Learning & Teaching Fellow (retired)
Autumn Jane
Workplace Happiness
+2 Others

Thread Started by #yoyi

Hi, all I am doing an analysis of a salary survey. the regression presents a R square = 0.8, the adjust R square = 0.5. How to trim the database to make a better R square? Thanks for any help.
5th September 2010 From China, Chongqing
Dear Yovi,
Please give more details about the nature of your data collected. What do you mean by "Trimming the database..."? How many variables does your database have? R-Squared value is for a single explanatory variable and Adjusted R-Squared is for multiple explanatory variables.
Have a nice day.

5th September 2010 From United Kingdom
Dear Yovi

The two common variables in salary regression can be:

1. Salary & Job Points (if jobs are evaluated)

2. Salary & Age

3. Salary & Tenure.....etc

Whichever variables used, you are trying to establish the correlation between the two.

In conducting salary survey, "trimming the database" refers to identifing "data outliers" or "extreme data points" and excluding them in the analysis because including these data will "skewed" the results either upwards or downwards and trends/norms will not be able to be established. E.g. If I have 5 Production Workers. 4 of them receiving a salary within the range of $1000 to $2000 but the fifth is receiving $5000. The fifth is considered an "outlier" because by including this data point, it will skewed the analysis.

To identify "outliers" you need to perform a Standard Deviation Analysis (use Excel), set the desired Deviation step e.g. 1, 2 or 3, and run the analysis. The Deviation step is anchored on the size of your data sample.

It is a good practice to run two sets of regression salary - one before the "trimming" to depicts current situation and another after the "trimming" to depicts desired situation.

"Trimming" is not just for show or presentation, it indicates an area of concern for the company which must eventually be addressed.

Please see sample attachment.

When R=1, you have a "perfect" correlaton but this is rarely the case in real life. To conclude whether two variables are "relatively correlated", the minimum is at R=0.8 (but it also depend very much on your desired standard).


Autumn Jane
5th September 2010 From Singapore, Singapore

Attached Files
Membership is required for download. Create An Account First
File Type: pdf Customized Company Analyses.pdf (345.1 KB, 532 views)

Thank you Autumn Jane for a clear explanation of what "Data Trimming" is in this context. As the R-Sqared value shown was 0.8 (quite high), I did not think about the outliers. There is a good explanation of this at http://www.statisticaloutsourcingser...m/Outlier2.pdf However, it is not wrt to pay analysis.
Have a nice day.
5th September 2010 From United Kingdom
Dear Simhan
You are very right to say that a 0.8 R-Squared is in actual fact quite high. But for salary analysis, you only need 1 outlier (be it over or under) to cause serious morale issue across the organization. Therefore, the tighter the control in data spread, the more valid the analysis.
Have a nice day.
Autumn Jane
6th September 2010 From Singapore, Singapore
Dear Yoyi - getting a 0.5 r-square is not worthless - it is telling you that factors you have chosen as the independent variables are not really the sole ones determining salary and you are missing some important factor..
Are you doing only a single factor correlation or are you doing multivariate analysis.. if you are doing multivariate analysis then you should also worry about the interdependence between the selected independent variables - if i remember my statistics correctly it is to with Pearson's correlation coefficient.
6th September 2010 From India, Delhi
Thanks guys! Your replies are really meaningful to me!
First of all, I am doing a market survey analyses. All I have are the P25,P50,and P75. I remember that my teacher told me: the R square would be accept if it >= 0.95, that means the market data has high validity. If the R square is low, we should "trim" the original data. In this case, I only use Annual salary and the job grade to derive the regression line.
Dear Autumn Jane, your explanation is terrific, but I don't understand why the Internal Equity Analyses can determine the number of pay structures an organization should have. Could you give any further explanation?
6th September 2010 From China, Chongqing
You are using univariate analysis; however, as there are people with variable experience and qualifications you should be using multivatriate analysis as the pay just does not depend upon the grade alone.
6th September 2010 From United Kingdom
R square value is 0.8, its mean your independent variable is expalining 80% of correct relationship between dependent variable. R-square equal to 0.5 or greater then is acceptable. Remember one thing R-sqaure never be equal to 1, if it is 1 its mean 100% relationship which is not part of statistical model. 1 or 100% is only part of mathmetical expression.
If your independent variable is more then 1, you should also check tolernace and variane inflation factor(VIF).
Basicaly in technicla term R-Square explaing the varince(Variation) of independent variable in dependent variable.
Just suppose you check raltionship of salary with age.
Salary is your dependent variable and age is idependent variable.
If R-square explains 0.8. Then we can interprate, 80 % variation in salary is explain by age.
7th September 2010 From Pakistan
Dear Yoyi
Let me come back to you regarding using Internal Equity Analysis to determine the number of Pay Structure(s) an organization should have as I am trying to look through my projects to extract the relevant materials for pictorial explanation.
In the meantime, would be be able to clarify how are job grades classified in your organization,e.g. job grade 1, 2, 3....?
Autumn Jane
7th September 2010 From Singapore, Singapore
Dear Yoyi

As promised, attached please find a perfect example of using an Internal Equity Analysis to determine whether an organization should have more than 1 pay structure.

In the first chart, it shows internal equity at its original practice i.e. one pay structure after a job evaluation exercise. Even though the R-squared was a 0.94 (exceeded the minimum of 0.8 and pretty close to 1.0), there are other "tell-tale" signs that indicates "unknowingly" / "unintentionally", the company was adopting a two pay structures.

The signs are:

1. Job Clusters - 1 cluster at Jobs Points from 50 to 250 and another cluster at Job Points 350 all the way to 1050.

2. A "break" in the form of missing grade(s) - there is no jobs in between job points from 251 to 349.

3. Beyond the visual, background information about this company confirmed that the 1st cluster belongs to non-executive jobs (blue collar & clerical) while the 2nd cluster belongs to executive jobs (professional & managerial). From the chart, it is confirmed that the company is paying more aggressive for their executive jobs vs the non-executive jobs.

Because of all the above factors, a second internal equity analysis was constructed, but with a break or with 2 pay structures. Notice the decline in the R-squared for both lines? Although there is a decline, the 2nd chart is a better representation of the pay practice for this company. Therefore, pay recommendations should be based on the 2nd chart instead of the 1st.

Hope my explanation is clear and useful.


Autumn Jane
9th September 2010 From Singapore, Singapore

Attached Files
Membership is required for download. Create An Account First
File Type: pdf Internal Equity Analysis 1.pdf (22.6 KB, 119 views)

Reply (Add What You Know) Start New Discussion

Cite.Co - is a repository of information created by your industry peers and experienced seniors. Register Here and help by adding your inputs to this topic/query page.
Prime Sponsor: TALENTEDGE - Certification Courses for career growth from top institutes like IIM / XLRI direct to device (online digital learning)

About Us Advertise Contact Us
Privacy Policy Disclaimer Terms Of Service

All rights reserved @ 2019 Cite.Co™