subject:"\[R\] how to tell if its better to standardize your data matrix first when you do principal"

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-23 Thread masterinex




Actually Its for an assignment Michael 
, all Im looking  is some help  and suggestions , please dont get it wrong ,
and I do believe that 
this is a helpful community .


> 
> This sounds a bit like homework. If that is the case, please ask your
> teacher rather than this list.
> Anyway, it does not make sense to predict weight using a linear
> combination (principle component) that contains weight, does it?
> 
> Uwe Ligges

It's likely to have been homework: A quick search on "masterinex"
"xevilgang79" reveal which university this undergraduate student is at. It
also produces a phone number, which can be used to lookup an address, and a
cell phone number.

MK
__
R-help@r-project.org mailing list

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



-- 
View this message in context: 
http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26490273.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-23 Thread Michael Kubovy


On Nov 22, 2009, at 10:22 AM, Uwe Ligges wrote:

> masterinex wrote:
>> Hi guys , Im trying to do principal component analysis in R . There is 2 
>> ways of doing
>> it , I believe. One is doing  principal component analysis right away the 
>> other way is standardizing the matrix first  using s = scale(m)and then 
>> apply principal
>> component analysis.   How  do I tell what result is better ? What values in 
>> particular should i
>> look at . I already managed to find the eigenvalues and eigenvectors , the
>> proportion of  variance for each eigenvector using both methods.
> 
> Generally, it is better to standardize. But in some cases, e.g. for the same 
> units in your variables indicating also the importance, it might make sense 
> not to do so.
> You should think about the analysis, you cannot know which result is `better' 
> unless you know an interpretation.
> 
> 
> 
>> I noticed that the proportion of the variance for the first  pca without
>> standardizing had a larger  value . Is there a meaning to it ? Isnt this
>> always the case?
>> At last , if I am  supposed to predict a variable ie weight should I drop
>> the variable ie weight from my data matrix when I do principal component
>> analysis ?
> 
> 
> This sounds a bit like homework. If that is the case, please ask your teacher 
> rather than this list.
> Anyway, it does not make sense to predict weight using a linear combination 
> (principle component) that contains weight, does it?
> 
> Uwe Ligges

It's likely to have been homework: A quick search on "masterinex" "xevilgang79" 
reveal which university this undergraduate student is at. It also produces a 
phone number, which can be used to lookup an address, and a cell phone number.

MK
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-23 Thread Uwe Ligges




masterinex wrote:

this is how my data matrix looks like . This is just for the first 10
observations , but the pattern is similar for the other observations.  



112.3 154.25  67.75 36.2  93.1  85.2  94.5  59.0 37.3  21.9   32.0   
27.4  17.1
2 6.1 173.25  72.25 38.5  93.6  83.0  98.7  58.7 37.3  23.4   30.5   
28.9  18.2
325.3 154.00  66.25 34.0  95.8  87.9  99.2  59.6 38.9  24.0   28.8   
25.2  16.6
410.4 184.75  72.25 37.4 101.8  86.4 101.2  60.1 37.3  22.8   32.4   
29.4  18.2
528.7 184.25  71.25 34.4  97.3 100.0 101.9  63.2 42.2  24.0   32.2   
27.7  17.7
620.9 210.25  74.75 39.0 104.5  94.4 107.8  66.0 42.0  25.6   35.7   
30.6  18.8
719.2 181.00  69.75 36.4 105.1  90.7 100.3  58.4 38.3  22.9   31.9   
27.8  17.7
812.4 176.00  72.50 37.8  99.6  88.5  97.1  60.0 39.4  23.2   30.5   
29.0  18.8
9 4.1 191.00  74.00 38.1 100.9  82.5  99.9  62.9 38.3  23.8   35.9   
31.1  18.2
10   11.7 198.25  73.50 42.1  99.6  88.6 104.1  63.1 41.7  25.0   35.6   
30.0  19.2



and after standardizing it  . 


1   -0.831228836 -0.898881671 -0.98330178 -0.77420686 -0.952294055
-0.712961621 -0.814552365 -0.0625400993 -0.53901713 -0.825399059 -0.08244945
2   -1.588060506 -0.185928394  0.75868364  0.23560461 -0.889886435
-0.931523054 -0.155497233 -0.1252522485 -0.53901713  0.295114747 -0.59529632
30.755676279 -0.908262635 -1.56396359 -1.74011349 -0.615292906
-0.444727135 -0.077038289  0.0628841989  0.15515266  0.743320270 -1.17652277
4   -1.063161122  0.245595958  0.75868364 -0.24734870  0.133598535
-0.593746294  0.236797489  0.1674044475 -0.53901713 -0.153090775  0.05430971
51.170713001  0.226834030  0.37157577 -1.56449410 -0.428070046 
0.757360745  0.346640011  0.8154299886  1.58687786  0.743320270 -0.01406987
60.218569932  1.202454304  1.72645331  0.45512884  0.470599683 
0.201022552  1.27244  1.4007433805  1.50010664  1.938534997  1.18257281

70.011051571  0.104881496 -0.20908604 -0.68639717  0.545488828
-0.166558039  0.095571389 -0.1879643976 -0.10516101 -0.078389855 -0.11663925
8   -0.819021874 -0.082737788  0.85546060 -0.07172932 -0.140994994
-0.385119472 -0.406565855  0.1465003978  0.37208072  0.145712907 -0.59529632
9   -1.832199755  0.480120063  1.43612241  0.05998522  0.021264819
-0.981196107  0.032804234  0.7527178395 -0.10516101  0.593918429  1.25095239
10  -0.904470611  0.752168024  1.24256848  1.81617909 -0.140994994
-0.375184861  0.691859366  0.7945259389  1.36994980  1.490329474  1.14838302



this is the result of applying PCA to the data matrix

Standard deviations:
 [1] 30.6645414  7.5513852  3.6927427  2.8703435  2.5363007  1.9136933 
1.5624131  1.3689630  1.2976189

[10]  1.1633458  1.1118231  0.7847148  0.4802303

Rotation:
PC1 PC2 PC3  PC4  PC5 
PC6  PC7 PC8

var1  0.18110712 -0.74864138 -0.46070566 -0.365658769  0.192810075
-0.132529979  0.023764851  0.03674873
var2  0.86458284  0.34243386 -0.05766909 -0.235504989 -0.046075934 
0.001493006 -0.024535011  0.13439659

var3  0.03765598  0.20097537 -0.15709612 -0.343218776 -0.295201121
-0.073295697 -0.086930370 -0.54389141
var40.05965733  0.01737951  0.09854179 -0.030801791  0.125735684 
0.341795876 -0.001735808  0.37152696

var5   0.23845698 -0.20616399  0.68948870  0.025904812  0.391188182
-0.428933369 -0.101780281 -0.16965893
var6   0.29928369 -0.47394636  0.24791449  0.341235161 -0.511378719 
0.447071255 -0.077534385 -0.13198544

var7 0.19503685  0.01385823 -0.24126047  0.531403827 -0.127426510
-0.410568454  0.608163973 -0.01265457
var8   0.13261863  0.06839078 -0.37740589  0.535332339  0.366103479 
0.032376851 -0.574484605 -0.05645694
var90.06246705  0.04407384 -0.09545362  0.037993146 -0.036651080 
0.012347288 -0.192976142 -0.13027876

var10   0.03027791  0.05533988 -0.03749859 -0.009257423  0.011026593
-0.010770032 -0.104041067  0.12125263
var11  0.07435322  0.04334969 -0.02666944  0.032036374  0.464035624 
0.454970952  0.347507539 -0.60527541
var12 0.04328710  0.04731771  0.00360668 -0.054200633  0.275901346 
0.297800123  0.324323749  0.30487145
var13   0.02095652  0.02146485  0.03598618 -0.022510780  0.005192075 
0.103988977  0.031541374  0.07877455


   PC9 PC10 PC11PC12 PC13
var1   -0.005328345  0.030549780 -0.049283616 -0.02211988  0.015660892
var2   0.170766596 -0.144031738  0.028862963  0.06984674  0.006293703
var3  -0.282549313  0.548650592  0.131284937 -0.14740722 -0.002384605
var4 0.024070488  0.614154008 -0.551480394 -0.03446124 -0.178123011
var5   -0.157551008  0.147685248  0.008044148 -0.04068258  0.007778992
var6   -0.058675551  0.006344813  0.130814072 -0.04088919 -0.028655330
var7 -0.099243751  0.171852216 -0.149231752 -0.06690208 -0.014693444
var80.006629025  0.199158097  0.187226774 -0.02511968  0.070896819
var9-0.658214712 -0.320120384 -0.53990  0.37630539 -0.023642902
var10   -0.259704149 -0.273030750 -0.074006053 -0.83676032

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-23 Thread Uwe Ligges

masterinex wrote:
Hi Hadley ,

I really apreciate the suggestions you gave, It was helpful , but I still
didnt quite get it all. and I really want to do a good job , so any
comments would sure come helpful, please understand me .

Well, we try to understand you, but we do not either. I think you really
nedc to consult some statistics textbook on PCA if my answer was not
sufficient. Given your questions, I doubt you understand what PCA does
and how it works. It does not predict anything.

Uwe Ligges

hadley wrote:

You've asked the same question on stackoverflow.com and received the
same answer. This is rude because it duplicates effort. If you
urgently need a response to a question, perhaps you should consider
paying for it.

Hadley

On Sun, Nov 22, 2009 at 12:04 PM, masterinex
wrote:

so under which cases is it better to standardize the data matrix first
?
also is PCA generally used to predict the response variable , should I
keep that variable in my data matrix ?

Uwe Ligges-3 wrote:

masterinex wrote:

Hi guys ,

Im trying to do principal component analysis in R . There is 2 ways of
doing
it , I believe.
One is doing principal component analysis right away the other way is
standardizing the matrix first using s = scale(m)and then apply
principal
component analysis.
How do I tell what result is better ? What values in particular should
i
look at . I already managed to find the eigenvalues and eigenvectors ,
the
proportion of variance for each eigenvector using both methods.

Generally, it is better to standardize. But in some cases, e.g. for the
same units in your variables indicating also the importance, it might
make sense not to do so.
You should think about the analysis, you cannot know which result is
`better' unless you know an interpretation.

I noticed that the proportion of the variance for the first pca
without
standardizing had a larger value . Is there a meaning to it ? Isnt
this
always the case?
At last , if I am supposed to predict a variable ie weight should I
drop
the variable ie weight from my data matrix when I do principal
component
analysis ?

This sounds a bit like homework. If that is the case, please ask your
teacher rather than this list.
Anyway, it does not make sense to predict weight using a linear
combination (principle component) that contains weight, does it?

Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
View this message in context:
http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26466400.html
Sent from the R help mailing list archive at Nabble.com.

--
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-22 Thread masterinex


this is how my data matrix looks like . This is just for the first 10
observations , but the pattern is similar for the other observations.  


112.3 154.25  67.75 36.2  93.1  85.2  94.5  59.0 37.3  21.9   32.0   
27.4  17.1
2 6.1 173.25  72.25 38.5  93.6  83.0  98.7  58.7 37.3  23.4   30.5   
28.9  18.2
325.3 154.00  66.25 34.0  95.8  87.9  99.2  59.6 38.9  24.0   28.8   
25.2  16.6
410.4 184.75  72.25 37.4 101.8  86.4 101.2  60.1 37.3  22.8   32.4   
29.4  18.2
528.7 184.25  71.25 34.4  97.3 100.0 101.9  63.2 42.2  24.0   32.2   
27.7  17.7
620.9 210.25  74.75 39.0 104.5  94.4 107.8  66.0 42.0  25.6   35.7   
30.6  18.8
719.2 181.00  69.75 36.4 105.1  90.7 100.3  58.4 38.3  22.9   31.9   
27.8  17.7
812.4 176.00  72.50 37.8  99.6  88.5  97.1  60.0 39.4  23.2   30.5   
29.0  18.8
9 4.1 191.00  74.00 38.1 100.9  82.5  99.9  62.9 38.3  23.8   35.9   
31.1  18.2
10   11.7 198.25  73.50 42.1  99.6  88.6 104.1  63.1 41.7  25.0   35.6   
30.0  19.2


and after standardizing it  . 

1   -0.831228836 -0.898881671 -0.98330178 -0.77420686 -0.952294055
-0.712961621 -0.814552365 -0.0625400993 -0.53901713 -0.825399059 -0.08244945
2   -1.588060506 -0.185928394  0.75868364  0.23560461 -0.889886435
-0.931523054 -0.155497233 -0.1252522485 -0.53901713  0.295114747 -0.59529632
30.755676279 -0.908262635 -1.56396359 -1.74011349 -0.615292906
-0.444727135 -0.077038289  0.0628841989  0.15515266  0.743320270 -1.17652277
4   -1.063161122  0.245595958  0.75868364 -0.24734870  0.133598535
-0.593746294  0.236797489  0.1674044475 -0.53901713 -0.153090775  0.05430971
51.170713001  0.226834030  0.37157577 -1.56449410 -0.428070046 
0.757360745  0.346640011  0.8154299886  1.58687786  0.743320270 -0.01406987
60.218569932  1.202454304  1.72645331  0.45512884  0.470599683 
0.201022552  1.27244  1.4007433805  1.50010664  1.938534997  1.18257281
70.011051571  0.104881496 -0.20908604 -0.68639717  0.545488828
-0.166558039  0.095571389 -0.1879643976 -0.10516101 -0.078389855 -0.11663925
8   -0.819021874 -0.082737788  0.85546060 -0.07172932 -0.140994994
-0.385119472 -0.406565855  0.1465003978  0.37208072  0.145712907 -0.59529632
9   -1.832199755  0.480120063  1.43612241  0.05998522  0.021264819
-0.981196107  0.032804234  0.7527178395 -0.10516101  0.593918429  1.25095239
10  -0.904470611  0.752168024  1.24256848  1.81617909 -0.140994994
-0.375184861  0.691859366  0.7945259389  1.36994980  1.490329474  1.14838302



this is the result of applying PCA to the data matrix

Standard deviations:
 [1] 30.6645414  7.5513852  3.6927427  2.8703435  2.5363007  1.9136933 
1.5624131  1.3689630  1.2976189
[10]  1.1633458  1.1118231  0.7847148  0.4802303

Rotation:
PC1 PC2 PC3  PC4  PC5 
PC6  PC7 PC8
var1  0.18110712 -0.74864138 -0.46070566 -0.365658769  0.192810075
-0.132529979  0.023764851  0.03674873
var2  0.86458284  0.34243386 -0.05766909 -0.235504989 -0.046075934 
0.001493006 -0.024535011  0.13439659
var3  0.03765598  0.20097537 -0.15709612 -0.343218776 -0.295201121
-0.073295697 -0.086930370 -0.54389141
var40.05965733  0.01737951  0.09854179 -0.030801791  0.125735684 
0.341795876 -0.001735808  0.37152696
var5   0.23845698 -0.20616399  0.68948870  0.025904812  0.391188182
-0.428933369 -0.101780281 -0.16965893
var6   0.29928369 -0.47394636  0.24791449  0.341235161 -0.511378719 
0.447071255 -0.077534385 -0.13198544
var7 0.19503685  0.01385823 -0.24126047  0.531403827 -0.127426510
-0.410568454  0.608163973 -0.01265457
var8   0.13261863  0.06839078 -0.37740589  0.535332339  0.366103479 
0.032376851 -0.574484605 -0.05645694
var90.06246705  0.04407384 -0.09545362  0.037993146 -0.036651080 
0.012347288 -0.192976142 -0.13027876
var10   0.03027791  0.05533988 -0.03749859 -0.009257423  0.011026593
-0.010770032 -0.104041067  0.12125263
var11  0.07435322  0.04334969 -0.02666944  0.032036374  0.464035624 
0.454970952  0.347507539 -0.60527541
var12 0.04328710  0.04731771  0.00360668 -0.054200633  0.275901346 
0.297800123  0.324323749  0.30487145
var13   0.02095652  0.02146485  0.03598618 -0.022510780  0.005192075 
0.103988977  0.031541374  0.07877455

   PC9 PC10 PC11PC12 PC13
var1   -0.005328345  0.030549780 -0.049283616 -0.02211988  0.015660892
var2   0.170766596 -0.144031738  0.028862963  0.06984674  0.006293703
var3  -0.282549313  0.548650592  0.131284937 -0.14740722 -0.002384605
var4 0.024070488  0.614154008 -0.551480394 -0.03446124 -0.178123011
var5   -0.157551008  0.147685248  0.008044148 -0.04068258  0.007778992
var6   -0.058675551  0.006344813  0.130814072 -0.04088919 -0.028655330
var7 -0.099243751  0.171852216 -0.149231752 -0.06690208 -0.014693444
var80.006629025  0.199158097  0.187226774 -0.02511968  0.070896819
var9-0.658214712 -0.320120384 -0.53990  0.37630539 -0.023642902
var10   -0.259704149 -0.273030750 -0.074006053 -0.83676032 -0.348034215
var11   0.157450716

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-22 Thread masterinex


Hi Hadley , 

I really apreciate the suggestions you gave, It was helpful , but I still
didnt quite get it all.   and I really want to do a good job , so any
comments would sure come helpful, please understand me . 





hadley wrote:
> 
> You've asked the same question on stackoverflow.com and received the
> same answer.  This is rude because it duplicates effort.  If you
> urgently need a response to a question, perhaps you should consider
> paying for it.
> 
> Hadley
> 
> On Sun, Nov 22, 2009 at 12:04 PM, masterinex 
> wrote:
>>
>> so under which cases is it better to  standardize  the data matrix first
>> ?
>> also  is  PCA generally used to predict the response variable , should I
>> keep that variable in my data matrix ?
>>
>>
>> Uwe Ligges-3 wrote:
>>>
>>> masterinex wrote:


 Hi guys ,

 Im trying to do principal component analysis in R . There is 2 ways of
 doing
 it , I believe.
 One is doing  principal component analysis right away the other way is
 standardizing the matrix first  using s = scale(m)and then apply
 principal
 component analysis.
 How  do I tell what result is better ? What values in particular should
 i
 look at . I already managed to find the eigenvalues and eigenvectors ,
 the
 proportion of  variance for each eigenvector using both methods.

>>>
>>> Generally, it is better to standardize. But in some cases, e.g. for the
>>> same units in your variables indicating also the importance, it might
>>> make sense not to do so.
>>> You should think about the analysis, you cannot know which result is
>>> `better' unless you know an interpretation.
>>>
>>>
>>>
 I noticed that the proportion of the variance for the first  pca
 without
 standardizing had a larger  value . Is there a meaning to it ? Isnt
 this
 always the case?
  At last , if I am  supposed to predict a variable ie weight should I
 drop
 the variable ie weight from my data matrix when I do principal
 component
 analysis ?
>>>
>>>
>>> This sounds a bit like homework. If that is the case, please ask your
>>> teacher rather than this list.
>>> Anyway, it does not make sense to predict weight using a linear
>>> combination (principle component) that contains weight, does it?
>>>
>>> Uwe Ligges
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26466400.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
> 
> -- 
> http://had.co.nz/
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26471673.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-22 Thread hadley wickham

You've asked the same question on stackoverflow.com and received the
same answer.  This is rude because it duplicates effort.  If you
urgently need a response to a question, perhaps you should consider
paying for it.

Hadley

On Sun, Nov 22, 2009 at 12:04 PM, masterinex  wrote:
>
> so under which cases is it better to  standardize  the data matrix first ?
> also  is  PCA generally used to predict the response variable , should I
> keep that variable in my data matrix ?
>
>
> Uwe Ligges-3 wrote:
>>
>> masterinex wrote:
>>>
>>>
>>> Hi guys ,
>>>
>>> Im trying to do principal component analysis in R . There is 2 ways of
>>> doing
>>> it , I believe.
>>> One is doing  principal component analysis right away the other way is
>>> standardizing the matrix first  using s = scale(m)and then apply
>>> principal
>>> component analysis.
>>> How  do I tell what result is better ? What values in particular should i
>>> look at . I already managed to find the eigenvalues and eigenvectors ,
>>> the
>>> proportion of  variance for each eigenvector using both methods.
>>>
>>
>> Generally, it is better to standardize. But in some cases, e.g. for the
>> same units in your variables indicating also the importance, it might
>> make sense not to do so.
>> You should think about the analysis, you cannot know which result is
>> `better' unless you know an interpretation.
>>
>>
>>
>>> I noticed that the proportion of the variance for the first  pca without
>>> standardizing had a larger  value . Is there a meaning to it ? Isnt this
>>> always the case?
>>>  At last , if I am  supposed to predict a variable ie weight should I
>>> drop
>>> the variable ie weight from my data matrix when I do principal component
>>> analysis ?
>>
>>
>> This sounds a bit like homework. If that is the case, please ask your
>> teacher rather than this list.
>> Anyway, it does not make sense to predict weight using a linear
>> combination (principle component) that contains weight, does it?
>>
>> Uwe Ligges
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> View this message in context: 
> http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26466400.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-22 Thread masterinex


so under which cases is it better to  standardize  the data matrix first ?
also  is  PCA generally used to predict the response variable , should I
keep that variable in my data matrix ?


Uwe Ligges-3 wrote:
> 
> masterinex wrote:
>> 
>> 
>> Hi guys , 
>> 
>> Im trying to do principal component analysis in R . There is 2 ways of
>> doing
>> it , I believe. 
>> One is doing  principal component analysis right away the other way is 
>> standardizing the matrix first  using s = scale(m)and then apply
>> principal
>> component analysis.   
>> How  do I tell what result is better ? What values in particular should i
>> look at . I already managed to find the eigenvalues and eigenvectors ,
>> the
>> proportion of  variance for each eigenvector using both methods.
>> 
> 
> Generally, it is better to standardize. But in some cases, e.g. for the 
> same units in your variables indicating also the importance, it might 
> make sense not to do so.
> You should think about the analysis, you cannot know which result is 
> `better' unless you know an interpretation.
> 
> 
> 
>> I noticed that the proportion of the variance for the first  pca without
>> standardizing had a larger  value . Is there a meaning to it ? Isnt this
>> always the case?
>>  At last , if I am  supposed to predict a variable ie weight should I
>> drop
>> the variable ie weight from my data matrix when I do principal component
>> analysis ?
> 
> 
> This sounds a bit like homework. If that is the case, please ask your 
> teacher rather than this list.
> Anyway, it does not make sense to predict weight using a linear 
> combination (principle component) that contains weight, does it?
> 
> Uwe Ligges
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26466400.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-22 Thread Uwe Ligges


masterinex wrote:



Hi guys , 


Im trying to do principal component analysis in R . There is 2 ways of doing
it , I believe. 
One is doing  principal component analysis right away the other way is 
standardizing the matrix first  using s = scale(m)and then apply principal
component analysis.   
How  do I tell what result is better ? What values in particular should i

look at . I already managed to find the eigenvalues and eigenvectors , the
proportion of  variance for each eigenvector using both methods.



Generally, it is better to standardize. But in some cases, e.g. for the 
same units in your variables indicating also the importance, it might 
make sense not to do so.
You should think about the analysis, you cannot know which result is 
`better' unless you know an interpretation.





I noticed that the proportion of the variance for the first  pca without
standardizing had a larger  value . Is there a meaning to it ? Isnt this
always the case?
 At last , if I am  supposed to predict a variable ie weight should I drop
the variable ie weight from my data matrix when I do principal component
analysis ?



This sounds a bit like homework. If that is the case, please ask your 
teacher rather than this list.
Anyway, it does not make sense to predict weight using a linear 
combination (principle component) that contains weight, does it?


Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-21 Thread masterinex




Hi guys , 

Im trying to do principal component analysis in R . There is 2 ways of doing
it , I believe. 
One is doing  principal component analysis right away the other way is 
standardizing the matrix first  using s = scale(m)and then apply principal
component analysis.   
How  do I tell what result is better ? What values in particular should i
look at . I already managed to find the eigenvalues and eigenvectors , the
proportion of  variance for each eigenvector using both methods.



I noticed that the proportion of the variance for the first  pca without
standardizing had a larger  value . Is there a meaning to it ? Isnt this
always the case?
 At last , if I am  supposed to predict a variable ie weight should I drop
the variable ie weight from my data matrix when I do principal component
analysis ?
-- 
View this message in context: 
http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26462070.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

[R] how to tell if its better to standardize your data matrix first when you do principal

10 matches

Site Navigation

Mail list logo

Footer information