RE: [R] Mental Block with PCA of multivariate time series!
Is this along the lines of what you are trying to do? sim.data-data.frame(matrix(rnorm(350*10),350,10)) day-seq(1:350) sim.data-data.frame(day,sim.data) pc1.load.all-NULL pc2.load.all-NULL for (i in seq(0,300,by=50)){ sim.data.i-subset(sim.data,sim.data$dayisim.data$day(i+50)) pc1.load.i-princomp(sim.data.i[,2:11])$loadings[,1] pc2.load.i-princomp(sim.data.i[,2:11])$loadings[,2] pc1.load.all-rbind(pc1.load.all,pc1.load.i) pc2.load.all-rbind(pc1.load.all,pc1.load.i) } period-seq(1:7) pc1.load.all-cbind(period,pc1.load.all) pc2.load.all-cbind(period,pc1.load.all) # and plot loadings for each each variable vs. the period... Ignacio -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Laura Quinn Sent: Monday, May 16, 2005 4:34 AM To: Gavin Simpson Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Mental Block with PCA of multivariate time series! Sorry, I don't think I made myself clear enough with my initial query! I am wishing to investigate the temporal evolution of the pca: if we assume that every 50 rows of my data frame is representitive of, for instance, 1 day of data, I am hoping to automate a process whereby a pca is performed on every 50 rows of data and the loading for PC1 and PC2 for each variable (i.e. each column) is represented as a point on a plot - so a years' data will be represented as two lines (representing PC1 and PC2) on a time series plot for each variable. Laura Quinn Institute of Atmospheric Science School of Earth and Environment University of Leeds Leeds LS2 9JT tel: +44 113 343 1596 fax: +44 113 343 6716 mail: [EMAIL PROTECTED] On Mon, 16 May 2005, Gavin Simpson wrote: Laura Quinn wrote: Please could someone point me in the right direction as I appear to be having a total mental block with fairly basic PCA problem! I have a large dataframe where rows represent independent observations and columns are variables. I am wanting to perform PCA sequentially on blocks of nrows at a time and produce a graphical output of the loadings for the first 2 EOFs for each variable. I'm sure I've performed a very similar routine in the past, but the method is currently escaping me. Any help gratefully received! Hi Laura, data(iris) iris.dat - iris[,1:4] pca.1 - prcomp(iris.dat[1:50, ], scale = TRUE) pca.2 - prcomp(iris.dat[51:100, ], scale = TRUE) pca.3 - prcomp(iris.dat[100:150, ], scale = TRUE) biplot(pca.1) etc... There is a better way of subsetting this data set as the 5th col of iris is a factor and we could use the subset argument to prcomp to do the subsetting without having to know that there are 50 rows per species. Take a look at that argument if you have a variable that defines the blocks for you. Is this what you were after? All the best, Gav -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [T] +44 (0)20 7679 5522 ENSIS Research Fellow [F] +44 (0)20 7679 7565 ENSIS Ltd. ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk UCL Department of Geography [W] http://www.ucl.ac.uk/~ucfagls/cv/ 26 Bedford Way[W] http://www.ucl.ac.uk/~ucfagls/ London. WC1H 0AP. %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] multiple autocorrelation coefficients in spdep?
Thanks for the reply. I will look into the suggestions you have given. Indeed, I have used an lme model with direct covariance representation via geostatistical models, where I was able to fit separate terms for different groups, but part of my point is to compare the outcome from both approaches. Ignacio -Original Message- From: Roger Bivand [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 26, 2005 3:00 AM To: Ignacio Colonna Cc: R-help@stat.math.ethz.ch Subject: Re: [R] multiple autocorrelation coefficients in spdep? On Mon, 25 Apr 2005, Ignacio Colonna wrote: Hello, Has anyone modified the errorsarlm in the R package spdep to allow for more than a single spatial autocorrelation coefficient (i.e. 'lambda')? Or, if not, any initial suggestions on how to make that modification? I have looked at the source code for the function and realize that any attempt to do it on my own would require much dedication, so would like to check whether someone has done it already. My R programming skills are very elementary. Specifically I would need to specify 2 different lambdas in a dataset, one for each group. I am performing a regression of grain yield against a number of soil variables, for 2 different years, where the regression includes terms for year*soil variables interaction. The spatial structure of the error is clearly different between these years, and I would thus like to fit different lambdas to them, if possible. Of course one option is to just run 2 separate regressions, but if the possibility of fitting more than 1 lambda does not seem too remote, I think fitting a single model offers some advantages. As far as I am aware, as package maintainer, this has not been done, and is not easy to do, as you have noted. It might be possible to use the lm.gls() function in the MASS package once the compound weights matrix (including the coefficients) has been fixed, perhaps using optim(). Have you considered other options perhaps including some lme variant, and/or spatial panel variants? Thanks in advance, Ignacio -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] random interactions in lme
The code below gives almost identical results for a split-block analysis in lme and SAS proc mixed, in terms of variance components and F statistics. It just extends the example in Pinheiro Bates (p.162) to a split block design. I am including below the SAS code and the data in case you want to try it. The only difference between both is in the df for the F denominator, which I wasn't able to compute correctly in lme, but this may be my ignorance on how to correctly specify the model. It is not a big issue though, as the F values are identical, so you can compute the p-values if you know how to obtain the correct DenDF. # a split block design spbl.an1-lme(yield~rowspace*ordered(tpop),random=list(rep=pdBlocked(list(pd Ident(~1), pdIdent(~rowspace-1),pdIdent(~ordered(tpop)-1,data=spblock) * SAS code proc mixed data=splitblock method=reml; class rep rowspace tpop; model yield=rowspace tpop rowspace*tpop; random rep rep*rowspace rep*tpop; run; # data rowspacetpoprep plotyield 9 60 1 133 19 9 120 1 101 19.5 9 180 1 117 22 9 240 1 132 19.4 9 300 1 116 23.9 18 60 1 134 15.8 18 120 1 102 26.2 18 180 1 118 21.9 18 240 1 131 20 18 300 1 115 23.3 9 60 2 216 20.6 9 120 2 233 22 9 180 2 201 23.4 9 240 2 217 28.2 9 300 2 232 25.9 18 60 2 215 19.7 18 120 2 234 30.3 18 180 2 202 22.4 18 240 2 218 27.9 18 300 2 231 28.5 9 60 3 309 20.8 9 120 3 308 21.6 9 180 3 324 24.6 9 240 3 340 25.3 9 300 3 325 35.3 18 60 3 310 17.2 18 120 3 307 23.6 18 180 3 323 24.9 18 240 3 339 30.7 18 300 3 326 33 9 60 4 435 15.6 9 120 4 403 20.4 9 180 4 430 24.4 9 240 4 414 21 9 300 4 419 23.2 18 60 4 436 17.7 18 120 4 404 23.6 18 180 4 429 21.7 18 240 4 413 24.4 18 300 4 420 26.2 Ignacio -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Douglas Bates Sent: Monday, April 25, 2005 6:40 PM To: Jacob Michaelson Cc: r-help@stat.math.ethz.ch Subject: Re: [R] random interactions in lme Jacob Michaelson wrote: On Apr 24, 2005, at 8:52 AM, Douglas Bates wrote: Jacob Michaelson wrote: Hi All, I'm taking an Experimental Design course this semester, and have spent many long hours trying to coax the professor's SAS examples into something that will work in R (I'd prefer that the things I learn not be tied to a license). It's been a long semester in that regard. One thing that has really frustrated me is that lme has an extremely counterintuitive way for specifying random terms. I can usually figure out how to express a single random term, but if there are multiple terms or random interactions, the documentation available just doesn't hold up. Here's an example: a split block (strip plot) design evaluated in SAS with PROC MIXED (an excerpt of the model and random statements): model DryMatter = Compacting|Variety / outp = residuals ddfm = satterthwaite; random Rep Rep*Compacting Rep*Variety; Now the fixed part of that model is easy enough in lme: DryMatter~Compacting*Variety But I can't find anything that adequately explains how to simply add the random terms to the model, ie rep + rep:compacting + rep:variety; anything to do with random terms in lme seems to go off about grouping factors, which just isn't intuitive for me. The grouping factor is rep because the random effects are associated with the levels of rep. I don't always understand the SAS notation so you may need to help me out here. Do you expect to get a single variance component estimate for Rep*Compacting and a single variance component for Rep*Variety? If so, you would specify the model in lmer by first creating factors for the interaction of Rep and Compacting and the interaction of Rep and Variety. dat$RepC - with(dat, Rep:Compacting)[drop=TRUE] dat$RepV - with(dat, Rep:Variety)[drop=TRUE] fm - lmer(DryMatter ~ Compacting*Variety+(1|Rep)+(1|RepC)+(1|RepV), dat) Thanks for the prompt reply. I tried what you suggested, here's what I got: turf.lme=lmer(dry_matter~compacting*variety+(1|rep)+(1|rc)+(1|rv), turf.data) Error in lmer(dry_matter ~ compacting * variety + (1 | rep) + (1 | rc) + : entry 3 in matrix[9,2] has row 3 and column 2 Just to see what the
[R] multiple autocorrelation coefficients in spdep?
Hello, Has anyone modified the errorsarlm in the R package spdep to allow for more than a single spatial autocorrelation coefficient (i.e. 'lambda')? Or, if not, any initial suggestions on how to make that modification? I have looked at the source code for the function and realize that any attempt to do it on my own would require much dedication, so would like to check whether someone has done it already. My R programming skills are very elementary. Specifically I would need to specify 2 different lambdas in a dataset, one for each group. I am performing a regression of grain yield against a number of soil variables, for 2 different years, where the regression includes terms for year*soil variables interaction. The spatial structure of the error is clearly different between these years, and I would thus like to fit different lambdas to them, if possible. Of course one option is to just run 2 separate regressions, but if the possibility of fitting more than 1 lambda does not seem too remote, I think fitting a single model offers some advantages. Thanks in advance, Ignacio [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] summing values by group
Maybe aggregate() is what you are looking for? e.g. say your data frame is called 'mydata' sum.by.CAT-aggregate(mydata,list(CAT),sum) this will give you sums by CAT for all the variables in the data set and will yield 'NA' for any character variables you may have. Ignacio -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Larry White Sent: Thursday, March 24, 2005 10:12 AM To: R-help@stat.math.ethz.ch Subject: [R] summing values by group At the risk of being wacked for asking what should be obvious I have a data frame with one categorical variable CAT and several numeric variables. I want to be able to get simple statistics on the numeric variables by level. For example, just as you can use table (CAT) to get the counts, I'd like to be able to get the means and sums by category. If someone could point me in the right direction, I'd appreciate it. I've been through the SimpleR and Using R for Data Analysis... docs and I'm still clueless. thanks for your help. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Bug on the stem function or in my brain ?
José, Notice that the values to the left of the | in your stem plot are all even. Odd numbers are included in the same line. Try stem(time,scale=2) The decimal point is 1 digit(s) to the right of the | 3 | 2 4 | 17 5 | 09 6 | 4667789 7 | 38 8 | 3 9 | 0335 10 | 00 11 | 0 ignacio -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jose A. Hernandez Sent: Monday, March 21, 2005 9:15 AM To: r-help@stat.math.ethz.ch Subject: [R] Bug on the stem function or in my brain ? Good day R-ers! I was running the basic statistics for the exam that my students took last week and something does not make sense with the stem() fucntion. Here are two of my variables: time, is time to complete the exam in minutes exam.1, is the grade for the exam In stem(), to the left of the vertical bar are the leading digits of the grades. To the right of the vertical bar are the last digits of the grades. Each single digit on the right represents one grade. time [1] 32 41 47 50 59 64 66 66 67 67 68 69 73 78 83 90 93 93 95 [20] 100 100 110 stem(time) The decimal point is 1 digit(s) to the right of the | 2 | 2 4 | 1709 6 | 466778938 8 | 30335 10 | 000 The stem and leaf plot does not reflect the actual data, the bottom line for instance says there were 3 people that spent 100 minutes working on the test. The next to bottom line says there were one 80, three 83s, one 85. And so forth. exam.1 [1] 82 100 86 81 88 78 92 23 91 49 97 9 89 78 93 60 80 80 83 [20] 94 51 100 stem(exam.1) The decimal point is 1 digit(s) to the right of the | 0 | 9 2 | 3 4 | 91 6 | 088 8 | 0012368912347 10 | 00 The Stem-and-Leaf plots DO NOT correspond to the data. Any educational insights on this issue would be appreciated. Regards, Jose class(exam.1) [1] numeric class(time) [1] numeric version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor0.1 year 2004 month11 day 15 language R -- Jose A. Hernandez Ph.D. Candidate Precision Agriculture Center Department of Soil, Water, and Climate University of Minnesota 1991 Upper Buford Circle St. Paul, MN 55108 Ph. (612) 625-0445, Fax. (612) 625-2208 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Re: Repeated Measures, groupedData and lme
Emma, I am not an expert, but I have been trying to fit similar models. Adding to Keith's reply to your question, I can suggest what I concluded was the most reasonable model for my case. Based on Keith's Model1, you might also want to allow for a correlation among years within each experimental unit (I am assuming the experiment was conducted at the exact same location over the 3 years). Say you want to impose an autoregressive, order 1 structure (you can change this to any other structure you may consider appropriate) To do this at the block*treatment unit (the smallest experimental unit size in your experiment), you can add to keith's code: correlation=corAR1(form=~1|block/treatment) thus the entire code would be Model1-lme(mg~treatment + year + treatment:year, random=~1|block, correlation=corAR1(form=~1|block/treatment),data=magnesium) This results in a model with a certain covariance among all exp.units within the same block, plus another covariance between pairs of years within the same exp.unit, and this covariance decreases as the difference in time increases. You can see graphically the structure of this covariance by doing: rho-0.3 ar1-corAR1(value=rho,form=~1|block/treatment) ar1-Initialize(ar1,data=yourdata) m1-corMatrix(ar1) plot(m1$1/name of first treatment[,1]) Now, I really hope someone more knowledgeable is checking this out there and will point out whether this is incorrect, as I have used it for my analysis assuming was the correct approach. Ignacio -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Keith Wong Sent: Friday, March 18, 2005 5:41 PM To: r-help@stat.math.ethz.ch Subject: [R] Re: Repeated Measures, groupedData and lme Hello, I'm an R-newbie, but I've been learning to use lme for repeated measures experiments as well. If I understand correctly: Outcome variable: Mg (Kg/ha) Subject/grouping variable: block Condition/treatment: treatment (19 levels) Repeated factor: time (3 levels: 99, 02, 04) I think if you specify the model formula in the lme call, then the formula structure specified in the groupedData object is ignored. One suggestion for the model: Model1-lme(mg~treatment + year + treatment:year, random=~1|block, data=magnesium) If the question of interest is the treatment:year interaction Or Model2 - lme(mg~treatment, random=~1|block, data=magnesium) Hope this helps ... and hope the experts chime in at this point to give more guidance. Keith --quoting original post--- Hello I am trying to fit a REML to some soil mineral data which has been collected over the time period 1999 - 2004. I want to know if the 19 different treatments imposed, differ in terms of their soil mineral content. A tree model of the data has shown differences between the treatments can be attributed to the Magnesium, Potassium and organic matter content of the soil, with Magnesium being the primary separating variable. I am looking at soil mineral data were collected : 99, 02, 04. In the experiment, there are 19 different treatments (treatmentcontrol, treatment6TFYM, treatment 12TFYM etc), which are replicated in 3 blocks. For the magnesium soil data, I have created the following groupedData object: magnesium-groupedData(Mg~year|treatment, inner=~block) Where mg=magnesium Kg/ha As it is a repeated measures I was going to use an lme. I have looked at Pinherio and Bates : Mixed-Effects models in S and S-plus and I am getting slightly confused. In order to fit the lme, should I specify the data to use in the model as the grouped structure model? If so is the following command correct: Model1-lme(mg~treatment, random=block|year, data=magnesium)? I am slightly worried that it isn't, because in model summary, instead of listing the 19 different treatments in the fixed effects section, it writes intercept (as normal), then treatment^1, treatment^2 etc. However if I don't specify the groupedData object in the model, then in the fixed effects section, it names the treatments (i.e. intercept, treatmentcontrol, treatment6TFYM. Should I be fitting the model using the whole data set rather than the groupedData object? Thank you very much for your help Emma Pilgrim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] How to plot points as numbers/strings in lattice
Owen, I think this gives the plot you are looking for. There may be other better ways to do it, this is just the one I know. Inside 'panel' you would need to use 'ltext()' instead of 'text()', as in the example you provided. xyplot(V1~V2, data=a, groups=V3, panel = function(x, y, groups) ltext(x, y, label=groups, cex = .75) ) Ignacio -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Owen Solberg Sent: Tuesday, March 15, 2005 4:55 PM To: r-help@stat.math.ethz.ch Subject: [R] How to plot points as numbers/strings in lattice Hello, I would be very grateful if anyone could help with what seems like a simple lattice task. I want to use xyplot, where the symbols for the plotted points are taken from another column in the data frame. So if the data frame looked like: a - as.data.frame(matrix(data=c(1,1,10,2,2,20,3,3,30), nrow=3, ncol=3, byrow=TRUE)) a V1 V2 V3 1 1 1 10 2 2 2 20 3 3 3 30 you would get an xy scatter plot using where 10 (not a dot) is at coordinate 1,1, 20 is at 2,2, and so on. I have made two attempts. The first, below, almost works, but only takes the first character from the V3 column: library(lattice) xyplot(V1~V2, data=a, pch=as.character(a$V3)) I've also tried this example, which is given in an online user guide for trellis. It uses the built-in ethanol data set (which is also in lattice), and subscripts... but when I try the same code in lattice, I get an invalid graphics state error. library(lattice) xyplot(NOx ~ E | C, data = ethanol, aspect = 1/2, panel = function(x, y, subscripts) text(x, y, subscripts, cex = .75) ) Thank you very much in advance! Owen Solberg version _ platform i686-redhat-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major2 minor0.0 year 2004 month10 day 04 language R __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html