RE: [R] Mental Block with PCA of multivariate time series!

2005-05-16 Thread Ignacio Colonna
Is this along the lines of what you are trying to do?

sim.data-data.frame(matrix(rnorm(350*10),350,10))
day-seq(1:350)
sim.data-data.frame(day,sim.data)
pc1.load.all-NULL
pc2.load.all-NULL
 for (i in seq(0,300,by=50)){
 sim.data.i-subset(sim.data,sim.data$dayisim.data$day(i+50))
pc1.load.i-princomp(sim.data.i[,2:11])$loadings[,1]
pc2.load.i-princomp(sim.data.i[,2:11])$loadings[,2]
pc1.load.all-rbind(pc1.load.all,pc1.load.i)
pc2.load.all-rbind(pc1.load.all,pc1.load.i)
 }

period-seq(1:7)
pc1.load.all-cbind(period,pc1.load.all)
pc2.load.all-cbind(period,pc1.load.all)

# and plot loadings for each each variable vs. the period...

Ignacio


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Laura Quinn
Sent: Monday, May 16, 2005 4:34 AM
To: Gavin Simpson
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Mental Block with PCA of multivariate time series!

Sorry, I don't think I made myself clear enough with my initial query!

I am wishing to investigate the temporal evolution of the pca: if we
assume that every 50 rows of my data frame is representitive of, for
instance, 1 day of data, I am hoping to automate a process whereby a pca
is performed on every 50 rows of data and the loading for PC1 and PC2 for
each variable (i.e. each column) is represented as a point on a plot - so
a years' data will be represented as two lines (representing PC1 and PC2)
on a time series plot for each variable.



Laura Quinn
Institute of Atmospheric Science
School of Earth and Environment
University of Leeds
Leeds
LS2 9JT

tel: +44 113 343 1596
fax: +44 113 343 6716
mail: [EMAIL PROTECTED]

On Mon, 16 May 2005, Gavin Simpson wrote:

 Laura Quinn wrote:
  Please could someone point me in the right direction as I appear to be
  having a total mental block with fairly basic PCA problem!
 
  I have a large dataframe where rows represent independent
  observations and columns are variables. I am wanting to perform PCA
  sequentially on blocks of nrows at a time and produce a graphical output
  of the loadings for the first 2 EOFs for each variable.
 
  I'm sure I've performed a very similar routine in the past, but the
method
  is currently escaping me.
 
  Any help gratefully received!

 Hi Laura,

 data(iris)
 iris.dat - iris[,1:4]
 pca.1 - prcomp(iris.dat[1:50, ], scale = TRUE)
 pca.2 - prcomp(iris.dat[51:100, ], scale = TRUE)
 pca.3 - prcomp(iris.dat[100:150, ], scale = TRUE)

 biplot(pca.1)
 etc...

 There is a better way of subsetting this data set as the 5th col of iris
 is a factor and we could use the subset argument to prcomp to do the
 subsetting without having to know that there are 50 rows per species.
 Take a look at that argument if you have a variable that defines the
 blocks for you.

 Is this what you were after?

 All the best,

 Gav
 --
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [T] +44 (0)20 7679 5522
 ENSIS Research Fellow [F] +44 (0)20 7679 7565
 ENSIS Ltd.  ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk
 UCL Department of Geography   [W] http://www.ucl.ac.uk/~ucfagls/cv/
 26 Bedford Way[W] http://www.ucl.ac.uk/~ucfagls/
 London.  WC1H 0AP.
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] multiple autocorrelation coefficients in spdep?

2005-04-26 Thread Ignacio Colonna
Thanks for the reply. I will look into the suggestions you have given.
Indeed, I have used an lme model with direct covariance representation via
geostatistical models, where I was able to fit separate terms for different
groups, but part of my point is to compare the outcome from both approaches.

Ignacio


-Original Message-
From: Roger Bivand [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 26, 2005 3:00 AM
To: Ignacio Colonna
Cc: R-help@stat.math.ethz.ch
Subject: Re: [R] multiple autocorrelation coefficients in spdep?

On Mon, 25 Apr 2005, Ignacio Colonna wrote:

 Hello,
 
 Has anyone modified the errorsarlm in the R package spdep to
 allow for more than a single spatial autocorrelation coefficient (i.e.
 'lambda')? 
 
 Or, if not, any initial suggestions on how to make that
 modification? I have looked at the source code for the function and
realize
 that any attempt to do it on my own would require much dedication, so
would
 like to check whether someone has done it already. My R programming skills
 are very elementary.
 
  
 
 Specifically I would need to specify 2 different lambdas in a dataset, one
 for each group. I am performing a regression of grain yield against a
number
 of soil variables, for 2 different years, where the regression includes
 terms for year*soil variables interaction. The spatial structure of the
 error is clearly different between these years, and I would thus like to
fit
 different lambdas to them, if possible. Of course one option is to just
run
 2 separate regressions, but if the possibility of fitting more than 1
lambda
 does not seem too remote, I think fitting a single model offers some
 advantages.
 

As far as I am aware, as package maintainer, this has not been done, and 
is not easy to do, as you have noted. It might be possible to use the 
lm.gls() function in the MASS package once the compound weights matrix 
(including the coefficients) has been fixed, perhaps using optim(). Have 
you considered other options perhaps including some lme variant, and/or 
spatial panel variants?

 Thanks in advance,
 
 Ignacio

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] random interactions in lme

2005-04-26 Thread Ignacio Colonna
The code below gives almost identical results for a split-block analysis in
lme and SAS proc mixed, in terms of variance components and F statistics. It
just extends the example in Pinheiro  Bates (p.162) to a split block
design. 

I am including below the SAS code and the data in case you want to try it.
The only difference between both is in the df for the F denominator, which I
wasn't able to compute correctly in lme, but this may be my ignorance on how
to correctly specify the model. It is not a big issue though, as the F
values are identical, so you can compute the p-values if you know how to
obtain the correct DenDF. 

# a split block design
spbl.an1-lme(yield~rowspace*ordered(tpop),random=list(rep=pdBlocked(list(pd
Ident(~1),
pdIdent(~rowspace-1),pdIdent(~ordered(tpop)-1,data=spblock)

* SAS code
proc mixed data=splitblock method=reml;
class rep rowspace tpop;
model yield=rowspace tpop rowspace*tpop;
random rep rep*rowspace rep*tpop;
run;


# data

rowspacetpoprep plotyield
9   60  1   133 19
9   120 1   101 19.5
9   180 1   117 22
9   240 1   132 19.4
9   300 1   116 23.9
18  60  1   134 15.8
18  120 1   102 26.2
18  180 1   118 21.9
18  240 1   131 20
18  300 1   115 23.3
9   60  2   216 20.6
9   120 2   233 22
9   180 2   201 23.4
9   240 2   217 28.2
9   300 2   232 25.9
18  60  2   215 19.7
18  120 2   234 30.3
18  180 2   202 22.4
18  240 2   218 27.9
18  300 2   231 28.5
9   60  3   309 20.8
9   120 3   308 21.6
9   180 3   324 24.6
9   240 3   340 25.3
9   300 3   325 35.3
18  60  3   310 17.2
18  120 3   307 23.6
18  180 3   323 24.9
18  240 3   339 30.7
18  300 3   326 33
9   60  4   435 15.6
9   120 4   403 20.4
9   180 4   430 24.4
9   240 4   414 21
9   300 4   419 23.2
18  60  4   436 17.7
18  120 4   404 23.6
18  180 4   429 21.7
18  240 4   413 24.4
18  300 4   420 26.2


Ignacio


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Douglas Bates
Sent: Monday, April 25, 2005 6:40 PM
To: Jacob Michaelson
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] random interactions in lme

Jacob Michaelson wrote:
 
 On Apr 24, 2005, at 8:52 AM, Douglas Bates wrote:
 
 Jacob Michaelson wrote:

 Hi All,
 I'm taking an Experimental Design course this semester, and have 
 spent many long hours trying to coax the professor's SAS examples 
 into something that will work in R (I'd prefer that the things I 
 learn not be tied to a license).  It's been a long semester in that 
 regard.
 One thing that has really frustrated me is that lme has an extremely 
 counterintuitive way for specifying random terms.  I can usually 
 figure out how to express a single random term, but if there are 
 multiple terms or random interactions, the documentation available 
 just doesn't hold up.
 Here's an example: a split block (strip plot) design evaluated in SAS 
 with PROC MIXED (an excerpt of the model and random statements):
 model DryMatter = Compacting|Variety / outp = residuals ddfm = 
 satterthwaite;
 random Rep Rep*Compacting Rep*Variety;
 Now the fixed part of that model is easy enough in lme: 
 DryMatter~Compacting*Variety
 But I can't find anything that adequately explains how to simply add 
 the random terms to the model, ie rep + rep:compacting + 
 rep:variety; anything to do with random terms in lme seems to go off 
 about grouping factors, which just isn't intuitive for me.


 The grouping factor is rep because the random effects are associated 
 with the levels of rep.

 I don't always understand the SAS notation so you may need to help me 
 out here.  Do you expect to get a single variance component estimate 
 for Rep*Compacting and a single variance component for Rep*Variety?  
 If so, you would specify the model in lmer by first creating factors 
 for the interaction of Rep and Compacting and the interaction of Rep 
 and Variety.

 dat$RepC - with(dat, Rep:Compacting)[drop=TRUE]
 dat$RepV - with(dat, Rep:Variety)[drop=TRUE]
 fm - lmer(DryMatter ~ Compacting*Variety+(1|Rep)+(1|RepC)+(1|RepV), dat)



 
 Thanks for the prompt reply.  I tried what you suggested, here's what I 
 got:
 
   turf.lme=lmer(dry_matter~compacting*variety+(1|rep)+(1|rc)+(1|rv), 
 turf.data)
 Error in lmer(dry_matter ~ compacting * variety + (1 | rep) + (1 | rc) +
:
 entry 3 in matrix[9,2] has row 3 and column 2
 
 Just to see what the 

[R] multiple autocorrelation coefficients in spdep?

2005-04-25 Thread Ignacio Colonna
Hello,

Has anyone modified the errorsarlm in the R package spdep to
allow for more than a single spatial autocorrelation coefficient (i.e.
'lambda')? 

Or, if not, any initial suggestions on how to make that
modification? I have looked at the source code for the function and realize
that any attempt to do it on my own would require much dedication, so would
like to check whether someone has done it already. My R programming skills
are very elementary.

 

Specifically I would need to specify 2 different lambdas in a dataset, one
for each group. I am performing a regression of grain yield against a number
of soil variables, for 2 different years, where the regression includes
terms for year*soil variables interaction. The spatial structure of the
error is clearly different between these years, and I would thus like to fit
different lambdas to them, if possible. Of course one option is to just run
2 separate regressions, but if the possibility of fitting more than 1 lambda
does not seem too remote, I think fitting a single model offers some
advantages.

 

Thanks in advance,

 

Ignacio

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] summing values by group

2005-03-24 Thread Ignacio Colonna
Maybe aggregate() is what you are looking for?

e.g. say your data frame is called 'mydata'

sum.by.CAT-aggregate(mydata,list(CAT),sum)

this will give you sums by CAT for all the variables in the data set and
will yield 'NA' for any character variables you may have.

Ignacio


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Larry White
Sent: Thursday, March 24, 2005 10:12 AM
To: R-help@stat.math.ethz.ch
Subject: [R] summing values by group

At the risk of being wacked for asking what should be obvious  

I have a data frame with one categorical variable CAT and several
numeric variables.  I want to be able to get simple statistics on the
numeric variables by level.  For example, just as you can use table
(CAT) to get the counts, I'd like to be able to get the means and sums
by category.

If someone could point me in the right direction, I'd appreciate it.
I've been through the SimpleR and Using R for Data Analysis... docs
and I'm still clueless.

thanks for your help.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Bug on the stem function or in my brain ?

2005-03-21 Thread Ignacio Colonna
José,
Notice that the values to the left of the | in your stem plot are
all even. Odd numbers are included in the same line.

Try
 stem(time,scale=2)

  The decimal point is 1 digit(s) to the right of the |

   3 | 2
   4 | 17
   5 | 09
   6 | 4667789
   7 | 38
   8 | 3
   9 | 0335
  10 | 00
  11 | 0


ignacio

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jose A. Hernandez
Sent: Monday, March 21, 2005 9:15 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Bug on the stem function or in my brain ?

Good day R-ers!

I was running the basic statistics for the exam that my students took 
last week and something does not make sense with the stem() fucntion.

Here are two of my variables:

time, is time to complete the exam in minutes
exam.1, is the grade for the exam

In stem(), to the left of the vertical bar are the leading digits of the 
grades. To the right of the vertical bar are the last digits of the 
grades. Each single digit on the right represents one grade.

  time
  [1]  32  41  47  50  59  64  66  66  67  67  68  69  73  78  83  90 
93  93  95
[20] 100 100 110
  stem(time)

   The decimal point is 1 digit(s) to the right of the |

2 | 2
4 | 1709
6 | 466778938
8 | 30335
   10 | 000

The stem and leaf plot does not reflect the actual data, the bottom line 
for instance says there were 3 people that spent 100 minutes working on 
the test. The next to bottom line says there were one 80, three 83s, one 
85. And so forth.

  exam.1
  [1]  82 100  86  81  88  78  92  23  91  49  97   9  89  78  93  60 
80  80  83
[20]  94  51 100

  stem(exam.1)

   The decimal point is 1 digit(s) to the right of the |

0 | 9
2 | 3
4 | 91
6 | 088
8 | 0012368912347
   10 | 00

The Stem-and-Leaf plots DO NOT correspond to the data.

Any educational insights on this issue would be appreciated.

Regards,

Jose

  class(exam.1)
[1] numeric
  class(time)
[1] numeric

  version
  _
platform i386-pc-mingw32
arch i386
os   mingw32
system   i386, mingw32
status
major2
minor0.1
year 2004
month11
day  15
language R


-- 
Jose A. Hernandez
Ph.D. Candidate
Precision Agriculture Center

Department of Soil, Water, and Climate
University of Minnesota
1991 Upper Buford Circle
St. Paul, MN 55108

Ph. (612) 625-0445, Fax. (612) 625-2208

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Re: Repeated Measures, groupedData and lme

2005-03-19 Thread Ignacio Colonna
Emma,
I am not an expert, but I have been trying to fit similar models.
Adding to Keith's reply to your question, I can suggest what I concluded was
the most reasonable model for my case. Based on Keith's Model1, you might
also want to allow for a correlation among years within each experimental
unit (I am assuming the experiment was conducted at the exact same location
over the 3 years). 
Say you want to impose an autoregressive, order 1 structure (you can
change this to any other structure you may consider appropriate) 
To do this at the block*treatment unit (the smallest experimental
unit size in your experiment), you can add to keith's code:

correlation=corAR1(form=~1|block/treatment)

thus the entire code would be
Model1-lme(mg~treatment + year + treatment:year, random=~1|block,
correlation=corAR1(form=~1|block/treatment),data=magnesium)

This results in a model with a certain covariance among all exp.units within
the same block, plus another covariance between pairs of years within the
same exp.unit, and this covariance decreases as the difference in time
increases.

You can see graphically the structure of this covariance by doing:
rho-0.3
ar1-corAR1(value=rho,form=~1|block/treatment)
ar1-Initialize(ar1,data=yourdata)
m1-corMatrix(ar1)
plot(m1$1/name of first treatment[,1])

Now, I really hope someone more knowledgeable is checking this out there and
will point out whether this is incorrect, as I have used it for my analysis
assuming was the correct approach. 

Ignacio


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Keith Wong
Sent: Friday, March 18, 2005 5:41 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Re: Repeated Measures, groupedData and lme

Hello,

I'm an R-newbie, but I've been learning to use lme for repeated measures
experiments as well.

If I understand correctly: 
  Outcome variable: Mg (Kg/ha)
  Subject/grouping variable: block

  Condition/treatment: treatment (19 levels)
  Repeated factor: time (3 levels: 99, 02, 04)


I think if you specify the model formula in the lme call, then the formula
structure specified in the groupedData object is ignored.

One suggestion for the model:

Model1-lme(mg~treatment + year + treatment:year, random=~1|block,
data=magnesium)

If the question of interest is the treatment:year interaction

Or
Model2 - lme(mg~treatment, random=~1|block, data=magnesium)


Hope this helps ... and hope the experts chime in at this point to give more
guidance.

Keith


--quoting original post---
Hello

I am trying to fit a REML to some soil mineral data which has been
collected over the time period 1999 - 2004. I want to know if the 19
different treatments imposed, differ in terms of their soil mineral
content. A tree model of the data has shown differences between the
treatments can be attributed to the Magnesium, Potassium and organic
matter content of the soil, with Magnesium being the primary separating
variable.

I am looking at soil mineral data were collected : 99, 02, 04. 

In the experiment, there are 19 different treatments (treatmentcontrol,
treatment6TFYM, treatment 12TFYM etc),  which are replicated in 3
blocks.

For the magnesium soil data, I have created the following groupedData
object: 

magnesium-groupedData(Mg~year|treatment, inner=~block) 
Where mg=magnesium Kg/ha

As it is a repeated measures I was going to use an lme.  I have looked
at Pinherio and Bates : Mixed-Effects models in S and S-plus and I am
getting slightly confused.  In order to fit the lme, should I specify
the data to use in the model as the grouped structure model?

If so is the following command correct:

Model1-lme(mg~treatment, random=block|year, data=magnesium)? 

I am slightly worried that it isn't, because in model summary, instead
of listing the 19 different treatments in the fixed effects section, it
writes intercept (as normal), then treatment^1, treatment^2 etc.

However if I don't specify the groupedData object in the model, then in
the fixed effects section, it names the treatments (i.e. intercept,
treatmentcontrol, treatment6TFYM.

Should I be fitting the model using the whole data set rather than the
groupedData object?


Thank you very much for your help


Emma Pilgrim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] How to plot points as numbers/strings in lattice

2005-03-15 Thread Ignacio Colonna
Owen,
I think this gives the plot you are looking for. There may be other
better ways to do it, this is just the one I know. Inside 'panel' you would
need to use 'ltext()' instead of 'text()', as in the example you provided.

xyplot(V1~V2, data=a, groups=V3,
panel = function(x, y, groups)
  ltext(x, y, label=groups, cex = .75)
)

Ignacio

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Owen Solberg
Sent: Tuesday, March 15, 2005 4:55 PM
To: r-help@stat.math.ethz.ch
Subject: [R] How to plot points as numbers/strings in lattice

Hello,

I would be very grateful if anyone could help with what seems like a 
simple lattice task.  I want to use xyplot, where the symbols for the 
plotted points are taken from another column in the data frame.  So if the 
data frame looked like:

a - as.data.frame(matrix(data=c(1,1,10,2,2,20,3,3,30), nrow=3, ncol=3,
byrow=TRUE))
a

   V1 V2 V3
1  1  1 10
2  2  2 20
3  3  3 30

you would get an xy scatter plot using where 10 (not a dot) is at 
coordinate 1,1, 20 is at 2,2, and so on.

I have made two attempts.  The first, below, almost works, but only takes 
the first character from the V3 column:

library(lattice)
xyplot(V1~V2, data=a, pch=as.character(a$V3))

I've also tried this example, which is given in an online user guide for 
trellis.  It uses the built-in ethanol data set (which is also in 
lattice), and subscripts... but when I try the same code in lattice, I get 
an invalid graphics state error.

library(lattice)
xyplot(NOx ~ E | C, data = ethanol, aspect = 1/2,
panel = function(x, y, subscripts)
  text(x, y, subscripts, cex = .75)
)


Thank you very much in advance!

Owen Solberg



 version
  _
platform i686-redhat-linux-gnu
arch i686
os   linux-gnu
system   i686, linux-gnu
status
major2
minor0.0
year 2004
month10
day  04
language R

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html