[R] RESIDUALS IN THE AR TIME SERIES FUNCTION

2005-07-04 Thread Talarico Massimiliano \(Xelion\)
Dear all,

It's possible to obtain the residuals with AR time series function.

 

Thanks in advance,

Massimiliano Talarico

Pianificazione Commerciale - Area Crm

Unicredit Xelion Banca S.p.A.

Via Pirelli 32 - 20124 Milano

Tel.: 02 67360 525

Fax: 02 67738 525

www.xelion.it

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] over/under flow

2005-07-04 Thread Martin Maechler
 William == William H Asquith [EMAIL PROTECTED]
 on Sun, 3 Jul 2005 16:00:40 -0500 writes:

William Great, but to followup, how do I select the bounds (B,E) on the 
root 
William for an arbitrary machine?

William OVER = uniroot(function(x) lgamma(x)-log(.Machine$double.xmax), 
William c(B,E))$root

William If uniroot() is fast enough, is it appropriate for me to set B at 
say 1 
William and E at log(.Machine$double.max)?  Suggestions on do this the 
proper 
William R way?  Perhaps this . . .

William OVER = uniroot(function(x) lgamma(x)-log(.Machine$double.xmax), 
William c(1,log(.Machine$double.xmax)))$root

William I am working on a package so different machines will be involved 
thus 
William simple 171,172 might not be the best idea for the root?

Well,

1) We nowadays *require* compliance with
the usual ``IEEE arithmetic standard''  {put a bit too simplistically}
and I'd bet that those compiler / runtime library combinations
that can build R properly, would also give cutoffs in this same interval

2) lgamma(x) is very linear ``out there'' -- try
  curve(lgamma(x) - log(.Machine$double.xmax), 170, 180)
   --so you can easily enlarge the interval a bit, e.g. to
   (170,175) or even to (100,200)

Martin


William On Jul 3, 2005, at 3:43 PM, Peter Dalgaard wrote:

 William H. Asquith [EMAIL PROTECTED] writes:
 
 I am porting some FORTRAN to R in which an Inf triggers an if().  The
 trigger is infinite on exp(lgamma(OVER)).  What is the canonical R
 style of determining OVER when exp(OVER)== Inf?  The code structure
 that I am
 porting is best left intact--so I need to query R somehow to the value
 of OVER that causes exp(lgamma(OVER)) to equal Inf.
 
 On my system,
 exp(lgamma(171)) is about first to equal Inf.
 
 I asked similar question a few weeks ago on exp(OVER) and got the
 answer back as log(.Machine$double.xmax).  I now have the lgamma
 involved.  I think that answer is what is OVER such the
 
 .Machine$double.xmax = lgamma(OVER),
 
 Not quite... (see below)
 
 but I am not sure how to invert or solve for OVER
 
 
 uniroot(function(x) lgamma(x)-log(.Machine$double.xmax), c(171,172))
 $root
 [1] 171.6244
 
 $f.root
 [1] -1.462051e-07
 
 $iter
 [1] 3
 
 $estim.prec
 [1] 6.103516e-05
 
 
 -- 
 O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
 c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 
 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45) 
 35327907
 

William __
William R-help@stat.math.ethz.ch mailing list
William https://stat.ethz.ch/mailman/listinfo/r-help
William PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html


William !DSPAM:42c852c3295961260279199!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] RESIDUALS IN THE AR TIME SERIES FUNCTION

2005-07-04 Thread Gabor Grothendieck
On 7/4/05, Talarico Massimiliano (Xelion)
[EMAIL PROTECTED] wrote:
 Dear all,
 
 It's possible to obtain the residuals with AR time series function.

There is no residuals method for the ar class but you could
try this using the lh dataset as an example.  Note that the
zoo package is required in order to use zoo's lag function,
which has been enhanced to take a vector of lags, and dyn 
of the dyn package is used to add dynamic regression 
capabilities to lm.  Also, this example uses features in
the most recent versions of the dyn and zoo packages
so be sure you have the latest ones from CRAN.

ar(lh)
library(dyn) # this also brings in zoo
lh.zoo - as.zoo(lh)
lh.lm - dyn$lm(lh.zoo ~ lag(lh.zoo, -seq(3)))
lh.lm # seems close to the coefficients calculated by ar
resid(lh.lm)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] RESIDUALS IN THE AR TIME SERIES FUNCTION

2005-07-04 Thread Achim Zeileis
On Mon, 4 Jul 2005, Talarico Massimiliano (Xelion) wrote:

 Dear all,

 It's possible to obtain the residuals with AR time series function.

Yes, that's true if you refer to the function ar().
Or was it a question? If so, look at the `value' section of
  help(ar)
which gives you the answer.
Z



 Thanks in advance,

 Massimiliano Talarico

 Pianificazione Commerciale - Area Crm

 Unicredit Xelion Banca S.p.A.

 Via Pirelli 32 - 20124 Milano

 Tel.: 02 67360 525

 Fax: 02 67738 525

 www.xelion.it




   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Symbolic Maximum Likelihood in R

2005-07-04 Thread Gabor Grothendieck
On 7/3/05, Doran, Harold [EMAIL PROTECTED] wrote:
 Dear List:
 
 Is any one aware of a package that would extend the D() function and allow 
 for one to maximize a likelihood function symbolically? Something akin to 
 Solve[x==0, parameter] function in Mathematica?
 
 Clearly R has the capacity to _compute_ MLEs given a set of data. But, I'm 
 looking for a package that would allow for me to define the likelihood 
 function, find the 1st order partial derivative of this function (which can 
 already be handled in D()), and then symbolically maximize this function?
 

There are a number of free symbolic mathematics packages such as
Maxima and Yacas.  Here is a simple example in Yacas.  (Also
see http://finzi.psych.upenn.edu/R/Rhelp02a/archive/31418.html
for transferring functions from yacas to R).

C:\usr\yacas yacas
[...various startup messages...]
In f(x) := Exp(-(x-mu)^2)/Sqrt(2*Pi)
Out True;
In logf(x) := Ln(f(x));
Out True;
In loglik := logf(x1) + logf(x2) + logf(x3);
Out Ln(Exp(-(x1-mu)^2)/Sqrt(2*Pi))+Ln(Exp(-(x2-mu)^2)/Sqrt(2*Pi))+Ln(Exp(-(x3-m
u)^2)/Sqrt(2*Pi));
In score := D(mu) loglik;
Out (-(-2)*(x2-mu)*Exp(-(x2-mu)^2)*2*Pi)/(2*Pi*Exp(-(x2-mu)^2))-((-2)*(x1-mu)*E
xp(-(x1-mu)^2)*2*Pi)/(2*Pi*Exp(-(x1-mu)^2))-((-2)*(x3-mu)*Exp(-(x3-mu)^2)*2*Pi)/
(2*Pi*Exp(-(x3-mu)^2));
In score := Simplify(score);
Out 2*(x2+(-3)*mu+x1+x3);
In Solve(score == 0, mu);
Out (x2+x1+x3)/3;

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] A faster way to aggregate?

2005-07-04 Thread Dieter Menne
Dear List,

I have  a logical data frame with NA's and a grouping factor, and I want to
calculate
the % TRUE per column and group. With an indexed database, result are mainly
limited by printout time, but my R-solution below let's me wait (there are
about 10* cases in the real
data set).
Any suggestions to speed this up? Yes, I could wait for the result in real
life, but just curious if I did something wrong. In real life, data set is
ordered by groups, but how can I use this with a data frame?

Dieter Menne


# Generate test data
ncol = 20
nrow = 2
ngroup=nrow %/% 20
colrow=ncol*nrow
group = factor(floor(runif(nrow)*ngroup))
sc = data.frame(group,matrix(ifelse(runif(colrow) 
0.1,runif(colrow)0.3,NA),
 nrow=nrow))

# aggregate
system.time ({
 s = aggregate(sc[2:(ncol+1)],list(group = group),
function(x) {
   xt=table(x)
   as.integer(100*xt[2]/(xt[1]+xt[2]))
}
  )
})
# 26.09  0.03 26.95NANA

# by and apply
system.time ({
  s = by (sc[2:(ncol+1)],group,function(x) {
 apply(x,2,function(x) {
 xt=table(x)
 as.integer(100*xt[2]/(xt[1]+xt[2]))
   }
 )
})
  s=do.call(rbind,s)
})

# 82.89  0.18 85.16NANA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] compare two lists with differents levels

2005-07-04 Thread Navarre Sabine
Hi,
I would like to compare 2 lists resulted from a sql query! bu there are 
different levels, so when I want to do:
 

release1-sqlQuery(channel,paste(select distinct c.ID,c.Title TitleCrit from 
category cat, category_criteria cc, criteria c, question_criteria qc, question 
q, form_question fq, form f, release_form rf, release r, product_release pr, 
product p where cat.ID=cc.category and cc.criteria=c.ID and c.ID=qc.criteria 
and qc.question=q.ID and fq.question=q.ID and fq.form=f.ID and f.ID=rf.form and 
rf.release=r.ID and r.ID=pr.release and pr.product=p.ID and 
r.ID=',param1,';,sep=))

release2-sqlQuery(channel,paste(select distinct c.ID,c.Title TitleCrit from 
category cat, category_criteria cc, criteria c, question_criteria qc, question 
q, form_question fq, form f, release_form rf, release r, product_release pr, 
product p where cat.ID=cc.category and cc.criteria=c.ID and c.ID=qc.criteria 
and qc.question=q.ID and fq.question=q.ID and fq.form=f.ID and f.ID=rf.form and 
rf.release=r.ID and r.ID=pr.release and pr.product=p.ID and 
r.ID=',param2,';,sep=))

data_NA-matrix(data=0,nrow=length(release$Title),ncol=length(formv$TitleCrit),byrow=TRUE,
 
dimnames=list(as.factor(release$Title),paste(as.factor(formv$TitleCrit),(,formv$first_letter,),sep=)))

for (i in 1:length(formv$TitleCrit))
{
for(k in 1: length(release1$TitleCrit))
{
 if(formv$TitleCrit[i]==release1$TitleCrit[k])
 {
data_NA[1,formv$TitleCrit[[i]]]-1
 }
 else{data_NA[1,formv$TitleCrit[[i]]]-NA}
   }
}

for (i in 1:length(formv$TitleCrit))
{
   for(k in 1: length(release2$TitleCrit))
   {
if(formv$TitleCrit[i]==release2$TitleCrit[k])
{
data_NA[2,formv$TitleCrit[[i]]]-1
}
else{data_NA[2,formv$TitleCrit[[i]]]-NA}
   }
}

 

On R: Error in Ops.factor(formv$TitleCrit[i], release2$TitleCrit[k]) : level 
sets of factors are different

How can I compare these 2 lists?

Thanks

 

SABINE

 



-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Rotate legends or other approaches to nice legend placement?

2005-07-04 Thread Alex Brown
I'm sure this general sort of question has been asked many times before 
- I would _like_ automatic and sensible legend placement in barplots so 
data is not overwritten... but since there doesn't seem to be one, one 
of the following would be useful:

One approach for this would be to place the legend to the right of the 
graph, and rotate it by 90 degrees.

Is there a sensible way to do this?

alternatively, is there a function to

1) estimate legend size
2) adjust nrows so that the full width of the drawing device is used, 
minimising height
3) use layout() so that enough space is allocated beneath the graph for 
the legend
4) draw legend
5) allow user to call plot, correctly drawing the plot in the remaining 
frame?

I have taken a look at this, but I am confused by the different units 
used by par(mar), legend(plot=F), and layout.

-Alex Brown

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] loop over large dataset

2005-07-04 Thread Federico Calboli
In my absentmindedness I'd forgotten to CC this to the list... and  
BTW, using gc() in the loop increases the runtime...


 My suggestion is that you try to vectorize the computation as much  
 as you
 can.

 From what you've shown, `new' and `ped' need to have the same  
 number of
 rows, right?

 Your `off' function seems to be randomly choosing between columns  
 1 and 2
 from its two input matrices (one row each?).  You may want to do the
 sampling all at once instead of looping over the rows.  E.g.,



 (m - matrix(1:10, ncol=2))


  [,1] [,2]
 [1,]16
 [2,]27
 [3,]38
 [4,]49
 [5,]5   10


 (colSample - sample(1:2, nrow(m), replace=TRUE))


 [1] 1 1 2 1 1


 (x - m[cbind(1:nrow(m), colSample)])


 [1] 1 2 8 4 5

 So you might want to do something like (obviously untested):

 todo - ped[,3] * ped[,5] != 0  ## indicator of which rows to work on
 n.todo - sum(todo)  ## how many are there?
 sire - new[ped[todo, 3], ]
 dam - new[ped[todo, 5], ]
 s.gam - sire[1:nrow(sire), sample(1:2, nrow(sire), replace=TRUE)]
 d.gam - dam[1:nrow(dam), sample(1:2, nrow(dam), replace=TRUE)]
 new[todo, 1:2] - cbind(s.gam, d.gam)



 Improving the efficiency of the code is abviously a plus, but the  
 real thing I am mesmerised by is the sheer increase in runtime...  
 how come not a linear increase with dataset size?

 Cheers,

 Federico

 --
 Federico C. F. Calboli
 Department of Epidemiology and Public Health
 Imperial College, St. Mary's Campus
 Norfolk Place, London W2 1PG

 Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

 f.calboli [.a.t] imperial.ac.uk
 f.calboli [.a.t] gmail.com



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A faster way to aggregate?

2005-07-04 Thread Gabor Grothendieck
On 7/4/05, Dieter Menne [EMAIL PROTECTED] wrote:
 Dear List,
 
 I have  a logical data frame with NA's and a grouping factor, and I want to
 calculate
 the % TRUE per column and group. With an indexed database, result are mainly
 limited by printout time, but my R-solution below let's me wait (there are
 about 10* cases in the real
 data set).
 Any suggestions to speed this up? Yes, I could wait for the result in real
 life, but just curious if I did something wrong. In real life, data set is
 ordered by groups, but how can I use this with a data frame?
 
 Dieter Menne
 
 
 # Generate test data
 ncol = 20
 nrow = 2
 ngroup=nrow %/% 20
 colrow=ncol*nrow
 group = factor(floor(runif(nrow)*ngroup))
 sc = data.frame(group,matrix(ifelse(runif(colrow) 
 0.1,runif(colrow)0.3,NA),
 nrow=nrow))
 
 # aggregate
 system.time ({
  s = aggregate(sc[2:(ncol+1)],list(group = group),
function(x) {
   xt=table(x)
   as.integer(100*xt[2]/(xt[1]+xt[2]))
}
  )
 })
 # 26.09  0.03 26.95NANA
 
 # by and apply
 system.time ({
  s = by (sc[2:(ncol+1)],group,function(x) {
 apply(x,2,function(x) {
 xt=table(x)
 as.integer(100*xt[2]/(xt[1]+xt[2]))
   }
 )
})
  s=do.call(rbind,s)
 })
 
 # 82.89  0.18 85.16NANA
 

Look at ?rowsum

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] loop over large dataset

2005-07-04 Thread Uwe Ligges
Federico Calboli wrote:

 In my absentmindedness I'd forgotten to CC this to the list... and  
 BTW, using gc() in the loop increases the runtime...

If the data size increases, you cannot expect linear run time behaviour, 
e.g. because gc() is called more frequently. And of course, gc() needs 
some time, hence you get the expected increase in runtime. This answers 
you other question as well.

Uwe Ligges


My suggestion is that you try to vectorize the computation as much  
as you
can.

From what you've shown, `new' and `ped' need to have the same  
number of
rows, right?

Your `off' function seems to be randomly choosing between columns  
1 and 2
from its two input matrices (one row each?).  You may want to do the
sampling all at once instead of looping over the rows.  E.g.,




(m - matrix(1:10, ncol=2))



 [,1] [,2]
[1,]16
[2,]27
[3,]38
[4,]49
[5,]5   10



(colSample - sample(1:2, nrow(m), replace=TRUE))



[1] 1 1 2 1 1



(x - m[cbind(1:nrow(m), colSample)])



[1] 1 2 8 4 5

So you might want to do something like (obviously untested):

todo - ped[,3] * ped[,5] != 0  ## indicator of which rows to work on
n.todo - sum(todo)  ## how many are there?
sire - new[ped[todo, 3], ]
dam - new[ped[todo, 5], ]
s.gam - sire[1:nrow(sire), sample(1:2, nrow(sire), replace=TRUE)]
d.gam - dam[1:nrow(dam), sample(1:2, nrow(dam), replace=TRUE)]
new[todo, 1:2] - cbind(s.gam, d.gam)



Improving the efficiency of the code is abviously a plus, but the  
real thing I am mesmerised by is the sheer increase in runtime...  
how come not a linear increase with dataset size?

Cheers,

Federico

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com


 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] To Predict for a vector

2005-07-04 Thread Talarico Massimiliano \(Xelion\)
Dear all,

It's possible in R to predict the new values for a vector ?

For example I have these vectors:

 

V1 at time 1 is [1,3,6,3,8,5,3]

V2 at time 2 is [5,3,7,8,9,5,4]

V3 at time 3 is [7,5,3,2,1,7,5]





Vn at time n is [5,4,3,7,8,9,3]

 

I want to estimate the vector at time n+1, with some package in R.

 

Thanks in advance.

MT

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] A faster way to aggregate?

2005-07-04 Thread Dieter Menne
My Original question (edited)

 I have  a logical data frame with NA's and a grouping factor, and I want to
 calculate
 the % TRUE per column and group. With an indexed database, result are mainly
 limited by printout time, but my R-solution below lets me wait.
 Any suggestions to speed this up? 

Gabor Grothendieck ggrothendieck at gmail.com writes:

 Look at ?rowsum

Nearby colMeans works, but why so slow?

Dieter Menne

# Generate test data
ncol = 20
nrow = 2
ngroup=nrow %/% 20
colrow=ncol*nrow
group = factor(floor(runif(nrow)*ngroup))
sc = data.frame(group,matrix(ifelse(runif(colrow)  0.1,runif(colrow)0.3,NA),
 nrow=nrow))

# aggregate (still best)
system.time ({
 s = aggregate(sc[2:(ncol+1)],list(group = group),
function(x) {
   xt=table(x)
   as.integer(100*xt[2]/(xt[1]+xt[2]))
}
  )
})
# 26.09  0.03 26.95NANA

# by and apply
system.time ({
  s1 = by (sc[2:(ncol+1)],group,function(x) {
 as.integer(100*colMeans(x,na.rm=T))

})
  s1=as.data.frame(do.call(rbind,s))
})

#  51.49  0.93 52.60NANA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] loop over large dataset

2005-07-04 Thread Federico Calboli

On 4 Jul 2005, at 12:41, Uwe Ligges wrote:

 Federico Calboli wrote:


 In my absentmindedness I'd forgotten to CC this to the list...  
 and  BTW, using gc() in the loop increases the runtime...


 If the data size increases, you cannot expect linear run time  
 behaviour, e.g. because gc() is called more frequently. And of  
 course, gc() needs some time, hence you get the expected increase  
 in runtime. This answers you other question as well.

Is then internal gc() calls that increase the runtime from 5 minutes  
to more then 24 hours for a 27x increase in data (given that the code  
is exactely the same)?

Federico

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] RMySQL typing Problem (bigint unsigned)

2005-07-04 Thread Dubravko Dolic
Dear Group,

 

if anyone has experience with the RMySQL Package maybe this behaviour is know:

 

Reading data from a table into R the fields with datatype bigint(20) unsigned 
are transformed in some way: e.g. the query select * from orders where userid 
= 14929859848712890325 selects the correct case but in R the userid is changed 
to 14929859848712890368. What happened here? This transformation is true for 
all fields of that type...

 

Thank You for any help

 

 

 

 

 

Dubravko Dolic

Statistical Analyst

Tel:  +49 (0)89-55 27 44 - 4630

Fax: +49 (0)89-55 27 44 - 2463

Email: [EMAIL PROTECTED]

Komdat GmbH
Nymphenburger Straße 86
80636 München
-
ONLINE MARKETING THAT WORKS
-
This electronic message contains information from Komdat Gmb...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] loop over large dataset

2005-07-04 Thread Peter Dalgaard
Federico Calboli [EMAIL PROTECTED] writes:

  behaviour, e.g. because gc() is called more frequently. And of  
  course, gc() needs some time, hence you get the expected increase  
  in runtime. This answers you other question as well.
 
 Is then internal gc() calls that increase the runtime from 5 minutes  
 to more then 24 hours for a 27x increase in data (given that the code  
 is exactely the same)?

Your original code got lost in the threading, but that order of
magnitude suggests that you have N^2/2 behaviour somewhere. The typical
culprit is code like

x - numeric(0)
for (i in 1:N){
  newx - 
  x - c(x, newx)
} 

in which the extension of x causes the whole thing to be reallocated
and copied. Same thing with cbind and rbind constructs of course.



-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] loop over large dataset

2005-07-04 Thread Federico Calboli

On 4 Jul 2005, at 15:15, Peter Dalgaard wrote:

 Your original code got lost in the threading, but that order of
 magnitude suggests that you have N^2/2 behaviour somewhere. The  
 typical
 culprit is code like

 x - numeric(0)
 for (i in 1:N){
   newx - 
   x - c(x, newx)
 }

 in which the extension of x causes the whole thing to be reallocated
 and copied. Same thing with cbind and rbind constructs of course.


I changed my code a bit, and now the runtime is dow to less than a  
minute (from more than 24 hours). I was copying a large dataset many  
times over, when I extracted the columns I need as independet vectors  
runtime dropped like a stone.

Cheers,

Federico

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] partial autoregression matrix function

2005-07-04 Thread Samuel E. Kemp
Hi,

Does anyone know of a function in R that is capable of calculating the 
partial autoregression matrix function for vector autoregressive moving 
average (VARMA) models?

The dse package has functions capable of simulating and estimating 
VARMA models, but I did not notice a function for model identification.

Any help would be greatly appreciated.

Kind regards,

Sam.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] RMySQL typing Problem (bigint unsigned)

2005-07-04 Thread Duncan Murdoch
On 7/4/2005 9:50 AM, Dubravko Dolic wrote:
 Dear Group,
 
  
 
 if anyone has experience with the RMySQL Package maybe this behaviour is know:
 
  
 
 Reading data from a table into R the fields with datatype bigint(20) unsigned 
 are transformed in some way: e.g. the query select * from orders where 
 userid = 14929859848712890325 selects the correct case but in R the userid 
 is changed to 14929859848712890368. What happened here? This transformation 
 is true for all fields of that type...

R doesn't have a bigint type, so I imagine these are being changed to 
doubles.  In double precision those are the same number.

I don't know the best way to handle this, but one way would be to do SQL 
calculations to extract the lower 10 digits separately from the upper 10 
digits.  R doubles can represent 10 digit integers exactly.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A faster way to aggregate?

2005-07-04 Thread Gabor Grothendieck
On 7/4/05, Dieter Menne [EMAIL PROTECTED] wrote:
 My Original question (edited)
 
  I have  a logical data frame with NA's and a grouping factor, and I want to
  calculate
  the % TRUE per column and group. With an indexed database, result are mainly
  limited by printout time, but my R-solution below lets me wait.
  Any suggestions to speed this up?
 
 Gabor Grothendieck ggrothendieck at gmail.com writes:
 
  Look at ?rowsum
 
 Nearby colMeans works, but why so slow?
 
 Dieter Menne
 
 # Generate test data
 ncol = 20
 nrow = 2
 ngroup=nrow %/% 20
 colrow=ncol*nrow
 group = factor(floor(runif(nrow)*ngroup))
 sc = data.frame(group,matrix(ifelse(runif(colrow)  0.1,runif(colrow)0.3,NA),
 nrow=nrow))
 
 # aggregate (still best)
 system.time ({
  s = aggregate(sc[2:(ncol+1)],list(group = group),
function(x) {
   xt=table(x)
   as.integer(100*xt[2]/(xt[1]+xt[2]))
}
  )
 })
 # 26.09  0.03 26.95NANA
 
 # by and apply
 system.time ({
  s1 = by (sc[2:(ncol+1)],group,function(x) {
 as.integer(100*colMeans(x,na.rm=T))
 
})
  s1=as.data.frame(do.call(rbind,s))
 })
 
 #  51.49  0.93 52.60NANA
 

Note that you did not actually try my suggestion which was rowsum,
not colMeans.

The following solution based on rowsum is more than
an order of magnitude faster than any of the solutions in your
posts:

sc1 - as.matrix(sc[,-1])
is.na.sc1 - is.na(sc1)
x1 - rowsum(ifelse(is.na.sc1, 0, sc1), group)
xx - rowsum(1-is.na.sc1, group)
res - floor(100*x1/xx)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Rotate legends or other approaches to nice legend placement?

2005-07-04 Thread Marc Schwartz
On Mon, 2005-07-04 at 11:19 +0100, Alex Brown wrote:
 I'm sure this general sort of question has been asked many times before 
 - I would _like_ automatic and sensible legend placement in barplots so 
 data is not overwritten... but since there doesn't seem to be one, one 
 of the following would be useful:
 
 One approach for this would be to place the legend to the right of the 
 graph, and rotate it by 90 degrees.
 
 Is there a sensible way to do this?
 
 alternatively, is there a function to
 
 1) estimate legend size
 2) adjust nrows so that the full width of the drawing device is used, 
 minimising height
 3) use layout() so that enough space is allocated beneath the graph for 
 the legend
 4) draw legend
 5) allow user to call plot, correctly drawing the plot in the remaining 
 frame?
 
 I have taken a look at this, but I am confused by the different units 
 used by par(mar), legend(plot=F), and layout.
 
 -Alex Brown

For placing the legend outside the plot region, see this post from just
a few days ago:

https://stat.ethz.ch/pipermail/r-help/2005-July/073207.html


In terms of automating the process, that usually means making some
assumptions and then writing code to fit the assumptions, while
providing options to handle the cases that don't.

Briefly, one approach to automating legend placement within the plot
region might be something like this. Create a new function we'll call
barplotL() (not feeling overly creative this morning). 

Basically, it figures out the maximum y value from the height argument
and multiplies that by 1.5 to provide extra room at the top of the plot
region for the legend. You can of course adjust this factor as required
(ie. less room is needed for a horizontal legend).

For the upper left hand corner (ULHC) of the legend, it takes the range
of the x and y axes and then places the UHLC at 5%/95% of the respective
ranges from the UHLC of the plot region. See ?par (specifically 'usr').

It provides options for coloring the legend boxes (in lieu of the
default grey) and for making the legend horizontal instead of vertical. 

You can add other options as well, but this should get you started.

BTW, this approach presumes that 'height' will be a matrix, since I am
not sure that a legend makes sense otherwise...


barplotL - function(height, beside = FALSE, 
 legend = NULL, col = NULL,
 leg.horiz = FALSE)
{
  ylim - ifelse(beside, 
 max(height) * 1.5, 
 max(colSums(height) * 1.5)) 

  barplot(height = height, ylim = c(0, ylim), 
  beside = beside, col = col)

  x.pos - par(usr)[1] + ((par(usr)[2] - par(usr)[1]) * .05)
  y.pos - par(usr)[4] - ((par(usr)[4] - par(usr)[3]) * .05)

  if(is.null(col))
col - grey.colors(nrow(height))

  if (is.null(legend))
legend - rownames(height) 

  legend(x.pos, y.pos, legend = legend, fill = col,
 horiz = leg.horiz)
}



So, let's try it:


barplotL(VADeaths, beside = FALSE)

barplotL(VADeaths, beside = FALSE, leg.horiz = TRUE)

barplotL(VADeaths, beside = TRUE, 
 col = c(red, yellow, orange))

barplotL(VADeaths, beside = TRUE, leg.horiz = TRUE, 
 col = c(red, yellow, orange))


You can also look at the smartlegend() function in the gplots package on
CRAN, but you still need to adjust the y axis ranges as above to make
room for the legend itself.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Lack of independence in anova()

2005-07-04 Thread Phillip Good
 If the observations are normally distributed and the 2xk design is
balanced,  theory requires that the tests for interaction and row effects be
independent.  In my program, appended below, this would translate to cntT
(approx)= cntR*cntI/N if all R routines were functioning correctly.  They
aren't.

sim2=function(size,N,p){
  cntR=0
  cntC=0
  cntI=0
  cntT=0
  cntP=0
  for(i in 1:N){
#generate data
 v=gendata(size)
#analyze after build(ing) design containing data
 lm.out=lm(yield~c*r,build(size,v))
 av.out=anova(lm.out)
#if column effect is significant, increment cntC
 if (av.out[[5]][1]=p)cntC=cntC+1
   #if row effect is significant, increment cntR
 if (av.out[[5]][2]=p){
   cntR=cntR+1
   tmp = 1
   }
 else tmp =0
 if (av.out[[5]][3]=p){
   #if interaction is significant, increment cntI
cntI=cntI+1
#if both interaction and row effect are significant, increment cntT
cntT=cntT + tmp
} 
 }
list(cntC=cntC, cntR=cntR, cntI=cntI, cntT=cntT)
}

build=function(size,v){
#size is a vector containing the sample sizes
col=c(rep(0,size[1]),rep(1,size[2]),rep(2,size[3]),rep(3,size[4]),
rep(0,size[5]),rep(1,size[6]),rep(2,size[7]),rep(3,size[8]))
row=c(rep(0,size[1]+size[2]+size[3]+size[4]),rep(1,size[5]+size[6]
+size[7]+size[8]))
return(data.frame(c=factor(col), r=factor(row),yield=v))
}

gendata=function(size){
  ssize=sum(size);
  return (rnorm(ssize))
}

#Example
 size=c(3,3,3,0,3,3,3,0)
 sim2(size,1,10,.16)



Phillip Good
Huntington Beach CA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A faster way to aggregate?

2005-07-04 Thread Dieter Menne
Gabor Grothendieck ggrothendieck at gmail.com writes:

 Note that you did not actually try my suggestion which was rowsum,
 not colMeans.

Mea culpa, I was hooked by rowSums, and so I was did not get aware of the 
grouping facility of rowsum().

 
 The following solution based on rowsum is more than
 an order of magnitude faster than any of the solutions in your
 posts:
 
   sc1 - as.matrix(sc[,-1])
   is.na.sc1 - is.na(sc1)
   x1 - rowsum(ifelse(is.na.sc1, 0, sc1), group)
   xx - rowsum(1-is.na.sc1, group)
   res - floor(100*x1/xx)

Thanks a lot.

Dieter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Windows compile

2005-07-04 Thread Philip Bermingham
That worked great, thank you.  but it seems I have a new error occurring 
after the zipping of the help files:

hhc: not found
cp: cannot stat `C:/R/R-2.1.1/src/library/base/chm/base.chm': No such 
file or di
rectory
make[3]: *** [chm-base] Error 1
make[2]: *** [pkg-base] Error 2
make[1]: *** [rpackage] Error 2
make: *** [all] Error 2

Any advise?
Philip Bermingham

Duncan Murdoch wrote:

 On 7/4/2005 8:51 AM, Philip Bermingham wrote:

 Where would I specify the path to this file? I checked the mkrules 
 file but it doesn't mention it there.  


 The path is the system path.  You specify it in your shell, before 
 starting make, or in one of the Control Panel settings (if you want to 
 make the change system wide).

 I'd suggest writing a batch file that sets the path, and run it before 
 a build.  Then you don't need to worry about side effects of changes 
 on other programs.

 It will look something like this:

 path=.;f:\r\tools\bin;f:\perl\bin;f:\minGW\bin;f:\cygwin\bin;c:\util\misc;c:\windows\command;F:\texmf\miktex\bin;c:\progra~1\htmlhe~1
  


 but this is from a very old batch file (I don't use the Windows shell 
 any more), and of course the paths to the programs will likely be 
 different on your system than on mine.

 Also I did a search for Rpwd.exe and it seems I don't have that 
 file.  I have Rpwd.c.  It seems I'm missing an important step in the 
 build processes, but I don't know what it is.


 Rpwd.exe is made in one of the first steps of the build process.  What 
 you missed is setting up the path so that the build can proceed.

 Duncan Murdoch


 Philip.

 Duncan Murdoch wrote:

 Philip Bermingham wrote:

 I'm trying to compile R on Windows 2003 Server and following the
 instruction laid out in R inst and admin manual I continue to get this
 error:

  

 make: ./Rpwd.exe: Command not found

 make[1]: ./Rpwd.exe: Command not found

 /cygdrive/d/rp/tools/bin/make --no-print-directory -C front-ends Rpwd

 /cygdrive/d/rp/tools/bin/make -C ../../include -f Makefile.win version

 make[3]: sh.exe: Command not found



 sh.exe is one of the programs in the tools collection, so it looks 
 as though you don't have that on your path.  Getting the path right 
 is important.  A description of what you need in the path is in the 
 R Installation and Administration manual.

 Duncan Murdoch





__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Lack of independence in anova()

2005-07-04 Thread Douglas Bates
I have already had email exchanges off-list with Phillip Good pointing
out that the independence property that he wishes to establish by
simulation is a consequence of orthogonality of the column span of the
row contrasts and the interaction contrasts.  If you set the contrasts
option to a set of orthogonal contrasts such as contr.helmert or
contr.sum, which has no effect on the results of the anova, this is
easily established.

 build
function(size, v = rnorm(sum(size))) {
col=c(rep(0,size[1]),rep(1,size[2]),rep(2,size[3]),rep(3,size[4]),
rep(0,size[5]),rep(1,size[6]),rep(2,size[7]),rep(3,size[8]))
row=c(rep(0,size[1]+size[2]+size[3]+size[4]),rep(1,size[5]+size[6]
+size[7]+size[8]))
return(data.frame(c=factor(col), r=factor(row),yield=v))
}
 size
[1] 3 3 3 0 3 3 3 0
 set.seed(1234321)
 vv - build(size)
 vv
   c r  yield
1  0 0  1.2369081
2  0 0  1.5616230
3  0 0  1.8396185
4  1 0  0.3173245
5  1 0  1.0715115
6  1 0 -1.1459955
7  2 0  0.2488894
8  2 0  0.1158625
9  2 0  2.6200816
10 0 1  1.2624048
11 0 1 -0.9862578
12 0 1 -0.3235653
13 1 1  0.2039706
14 1 1 -1.4574576
15 1 1  1.9158713
16 2 1 -2.0333909
17 2 1  1.0050272
18 2 1  0.6789184
 options(contrasts = c('contr.helmert', 'contr.poly'))
 crossprod(model.matrix(~c*r, vv))
(Intercept) c1 c2 r1 c1:r1 c2:r1
(Intercept)  18  0  0  0 0 0
c10 12  0  0 0 0
c20  0 36  0 0 0
r10  0  0 18 0 0
c1:r1 0  0  0  012 0
c2:r1 0  0  0  0 036


On 7/4/05, Phillip Good [EMAIL PROTECTED] wrote:
  If the observations are normally distributed and the 2xk design is
 balanced,  theory requires that the tests for interaction and row effects be
 independent.  In my program, appended below, this would translate to cntT
 (approx)= cntR*cntI/N if all R routines were functioning correctly.  They
 aren't.
 
 sim2=function(size,N,p){
   cntR=0
   cntC=0
   cntI=0
   cntT=0
   cntP=0
   for(i in 1:N){
 #generate data
  v=gendata(size)
 #analyze after build(ing) design containing data
  lm.out=lm(yield~c*r,build(size,v))
  av.out=anova(lm.out)
 #if column effect is significant, increment cntC
  if (av.out[[5]][1]=p)cntC=cntC+1
#if row effect is significant, increment cntR
  if (av.out[[5]][2]=p){
cntR=cntR+1
tmp = 1
}
  else tmp =0
  if (av.out[[5]][3]=p){
#if interaction is significant, increment cntI
 cntI=cntI+1
 #if both interaction and row effect are significant, increment cntT
 cntT=cntT + tmp
 }
  }
 list(cntC=cntC, cntR=cntR, cntI=cntI, cntT=cntT)
 }
 
 build=function(size,v){
 #size is a vector containing the sample sizes
 col=c(rep(0,size[1]),rep(1,size[2]),rep(2,size[3]),rep(3,size[4]),
 rep(0,size[5]),rep(1,size[6]),rep(2,size[7]),rep(3,size[8]))
 row=c(rep(0,size[1]+size[2]+size[3]+size[4]),rep(1,size[5]+size[6]
 +size[7]+size[8]))
 return(data.frame(c=factor(col), r=factor(row),yield=v))
 }
 
 gendata=function(size){
   ssize=sum(size);
   return (rnorm(ssize))
 }
 
 #Example
  size=c(3,3,3,0,3,3,3,0)
  sim2(size,1,10,.16)
 
 
 
 Phillip Good
 Huntington Beach CA
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Lack of independence in anova()

2005-07-04 Thread Douglas Bates
A couple more comments on this simulation.  Notice that the sim2
function is defined with arguments size, N and p but is being called
with four arguments.  It appears as if the value of p will be 10 in
that call.

If you decide to do such a simulation yourself you can save a lot of
time by just building the model matrix once and using lm.fit in
subsequent calls.

Also, there is no need to do the counting in the body of the sim2
function.  Just save the 3 p-values from each replication.  The test
of independence is equivalent to showing that the distribution of the
p-values is uniform over the unit cube.


On 7/4/05, Phillip Good [EMAIL PROTECTED] wrote:
  If the observations are normally distributed and the 2xk design is
 balanced,  theory requires that the tests for interaction and row effects be
 independent.  In my program, appended below, this would translate to cntT
 (approx)= cntR*cntI/N if all R routines were functioning correctly.  They
 aren't.
 
 sim2=function(size,N,p){
   cntR=0
   cntC=0
   cntI=0
   cntT=0
   cntP=0
   for(i in 1:N){
 #generate data
  v=gendata(size)
 #analyze after build(ing) design containing data
  lm.out=lm(yield~c*r,build(size,v))
  av.out=anova(lm.out)
 #if column effect is significant, increment cntC
  if (av.out[[5]][1]=p)cntC=cntC+1
#if row effect is significant, increment cntR
  if (av.out[[5]][2]=p){
cntR=cntR+1
tmp = 1
}
  else tmp =0
  if (av.out[[5]][3]=p){
#if interaction is significant, increment cntI
 cntI=cntI+1
 #if both interaction and row effect are significant, increment cntT
 cntT=cntT + tmp
 }
  }
 list(cntC=cntC, cntR=cntR, cntI=cntI, cntT=cntT)
 }
 
 build=function(size,v){
 #size is a vector containing the sample sizes
 col=c(rep(0,size[1]),rep(1,size[2]),rep(2,size[3]),rep(3,size[4]),
 rep(0,size[5]),rep(1,size[6]),rep(2,size[7]),rep(3,size[8]))
 row=c(rep(0,size[1]+size[2]+size[3]+size[4]),rep(1,size[5]+size[6]
 +size[7]+size[8]))
 return(data.frame(c=factor(col), r=factor(row),yield=v))
 }
 
 gendata=function(size){
   ssize=sum(size);
   return (rnorm(ssize))
 }
 
 #Example
  size=c(3,3,3,0,3,3,3,0)
  sim2(size,1,10,.16)
 
 
 
 Phillip Good
 Huntington Beach CA
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Lack of independence in anova()

2005-07-04 Thread Phillip Good
To my surprise, the R functions I employed did NOT verify the independence
property.  I've no quarrel with the theory--it's the R functions that worry
me.

pG

-Original Message-
From: Douglas Bates [mailto:[EMAIL PROTECTED]
Sent: Monday, July 04, 2005 9:13 AM
To: [EMAIL PROTECTED]; r-help
Subject: Re: [R] Lack of independence in anova()


I have already had email exchanges off-list with Phillip Good pointing
out that the independence property that he wishes to establish by
simulation is a consequence of orthogonality of the column span of the
row contrasts and the interaction contrasts.  If you set the contrasts
option to a set of orthogonal contrasts such as contr.helmert or
contr.sum, which has no effect on the results of the anova, this is
easily established.

 build
function(size, v = rnorm(sum(size))) {
col=c(rep(0,size[1]),rep(1,size[2]),rep(2,size[3]),rep(3,size[4]),
rep(0,size[5]),rep(1,size[6]),rep(2,size[7]),rep(3,size[8]))
row=c(rep(0,size[1]+size[2]+size[3]+size[4]),rep(1,size[5]+size[6]
+size[7]+size[8]))
return(data.frame(c=factor(col), r=factor(row),yield=v))
}
 size
[1] 3 3 3 0 3 3 3 0
 set.seed(1234321)
 vv - build(size)
 vv
   c r  yield
1  0 0  1.2369081
2  0 0  1.5616230
3  0 0  1.8396185
4  1 0  0.3173245
5  1 0  1.0715115
6  1 0 -1.1459955
7  2 0  0.2488894
8  2 0  0.1158625
9  2 0  2.6200816
10 0 1  1.2624048
11 0 1 -0.9862578
12 0 1 -0.3235653
13 1 1  0.2039706
14 1 1 -1.4574576
15 1 1  1.9158713
16 2 1 -2.0333909
17 2 1  1.0050272
18 2 1  0.6789184
 options(contrasts = c('contr.helmert', 'contr.poly'))
 crossprod(model.matrix(~c*r, vv))
(Intercept) c1 c2 r1 c1:r1 c2:r1
(Intercept)  18  0  0  0 0 0
c10 12  0  0 0 0
c20  0 36  0 0 0
r10  0  0 18 0 0
c1:r1 0  0  0  012 0
c2:r1 0  0  0  0 036


On 7/4/05, Phillip Good [EMAIL PROTECTED] wrote:
  If the observations are normally distributed and the 2xk design is
 balanced,  theory requires that the tests for interaction and row effects
be
 independent.  In my program, appended below, this would translate to cntT
 (approx)= cntR*cntI/N if all R routines were functioning correctly.  They
 aren't.

 sim2=function(size,N,p){
   cntR=0
   cntC=0
   cntI=0
   cntT=0
   cntP=0
   for(i in 1:N){
 #generate data
  v=gendata(size)
 #analyze after build(ing) design containing data
  lm.out=lm(yield~c*r,build(size,v))
  av.out=anova(lm.out)
 #if column effect is significant, increment cntC
  if (av.out[[5]][1]=p)cntC=cntC+1
#if row effect is significant, increment cntR
  if (av.out[[5]][2]=p){
cntR=cntR+1
tmp = 1
}
  else tmp =0
  if (av.out[[5]][3]=p){
#if interaction is significant, increment cntI
 cntI=cntI+1
 #if both interaction and row effect are significant, increment
cntT
 cntT=cntT + tmp
 }
  }
 list(cntC=cntC, cntR=cntR, cntI=cntI, cntT=cntT)
 }

 build=function(size,v){
 #size is a vector containing the sample sizes
 col=c(rep(0,size[1]),rep(1,size[2]),rep(2,size[3]),rep(3,size[4]),
 rep(0,size[5]),rep(1,size[6]),rep(2,size[7]),rep(3,size[8]))
 row=c(rep(0,size[1]+size[2]+size[3]+size[4]),rep(1,size[5]+size[6]
 +size[7]+size[8]))
 return(data.frame(c=factor(col), r=factor(row),yield=v))
 }

 gendata=function(size){
   ssize=sum(size);
   return (rnorm(ssize))
 }

 #Example
  size=c(3,3,3,0,3,3,3,0)
  sim2(size,1,10,.16)



 Phillip Good
 Huntington Beach CA



 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Lack of independence in anova()

2005-07-04 Thread Phillip Good
My bad.  The example of a call should read, sim2(size,1,.16).
Originally, the intent of the program was to compare ANOV (the gold
standard) with synchronized permutation tests when data is from contaminated
normal distributions or the design is unbalanced.  The tests for main
effects and interactions are always independent with synchronized
permutations and ought to be for ANOV with normal data and balanced designs.

-Original Message-
From: Douglas Bates [mailto:[EMAIL PROTECTED]
Sent: Monday, July 04, 2005 9:24 AM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: [R] Lack of independence in anova()


A couple more comments on this simulation.  Notice that the sim2
function is defined with arguments size, N and p but is being called
with four arguments.  It appears as if the value of p will be 10 in
that call.

If you decide to do such a simulation yourself you can save a lot of
time by just building the model matrix once and using lm.fit in
subsequent calls.

Also, there is no need to do the counting in the body of the sim2
function.  Just save the 3 p-values from each replication.  The test
of independence is equivalent to showing that the distribution of the
p-values is uniform over the unit cube.


On 7/4/05, Phillip Good [EMAIL PROTECTED] wrote:
  If the observations are normally distributed and the 2xk design is
 balanced,  theory requires that the tests for interaction and row effects
be
 independent.  In my program, appended below, this would translate to cntT
 (approx)= cntR*cntI/N if all R routines were functioning correctly.  They
 aren't.

 sim2=function(size,N,p){
   cntR=0
   cntC=0
   cntI=0
   cntT=0
   cntP=0
   for(i in 1:N){
 #generate data
  v=gendata(size)
 #analyze after build(ing) design containing data
  lm.out=lm(yield~c*r,build(size,v))
  av.out=anova(lm.out)
 #if column effect is significant, increment cntC
  if (av.out[[5]][1]=p)cntC=cntC+1
#if row effect is significant, increment cntR
  if (av.out[[5]][2]=p){
cntR=cntR+1
tmp = 1
}
  else tmp =0
  if (av.out[[5]][3]=p){
#if interaction is significant, increment cntI
 cntI=cntI+1
 #if both interaction and row effect are significant, increment
cntT
 cntT=cntT + tmp
 }
  }
 list(cntC=cntC, cntR=cntR, cntI=cntI, cntT=cntT)
 }

 build=function(size,v){
 #size is a vector containing the sample sizes
 col=c(rep(0,size[1]),rep(1,size[2]),rep(2,size[3]),rep(3,size[4]),
 rep(0,size[5]),rep(1,size[6]),rep(2,size[7]),rep(3,size[8]))
 row=c(rep(0,size[1]+size[2]+size[3]+size[4]),rep(1,size[5]+size[6]
 +size[7]+size[8]))
 return(data.frame(c=factor(col), r=factor(row),yield=v))
 }

 gendata=function(size){
   ssize=sum(size);
   return (rnorm(ssize))
 }

 #Example
  size=c(3,3,3,0,3,3,3,0)
  sim2(size,1,10,.16)



 Phillip Good
 Huntington Beach CA



 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Colors in mtext

2005-07-04 Thread Christof Bigler
I want to assign different colors to different strings using 
mtext(...). However, when I use something like

mtext(text=c(string1,string2,string3), 
col=c(black,blue,red), side=3, line=0)

string2 and string3 are printed over string1. When I use 
paste(string1,string2,string3), the series of strings are printed 
one over each other, but I still don't get different colors for 
different strings.

Any solutions?

Christof

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] question about boxplot axis

2005-07-04 Thread yyan liu
Hi:
  I have a question making side by side boxplot. 
  My response is numeric and I want to make a side by
side boxplot of it accroding to a factor vector. So,
there are several boxplots on the same plot. Each
boxplot is with respect to one level for a factor. The
levels of
the factor are some characters. When I make the plot,
the boxplots are arranged according to the alphabetic
order of the levels. My question is that how I can
change the order the boxplots are arranged. For
example, 
 
   fac-c(a,b,c,d)
   boxplot(time.vec~fac,xlab=Events,ylab=Time In
Days,col=yellow)

Then the boxplots are arranged in the order of
a,b,c,d. But for some reason, I want it to be
c,a,d,b. Any suggestion about this question?

thx a lot!

liu zhong

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] r version 2.1.0 and graphics in mac os 10.3.9

2005-07-04 Thread Luis Borda de Agua
I use mac os 10.3.9 and I've installed in my computer R 2.1.0 (I  
believe this is the latest R version).

Although it works alright when I open R by clicking in the R icon, I  
cannot use the graphics facility when I open R from a X-terminal window  
(or x-11 window).

In fact, when I try to open R I get the message that I pasted below.
Is this a R bug?
Is R assuming that I should have the latest Mac OS Tiger installed?
Or is it a problem related to my computer only?

I did not have this problem when I used a previous version of R (I  
don't know which one was).

Thank you in advance for your help.

Luis BA.

___


% R

R : Copyright 2005, The R Foundation for Statistical Computing
Version 2.1.0 Patched (2005-05-12), ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for a HTML browser interface to help.
Type 'q()' to quit R.

[Previously saved workspace restored]

Error in dyn.load(x, as.logical(local), as.logical(now)) :
 unable to load shared library  
'/Library/Frameworks/R.framework/Resources/library/grDevices/libs/ 
grDevices.so':
   dlcompat: dyld: /Library/Frameworks/R.framework/Resources/bin/exec/R  
version mismatch for library: /usr/local/lib/libxml2.2.dylib  
(compatibility version of user: 9.0.0 greater than library's version:  
8.0.0)
Loading required package: grDevices
Error in dyn.load(x, as.logical(local), as.logical(now)) :
 unable to load shared library  
'/Library/Frameworks/R.framework/Resources/library/grDevices/libs/ 
grDevices.so':
   dlcompat: dyld: /Library/Frameworks/R.framework/Resources/bin/exec/R  
version mismatch for library: /usr/local/lib/libxml2.2.dylib  
(compatibility version of user: 9.0.0 greater than library's version:  
8.0.0)
In addition: Warning message:
package grDevices in options(defaultPackages) was not found
Error: package 'grDevices' could not be loaded



Luis Borda de Agua
2502 Department of Plant Biology
University of Georgia
Athens GA 30602
USA
Phone: (706) 583-0943
Fax: (706) 542-1805

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] question about boxplot axis

2005-07-04 Thread Marc Schwartz
On Mon, 2005-07-04 at 11:35 -0700, yyan liu wrote:
 Hi:
   I have a question making side by side boxplot. 
   My response is numeric and I want to make a side by
 side boxplot of it accroding to a factor vector. So,
 there are several boxplots on the same plot. Each
 boxplot is with respect to one level for a factor. The
 levels of
 the factor are some characters. When I make the plot,
 the boxplots are arranged according to the alphabetic
 order of the levels. My question is that how I can
 change the order the boxplots are arranged. For
 example, 
  
fac-c(a,b,c,d)
boxplot(time.vec~fac,xlab=Events,ylab=Time In
 Days,col=yellow)
 
 Then the boxplots are arranged in the order of
 a,b,c,d. But for some reason, I want it to be
 c,a,d,b. Any suggestion about this question?
 
 thx a lot!
 
 liu zhong



time.vec - rnorm(40)
fac - factor(sample(letters[1:4], 40, replace = TRUE))

# Default ordering
boxplot(time.vec ~ fac, xlab=Events, 
ylab=Time In Days, col=yellow)


# Now adjust factor levels as you desire
fac - factor(fac, levels = c(c, a, d, b))

# Redo boxplot with new ordering
boxplot(time.vec ~ fac, xlab=Events, 
ylab=Time In Days, col=yellow)


See ?factor for more information. boxplot() sequences the boxes based
upon the factor levels.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] partial autoregression matrix function

2005-07-04 Thread Paul Gilbert


Samuel E. Kemp wrote:
 Hi,
 
 Does anyone know of a function in R that is capable of calculating the 
 partial autoregression matrix function for vector autoregressive moving 
 average (VARMA) models?
?pacf

 
 The dse package has functions capable of simulating and estimating 
 VARMA models, but I did not notice a function for model identification.

In dse you might also look at
?checkResiduals
?informationTests

Paul Gilbert
 
 Any help would be greatly appreciated.
 
 Kind regards,
 
 Sam.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] R emacs and ess

2005-07-04 Thread O.J. Shaikh
Hi all,
 I want to apologise first for being a beginner with linux/emacs/ess;
I am pretty good with R ;).

I have R-1.9.1 installed on my directory of a linux server, I also have
ess 5.2.8 installed there as well.  When I call R using (C-u M-x R) in
emacs, i receive the following error [no match].

I was wondering if emacs is not looking for ess in the right place (if
there is something i have to edit into the .emacs file) and if someone
could explain to me how i should do that since i am not good with
linux.  

I was also wondering if emacs cant find R and if someone could let me
know how i should go about doing that.  Currently to execute R I use
(sh R) from the R bin directory.

Thanks for all your help
Omar

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Colors in mtext

2005-07-04 Thread Spencer Graves
  The documentation ?mtext says that adj is adjustment for each 
string in reading direction. For strings parallel to the axes, 'adj=0' 
means left or bottom alignment, and 'adj=1' means right or top 
alignment.  With this information, I tried the following, which looks 
like it may be close to what you are requesting:

mtext(text=c(string1,string2,string3),
   adj=c(0, .5, 1), col=c(black,blue,red), side=3, line=0)

  spencer graves

Christof Bigler wrote:
 I want to assign different colors to different strings using 
 mtext(...). However, when I use something like
 
 mtext(text=c(string1,string2,string3), 
 col=c(black,blue,red), side=3, line=0)
 
 string2 and string3 are printed over string1. When I use 
 paste(string1,string2,string3), the series of strings are printed 
 one over each other, but I still don't get different colors for 
 different strings.
 
 Any solutions?
 
 Christof
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com http://www.pdf.com
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] using index of a loop as a macro variable

2005-07-04 Thread E. Michael Foster
Hi,

I'm a long-time STATA user and a R newbie. I'm doing ok, but I'm addicted
to STATA macro variables.  Is there something like a macro variable in R?

Specifically, I'd like to be able to do something like

for (i in 1:3) {
.
x`i' - ...
}

where R would resolve x`i' to the objects named x1, x2 and x3 as I move
through the loop.  I guess I could create these in advance of the loop and
fill them in, but I'd rather not.  

Is there a way to use an index of a loop in this manner? 


thanks,
michael

E. Michael Foster
Professor of Maternal and Child Health
School of Public Health
University of North Carolina

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] using index of a loop as a macro variable

2005-07-04 Thread Charles Annis, P.E.
x - rep(NA, 3)
for (i in 1:length(x)){
x[i] - ...
}



will do the job, but you may be able to take advantage of R's vectorization
and do what you want with no loop at all.




Charles Annis, P.E.

[EMAIL PROTECTED]
phone: 561-352-9699
eFax:  614-455-3265
http://www.StatisticalEngineering.com
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of E. Michael Foster
Sent: Monday, July 04, 2005 4:32 PM
To: r-help@stat.math.ethz.ch
Subject: [R] using index of a loop as a macro variable

Hi,

I'm a long-time STATA user and a R newbie. I'm doing ok, but I'm addicted
to STATA macro variables.  Is there something like a macro variable in R?

Specifically, I'd like to be able to do something like

for (i in 1:3) {
.
x`i' - ...
}

where R would resolve x`i' to the objects named x1, x2 and x3 as I move
through the loop.  I guess I could create these in advance of the loop and
fill them in, but I'd rather not.  

Is there a way to use an index of a loop in this manner? 


thanks,
michael

E. Michael Foster
Professor of Maternal and Child Health
School of Public Health
University of North Carolina

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Lack of independence in anova()

2005-07-04 Thread Douglas Bates
I wrote up a note on how to do a more efficient simulation of the
p-values from anova then discovered to my surprise that the chi-square
test for independence of the significance of the F-tests indicated
that they were not independent.  I was stumped by this but fortunately
Thomas Lumley came to my rescue with an explanation.  There is no
reason why the results of the F tests should be independent.  The
numerators are independent but the denominator is the same for both
tests.  When, due to random variation, the denominator is small, then
the p-values for both tests will tend to be small.  If, instead of
F-tests you use chi-square tests then you do see independence.

Here is the note on the simulation.

There are several things that could be done to speed the simulation of
the p-values of an anova like this under the null distribution.

If you examine the structure of a fitted lm object (use the function
str()) you will see that there are components called `qr', `effects'
and `assign'.  You can verify by experimentation that `qr' and
`assign' depend only on the experimental design.  Furthermore, the
`effects' vector can be reproduced as qr.qty(qrstr, y).

The sums of squares for the different terms in the model are the sums
of squares of elements of the effects vector as indexed by the assign
vector.  The residual sum of squares is the sum of squares of the
remaining elements of the assign vector.  You can generate 1
replications of the calculations of the relevant sums of squares as

 set.seed(1234321)
 vv - data.frame(c = gl(3,3,18), r = gl(2,9,18))
 vv
   c r
1  1 1
2  1 1
3  1 1
4  2 1
5  2 1
6  2 1
7  3 1
8  3 1
9  3 1
10 1 2
11 1 2
12 1 2
13 2 2
14 2 2
15 2 2
16 3 2
17 3 2
18 3 2
 fm1 - lm(rnorm(18) ~ c*r, vv)
 fm1$assign
[1] 0 1 1 2 3 3
 asgn - c(fm1$assign, rep(4, 12))
 system.time(res - replicate(1, tapply(qr.qty(fm1$qr, rnorm(18))^2, asgn, 
 sum)))
[1] 20.61  0.01 20.61  0.00  0.00
 res[,1:6]
   [,1]  [,2] [,3]  [,4]   [,5][,6]
0 0.4783121 0.3048634 0.713689 0.6937838 0.03649023  2.63392426
1 0.5825213 1.4756395 1.127018 0.5209751 1.18697199  3.32972093
2 0.2612723 3.6396106 0.547506 1.1641910 0.37843963  0.03411672
3 2.6259806 3.5504584 1.645215 0.1197238 0.85361018  4.53895212
4 9.1942755 8.2122693 4.863392 5.4413571 2.03715439 22.94815118

The rows of that array correspond to the sum of squares for the
Intercept (which we generally ignore), the factor 'c', the factor 'r',
their interaction and the residuals.

As you can see that took about 21 seconds on my system, which I expect
is a bit faster than your simulation ran.

Because I set the seed to a known value I can reproduce the results
for the first few simulations to check that the sums of squares are
correct.  Remember that the original fit (fm1) is not included in the
table.

 set.seed(1234321)
 fm1 - lm(rnorm(18) ~ c*r, vv)
 anova(fm2 - lm(rnorm(18) ~ c*r, vv))
Analysis of Variance Table

Response: rnorm(18)
  Df Sum Sq Mean Sq F value Pr(F)
c  2 0.5825  0.2913  0.3801 0.6917
r  1 0.2613  0.2613  0.3410 0.5701
c:r2 2.6260  1.3130  1.7137 0.2215
Residuals 12 9.1943  0.7662   

You can continue this process if you wish further verification that
the results correspond to the fitted models.

You can get the p-values for the F-tests as

 pvalsF - data.frame(pc = pf((res[2,]/2)/(res[5,]/12), 2, 12, low = FALSE),
+  pr = pf((res[3,]/1)/(res[5,]/12), 1, 12, low = FALSE),
+  pint = pf((res[4,]/2)/(res[5,]/12), 2, 12, low = FALSE))

Again you can check this for the first few by hand.

 pvalsF[1:5,]
  pc pr  pint
1 0.69171238 0.57006574 0.2214847
2 0.37102129 0.03975286 0.1158059
3 0.28634939 0.26771167 0.1740633
4 0.57775850 0.13506561 0.8775828
5 0.06363138 0.16124100 0.1224806

If you wish you could then check marginal distributions using
techniques like an empirical density plot.

 library(lattice)
 densityplot(~ pc, pvals)

At this point I would recommend checking the joint distribution but if
you want to choose a specific level and check the contingency table
that could be done as 

 xtabs(~ I(pr  0.16) + I(pint  0.16), pvalsF)
I(pint  0.16)
I(pr  0.16) FALSE TRUE
   FALSE  7204 1240
   TRUE   1215  341

The summary method for an xtabs object provides a test of independence

 summary(xtabs(~ I(pr  0.16) + I(pint  0.16), pvalsF))
Call: xtabs(formula = ~I(pr  0.16) + I(pint  0.16), data = pvalsF)
Number of cases in table: 1 
Number of factors: 2 
Test for independence of all factors:
Chisq = 51.6, df = 1, p-value = 6.798e-13

for which you can see the puzzling result.  However, if you use the
chisquared test based on the known residual variance of 1, you get

 pvalsC - data.frame(pc = pchisq(res[2,], 2, low = FALSE),
+  pr = pchisq(res[3,], 1, low = FALSE),
+  pint = pchisq(res[4,], 2, low = FALSE))
 pvalsC[1:5,]
 pc pr 

Re: [R] using index of a loop as a macro variable

2005-07-04 Thread Thomas Lumley
On Mon, 4 Jul 2005, E. Michael Foster wrote:
 I'm a long-time STATA user and a R newbie. I'm doing ok, but I'm addicted
 to STATA macro variables.  Is there something like a macro variable in R?

 Specifically, I'd like to be able to do something like

 for (i in 1:3) {
   .
   x`i' - ...
 }

 where R would resolve x`i' to the objects named x1, x2 and x3 as I move
 through the loop.  I guess I could create these in advance of the loop and
 fill them in, but I'd rather not.

 Is there a way to use an index of a loop in this manner?

No. Well, actually, yes, but you don't want to. Stata macros rarely 
translate word-for-word into R.  There is a FAQ describing how to do this 
sort of thing, but the most important paragraph is the last one, where it 
says not to do this.


What you want is a list.

for(i in 1:3){
 .
 x[[i]]-...
}

Now, x needs to exist before the loop. You can use
x-NULL
to create it, or if you know how long it will be you can use
x-vector(list,3)



-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Lack of independence in anova()

2005-07-04 Thread Douglas Bates
On 7/4/05, Douglas Bates [EMAIL PROTECTED] wrote:
...
 The sums of squares for the different terms in the model are the sums
 of squares of elements of the effects vector as indexed by the assign
 vector.  The residual sum of squares is the sum of squares of the
 remaining elements of the assign vector. 

I meant to write effects vector not assign vector in that last sentence.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R emacs and ess

2005-07-04 Thread Peter Dalgaard
O.J. Shaikh [EMAIL PROTECTED] writes:

 Hi all,
  I want to apologise first for being a beginner with linux/emacs/ess;
 I am pretty good with R ;).

Not at upgrading it, it seems ... ;-)

 I have R-1.9.1 installed on my directory of a linux server, I also have
 ess 5.2.8 installed there as well.  When I call R using (C-u M-x R) in
 emacs, i receive the following error [no match].
 
 I was wondering if emacs is not looking for ess in the right place (if
 there is something i have to edit into the .emacs file) and if someone
 could explain to me how i should do that since i am not good with
 linux.  
 
 I was also wondering if emacs cant find R and if someone could let me
 know how i should go about doing that.  Currently to execute R I use
 (sh R) from the R bin directory.

There's an ESS list too, but you would seem to have omitted to put
something in you emacs startup file. It's usually (require 'ess-site),
but likely different if you install in a private dir. Something like
(load /wherever/you/put/it/ess-site), I guess. Check the install
documentation.


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] function for cumulative occurrence of elements

2005-07-04 Thread Mrs Karen Kotschy
Hi Steven

Are you aware of the package vegan for community ecology? There is a
function in this package called specaccum, which calculates species
accumulation curves for you. Various methods can be specified, including
random.

I must admit I have not used this particular function (yet!) but it seems
like it could be useful to you.

Regards
Karen

---
Karen Kotschy
Centre for Water in the Environment
University of the Witwatersrand
Johannesburg
South Africa


On Tue, 28 Jun 2005, Steven K Friedman wrote:


 Hello,

 I have a data set with 9700 records, and 7 parameters.

 The data were collected for a survey of forest communities.  Sample plots
 (1009) and species (139) are included in this data set. I need to determine
 how species are accumulated as new plots are considered. Basically, I want
 to develop a species area curve.

 I've included the first 20 records from the data set.  Point represents the
 plot id. The other parameters are parts of the information statistic H'.

 Using Table, I can construct a data set that lists the occurrence of a
 species at any Point (it produces a binary 0/1 data table). From there it
 get confusing, regarding the most efficient approach to determining the
 addition of new and or repeated species occurrences.

 ptcount -  table(sppoint.freq$species, sppoint.freq$Point)

  From here I've played around with colSums to calculate the number of species
 at each Point.  The difficulty is determining if a species is new or
 repeated.  Also since there are 1009 points a function is needed to screen
 every Point.

 Two goals are of interest: 1) the species accumulation curve, and 2) an
 accumulation curve when random Points are considered.

 Any help would be greatly appreciated.

 Thank you
 Steve Friedman


  Pointspecies frequency point.list point.prop   log.prop
 point.hprime
 1  7   American elm 7 27 0.25925926 -1.3499267
 0.3499810
 2  7  apple 2 27 0.07407407 -2.6026897
 0.1927918
 3  7   black cherry 8 27 0.29629630 -1.2163953
 0.3604134
 4  7  black oak 1 27 0.03703704 -3.2958369
 0.1220680
 5  7chokecherry 1 27 0.03703704 -3.2958369
 0.1220680
 6  7 oak sp 1 27 0.03703704 -3.2958369
 0.1220680
 7  7 pignut hickory 1 27 0.03703704 -3.2958369
 0.1220680
 8  7  red maple 1 27 0.03703704 -3.2958369
 0.1220680
 9  7  white oak 5 27 0.18518519 -1.6863990
 0.3122961
 10 9   black spruce 2 27 0.07407407 -2.6026897
 0.1927918
 11 9blue spruce 2 27 0.07407407 -2.6026897
 0.1927918
 12 9missing12 27 0. -0.8109302
 0.3604134
 13 9  Norway spruce 8 27 0.29629630 -1.2163953
 0.3604134
 14 9   white spruce 3 27 0. -2.1972246
 0.2441361
 1512  apple 2 27 0.07407407 -2.6026897
 0.1927918
 1612   black cherry 1 27 0.03703704 -3.2958369
 0.1220680
 1712   black locust 1 27 0.03703704 -3.2958369
 0.1220680
 1812   black walnut 1 27 0.03703704 -3.2958369
 0.1220680
 1912  lilac 3 27 0. -2.1972246
 0.2441361
 2012missing 2 27 0.07407407 -2.6026897
 0.1927918

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Problems with eval() in connection with match.call()

2005-07-04 Thread Søren Højsgaard
Dear all, I have a problem when passing parms from one function to another when 
the argument list is just '...'. Consider this example:
 
foo-function(){
 xx - 111222
 bar(x=xx)
}
bar - function(...){
  cl - match.call(expand.dots=TRUE)
  print(cl)
  x - eval(cl$x)
  print(x)
}
foo()

 bar(x = xx)
 Error in eval(expr, envir, enclos) : Object xx not found

My expectation was, that xx would be evaluated to 111222 in foo before being 
passed on to bar, but obviously it is not so. Should I do something explicitely 
in foo() to 'evaluate' xx or need I do something special in bar()?? 
 
Thanks in advance, Søren

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problems with eval() in connection with match.call()

2005-07-04 Thread Douglas Bates
On 7/4/05, Søren Højsgaard [EMAIL PROTECTED] wrote:
 Dear all, I have a problem when passing parms from one function to another 
 when the argument list is just '...'. Consider this example:
 
 foo-function(){
  xx - 111222
  bar(x=xx)
 }
 bar - function(...){
   cl - match.call(expand.dots=TRUE)
   print(cl)
   x - eval(cl$x)
   print(x)
 }
 foo()
 
  bar(x = xx)
  Error in eval(expr, envir, enclos) : Object xx not found
 
 My expectation was, that xx would be evaluated to 111222 in foo before being 
 passed on to bar, but obviously it is not so. 

Welcome to the world of lazy evaluation.  Arguments are evaluated only
when needed (i.e. when first referenced) not at the time of building
the function call.   That's why the default value of an argument can
depend on previously calculated results within the function.

 Should I do something explicitely in foo() to 'evaluate' xx or need I do 
 something special in bar()??

Change the eval(cl$x) to eval(cl$x, parent.frame())

 bar
function(...){
cl - match.call(expand.dots=TRUE)
print(cl)
x - eval(cl$x, parent.frame())
print(x)
}
 foo()
bar(x = xx)
[1] 111222

 Thanks in advance, Søren
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problems with eval() in connection with match.call()

2005-07-04 Thread Duncan Murdoch
Søren Højsgaard wrote:
 Dear all, I have a problem when passing parms from one function to another 
 when the argument list is just '...'. Consider this example:
  
 foo-function(){
  xx - 111222
  bar(x=xx)
 }
 bar - function(...){
   cl - match.call(expand.dots=TRUE)
   print(cl)
   x - eval(cl$x)
   print(x)
 }
 foo()
 
 
bar(x = xx)
Error in eval(expr, envir, enclos) : Object xx not found
 
 
 My expectation was, that xx would be evaluated to 111222 in foo before being 
 passed on to bar, but obviously it is not so. Should I do something 
 explicitely in foo() to 'evaluate' xx or need I do something special in 
 bar()?? 

Hi Søren.

You need to use eval.parent(cl$x) (which is the same as eval(cl$x, 
envir=parent.frame())) to get what you want.  You want the evaluation to 
happen in the caller's frame of reference, not in bar's frame.

Generally when you eval() an expression, evaluation takes place in the 
current frame:  the function's local frame when it's in a function, the 
global frame when you do it at a command line.  You can use the envir= 
argument to change where it takes place.

Things look different when you have named parameters, e.g.

bar - function(x) {
print(x)
}

because in this case, the argument gets turned into a promise at the 
time of the call, and a promise knows its own evaluation frame.  There's 
nothing really special about named parameters though, e.g.

  bar - function(...) {
+  cl - list(...)
+  print(cl$x)
+ }
  foo()
[1] 111222


By using match.call, you ignore the environment, and only see the 
expression.

I imagine you could put together an example where eval.parent() gets the 
evaluation wrong, e.g. because you want to evaluate in the parent's 
parent.  I was a little surprised that it didn't happen here:

  foo
function(){
  xx - 111222
  bar2(x=xx)
}
  bar2
function(...) { bar(...) }
  bar
function(...){
   cl - match.call(expand.dots=TRUE)
   print(cl)
   x - eval.parent(cl$x)
   print(x)
}
  foo()
bar(x = ..1)
[1] 111222



I guess match.call() works hard to be helpful.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to set the position and size of the select.list window

2005-07-04 Thread wu sz
Hello,

I use select.list to obtain a window of select items, but how can I
set the position and size of this window?

Are there any functions which are used to maximize and minimize the
window of R Console?

Thank you!
shengzhe

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] help eliminating for loops

2005-07-04 Thread John Sorkin
I am having trouble with apply. Could someone suggest changes to the
code below that will eliminate the for loops?
R 2.1.1 patched
Windows 2k
Thanks,
John
 
function () 
{
NaCon-array(dim=c(10,10))
for (i in 1:10) {NaCon[i,]-rnorm(10,10.77,0.02)}
MeanNaCon-vector(mode=numeric,length=10)
for (j in 1:10) {MeanNaCon[j]-mean(NaCon[j,1:10])}
print(MeanNaCon)
#assign(MakeNaCon,MakeNaCon)
hist(MeanNaCon)
}
 
John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC and
University of Maryland School of Medicine Claude Pepper OAIC
 
University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
 
410-605-7119 
- NOTE NEW EMAIL ADDRESS:
[EMAIL PROTECTED]

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] help eliminating for loops

2005-07-04 Thread Liaw, Andy
Would this suffice:

n - 10
NaCon - matrix(rnorm(n * n, mean=10.77, sd=0.02), 10, 10)
MeanNaCon - rowMeans(NaCon)
[...]

??

Andy

 From: John Sorkin
 
 I am having trouble with apply. Could someone suggest changes to the
 code below that will eliminate the for loops?
 R 2.1.1 patched
 Windows 2k
 Thanks,
 John
  
 function () 
 {
 NaCon-array(dim=c(10,10))
 for (i in 1:10) {NaCon[i,]-rnorm(10,10.77,0.02)}
 MeanNaCon-vector(mode=numeric,length=10)
 for (j in 1:10) {MeanNaCon[j]-mean(NaCon[j,1:10])}
 print(MeanNaCon)
 #assign(MakeNaCon,MakeNaCon)
 hist(MeanNaCon)
 }
  
 John Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 Baltimore VA Medical Center GRECC and
 University of Maryland School of Medicine Claude Pepper OAIC
  
 University of Maryland School of Medicine
 Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
  
 410-605-7119 
 -- NOTE NEW EMAIL ADDRESS:
 [EMAIL PROTECTED]
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Derivative of a function

2005-07-04 Thread Gabriel Rodrigues Alves Margarido
Suppose I have a simple function that returns a matrix, such as:

test - function(x){ return(matrix(c(x,x^2,x^3,x^4),2,2)) }

so that test returns:
[ x  x^3 ]
[ x^2x^4 ]

Is it possible for me to get the derivative of an expression such as:

c(1,0) %*% test() %*% c(0,1)

The vectors are used just to index the matrix.
I don't want a value, but the expression to work with (in that case,
the expected expression would be 3*x^2)...

Tried functions D and deriv in many ways, but no success.
I will be grateful if anyone can help.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Derivative of a function

2005-07-04 Thread Gabriel Rodrigues Alves Margarido
Suppose I have a simple function that returns a matrix, such as:

test - function(x){ return(matrix(c(x,x^2,x^3,x^4),2,2)) }

so that test returns:
[ x  x^3 ]
[ x^2x^4 ]

Is it possible for me to get the derivative of an expression such as:

c(1,0) %*% test() %*% c(0,1)

The vectors are used just to index the matrix.
I don't want a value, but the expression to work with (in that case,
the expected expression would be 3*x^2)...

Tried functions D and deriv in many ways, but no success.
I will be grateful if anyone can help.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html