date:20100429

Hi


r-help-boun...@r-project.org napsal dne 29.04.2010 05:56:23:

 Hello,
 
 I have a data.frame:
 namecol1col2col3col4
 AA23540.9990.78
 BB123510.99
 AA203980.790.99
 
 I want to get mean value data.frame in terms of name:
 
 namecol1col2col3col4
 
 AA113.  76.   0.8945   0.8850
 
 BB123.00   5.00   1.00   0.99
 
 I tried to use by function:
 
 aa-by(test[,2:5], feature, mean)
 I found aa is by function. 
  class(aa)
 [1] by
 
 how can I transfer aa to a data frame?

use aggregate instead

aa-aggregate(test[,2:5], feature, mean)

Regards
Petr

 
 thanks
 YU
 
 
 
 
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] non linear estimation

Hi

I put a search question about nonlinear programming in R site search and 
got many answers maybe you could find something which suits your needs. 
Maybe you could also look at CRAN task view - Optimisation and 
Mathematical programming

Regards
Petr

r-help-boun...@r-project.org napsal dne 29.04.2010 03:38:27:

 
 any suggestion? actually I just wanna know if there is a package for non
 linear estimation with restriction, thanks. I am a new for R
 -- 
 View this message in context: http://r.789695.n4.nabble.com/non-linear-
 estimation-tp2072136p2074911.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] by funtion

Hi

r-help-boun...@r-project.org napsal dne 29.04.2010 08:11:41:

 Hi
 
 you could try
 
 do.call('rbind',aa)

No, No, No. rbind and cbind binds vectors as rows or columns of 
***matrix***, result is not a data frame

do.call(rbind,aa)
X069rutil X102anatas
105  26.97.9
200  22.8   10.6
400  30.6   13.3
600  50.8   20.6
800  78.7 NA
exp.df-do.call(rbind,aa)
str(exp.df)
 num [1:5, 1:2] 26.9 22.8 30.6 50.8 78.7 7.9 10.6 13.3 20.6 NA
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:5] 105 200 400 600 ...
  ..$ : chr [1:2] X069rutil X102anatas

If some object has rectangular shape and has column names it does not 
automatically mean that it is data frame

Regards
Petr




 
 then turn the matrix into data frame
 
 regards
 
 Tengfei
 
 On Wed, Apr 28, 2010 at 10:56 PM, Yuan Jian jayuan2...@yahoo.com 
wrote:
 
  Hello,
 
  I have a data.frame:
  namecol1col2col3col4
  AA23540.9990.78
  BB123510.99
  AA203980.790.99
 
  I want to get mean value data.frame in terms of name:
 
  namecol1col2col3col4
 
  AA113.  76.   0.8945   0.8850
 
  BB123.00   5.00   1.00   0.99
 
  I tried to use by function:
 
  aa-by(test[,2:5], feature, mean)
  I found aa is by function.
   class(aa)
  [1] by
 
  how can I transfer aa to a data frame?
 
  thanks
  YU
 
 
 
 
 
 
 [[alternative HTML version deleted]]
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Tengfei Yin
 MCDB PhD student
 1620 Howe Hall, 2274,
 Iowa State University
 Ames, IA,50011-2274
 Homepage: www.tengfei.name
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] by funtion

2010-04-29 Thread Tengfei Yin

Hi,

Thanks, actually I mentioned in the reply, you need to turn the matrix into
data frame in the end if use this method. e.g


df=data.frame(name=c('AA','BB','AA'),c1=c(23,123,203),c2=c(54,5,98),c3=c(0.999,1,0.79),c4=c(0.78,0.99,0.99))
 aa=by(df[,2:5],df$name,mean)
 dd=do.call('rbind',aa)
 df=data.frame(dd)
 df
c1 c2 c3c4
AA 113 76 0.8945 0.885
BB 123  5 1. 0.990

Regards

Tengfei
On Thu, Apr 29, 2010 at 1:30 AM, Petr PIKAL petr.pi...@precheza.cz wrote:

 Hi

 r-help-boun...@r-project.org napsal dne 29.04.2010 08:11:41:

  Hi
 
  you could try
 
  do.call('rbind',aa)

 No, No, No. rbind and cbind binds vectors as rows or columns of
 ***matrix***, result is not a data frame

 do.call(rbind,aa)
X069rutil X102anatas
 105  26.97.9
 200  22.8   10.6
 400  30.6   13.3
 600  50.8   20.6
 800  78.7 NA
 exp.df-do.call(rbind,aa)
 str(exp.df)
  num [1:5, 1:2] 26.9 22.8 30.6 50.8 78.7 7.9 10.6 13.3 20.6 NA
  - attr(*, dimnames)=List of 2
  ..$ : chr [1:5] 105 200 400 600 ...
  ..$ : chr [1:2] X069rutil X102anatas

 If some object has rectangular shape and has column names it does not
 automatically mean that it is data frame

 Regards
 Petr


 

 
  then turn the matrix into data frame
 
  regards
 
  Tengfei
 
  On Wed, Apr 28, 2010 at 10:56 PM, Yuan Jian jayuan2...@yahoo.com
 wrote:
 
   Hello,
  
   I have a data.frame:
   namecol1col2col3col4
   AA23540.9990.78
   BB123510.99
   AA203980.790.99
  
   I want to get mean value data.frame in terms of name:
  
   namecol1col2col3col4
  
   AA113.  76.   0.8945   0.8850
  
   BB123.00   5.00   1.00   0.99
  
   I tried to use by function:
  
   aa-by(test[,2:5], feature, mean)
   I found aa is by function.
class(aa)
   [1] by
  
   how can I transfer aa to a data frame?
  
   thanks
   YU
  
  
  
  
  
  
  [[alternative HTML version deleted]]
  
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
  
 
 
  --
  Tengfei Yin
  MCDB PhD student
  1620 Howe Hall, 2274,
  Iowa State University
  Ames, IA,50011-2274
  Homepage: www.tengfei.name
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
Tengfei Yin
MCDB PhD student
1620 Howe Hall, 2274,
Iowa State University
Ames, IA,50011-2274
Homepage: www.tengfei.name

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] non linear estimation

2010-04-29 Thread Rubén Roa


 -Mensaje original-
 De: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] En nombre de JamesHuang
 Enviado el: jueves, 29 de abril de 2010 3:38
 Para: r-help@r-project.org
 Asunto: Re: [R] non linear estimation
 
 
 any suggestion? actually I just wanna know if there is a 
 package for non linear estimation with restriction, thanks. I 
 am a new for R

I do not know if there is any specific package for optimization with 
restrictions, but you can use optim with method=L-BFGS-B
This only lets you set bounds of single parameters, so for restrictions such as 
a+b19 in
Y=a+(b+c*x)*exp(-d*x)
you could deduce your restrictions in terms of single parameters (for example, 
in your original mail you put that a10, a+b19, and b3, so the restriction 
a+b19 is actually redundant), or else you could think of some 
re-parameterization that would put a+b (and all other multi-par restrictions) 
as a single parameter.

Wait, is this a homework?


 

Dr. Rubén Roa-Ureta
AZTI - Tecnalia / Marine Research Unit
Txatxarramendi Ugartea z/g
48395 Sukarrieta (Bizkaia)
SPAIN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] by funtion

2010-04-29 Thread Yuan Jian

Thanks Tengfei,
I have another question. 
 df=data.frame(name=c('AA','BB', 
 'CC'),c1=c(23,123,5),c2=c(54,5,4),c3=c(0.999,1,23),c4=c(0.78,0.99,54))
 df
  name  c1 c2 c3    c4
1   AA  23 54  0.999  0.78
2   BB 123  5  1.000  0.99
3   CC   5  4 23.000 54.00

 df1=data.frame(name=c('BB','AA', 'DD'),c5=c(98,87,54),c6=c(7,6,3))
 df1
  name c5 c6
1   BB 98  7
2   AA 87  6
3   DD 54  3

now I want to get interaction for df and df1 in terms of name. this is
  name  c1   c2 c3    c4   c5   c6
  AA     23   54  0.999  0.78  87   6
  BB    123   5   1.000   0.99  98   7

could give advice?




--- On Thu, 29/4/10, Tengfei Yin yinteng...@gmail.com wrote:

From: Tengfei Yin yinteng...@gmail.com
Subject: Re: [R] by funtion
To: Petr PIKAL petr.pi...@precheza.cz
Cc: Yuan Jian jayuan2...@yahoo.com, r-help@r-project.org
Received: Thursday, 29 April, 2010, 6:44 AM

Hi, 
Thanks, actually I mentioned in the reply, you need to turn the matrix into 
data frame in the end if use this method. e.g
 df=data.frame(name=c('AA','BB','AA'),c1=c(23,123,203),c2=c(54,5,98),c3=c(0.999,1,0.79),c4=c(0.78,0.99,0.99))

 aa=by(df[,2:5],df$name,mean) dd=do.call('rbind',aa) df=data.frame(dd) df   
  c1 c2     c3    c4AA 113 76 0.8945 0.885

BB 123  5 1. 0.990
Regards
TengfeiOn Thu, Apr 29, 2010 at 1:30 AM, Petr PIKAL petr.pi...@precheza.cz 
wrote:


Hi



r-help-boun...@r-project.org napsal dne 29.04.2010 08:11:41:



 Hi



 you could try



 do.call('rbind',aa)



No, No, No. rbind and cbind binds vectors as rows or columns of

***matrix***, result is not a data frame



do.call(rbind,aa)

    X069rutil X102anatas

105      26.9        7.9

200      22.8       10.6

400      30.6       13.3

600      50.8       20.6

800      78.7         NA

exp.df-do.call(rbind,aa)

str(exp.df)

 num [1:5, 1:2] 26.9 22.8 30.6 50.8 78.7 7.9 10.6 13.3 20.6 NA

 - attr(*, dimnames)=List of 2

  ..$ : chr [1:5] 105 200 400 600 ...

  ..$ : chr [1:2] X069rutil X102anatas



If some object has rectangular shape and has column names it does not

automatically mean that it is data frame



Regards

Petr











 then turn the matrix into data frame



 regards



 Tengfei



 On Wed, Apr 28, 2010 at 10:56 PM, Yuan Jian jayuan2...@yahoo.com

wrote:



  Hello,

 

  I have a data.frame:

  name    col1    col2    col3    col4

  AA    23    54    0.999    0.78

  BB    123    5    1    0.99

  AA    203    98    0.79    0.99

 

  I want to get mean value data.frame in terms of name:

 

  name    col1    col2    col3    col4

 

  AA    113.  76.   0.8945   0.8850

 

  BB    123.00   5.00   1.00   0.99

 

  I tried to use by function:

 

  aa-by(test[,2:5], feature, mean)

  I found aa is by function.

   class(aa)

  [1] by

 

  how can I transfer aa to a data frame?

 

  thanks

  YU

 

 

 

 

 

 

         [[alternative HTML version deleted]]

 

 

  __

  R-help@r-project.org mailing list

  https://stat.ethz.ch/mailman/listinfo/r-help

  PLEASE do read the posting guide

  http://www.R-project.org/posting-guide.html

  and provide commented, minimal, self-contained, reproducible code.

 

 





 --

 Tengfei Yin

 MCDB PhD student

 1620 Howe Hall, 2274,

 Iowa State University

 Ames, IA,50011-2274

 Homepage: www.tengfei.name



    [[alternative HTML version deleted]]



 __

 R-help@r-project.org mailing list

 https://stat.ethz.ch/mailman/listinfo/r-help

 PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

 and provide commented, minimal, self-contained, reproducible code.






-- 
Tengfei Yin
MCDB PhD student 
1620 Howe Hall, 2274,
Iowa State University
Ames, IA,50011-2274
Homepage: www.tengfei.name








  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Exporting an rgl graph

2010-04-29 Thread cgenolin


I need to use the function saveTriangleAsASY in my package. Does it allready
exist in a package or may I unclude it ?

Christophe
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Exporting-an-rgl-graph-tp1872712p2075086.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] by funtion

Hi

sorry I did not read your reply as thoroughly. But generally matrices are 
quite often exchanged for data frames. Also if you have list with mixture 
of numeric and nonumeric data such approach results in nonumeric output as 
matrix can have values only of one type. I would therefore generally 
prefer

 dd=do.call('data.frame',aa)
 dd
 AA BB
c1 113. 123.00
c2  76.   5.00
c3   0.8945   1.00
c4   0.8850   0.99
 t(dd)
c1 c2 c3c4
AA 113 76 0.8945 0.885
BB 123  5 1. 0.990

approach

Regards
Petr

r-help-boun...@r-project.org napsal dne 29.04.2010 08:44:10:

 Hi,
 
 Thanks, actually I mentioned in the reply, you need to turn the matrix 
into
 data frame in the end if use this method. e.g
 
 
 
df=data.frame(name=c('AA','BB','AA'),c1=c(23,123,203),c2=c(54,5,98),c3=c(0.
 999,1,0.79),c4=c(0.78,0.99,0.99))
  aa=by(df[,2:5],df$name,mean)
  dd=do.call('rbind',aa)
  df=data.frame(dd)
  df
 c1 c2 c3c4
 AA 113 76 0.8945 0.885
 BB 123  5 1. 0.990
 
 Regards
 
 Tengfei
 On Thu, Apr 29, 2010 at 1:30 AM, Petr PIKAL petr.pi...@precheza.cz 
wrote:
 
  Hi
 
  r-help-boun...@r-project.org napsal dne 29.04.2010 08:11:41:
 
   Hi
  
   you could try
  
   do.call('rbind',aa)
 
  No, No, No. rbind and cbind binds vectors as rows or columns of
  ***matrix***, result is not a data frame
 
  do.call(rbind,aa)
 X069rutil X102anatas
  105  26.97.9
  200  22.8   10.6
  400  30.6   13.3
  600  50.8   20.6
  800  78.7 NA
  exp.df-do.call(rbind,aa)
  str(exp.df)
   num [1:5, 1:2] 26.9 22.8 30.6 50.8 78.7 7.9 10.6 13.3 20.6 NA
   - attr(*, dimnames)=List of 2
   ..$ : chr [1:5] 105 200 400 600 ...
   ..$ : chr [1:2] X069rutil X102anatas
 
  If some object has rectangular shape and has column names it does not
  automatically mean that it is data frame
 
  Regards
  Petr
 
 
  
 
  
   then turn the matrix into data frame
  
   regards
  
   Tengfei
  
   On Wed, Apr 28, 2010 at 10:56 PM, Yuan Jian jayuan2...@yahoo.com
  wrote:
  
Hello,
   
I have a data.frame:
namecol1col2col3col4
AA23540.9990.78
BB123510.99
AA203980.790.99
   
I want to get mean value data.frame in terms of name:
   
namecol1col2col3col4
   
AA113.  76.   0.8945   0.8850
   
BB123.00   5.00   1.00   0.99
   
I tried to use by function:
   
aa-by(test[,2:5], feature, mean)
I found aa is by function.
 class(aa)
[1] by
   
how can I transfer aa to a data frame?
   
thanks
YU
   
   
   
   
   
   
   [[alternative HTML version deleted]]
   
   
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
   
   
  
  
   --
   Tengfei Yin
   MCDB PhD student
   1620 Howe Hall, 2274,
   Iowa State University
   Ames, IA,50011-2274
   Homepage: www.tengfei.name
  
  [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Tengfei Yin
 MCDB PhD student
 1620 Howe Hall, 2274,
 Iowa State University
 Ames, IA,50011-2274
 Homepage: www.tengfei.name
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] by funtion

Hi

probably merge is what you want

see

?merge
Regards
Petr

r-help-boun...@r-project.org napsal dne 29.04.2010 09:13:34:

 Thanks Tengfei,
 I have another question. 
  df=data.frame(name=c('AA','BB', 
'CC'),c1=c(23,123,5),c2=c(54,5,4),c3=c(0.
 999,1,23),c4=c(0.78,0.99,54))
  df
   name  c1 c2 c3c4
 1   AA  23 54  0.999  0.78
 2   BB 123  5  1.000  0.99
 3   CC   5  4 23.000 54.00
 
  df1=data.frame(name=c('BB','AA', 'DD'),c5=c(98,87,54),c6=c(7,6,3))
  df1
   name c5 c6
 1   BB 98  7
 2   AA 87  6
 3   DD 54  3
 
 now I want to get interaction for df and df1 in terms of name. this is
   name  c1   c2 c3c4   c5   c6
   AA 23   54  0.999  0.78  87   6
   BB123   5   1.000   0.99  98   7
 
 could give advice?
 
 
 
 
 --- On Thu, 29/4/10, Tengfei Yin yinteng...@gmail.com wrote:
 
 From: Tengfei Yin yinteng...@gmail.com
 Subject: Re: [R] by funtion
 To: Petr PIKAL petr.pi...@precheza.cz
 Cc: Yuan Jian jayuan2...@yahoo.com, r-help@r-project.org
 Received: Thursday, 29 April, 2010, 6:44 AM
 
 Hi, 
 Thanks, actually I mentioned in the reply, you need to turn the matrix 
into 
 data frame in the end if use this method. e.g
  
df=data.frame(name=c('AA','BB','AA'),c1=c(23,123,203),c2=c(54,5,98),c3=c(0.
 999,1,0.79),c4=c(0.78,0.99,0.99))
 
  aa=by(df[,2:5],df$name,mean) dd=do.call('rbind',aa) 
df=data.frame(dd) 
 dfc1 c2 c3c4AA 113 76 0.8945 0.885
 
 BB 123  5 1. 0.990
 Regards
 TengfeiOn Thu, Apr 29, 2010 at 1:30 AM, Petr PIKAL 
petr.pi...@precheza.cz wrote:
 
 
 Hi
 
 
 
 r-help-boun...@r-project.org napsal dne 29.04.2010 08:11:41:
 
 
 
  Hi
 
 
 
  you could try
 
 
 
  do.call('rbind',aa)
 
 
 
 No, No, No. rbind and cbind binds vectors as rows or columns of
 
 ***matrix***, result is not a data frame
 
 
 
 do.call(rbind,aa)
 
 X069rutil X102anatas
 
 105  26.97.9
 
 200  22.8   10.6
 
 400  30.6   13.3
 
 600  50.8   20.6
 
 800  78.7 NA
 
 exp.df-do.call(rbind,aa)
 
 str(exp.df)
 
  num [1:5, 1:2] 26.9 22.8 30.6 50.8 78.7 7.9 10.6 13.3 20.6 NA
 
  - attr(*, dimnames)=List of 2
 
   ..$ : chr [1:5] 105 200 400 600 ...
 
   ..$ : chr [1:2] X069rutil X102anatas
 
 
 
 If some object has rectangular shape and has column names it does not
 
 automatically mean that it is data frame
 
 
 
 Regards
 
 Petr
 
 
 
 
 
 
 
 
 
 
 
  then turn the matrix into data frame
 
 
 
  regards
 
 
 
  Tengfei
 
 
 
  On Wed, Apr 28, 2010 at 10:56 PM, Yuan Jian jayuan2...@yahoo.com
 
 wrote:
 
 
 
   Hello,
 
  
 
   I have a data.frame:
 
   namecol1col2col3col4
 
   AA23540.9990.78
 
   BB123510.99
 
   AA203980.790.99
 
  
 
   I want to get mean value data.frame in terms of name:
 
  
 
   namecol1col2col3col4
 
  
 
   AA113.  76.   0.8945   0.8850
 
  
 
   BB123.00   5.00   1.00   0.99
 
  
 
   I tried to use by function:
 
  
 
   aa-by(test[,2:5], feature, mean)
 
   I found aa is by function.
 
class(aa)
 
   [1] by
 
  
 
   how can I transfer aa to a data frame?
 
  
 
   thanks
 
   YU
 
  
 
  
 
  
 
  
 
  
 
  
 
  [[alternative HTML version deleted]]
 
  
 
  
 
   __
 
   R-help@r-project.org mailing list
 
   https://stat.ethz.ch/mailman/listinfo/r-help
 
   PLEASE do read the posting guide
 
   http://www.R-project.org/posting-guide.html
 
   and provide commented, minimal, self-contained, reproducible code.
 
  
 
  
 
 
 
 
 
  --
 
  Tengfei Yin
 
  MCDB PhD student
 
  1620 Howe Hall, 2274,
 
  Iowa State University
 
  Ames, IA,50011-2274
 
  Homepage: www.tengfei.name
 
 
 
 [[alternative HTML version deleted]]
 
 
 
  __
 
  R-help@r-project.org mailing list
 
  https://stat.ethz.ch/mailman/listinfo/r-help
 
  PLEASE do read the posting guide
 
 http://www.R-project.org/posting-guide.html
 
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
 
 -- 
 Tengfei Yin
 MCDB PhD student 
 1620 Howe Hall, 2274,
 Iowa State University
 Ames, IA,50011-2274
 Homepage: www.tengfei.name
 
 
 
 
 
 
 
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Split a vector by NA's - is there a better solution then a loop ?

2010-04-29 Thread Tal Galili

Hi all,

I would like to have a function like this:
split.vec.by.NA - function(x)

That takes a vector like this:
x - c(2,1,2,NA,1,1,2,NA,4,5,2,3)

And returns a list of length of 3, each element of the list is the relevant
segmented vector, like this:

$`1`
[1] 2 1 2
$`2`
[1] 1 1 2
$`3`
[1] 4 5 2 3


I found how to do it with a loop, but wondered if there is some smarter
(vectorized) way of doing it.



Here is the code I used:

x - c(2,1,2,NA,1,1,2,NA,4,5,2,3)


split.vec.by.NA - function(x)
{
# assumes NA are seperating groups of numbers
#TODO: add code to check for it

number.of.groups - sum(is.na(x)) + 1
groups.end.point.locations - c(which(is.na(x)), length(x)+1) # This will be
all the places with NA's + a nubmer after the ending of the vector
 group.start - 1
group.end - NA
new.groups.split.id - x # we will replace all the places of the group with
group ID, excapt for the NA, which will later be replaced by 0
 for(i in seq_len(number.of.groups))
{
group.end - groups.end.point.locations[i]-1
 new.groups.split.id[group.start:group.end] - i
 group.start - groups.end.point.locations[i]+1 # make the new group start
higher for the next loop (at the final loop it won't matter
 }
 new.groups.split.id[is.na(x)] - 0
 return(split(x, new.groups.split.id)[-1])
}

split.vec.by.NA(x)




Thanks,
Tal




Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] using get and paste in a loop to return objects for object names listed a strings

2010-04-29 Thread Nevil Amos

I am trying to create a heap of boxplots, by looping though a series of 
factors and variables in a large data.frame suing paste to constrcut the 
facto and response names from the colnames

I thought I could do this using get()
however it is not working what am I doing wrong?

thanks

Nevil Amos


sp.codes=levels(data.all$CODE_LETTERS)

for(spp in sp.codes) {


data.sp=subset(data.all,CODE_LETTERS==spp)

responses = colnames(data.all)[c(20,28,29,19)]
 #if (spp==BT) responses = colnames(data.all)[c(19,20,26:29)]
groups=colnames   (data.all)[c(9,10,13,16,30)]

data.sp=subset(data.all,CODE_LETTERS==spp)
for (response in responses){
for (group in groups){
r-get(paste(data.sp$,response,sep=))
g-get(paste(data.sp$,group, sep=))
print (r)
print(g)

boxplot(r ~g)
}}}

Error in get(paste(data.sp$, response, sep = )) :
  object 'data.sp$Hb' not found

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

2010-04-29 Thread Romain Francois


Maybe this :

 foo - function( x ){
+   idx - 1 + cumsum( is.na( x ) )
+   not.na - ! is.na( x )
+   split( x[not.na], idx[not.na] )
+ }
 foo( x )
$`1`
[1] 2 1 2

$`2`
[1] 1 1 2

$`3`
[1] 4 5 2 3

Romain

Le 29/04/10 09:42, Tal Galili a écrit :


Hi all,

I would like to have a function like this:
split.vec.by.NA- function(x)

That takes a vector like this:
x- c(2,1,2,NA,1,1,2,NA,4,5,2,3)

And returns a list of length of 3, each element of the list is the relevant
segmented vector, like this:

$`1`
[1] 2 1 2
$`2`
[1] 1 1 2
$`3`
[1] 4 5 2 3


I found how to do it with a loop, but wondered if there is some smarter
(vectorized) way of doing it.



Here is the code I used:

x- c(2,1,2,NA,1,1,2,NA,4,5,2,3)


split.vec.by.NA- function(x)
{
# assumes NA are seperating groups of numbers
#TODO: add code to check for it

number.of.groups- sum(is.na(x)) + 1
groups.end.point.locations- c(which(is.na(x)), length(x)+1) # This will be
all the places with NA's + a nubmer after the ending of the vector
  group.start- 1
group.end- NA
new.groups.split.id- x # we will replace all the places of the group with
group ID, excapt for the NA, which will later be replaced by 0
  for(i in seq_len(number.of.groups))
{
group.end- groups.end.point.locations[i]-1
  new.groups.split.id[group.start:group.end]- i
  group.start- groups.end.point.locations[i]+1 # make the new group start
higher for the next loop (at the final loop it won't matter
  }
  new.groups.split.id[is.na(x)]- 0
  return(split(x, new.groups.split.id)[-1])
}

split.vec.by.NA(x)




Thanks,
Tal


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9aKDM9 : embed images in Rd documents
|- http://tr.im/OIXN : raster images and RImageJ
|- http://tr.im/OcQe : Rcpp 0.7.7

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function which saves an image of a dgtMatrix as png

2010-04-29 Thread Gildas Mazo

Thanks so much



Douglas Bates a écrit :
 image applied to a sparseMatrix object uses lattice functions to
 create the image.  As described in R FAQ 7.22 you must use

 print(image(x))

 or

 show(image(x))

 or even

 plot(image(x))

 when a lattice function is called from within another function.
 On Wed, Apr 28, 2010 at 1:20 PM, Gildas Mazo gildas.m...@curie.fr wrote:
   
 Hi,

 I'm getting crazy:

 This does work:

 library(Matrix)
 a1-b1-c(1,2)
 c1-rnorm(2)
 aDgt-spMatrix(ncol=3,nrow=3,i=a1,j=b1,x=c1)
 png(myImage.png)
 image(aDgt)
 dev.off()

 But this doesn't !!!

 f-function(x){
 png(myImage.png)
 image(x)
 dev.off()
 }
 f(aDgt)

 My image is saved as a text file and contains nothing at all !!!
 Thanks in advance,

 Gildas Mazo

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] by funtion

2010-04-29 Thread Tengfei Yin

Hi Petr,

Thanks for your suggestions:)

@Yuan,

Petr is right, you can try

merge(df,df1,'name')

Regards

Tengfei


On Thu, Apr 29, 2010 at 2:20 AM, Petr PIKAL petr.pi...@precheza.cz wrote:

 Hi

 probably merge is what you want

 see

 ?merge
 Regards
 Petr

 r-help-boun...@r-project.org napsal dne 29.04.2010 09:13:34:

  Thanks Tengfei,
  I have another question.
   df=data.frame(name=c('AA','BB',
 'CC'),c1=c(23,123,5),c2=c(54,5,4),c3=c(0.
  999,1,23),c4=c(0.78,0.99,54))
   df
name  c1 c2 c3c4
  1   AA  23 54  0.999  0.78
  2   BB 123  5  1.000  0.99
  3   CC   5  4 23.000 54.00
 
   df1=data.frame(name=c('BB','AA', 'DD'),c5=c(98,87,54),c6=c(7,6,3))
   df1
name c5 c6
  1   BB 98  7
  2   AA 87  6
  3   DD 54  3
 
  now I want to get interaction for df and df1 in terms of name. this is
name  c1   c2 c3c4   c5   c6
AA 23   54  0.999  0.78  87   6
BB123   5   1.000   0.99  98   7
 
  could give advice?
 
 
 
 
  --- On Thu, 29/4/10, Tengfei Yin yinteng...@gmail.com wrote:
 
  From: Tengfei Yin yinteng...@gmail.com
  Subject: Re: [R] by funtion
  To: Petr PIKAL petr.pi...@precheza.cz
  Cc: Yuan Jian jayuan2...@yahoo.com, r-help@r-project.org
  Received: Thursday, 29 April, 2010, 6:44 AM
 
  Hi,
  Thanks, actually I mentioned in the reply, you need to turn the matrix
 into
  data frame in the end if use this method. e.g
  
 df=data.frame(name=c('AA','BB','AA'),c1=c(23,123,203),c2=c(54,5,98),c3=c(0.
  999,1,0.79),c4=c(0.78,0.99,0.99))
 
   aa=by(df[,2:5],df$name,mean) dd=do.call('rbind',aa)
 df=data.frame(dd)
  dfc1 c2 c3c4AA 113 76 0.8945 0.885
 
  BB 123  5 1. 0.990
  Regards
  TengfeiOn Thu, Apr 29, 2010 at 1:30 AM, Petr PIKAL
 petr.pi...@precheza.cz wrote:
 
 
  Hi
 
 
 
  r-help-boun...@r-project.org napsal dne 29.04.2010 08:11:41:
 
 
 
   Hi
 
  
 
   you could try
 
  
 
   do.call('rbind',aa)
 
 
 
  No, No, No. rbind and cbind binds vectors as rows or columns of
 
  ***matrix***, result is not a data frame
 
 
 
  do.call(rbind,aa)
 
  X069rutil X102anatas
 
  105  26.97.9
 
  200  22.8   10.6
 
  400  30.6   13.3
 
  600  50.8   20.6
 
  800  78.7 NA
 
  exp.df-do.call(rbind,aa)
 
  str(exp.df)
 
   num [1:5, 1:2] 26.9 22.8 30.6 50.8 78.7 7.9 10.6 13.3 20.6 NA
 
   - attr(*, dimnames)=List of 2
 
..$ : chr [1:5] 105 200 400 600 ...
 
..$ : chr [1:2] X069rutil X102anatas
 
 
 
  If some object has rectangular shape and has column names it does not
 
  automatically mean that it is data frame
 
 
 
  Regards
 
  Petr
 
 
 
 
 
  
 
 
 
  
 
   then turn the matrix into data frame
 
  
 
   regards
 
  
 
   Tengfei
 
  
 
   On Wed, Apr 28, 2010 at 10:56 PM, Yuan Jian jayuan2...@yahoo.com
 
  wrote:
 
  
 
Hello,
 
   
 
I have a data.frame:
 
namecol1col2col3col4
 
AA23540.9990.78
 
BB123510.99
 
AA203980.790.99
 
   
 
I want to get mean value data.frame in terms of name:
 
   
 
namecol1col2col3col4
 
   
 
AA113.  76.   0.8945   0.8850
 
   
 
BB123.00   5.00   1.00   0.99
 
   
 
I tried to use by function:
 
   
 
aa-by(test[,2:5], feature, mean)
 
I found aa is by function.
 
 class(aa)
 
[1] by
 
   
 
how can I transfer aa to a data frame?
 
   
 
thanks
 
YU
 
   
 
   
 
   
 
   
 
   
 
   
 
   [[alternative HTML version deleted]]
 
   
 
   
 
__
 
R-help@r-project.org mailing list
 
https://stat.ethz.ch/mailman/listinfo/r-help
 
PLEASE do read the posting guide
 
http://www.R-project.org/posting-guide.html
 
and provide commented, minimal, self-contained, reproducible code.
 
   
 
   
 
  
 
  
 
   --
 
   Tengfei Yin
 
   MCDB PhD student
 
   1620 Howe Hall, 2274,
 
   Iowa State University
 
   Ames, IA,50011-2274
 
   Homepage: www.tengfei.name
 
  
 
  [[alternative HTML version deleted]]
 
  
 
   __
 
   R-help@r-project.org mailing list
 
   https://stat.ethz.ch/mailman/listinfo/r-help
 
   PLEASE do read the posting guide
 
  http://www.R-project.org/posting-guide.html
 
   and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
 
  --
  Tengfei Yin
  MCDB PhD student
  1620 Howe Hall, 2274,
  Iowa State University
  Ames, IA,50011-2274
  Homepage: www.tengfei.name
 
 
 
 
 
 
 
 
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read

[R] Compact Patricia Trees (Tries)

2010-04-29 Thread Richard R. Liu

I have an application that a long list of character strings to determine which
occur at the beginning of a given word.  A straight forward R script using grep
takes a long time to run.  Rewriting it to use substr and match might be an
option, but I have the impression that preparing the list as a trie and
performing trie searches might lead to dramatic improvements in performance. 


I have searched the CRAN packages and find no packages that support Compact
Patricia Trees.  Does anybody know of such?


Thanks,
Richard

Richard R. Liu
richard@pueo-owl.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

2010-04-29 Thread Tal Galili

Definitely Smarter,
Thanks!

Tal

Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Thu, Apr 29, 2010 at 10:56 AM, Romain Francois 
romain.franc...@dbmail.com wrote:

 Maybe this :

  foo - function( x ){
 +   idx - 1 + cumsum( is.na( x ) )
 +   not.na - ! is.na( x )
 +   split( x[not.na], idx[not.na] )
 + }
  foo( x )

 $`1`
 [1] 2 1 2

 $`2`
 [1] 1 1 2

 $`3`
 [1] 4 5 2 3

 Romain

 Le 29/04/10 09:42, Tal Galili a écrit :


 Hi all,

 I would like to have a function like this:
 split.vec.by.NA- function(x)

 That takes a vector like this:
 x- c(2,1,2,NA,1,1,2,NA,4,5,2,3)

 And returns a list of length of 3, each element of the list is the
 relevant
 segmented vector, like this:

 $`1`
 [1] 2 1 2
 $`2`
 [1] 1 1 2
 $`3`
 [1] 4 5 2 3


 I found how to do it with a loop, but wondered if there is some smarter
 (vectorized) way of doing it.



 Here is the code I used:

 x- c(2,1,2,NA,1,1,2,NA,4,5,2,3)


 split.vec.by.NA- function(x)
 {
 # assumes NA are seperating groups of numbers
 #TODO: add code to check for it

 number.of.groups- sum(is.na(x)) + 1
 groups.end.point.locations- c(which(is.na(x)), length(x)+1) # This will
 be
 all the places with NA's + a nubmer after the ending of the vector
  group.start- 1
 group.end- NA
 new.groups.split.id- x # we will replace all the places of the group
 with
 group ID, excapt for the NA, which will later be replaced by 0
  for(i in seq_len(number.of.groups))
 {
 group.end- groups.end.point.locations[i]-1
  new.groups.split.id[group.start:group.end]- i
  group.start- groups.end.point.locations[i]+1 # make the new group start
 higher for the next loop (at the final loop it won't matter
  }
  new.groups.split.id[is.na(x)]- 0
  return(split(x, new.groups.split.id)[-1])
 }

 split.vec.by.NA(x)




 Thanks,
 Tal


 --
 Romain Francois
 Professional R Enthusiast
 +33(0) 6 28 91 30 30
 http://romainfrancois.blog.free.fr
 |- http://bit.ly/9aKDM9 : embed images in Rd documents
 |- http://tr.im/OIXN : raster images and RImageJ
 |- http://tr.im/OcQe : Rcpp 0.7.7




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with optimization (constrOptim)

2010-04-29 Thread Człowiek Kuba

Hi,

You are right, my intention was to return a set of values and to minimize
them all in a multicriteria optimization problem.

The interesting thing is that when I actually used scalar return of this
function, by minimizing sum of squares in this form:


fr - function(z) {
t(z%*%matrix(c(2,5,6), 3,1)-matrix(c(5,4,2), 3,1))%*%(z%*%matrix(c(2,5,6),
3,1)-matrix(c(5,4,2), 3,1))
}
constrOptim((matrix(c(0,0,0,0,0,0,0,0,0),3,3)), fr)
or
nlm(fr, matrix(c(0,0,0,0,0,0,0,0,0),3,3))
--
the function also returned non-comformable error.
Kind regards
Jacob



2010/4/29 Nikhil Kaza nikhil.l...@gmail.com


 fr does not return a scalar.


 Nikhil



 On Apr 28, 2010, at 3:35 AM, Cz³owiek Kuba wrote:

   Hello,

 I have the following problem:
 I have a set of n matrix equations in the form of :
 [b1] = [A] * [b0]
 [b2] = [A] * [b1]
 etc.
 vertical vectors [b0], [b1], ... are GIVEN. We try to estimate matrix A.
 As
 there are many equations (more than cells in matrix A) the system has no
 solutions.
 A is transition matrix (stochastic matrix) or markov process, so the sum
 of
 each row = 1 and each entry is probability (aij in 0;1). I tried to
 estimate A by using constrOptim the following way, but apparently it won't
 work on matrices.

 fr - function(x) {
 x%*%matrix(c(2,5,6), 3,1)-matrix(c(5,4,2), 3,1)
 x%*%matrix(c(6,2,3), 3,1)-matrix(c(1,1,1), 3,1)
 x%*%matrix(c(6,1,2), 3,1)-matrix(c(3,4,1), 3,1)
 }
 constrOptim(matrix(c(0.5,0.4,0.1,0.2,0.3,0.5,0.5,0.2,0.3),3,3), fr, NULL,
 ui=matrix(c(1,0,0,0,1,0,0,0,1),3,3), ci=matrix(c(-.1
 ,-.1,-.1,-.1,-.1,-.1,-.1,-.1,-.1),3,3))

 It produces the following error:
 Error in ui %*% theta : non-conformable arguments

 Kind regards and thanks for help
 Jacob

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lattice Groups

2010-04-29 Thread Santosh

Dear R experts..
Related to the example below, (which was discussed earlier)...
How do I control the graphical elements of box, whiskers etc?  I would like
their colors go with specific groups. i tried changing
par.settings(box.umbrella, box.rectangle etc)..and could not make them
work.. Sample dataset and example code is given below.

tmp - data.frame(
y=rnorm(100),
category=rep(factor(letters[1:
5]),each=20),
level=rep(factor(0:1), length=100))

barchart(y~factor(category),groups=level,
data=tmp,jitter.x=F,
panel=function(...){
panel.superpose( ...)
panel.superpose(panel.groups=panel.bwplot,
alpha=c(0.5,0.5),
varwidth=T,notch=T,
col=c(red,blue),
fill=c(pink,lightblue),pch=16,

par.settings=list(box.umbrella=list(col=c(red,blue),box.dot=list(col=c(red,blue,...)

panel.superpose(panel.groups=panel.loess,lwd=2,col.line=c(red,blue),alpha=0.2,lty=1,...)
panel.abline(h=0,col=black,lty=2)},
xlab=time bin (week),
auto.key=list(space=right,text=c(A,H),points=T))

Thanks,
Santosh
_
On Wed, Apr 8, 2009 at 12:07 PM, Deepayan Sarkar
deepayan.sar...@gmail.comwrote:

 On Wed, Apr 8, 2009 at 10:36 AM, Lyman, Mark mark.ly...@atk.com wrote:
  I don't understand your first question, but, since no one else has
  responded I can answer your second question. panel.bwplot, unlike
  panel.xyplot doesn't use panel.superpose when groups is not NULL. In
  order to get an analogous result you need to specify that you want to
  use panel.superpose.
 
  cols - c(Sepal.Width, Petal.Length, Petal.Width)
  stackedData - stack(iris[, cols])
  df - data.frame(y = stackedData$values, x = rep(iris$Species, 3), which
  = gl(3, nrow(iris)))
 
  bwplot(y ~ x:which, data = df, groups = which, panel=panel.superpose,
  panel.groups = panel.bwplot)
 
  If you don't like the default colors, you can set the fill colors with
  par.settings like:
 
  bwplot(y ~ x:which, data = df, groups = which, panel=panel.superpose,
  panel.groups = panel.bwplot,
  par.settings=list(superpose.symbol=list(fill=2:4)))

 And to answer the first question: using panel.superpose hijacks the
 parameters of the median spot, but they can be supplied explicity:

 bwplot(y ~ x:which, data = df, groups = which, panel=panel.superpose,
panel.groups = panel.bwplot,
 par.settings=list(superpose.symbol=list(fill=2:4)), col = black, pch
 = 16)

 -Deepayan

 
  Without the groups, the fill colors are controlled like this
  bwplot(y~x:which, data = df,
  par.settings=list(box.rectangle=list(fill=2:4)))
 
  Although if you have groups, using the groups argument is probably
  better.
 
  Mark Lyman
 
 
  Message: 41
  Date: Tue, 7 Apr 2009 10:50:33 +0100
  From: Richard Weeks dickywe...@hotmail.com
  Subject: [R] Lattice Groups
  To: r-help@r-project.org
  Message-ID: blu138-w2277550025ed688aae0c91dc...@phx.gbl
  Content-Type: text/plain
 
 
  Hi all,
 
 
 
  I'm trying to achieve a few things using the lattice package but am
  failing miserably.
 
  I am plotting side by side box plots and using a grouping variable, e.g.
 
 
 
  cols - c(Sepal.Width, Petal.Length, Petal.Width)
  stackedData - stack(iris[, cols])
  df - data.frame(y = stackedData$values, x = rep(iris$Species, 3), which
  = gl(3, nrow(iris)))
 
  bwplot(y ~ x:which, data = df, group = which, panel.groups =
  panel.bwplot)
 
 
 
  My questions are
 
  1) How am I able to retain the median spot in the boxes?
 
  2) How can I change the fill using the par.settings argument rather than
  fill =1:3 say?
 
 
 
  Best wishes,
 
 
 
  Biff
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Request - adding recycled lwd parameter to polygon

2010-04-29 Thread Tal Galili

Hello dear members of R-help and R-core mailing list,

I am not sure if this request is a ticket that should be filled somewhere
outside the mailing list.  If so, I apologize for not doing and would like
to know where I should have filled it.



And to the subject matter:

I would like to use a command like this:

plot(c(1,8), 1:2, type=n)

polygon(1:7, c(2,1,2,NA,2,1,2),

 col=c(red, blue),

 # border=c(green, yellow),

 border=c(1,10),

 lwd=c(1:10))

To create two triangles, with different line widths.

But the polygon command doesn't seem to recycle the lwd parameter as it
does for the col, lty, and the border parameters.

I would like the resulting plot to look like what the following code will
produce:

plot(c(1,8), 1:2, type=n)

polygon(1:3, c(2,1,2),

 col=c(red),

 # border=c(green, yellow),

 border=c(1,10),

 lwd=c(1))

polygon(5:7, c(2,1,2),

 col=c( blue),

 # border=c(green, yellow),

 border=c(1,10),

 lwd=c(10))


I opened up the polygon code to add the lwd parameter so to be used as the
lty is used.
For some reason it didn't work (I am wondering if it is because of some way
.Internal(polygon(xy$x, xy$y, col, border, lty, lwd,...)) doesn't accept
lwd...)



Here is the updates code I wrote:



polygon2   -   function (x, y = NULL, density = NULL, angle = 45, border =
NULL,
   col = NA, lty = par(lty), lwd
=par(lwd) ,..., fillOddEven = FALSE)
{
..debug.hatch - FALSE
xy - xy.coords(x, y)
if (is.numeric(density)  all(is.na(density) | density 
0))
density - NULL
if (!is.null(angle)  !is.null(density)) {
polygon.onehatch - function(x, y, x0, y0, xd, yd, ..debug.hatch =
FALSE,
...) {
if (..debug.hatch) {
points(x0, y0)
arrows(x0, y0, x0 + xd, y0 + yd)
}
halfplane - as.integer(xd * (y - y0) - yd * (x -
x0) = 0)
cross - halfplane[-1L] - halfplane[-length(halfplane)]
does.cross - cross != 0
if (!any(does.cross))
return()
x1 - x[-length(x)][does.cross]
y1 - y[-length(y)][does.cross]
x2 - x[-1L][does.cross]
y2 - y[-1L][does.cross]
t - (((x1 - x0) * (y2 - y1) - (y1 - y0) * (x2 -
x1))/(xd * (y2 - y1) - yd * (x2 - x1)))
o - order(t)
tsort - t[o]
crossings - cumsum(cross[does.cross][o])
if (fillOddEven)
crossings - crossings%%2
drawline - crossings != 0
lx - x0 + xd * tsort
ly - y0 + yd * tsort
lx1 - lx[-length(lx)][drawline]
ly1 - ly[-length(ly)][drawline]
lx2 - lx[-1L][drawline]
ly2 - ly[-1L][drawline]
segments(lx1, ly1, lx2, ly2, ...)
}
polygon.fullhatch - function(x, y, density, angle, ..debug.hatch =
FALSE,
...) {
x - c(x, x[1L])
y - c(y, y[1L])
angle - angle%%180
if (par(xlog) || par(ylog)) {
warning(cannot hatch with logarithmic scale active)
return()
}
usr - par(usr)
pin - par(pin)
upi - c(usr[2L] - usr[1L], usr[4L] - usr[3L])/pin
if (upi[1L]  0)
angle - 180 - angle
if (upi[2L]  0)
angle - 180 - angle
upi - abs(upi)
xd - cos(angle/180 * pi) * upi[1L]
yd - sin(angle/180 * pi) * upi[2L]
if (angle  45 || angle  135) {
if (angle  45) {
  first.x - max(x)
  last.x - min(x)
}
else {
  first.x - min(x)
  last.x - max(x)
}
y.shift - upi[2L]/density/abs(cos(angle/180 *
  pi))
x0 - 0
y0 - floor((min(y) - first.x * yd/xd)/y.shift) *
  y.shift
y.end - max(y) - last.x * yd/xd
while (y0  y.end) {
  polygon.onehatch(x, y, x0, y0, xd, yd, ..debug.hatch =
..debug.hatch,
...)
  y0 - y0 + y.shift
}
}
else {
if (angle  90) {
  first.y - max(y)
  last.y - min(y)
}
else {
  first.y - min(y)
  last.y - max(y)
}
x.shift - upi[1L]/density/abs(sin(angle/180 *
  pi))
x0 - floor((min(x) - first.y * xd/yd)/x.shift) *
  x.shift
y0 - 0
x.end - max(x) - last.y * xd/yd
while (x0  x.end) {
  polygon.onehatch(x, y, x0, y0, xd, yd, ..debug.hatch =

Re: [R] NLS amp;quot;Singular Gradientamp;quot; Error

2010-04-29 Thread bsnrh


Hi Ben,

That's great, thank you very much indeed.

Kind regards,
Neal


Quoting Ben Bolker [via R]  
ml-node+2074786-1865094303-243...@n4.nabble.com:




 bsnrh bsnrh at leeds.ac.uk writes:



 Hi Ben,

 Your book refers to the mle function in the emdbookx package. I was
 wondering if it's possible to find that package on the internet?

 Many thanks,
 Neal

   If the (draft) PDF says that, it's an error.
   See the mle2 function in the bbmle package (which is available
 from CRAN).  The lambertW function is in the emdbook package,
 which is also available in CRAN.

   install.packages(c(bbmle,emdbook))
   library(bbmle)
   library(emdbook)
   ?mle2
   ?lambertW

   (You can address further questions to me off-list ...)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 View message @   
 http://r.789695.n4.nabble.com/NLS-Singular-Gradient-Error-tp2069029p2074786.html

 To unsubscribe from Re: NLS quot;Singular Gradientquot; Error,   
 click   
  (link removed) 




-- 
View this message in context: 
http://r.789695.n4.nabble.com/NLS-Singular-Gradient-Error-tp2069029p2075140.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using get and paste in a loop to return objects for object names listed a strings

2010-04-29 Thread Paul Hiemstra


Nevil Amos wrote:
I am trying to create a heap of boxplots, by looping though a series 
of factors and variables in a large data.frame suing paste to 
constrcut the facto and response names from the colnames

I thought I could do this using get()
however it is not working what am I doing wrong?
You don't give a reproducible example, this makes it hard to answer your 
question.


But not really in response to your question, take a look at histogram 
from the lattice package or geom_boxplot from the ggplot2 package. These 
functions can do all the work for you of drawing boxplots for a series 
of factors and variables in a large data.frame. This saves you a lot of 
time.


cheers,
Paul


thanks

Nevil Amos


sp.codes=levels(data.all$CODE_LETTERS)

for(spp in sp.codes) {


data.sp=subset(data.all,CODE_LETTERS==spp)

responses = colnames(data.all)[c(20,28,29,19)]
 #if (spp==BT) responses = colnames(data.all)[c(19,20,26:29)]
groups=colnames   (data.all)[c(9,10,13,16,30)]

data.sp=subset(data.all,CODE_LETTERS==spp)
for (response in responses){
for (group in groups){
r-get(paste(data.sp$,response,sep=))
g-get(paste(data.sp$,group, sep=))
print (r)
print(g)

boxplot(r ~g)
}}}

Error in get(paste(data.sp$, response, sep = )) :
  object 'data.sp$Hb' not found

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone:  +3130 274 3113 Mon-Tue
Phone:  +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Random numbers with PDF of user-defined function

2010-04-29 Thread Nick Crosbie

Hi,

In S+/R, is there an easy way to generate random numbers with a
probability distribution specified by an exact user-defined function?

For example, I have a function:

f(x) = 1/(365 * x), which should be fitted for values of x between 1 and
100,000

How do I generate random numbers with a probability distribution that
exactly maps the above function?

Nick

This email and any attachments may contain information t...{{dropped:15}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Determining whether plot.new has been called

2010-04-29 Thread Jim Lemon


On 04/29/2010 02:21 AM, Dennis Fisher wrote:

Colleagues

I have a lengthy script that calls mtext.  Under most circumstances, a graphics 
device is open and a plot exists, in which case mtext works as expected.  
However, there are some instances where the graphics device is open but no plot 
exists.  When mtext is called, I receive an error message:
Error in mtext(1) : plot.new has not been called yet

The solution is to confirm that:
a.  the device is open: length(dev.list())  0
b.  whether plot.new has been called.

I need help on the latter - how does one test whether plot.new has been called?


Hi Dennis,
I use:

if(dev.cur() == 1) # there is no graphics device open

which always seems to be the null device. Since I have never had 
occasion to switch to the null device, this is a sort of test for 
whether there is another graphics device open, and thus whether plot.new 
has been called (on the current device). While it has always worked for 
me, I am aware that it is a Sneaky Trick and may not work in some 
situations.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] control span in panel.loess in xyplot

2010-04-29 Thread Santosh

Dear R gurus..

Is it possible to control span settings for different values of a grouping
variable, when using xyplot? an example code shown below
d=data.frame(x=rep(sample(1:5,rep=F),10),y=rnorm(50),z=rep(sample(LETTERS[1:2],rep=F),25))
xyplot(y~x,data=d,groups=z,panel=panel.superpose,panel.groups=panel.loess(span=c(2/3,
3/4,1/2))
or something like..
xyplot(y~x,data=d,groups=z,panel=function(...)
{panel.superpose(...);panel.groups=panel.loess(span=3/4,...)})

Thanks,
Santosh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] control span in panel.loess in xyplot

2010-04-29 Thread Gabor Grothendieck

See ?panel.number for lattice functions that can be used in your panel
function to discover which one is currently being drawn.

On Thu, Apr 29, 2010 at 6:28 AM, Santosh santosh2...@gmail.com wrote:
 Dear R gurus..

 Is it possible to control span settings for different values of a grouping
 variable, when using xyplot? an example code shown below
 d=data.frame(x=rep(sample(1:5,rep=F),10),y=rnorm(50),z=rep(sample(LETTERS[1:2],rep=F),25))
 xyplot(y~x,data=d,groups=z,panel=panel.superpose,panel.groups=panel.loess(span=c(2/3,
 3/4,1/2))
 or something like..
 xyplot(y~x,data=d,groups=z,panel=function(...)
 {panel.superpose(...);panel.groups=panel.loess(span=3/4,...)})

 Thanks,
 Santosh

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help in web browser

2010-04-29 Thread Duncan Murdoch


On 28/04/2010 11:07 PM, Chintanu wrote:

Hi,

I have recently updated to R 2.10.1 in my windows system. Since then,
whenever I look for help (e.g., by using ? Function command), the
information is displayed by opening a web-browser. However, I rather would
prefer to have the information in the usual pop-up style. Is there a way to
set/do it ?? Please inform.


Set options(help_type=text) for the plain text popups.  (This was an 
installation option; you could reinstall R to set it in your profile, or 
edit it into RHOME/etc/Rprofile.site yourself.)


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Compact Patricia Trees (Tries)

2010-04-29 Thread Gabor Grothendieck

Using charmatch partial matches of 10,000 5 leters words to the same
list can be done in 10 seconds on my machine and 10,000 5 letter words
to 100,000 10 letter words in 1 minute.  Is that good enough?  Try
this simulation:

# generate N random words each k long
rwords - function(N, k) {
   L - sample(letters, N*k, replace = TRUE)
   apply(matrix(L, k), 2, paste, collapse = )
}
w1 - rwords(1e5, 10)
w2 - rwords(1e4, 5)

system.time(charmatch(w2, w2))

system.time(charmatch(w2, w1))


On Thu, Apr 29, 2010 at 4:05 AM, Richard R. Liu richard@pueo-owl.ch wrote:
 I have an application that a long list of character strings to determine which
 occur at the beginning of a given word.  A straight forward R script using 
 grep
 takes a long time to run.  Rewriting it to use substr and match might be an
 option, but I have the impression that preparing the list as a trie and
 performing trie searches might lead to dramatic improvements in performance.


 I have searched the CRAN packages and find no packages that support Compact
 Patricia Trees.  Does anybody know of such?


 Thanks,
 Richard

 Richard R. Liu
 richard@pueo-owl.ch

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] UpdateLinks = FALSE

2010-04-29 Thread Albert-Jan Roskam

Hi,
 
I'm reading 100s of excel files and many of them contain links to external 
files (I hate that, but that aside). Every time such a file is opened, a menu 
pops up asking if I want to update the links. I never want to update the links. 
I used the macro recorder to see what code would be needed to suppress that 
message, but to no avail (I tried more variations, but one attempt is shown 
below).
How can I suppress such messages?
 
excel - comCreateObject(Excel.Application)
wb - comGetProperty(excel, Workbooks)
comSetProperty(wb, UpdateLinks, FALSE)
owb - comInvoke(wb, Open, xlsfile) # at this point, it's too late

Another query: the program at large erases any cells that contain formulae. 
Thanks to Erich, the program now works like a charm. However, some cells 
contain formulae such as 832.1 * E4 * E3  (yes I know: big, big *sigh*). I 
did not take that possibility into account while writing the program. Would it 
be possible to capture the number (832.1)? My first idea would be to access the 
formula representation (as a string) and use a nifty regular expression.
 
Thank you in advance.

Cheers!!
Albert-Jan

~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] convert Factor as numeric

2010-04-29 Thread arnaud Gaboury

Dear group,

I know this issue has been already covered, and before you reply I must say
I have read the R-FAQ and search the mailing list archive.
I still can't manage to change my factor to numeric as I couldn't find any
clear answer.

Here is my df :

Pose1 -
structure(list(DESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L, 
8L), .Label = c( SUGAR NO.11 May/10 , COTTON NO.2 May/10 , 
PLATINUM Jul/10 , ROBUSTA COFFEE (10) May/10 , WHEAT May/10 , 
PRIMARY NICKEL USD, PRM HGH GD ALUMINIUM USD, SPCL HIGH GRADE ZINC
USD, 
STANDARD LEAD USD), class = factor), POSITION = c(5, 3, -1, 
15, 4, 2), SETTLEMENT = structure(c(3L, 5L, 2L, 1L, 4L, 8L), .Label =
c(1,353., 
1,739.4000, 16.5400, 467.7500, 78.1300, 25,760.8600, 
2,415.9000, 2,421.0500, 2,357.1200), class = factor)), .Names =
c(DESCRIPTION, 
POSITION, SETTLEMENT), row.names = c(1, 2, 3, 4, 
5, 51), class = data.frame)

S-Pose1$SETTLEMENT  #select the last column
 S
[1] 16.540078.13001,739.4000 1,353. 467.7500   2,421.0500
Levels: 1,353. 1,739.4000 16.5400 467.7500 78.1300 25,760.8600
2,415.9000 2,421.0500 2,357.1200
 str(S)
 Factor w/ 9 levels 1,353.,1,739.4000,..: 3 5 2 1 4 8

Now I need to change S to numeric class

 S1-as.numeric(levels(S))[as.integer(S)]   #doesn't work, numbers are
rounded or NA
Warning message:
NAs introduced by coercion

 S1-as.numeric(levels(S))[S]  #doesn't work, numbers are rounded or NA
Warning message:
NAs introduced by coercion

 S1-as.numeric(as.character(S))  #doesn't work, numbers are rounded or NA
Warning message:
NAs introduced by coercion

If it can help, my column S is part of a DF that has been obtained via this
line :

pose=read.csv2(LSCPos1.csv,sep=,,dec=.,as.is=T,h=T,skip=1)[,c(4,8,14,
15)]

pose -
structure(list(DESCRIPTION = c(WHEAT May/10 , WHEAT May/10 , 
WHEAT May/10 , WHEAT May/10 , COTTON NO.2 May/10 , COTTON NO.2 May/10
, 
COTTON NO.2 May/10 , PLATINUM Jul/10 ,  SUGAR NO.11 May/10 , 
 SUGAR NO.11 May/10 ,  SUGAR NO.11 May/10 ,  SUGAR NO.11 May/10 , 
 SUGAR NO.11 May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10)
May/10 , 
ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
PRM HGH GD ALUMINIUM USD 09/07/10 , PRM HGH GD ALUMINIUM USD 09/07/10 , 
PRIMARY NICKEL USD 04/06/10 , PRIMARY NICKEL USD 04/06/10 , 
PRIMARY NICKEL USD 10/06/10 , PRIMARY NICKEL USD 10/06/10 , 
STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , 
STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , 
STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , 
STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 06/07/10 , 
SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD 08/07/10 , 
SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , 
SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , 
SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , 
SPCL HIGH GRADE ZINC USD 13/04/10 , SPCL HIGH GRADE ZINC USD 13/04/10 
), CREATED.DATE = structure(c(14705, 14707, 14707, 14711, 14700, 
14700, 14711, 14711, 14708, 14708, 14708, 14711, 14711, 14707, 
14707, 14707, 14707, 14707, 14708, 14708, 14708, 14708, 14708, 
14708, 14708, 14708, 14708, 14672, 14673, 14678, 14678, 14700, 
14700, 14700, 14700, 14700, 14700, 14700, 14705, 14707, 14707, 
14707, 14708, 14708, 14708, 14708, 14708, 14622, 14634), class = Date), 
QUANITY = c(1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1, 
1, 1, 2, 1, 1, 1, 1, 2, 1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 
1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1, -1, 1, 1, 1, -1), CLOSING.PRICE =
c(467.7500, 
467.7500, 467.7500, 467.7500, 78.1300, 78.1300, 
78.1300, 1,739.4000, 16.5400, 16.5400, 16.5400, 
16.5400, 16.5400, 1,353., 1,353., 1,353., 
1,353., 1,353., 1,353., 1,353., 1,353., 
1,353., 1,353., 1,353., 1,353., 2,415.9000, 
2,415.9000, 25,755.7100, 25,755.7100, 25,760.8600, 
25,760.8600, 2,355.9600, 2,355.9600, 2,355.9600, 
2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,357.1200, 
2,420.7300, 2,420.7300, 2,420.7300, 2,421.0500, 2,421.0500, 
2,421.0500, 2,421.0500, 2,421.0500, 2,388.4300, 2,388.4300
)), .Names = c(DESCRIPTION, CREATED.DATE, QUANITY, 
SETTLEMENT), row.names = c(NA, -49L), class = data.frame)

 str(pose)
'data.frame':   49 obs. of  4 variables:
 $ DESCRIPTION : chr  WHEAT May/10  WHEAT May/10  WHEAT May/10  WHEAT
May/10  ...
 $ CREATED.DATE:Class 'Date'  num [1:49] 14705 14707 14707 14711 14700 ...
 $ QUANITY : num  1 1 1 1 1 1 1 -1 1 1 ...
 $ SETTLEMENT  : chr  467.7500 467.7500 467.7500 467.7500 ...


Pose$SETTLEMENT has a character class, when it should have been
numeric. So maybe a solution would be to give a numeric class when I read
my .csv file?
I tried to change class of this

Re: [R] Sweave question

2010-04-29 Thread Duncan Murdoch


On 28/04/2010 11:31 PM, Felipe Carrillo wrote:

Hi:
I am using Sweave and texi2dvi to generate a LaTeX document but
can't find the way to hide the graphics while the R chunks are being
executed. I thought results=hide would do it but that't not the case. 
  


Sweave runs figure chunks multiple times.  The first time is probably 
what you're seeing:  it just runs the code, with no special devices 
created.  You need to tell R to use something other than your screen as 
the default device for this.  That's what happens if you run Sweave in 
batch mode, or if you choose options(device=pdf).  (You'll get a file 
Rplots.pdf created.)


Duncan Murdoch

If I do:
\begin{figure}[h]
figA=true,echo=F,fig=T,results=hide=
a  rnorm(1000)
plot(a)
@
\caption{Weekly estimates.}
\label{figure:ggplot1}
\end{figure}

The graphic doesn't get displayed but gets printed on the document

but the code below shows the graphic...how can I hide it??
\begin{figure}[h]
figA=true,echo=F,fig=T,results=hide=
library(ggplot2)
winter - read.csv(Winter_AllYears.csv)
wintermelt - melt(winter,id=week)
print(ggplot(wintermelt,aes(week,value/1000)) + geom_line(aes(colour=variable))+ 
opts(legend.position=none) +
facet_wrap(~variable,ncol=2) + opts(title=Winter) + labs(y=Number of fish X 
1,000,x=WEEK))
@
\caption{Weekly estimates.}
\label{figure:ggplot1}
\end{figure}
 
Felipe D. Carrillo

Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA

 
Felipe D. Carrillo

Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Random numbers with PDF of user-defined function

2010-04-29 Thread Robert A LaBudde


At 05:40 AM 4/29/2010, Nick Crosbie wrote:

Hi,

In S+/R, is there an easy way to generate random numbers with a
probability distribution specified by an exact user-defined function?

For example, I have a function:

f(x) = 1/(365 * x), which should be fitted for values of x between 1 and
100,000

How do I generate random numbers with a probability distribution that
exactly maps the above function?

Nick


First of all, your pdf should be f(x) = 1 / [x log(10)], if x is 
continuous.


Second, compute the cdf as F(x) = ln(x) / log(10).

Third, compute the inverse cdf as G(p) = exp[p log(10)]

Finally, to generate random variates, use G(u), where u is a uniform 
random variate in [0,1].


In R,

 G- function (p) exp(p*log(10))
 G(runif(5))
[1] 11178.779736  9475.748549 65939.487801 94914.354479 1.694695



Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: r...@lcfltd.com
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

Vere scire est per causas scire

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Random numbers with PDF of user-defined function

2010-04-29 Thread Duncan Murdoch


On 29/04/2010 5:40 AM, Nick Crosbie wrote:

Hi,

In S+/R, is there an easy way to generate random numbers with a
probability distribution specified by an exact user-defined function?

For example, I have a function:

f(x) = 1/(365 * x), which should be fitted for values of x between 1 and
100,000

How do I generate random numbers with a probability distribution that
exactly maps the above function?
  


You can use sample() with the prob argument set to the values of f(x).  
You probably want replace=TRUE as well.


Duncan Murdoch

Nick

This email and any attachments may contain information t...{{dropped:15}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] errors returned upon trying to update JGR

2010-04-29 Thread mauede

I have upgraded R and am currently running the following version:
R version 2.10.1 Patched (2010-02-20 r51163)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

The characteristics of my system are the following:
OS:  Linux 2.6.27.29-0.1-default x86_64
  Current user:  mau...@linux-326k
  System:  openSUSE 11.1 (x86_64)
  KDE:  4.1.3 (KDE 4.1.3) release 4.10.4

JGR upgrading attempt generated the following errors On-line 
help files cannot be found  from JGR. 
 JGR(update=TRUE)
trying URL 'http://www.rforge.net/src/contrib/JGR_1.7-2.tar.gz'
Content type 'application/x-gzip' length 528295 bytes (515 Kb)
opened URL
==
downloaded 515 Kb

trying URL 'http://cran.r-project.org/src/contrib/rJava_0.8-4.tar.gz'
Content type 'application/x-gzip' length 520037 bytes (507 Kb)
opened URL
==
downloaded 507 Kb

trying URL 'http://www.rforge.net/src/contrib/JavaGD_0.5-3.tar.gz'
Content type 'application/x-gzip' length 101898 bytes (99 Kb)
opened URL
==
downloaded 99 Kb

trying URL 'http://cran.r-project.org/src/contrib/iplots_1.1-3.tar.gz'
Content type 'application/x-gzip' length 331100 bytes (323 Kb)
opened URL
==
downloaded 323 Kb


The downloaded packages are in
/tmp/RtmpXEkgtp/downloaded_packages
Warning messages:
1: In install.packages(c(JGR, rJava, JavaGD, iplots), lt, c(cran,  :
  installation of package 'rJava' had non-zero exit status
2: In install.packages(c(JGR, rJava, JavaGD, iplots), lt, c(cran,  :
  installation of package 'JavaGD' had non-zero exit status
3: In install.packages(c(JGR, rJava, JavaGD, iplots), lt, c(cran,  :
  installation of package 'JGR' had non-zero exit status

Any suggestion is welcome.
Thank you,
Maura 



tutti i telefonini TIM!





tutti i telefonini TIM!





tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert Factor as numeric

2010-04-29 Thread Ben Bolker

arnaud Gaboury arnaud.gaboury at gmail.com writes:

 
 Dear group,
 
 I know this issue has been already covered, and before you reply I must say
 I have read the R-FAQ and search the mailing list archive.
 I still can't manage to change my factor to numeric as I couldn't find any
 clear answer.

(Posting via Gmane, so there will probably be four other
solutions by the time this shows up.)

  Your problem is that R does not recognize the comma
separators in your numeric format.  Thanks for posting
reproducible code!

as.numeric(gsub(,,,as.character(Pose1$SETTLEMENT)))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with optimization (constrOptim)

2010-04-29 Thread Nikhil Kaza

Ah..constrOptim is for linear inequality constraints. your ci is a  
matrix. it should be a vector.


Nikhil



On Apr 29, 2010, at 3:14 AM, Cz³owiek Kuba wrote:

 Hi,

 You are right, my intention was to return a set of values and to  
 minimize them all in a multicriteria optimization problem.

 The interesting thing is that when I actually used scalar return of  
 this function, by minimizing sum of squares in this form:

 
 fr - function(z) {
 t(z%*%matrix(c(2,5,6), 3,1)-matrix(c(5,4,2), 3,1))%*%(z%* 
 %matrix(c(2,5,6), 3,1)-matrix(c(5,4,2), 3,1))
 }
 constrOptim((matrix(c(0,0,0,0,0,0,0,0,0),3,3)), fr)
 or
 nlm(fr, matrix(c(0,0,0,0,0,0,0,0,0),3,3))
 --
 the function also returned non-comformable error.
 Kind regards
 Jacob



 2010/4/29 Nikhil Kaza nikhil.l...@gmail.com

 fr does not return a scalar.


 Nikhil



 On Apr 28, 2010, at 3:35 AM, Cz³owiek Kuba wrote:

 Hello,

 I have the following problem:
 I have a set of n matrix equations in the form of :
 [b1] = [A] * [b0]
 [b2] = [A] * [b1]
 etc.
 vertical vectors [b0], [b1], ... are GIVEN. We try to estimate  
 matrix A. As
 there are many equations (more than cells in matrix A) the system  
 has no
 solutions.
 A is transition matrix (stochastic matrix) or markov process, so the  
 sum of
 each row = 1 and each entry is probability (aij in 0;1). I tried to
 estimate A by using constrOptim the following way, but apparently it  
 won't
 work on matrices.

 fr - function(x) {
 x%*%matrix(c(2,5,6), 3,1)-matrix(c(5,4,2), 3,1)
 x%*%matrix(c(6,2,3), 3,1)-matrix(c(1,1,1), 3,1)
 x%*%matrix(c(6,1,2), 3,1)-matrix(c(3,4,1), 3,1)
 }
 constrOptim(matrix(c(0.5,0.4,0.1,0.2,0.3,0.5,0.5,0.2,0.3),3,3), fr,  
 NULL,
 ui=matrix(c(1,0,0,0,1,0,0,0,1),3,3), ci=matrix(c(-.1
 ,-.1,-.1,-.1,-.1,-.1,-.1,-.1,-.1), 
 3,3))

 It produces the following error:
 Error in ui %*% theta : non-conformable arguments

 Kind regards and thanks for help
 Jacob

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] non linear estimation

2010-04-29 Thread JamesHuang


hey,
thanks and I actually ready found such packages such as nlme, but i failed
to found the comment for restrictions, so.anyway, thanks fro your help.

James
-- 
View this message in context: 
http://r.789695.n4.nabble.com/non-linear-estimation-tp2072136p2075338.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] non linear estimation

2010-04-29 Thread JamesHuang


it is an assignment, haha~~
I just simplify the question and i could do that in Excel using solver. I
just wonder whether I can find a way to do that in R. The main problem is
adding restrictions, I managed to do one question without restrictions in R
by nls. 

James
-- 
View this message in context: 
http://r.789695.n4.nabble.com/non-linear-estimation-tp2072136p2075343.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert Factor as numeric

2010-04-29 Thread arnaud Gaboury

TY petr, I was just trying something like that few mn ago :-)

as.numeric(gsub(,, , S))  does exactly what I want. 




 -Original Message-
 From: Petr PIKAL [mailto:petr.pi...@precheza.cz]
 Sent: Thursday, April 29, 2010 1:28 PM
 To: arnaud Gaboury
 Cc: r-help@r-project.org
 Subject: Odp: [R] convert Factor as numeric
 
 Hi
 
 You have to get rid of thousands separator firsr
 
 as.numeric(gsub(,, , S))
 
 Regards
 Petr
 
 r-help-boun...@r-project.org napsal dne 29.04.2010 13:12:44:
 
  Dear group,
 
  I know this issue has been already covered, and before you reply I
 must
 say
  I have read the R-FAQ and search the mailing list archive.
  I still can't manage to change my factor to numeric as I couldn't
 find
 any
  clear answer.
 
  Here is my df :
 
  Pose1 -
  structure(list(DESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L,
  8L), .Label = c( SUGAR NO.11 May/10 , COTTON NO.2 May/10 ,
  PLATINUM Jul/10 , ROBUSTA COFFEE (10) May/10 , WHEAT May/10 ,
  PRIMARY NICKEL USD, PRM HGH GD ALUMINIUM USD, SPCL HIGH GRADE
 ZINC
  USD,
  STANDARD LEAD USD), class = factor), POSITION = c(5, 3, -1,
  15, 4, 2), SETTLEMENT = structure(c(3L, 5L, 2L, 1L, 4L, 8L), .Label =
  c(1,353.,
  1,739.4000, 16.5400, 467.7500, 78.1300, 25,760.8600,
  2,415.9000, 2,421.0500, 2,357.1200), class = factor)), .Names
 =
  c(DESCRIPTION,
  POSITION, SETTLEMENT), row.names = c(1, 2, 3, 4,
  5, 51), class = data.frame)
 
  S-Pose1$SETTLEMENT  #select the last column
   S
  [1] 16.540078.13001,739.4000 1,353. 467.7500   2,421.0500
  Levels: 1,353. 1,739.4000 16.5400 467.7500 78.1300 25,760.8600
  2,415.9000 2,421.0500 2,357.1200
   str(S)
   Factor w/ 9 levels 1,353.,1,739.4000,..: 3 5 2 1 4 8
 
  Now I need to change S to numeric class
 
   S1-as.numeric(levels(S))[as.integer(S)]   #doesn't work, numbers
 are
  rounded or NA
  Warning message:
  NAs introduced by coercion
 
   S1-as.numeric(levels(S))[S]  #doesn't work, numbers are rounded or
 NA
  Warning message:
  NAs introduced by coercion
 
   S1-as.numeric(as.character(S))  #doesn't work, numbers are rounded
 or
 NA
  Warning message:
  NAs introduced by coercion
 
  If it can help, my column S is part of a DF that has been obtained
 via
 this
  line :
 
 
 pose=read.csv2(LSCPos1.csv,sep=,,dec=.,as.is=T,h=T,skip=1)[,c(4,
 8,14,
  15)]
 
  pose -
  structure(list(DESCRIPTION = c(WHEAT May/10 , WHEAT May/10 ,
  WHEAT May/10 , WHEAT May/10 , COTTON NO.2 May/10 , COTTON NO.2
 May/10
  ,
  COTTON NO.2 May/10 , PLATINUM Jul/10 ,  SUGAR NO.11 May/10 ,
   SUGAR NO.11 May/10 ,  SUGAR NO.11 May/10 ,  SUGAR NO.11 May/10
 ,
   SUGAR NO.11 May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA
 COFFEE
 (10)
  May/10 ,
  ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 ,
  ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 ,
  ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 ,
  ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 ,
  ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 ,
  PRM HGH GD ALUMINIUM USD 09/07/10 , PRM HGH GD ALUMINIUM USD
 09/07/10
 ,
  PRIMARY NICKEL USD 04/06/10 , PRIMARY NICKEL USD 04/06/10 ,
  PRIMARY NICKEL USD 10/06/10 , PRIMARY NICKEL USD 10/06/10 ,
  STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 ,
  STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 ,
  STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 ,
  STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 06/07/10 ,
  SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD
 08/07/10
 ,
  SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD
 09/07/10
 ,
  SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD
 09/07/10
 ,
  SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD
 09/07/10
 ,
  SPCL HIGH GRADE ZINC USD 13/04/10 , SPCL HIGH GRADE ZINC USD
 13/04/10
 
  ), CREATED.DATE = structure(c(14705, 14707, 14707, 14711, 14700,
  14700, 14711, 14711, 14708, 14708, 14708, 14711, 14711, 14707,
  14707, 14707, 14707, 14707, 14708, 14708, 14708, 14708, 14708,
  14708, 14708, 14708, 14708, 14672, 14673, 14678, 14678, 14700,
  14700, 14700, 14700, 14700, 14700, 14700, 14705, 14707, 14707,
  14707, 14708, 14708, 14708, 14708, 14708, 14622, 14634), class =
 Date),
  QUANITY = c(1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1,
  1, 1, 2, 1, 1, 1, 1, 2, 1, 1, -1, 1, 1, -1, -1, 1, 1, -1,
  1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1, -1, 1, 1, 1, -1),
 CLOSING.PRICE =
  c(467.7500,
  467.7500, 467.7500, 467.7500, 78.1300, 78.1300,
  78.1300, 1,739.4000, 16.5400, 16.5400, 16.5400,
  16.5400, 16.5400, 1,353., 1,353., 1,353.,
  1,353., 1,353., 1,353., 1,353.,
 1,353.,
  1,353., 1,353., 1,353., 1,353.,
 2,415.9000,
  2,415.9000, 25,755.7100, 25,755.7100, 25,760.8600,
  25,760.8600, 2,355.9600, 2,355.9600, 2,355.9600,
  2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600,
 2,357.1200,
  2,420.7300, 2,420.7300, 2,420.7300, 2,421.0500,
 2,421.0500,

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

Another option could be:

split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1]

On Thu, Apr 29, 2010 at 4:42 AM, Tal Galili tal.gal...@gmail.com wrote:

 Hi all,

 I would like to have a function like this:
 split.vec.by.NA - function(x)

 That takes a vector like this:
 x - c(2,1,2,NA,1,1,2,NA,4,5,2,3)

 And returns a list of length of 3, each element of the list is the relevant
 segmented vector, like this:

 $`1`
 [1] 2 1 2
 $`2`
 [1] 1 1 2
 $`3`
 [1] 4 5 2 3


 I found how to do it with a loop, but wondered if there is some smarter
 (vectorized) way of doing it.



 Here is the code I used:

 x - c(2,1,2,NA,1,1,2,NA,4,5,2,3)


 split.vec.by.NA - function(x)
 {
 # assumes NA are seperating groups of numbers
 #TODO: add code to check for it

 number.of.groups - sum(is.na(x)) + 1
 groups.end.point.locations - c(which(is.na(x)), length(x)+1) # This will
 be
 all the places with NA's + a nubmer after the ending of the vector
  group.start - 1
 group.end - NA
 new.groups.split.id - x # we will replace all the places of the group
 with
 group ID, excapt for the NA, which will later be replaced by 0
  for(i in seq_len(number.of.groups))
 {
 group.end - groups.end.point.locations[i]-1
  new.groups.split.id[group.start:group.end] - i
  group.start - groups.end.point.locations[i]+1 # make the new group start
 higher for the next loop (at the final loop it won't matter
  }
  new.groups.split.id[is.na(x)] - 0
  return(split(x, new.groups.split.id)[-1])
 }

 split.vec.by.NA(x)




 Thanks,
 Tal




 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |  972-52-7275845
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
 www.r-statistics.com (English)

 --

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] UpdateLinks = FALSE

2010-04-29 Thread Albert-Jan Roskam

Sorry, I intended to send this straight to the rcom mailing list. It's about 
the rcom package.

Cheers!!
Albert-Jan

~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~

--- On Thu, 4/29/10, Albert-Jan Roskam fo...@yahoo.com wrote:


From: Albert-Jan Roskam fo...@yahoo.com
Subject: [R] UpdateLinks = FALSE
To: R Mailing List r-help@r-project.org
Date: Thursday, April 29, 2010, 1:07 PM


Hi,
 
I'm reading 100s of excel files and many of them contain links to external 
files (I hate that, but that aside). Every time such a file is opened, a menu 
pops up asking if I want to update the links. I never want to update the links. 
I used the macro recorder to see what code would be needed to suppress that 
message, but to no avail (I tried more variations, but one attempt is shown 
below).
How can I suppress such messages?
 
excel - comCreateObject(Excel.Application)
wb - comGetProperty(excel, Workbooks)
comSetProperty(wb, UpdateLinks, FALSE)
owb - comInvoke(wb, Open, xlsfile) # at this point, it's too late

Another query: the program at large erases any cells that contain formulae. 
Thanks to Erich, the program now works like a charm. However, some cells 
contain formulae such as 832.1 * E4 * E3  (yes I know: big, big *sigh*). I 
did not take that possibility into account while writing the program. Would it 
be possible to capture the number (832.1)? My first idea would be to access the 
formula representation (as a string) and use a nifty regular expression.
 
Thank you in advance.

Cheers!!
Albert-Jan

~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~


      
    [[alternative HTML version deleted]]


-Inline Attachment Follows-


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] randomness in stepclass (klaR) or lda (MASS) ?

2010-04-29 Thread Eric Elguero

Hi,

a colleague ran a stepwise discriminant analysis
twice in a row and got different results, suggesting
some sochasticity in the algorithms involved.
I looked at her data and found that there was a lot
of collinearity, so that I reckoned that maybe stepclass 
(klaR) cannot find a clear winner when trying to include a 
new variable and makes a random choice. Is that true?
another possibility is that lda (from MASS) computes
CV classification rates from a random subsample instead of
using all the data (?) That might be a sensible choice
with a very large sample.
I advised her to run the function several times and
see if a consensus emerges, but that doesn't seem to
be the case, and besides, I would like to know what
really is going on.

thanks

Eric Elguero
Laboratory Genetics and Evolution of Infectious Diseases, 
Team: Genetics and Adaptation of Plasmodium
UMR 2724 CNRS-IRD,
IRD Montpellier, 
911 Avenue Agropolis, BP 64501, 
34394 Montpellier Cedex 5, 
France


 f4.U.spDA - stepclass(f.mes, f.gp4,
lda,improvement=0.01,prior=rep(0.25,4))
 `stepwise classification', using 10-fold cross-validated correctness
rate of method lda'.
89 observations of 31 variables in 4 classes; direction: both
stop criterion: improvement less than 1%.
correctness rate: 0.58333;  in: X2;  variables (1): X2 
correctness rate: 0.66389;  in: X9;  variables (2): X2, X9 
correctness rate: 0.69583;  in: X27;  variables (3): X2, X9, X27 

 hr.elapsed min.elapsed sec.elapsed 
   0.000.00   20.77 

 f4.U.spDA - stepclass(f.mes, f.gp4,
lda,improvement=0.01,prior=rep(0.25,4))
 `stepwise classification', using 10-fold cross-validated correctness
rate of method lda'.
89 observations of 31 variables in 4 classes; direction: both
stop criterion: improvement less than 1%.
correctness rate: 0.60556;  in: X2;  variables (1): X2 
correctness rate: 0.71806;  in: X6;  variables (2): X2, X6 

 hr.elapsed min.elapsed sec.elapsed 
   0.000.00   15.14

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] operator problem within function

2010-04-29 Thread Bunny, lautloscrew.com

Sorry for that offlist post, did not mean to do it intentionally. just hit the 
wrong button. Unfortunately this disadvantage is not written next to $ in the 
manual. 


 
 On Apr 29, 2010, at 2:34 AM, Bunny, lautloscrew.com wrote:
 
 David,
 
 With your help i finally got it. THX!
 sorry for handing out some ugly names.
 Reason being: it´s a german dataset with german variable names. With those 
 german names you are always sure you dont use a forbidden
 name. I just did not want to hit one of those by accident when changing 
 these names for the mailing list. columna is just the latin term for column 
 :) . Anyway here´s what worked
 
 note: I just tried to use some more real names here.
  
 recode_items = function(dataframe,question_number,medium=3){
  
  #note column names of the initial data.frame are like 
 Question1,Question2 etc. Using [,1] would not be very practical since
  # the df contains some other data too. Indexing by names seemed to most 
 comfortable way so far.
  question-paste(Question,question_number,sep=)
  # needed indexing here that understands characters, that´s why 
 going with [,question_number] did not work.
  dataframe[question][dataframe[question]==3]=0
 
 This would be more typical:
 
 dataframe[dataframe[question]==3, question] - 0
 
  
  
  return(dataframe)
  
  }
 
 recode_items(mydataframe,question_number,3)
 # this call uses the dataframe that contains the answers of survey 
 participants. Question number is an argument that selects the question from 
 the dataframe that should be recoded. In surveys some weighting schemes only 
 respect extreme answers, which is why the medium answer is recoded to zero. 
 Since it depends on the item scale what medium actually is, I need it to be 
 an argument of my function.
 
 Did you want a further logical test with that =1 or some sort of 
 assignment???
 
 So yes, it´s an assignment.
 
 Moral: Generally better to use [ indexing.
 
 That´s what really made my day (and it´s only 9.30 a.m. here ) . Are there 
 exceptions to rule?
 
 Not that I know of.
 
 I just worked a lot with the $ in the past.
 
 $colname is just syntactic sugar for either [colname] or [ 
 ,colname] and it has the disadvantage that colname is not evaluated.
 
 
 
 thx
 
 matt
 
  
 
 
 On 29.04.2010, at 00:56, David Winsemius wrote:
 
 
 On Apr 28, 2010, at 5:45 PM, David Winsemius wrote:
 
 
 On Apr 28, 2010, at 5:31 PM, Bunny, lautloscrew.com wrote:
 
 Dear all,
 
 i have a problem with processing dataframes within a function using the 
 $.
 Here´s my code:
 
 
 recode_items = function(dataframe,number,medium=2){
   
   # this works
   q-paste(columna,number,sep=)
 
 Do your really want q to equal columna2 when number equals 2?
 
 
   # this does not work, particularly because dataframe is not processed
   # dataframe should be: givenframe$columnagivennumber
   a=dataframe$q[dataframe$q==medium]=1
 
 Did you want a further logical test with that =1 or some sort of 
 assignment???
 
 
 a) Do you want to work on the column from dataframe ( horrible name for 
 this purpose IMO) with the name columna2? If so, then start with
 
 dataframe[ , q ]
 
  the q will be evaluated in this form whereas it would not when used 
 with $.
 
 b) (A guess in absence of explanation of a goal.) Now do you want all of 
 the rows where that vector equals medium? If so ,then try this:
 
 dataframe[ dataframe[ , q ]==2 , ]  # untested in the absence of data
 
 Ooops. should have been:
 
 dataframe[ dataframe[ , q ]==medium , ] #since both q and medium will be 
 evaluated.
 
 
 
 Moral: Generally better to use [ indexing.
 
 -- 
 David.
 
 
 
 
   return(a)   
 
 }
 
 
 If I call this function, i´d like it to return my  dataframe.  The 
 problem appears to be somewhere around the $. I´m sure this not too hard, 
 but somehow i am stuck. I´ll keep searchin the manuals.
 Thx for any help in advance.
 
 best
 
 matt
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using plyr::dply more (memory) efficiently?

2010-04-29 Thread Steve Lianoglou

Hi all,

In short:

I'm running ddply on an admittedly (somehow) large data.frame (not
that large). It runs fine until it finishes and gets to the
collating part where all subsets of my data.frame have been
summarized and they are being reassembled into the final summary
data.frame (sorry, don't know the correct plyr terminology). During
collation, my R workspace RAM usage goes from about 1.5 GB upto 20GB
until I kill it.

Running a similar piece of code that iterates manually w/o ddply by
using a combo of lapply and a do.call(rbind, ...) uses considerably
less ram (tops out at about 8GB).

How can I use ddply more efficiently?

Longer:

Here's more info:

 * The data.frame itself ~ 15.8 MB when loaded.
 * ~ 400,000 rows, 8 columns

It looks like so:

   exon.start exon.width exon.width.unique exon.anno counts
symbol   transcript  chr
14225468 0   utr  0
WASH5P   WASH5P chr1
24833 69 0   utr  1
WASH5P   WASH5P chr1
3565915238   utr  1
WASH5P   WASH5P chr1
46470159 0   utr  0
WASH5P   WASH5P chr1
56721198 0   utr  0
WASH5P   WASH5P chr1
67096136 0   utr  0
WASH5P   WASH5P chr1
77469137 0   utr  0
WASH5P   WASH5P chr1
87778147 0   utr  0
WASH5P   WASH5P chr1
98131 99 0   utr  0
WASH5P   WASH5P chr1
10  14601154 0   utr  0
WASH5P   WASH5P chr1
11  19184 50 0   utr  0
WASH5P   WASH5P chr1
12   469314036intron  2
WASH5P   WASH5P chr1
13   490275736intron  1
WASH5P   WASH5P chr1
14   5811659   144intron 47
WASH5P   WASH5P chr1
15   6629 9221intron  1
WASH5P   WASH5P chr1
16   6919177 0intron  0
WASH5P   WASH5P chr1
17   723223735intron  2
WASH5P   WASH5P chr1
18   7606172 0intron  0
WASH5P   WASH5P chr1
19   7925206 0intron  0
WASH5P   WASH5P chr1
20   8230   6371   109intron 67
WASH5P   WASH5P chr1
21  14755   442955intron 12
WASH5P   WASH5P chr1
...

I'm ply-ing over the transcript column and the function transforms
each such subset of the data.frame into a new data.frame that is just
1 row / transcript that basically has the sum of the counts for each
transcript.

The code would look something like this (`summaries` is the data.frame
I'm referring to):

rpkm - ddply(summaries, .(transcript), function(df) {
  data.frame(symbol=df$symbol[1], counts=sum(df$counts))
}

(It actually calculates 2 more columns that are returned in the
data.frame, but I'm not sure that's really important here).

To test some things out, I've written another function to manually
iterate/create subsets of my data.frame to summarize.

I'm using sqldf to dump the data.frame into a db, then I lapply over
subsets of the db `where transcript=x` to summarize each subset of my
data into a list of single-row data.frames (like ddply is doing), and
finish with a `do.call(rbind, the.dfs)` o nthis list.

This returns the same exact result ddply would return, and by the time
`do.call` finishes, my RAM usage hits about 8gb.

So, what am I doing wrong with ddply that makes the difference ram
usage in the last step (collation -- the equivalent of my final
`do.call(rbind, my.dfs)` be more than 12GB?

Thanks,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] randomness in stepclass (klaR) or lda (MASS) ?

2010-04-29 Thread Uwe Ligges




On 29.04.2010 15:01, Eric Elguero wrote:

Hi,

a colleague ran a stepwise discriminant analysis
twice in a row and got different results, suggesting
some sochasticity in the algorithms involved.
I looked at her data and found that there was a lot
of collinearity, so that I reckoned that maybe stepclass
(klaR) cannot find a clear winner when trying to include a
new variable and makes a random choice. Is that true?


Yes, since a cross validation is involved.
If you want stable results, you could try leave one out or set a seed.
Anyway, if you variables are collinear I wonder if the stepwise approach 
is the smartest solution here.





another possibility is that lda (from MASS) computes
CV classification rates from a random subsample instead of
using all the data (?) That might be a sensible choice
with a very large sample.
I advised her to run the function several times and
see if a consensus emerges, but that doesn't seem to
be the case, and besides, I would like to know what
really is going on.


Well, it is called cross validation which is based on random sampling if 
you do not have k=n -fold CV (=leave-one-out).

Again, to get reproducible results, you will need to set a seed.


If the results are that unstable: Do you really have a sufficient number 
of observations for your classification problem?


Uwe Ligges






thanks

Eric Elguero
Laboratory Genetics and Evolution of Infectious Diseases,
Team: Genetics and Adaptation of Plasmodium
UMR 2724 CNRS-IRD,
IRD Montpellier,
911 Avenue Agropolis, BP 64501,
34394 Montpellier Cedex 5,
France



f4.U.spDA- stepclass(f.mes, f.gp4,

lda,improvement=0.01,prior=rep(0.25,4))
  `stepwise classification', using 10-fold cross-validated correctness
rate of method lda'.
89 observations of 31 variables in 4 classes; direction: both
stop criterion: improvement less than 1%.
correctness rate: 0.58333;  in: X2;  variables (1): X2
correctness rate: 0.66389;  in: X9;  variables (2): X2, X9
correctness rate: 0.69583;  in: X27;  variables (3): X2, X9, X27

  hr.elapsed min.elapsed sec.elapsed
0.000.00   20.77


f4.U.spDA- stepclass(f.mes, f.gp4,

lda,improvement=0.01,prior=rep(0.25,4))
  `stepwise classification', using 10-fold cross-validated correctness
rate of method lda'.
89 observations of 31 variables in 4 classes; direction: both
stop criterion: improvement less than 1%.
correctness rate: 0.60556;  in: X2;  variables (1): X2
correctness rate: 0.71806;  in: X6;  variables (2): X2, X6

  hr.elapsed min.elapsed sec.elapsed
0.000.00   15.14

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Dendrogram and fusion levels

2010-04-29 Thread Timothée Poisot

Dear users,

I am trying to extract the fusion levels from a dendrogram (in my case, 
phylogenetic trees in the 'phylo' format of APE). So far, I have not been 
successful. I can't believe there is not a library to do it, but I can't find a 
function that would extract the fusion levels.

Do you know any way to extract the fusion levels?

Any help would be appreciated,
Timothée



---
Timothée POISOT
-
Institut des Sciences de l'Evolution
Université Montpellier 2, CC 065
Place Eugène Bataillon
34095 Montpellier CEDEX 5
-
Phone   :   (+33)4 67 14 40 61
Fax :   (+33)4 67 14 40 61
E-mail  :   tpoi...@um2.fr
Web :   http://www.timotheepoisot.fr/
---


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lm() with non-linear coefficients constraints? --- nls?

2010-04-29 Thread ivo welch

dear R experts---quick question.  I need to estimate a model that looks like

 y = (b*T+d*T^3) + (1-b-3*d*T^2)*x + (3*d*T)*x^2 + (-d)*x^3

I only have three parameters.  Is nls() the right tool for the job, or is
there something faster/better?

/iaw


Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lm() with non-linear coefficients constraints? --- nls?

2010-04-29 Thread ivo welch

dear R experts---quick question.  I need to estimate a model that looks like

 y = (b*T+d*T^3) + (1-b-3*d*T^2)*x + (3*d*T)*x^2 + (-d)*x^3

I only have three parameters.  Is nls() the right tool for the job, or is
there something faster/better?

/iaw

Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com)
CV Starr Professor of Economics (Finance), Brown University
http://welch.econ.brown.edu/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] operator problem within function



On Apr 29, 2010, at 9:03 AM, Bunny, lautloscrew.com wrote:

Sorry for that offlist post, did not mean to do it intentionally.  
just hit the wrong button. Unfortunately this disadvantage is not  
written next to $ in the manual.


Hmmm. Not my manual:

Both [[ and $ select a single element of the list. The main  
difference is that $ does not allow computed indices, whereas [[does.



It also says that the correct equivalent using extraction operators of  
$ would be:


x$name  ==  x[[name, exact = FALSE]]
--
David.





On Apr 29, 2010, at 2:34 AM, Bunny, lautloscrew.com wrote:


David,

With your help i finally got it. THX!
sorry for handing out some ugly names.
Reason being: it´s a german dataset with german variable names.  
With those german names you are always sure you dont use a forbidden
name. I just did not want to hit one of those by accident when  
changing these names for the mailing list. columna is just the  
latin term for column :) . Anyway here´s what worked


note: I just tried to use some more real names here.

recode_items = function(dataframe,question_number,medium=3){

		#note column names of the initial data.frame are like  
Question1,Question2 etc. Using [,1] would not be very practical  
since 		# the df contains some other data too. Indexing by names  
seemed to most comfortable way so far.

question-paste(Question,question_number,sep=)
		# needed indexing here that understands characters, that´s why  
going with [,question_number] did not work.

dataframe[question][dataframe[question]==3]=0


This would be more typical:

dataframe[dataframe[question]==3, question] - 0




return(dataframe)

}

recode_items(mydataframe,question_number,3)
# this call uses the dataframe that contains the answers of survey  
participants. Question number is an argument that selects the  
question from the dataframe that should be recoded. In surveys  
some weighting schemes only respect extreme answers, which is why  
the medium answer is recoded to zero. Since it depends on the item  
scale what medium actually is, I need it to be an argument of my  
function.


Did you want a further logical test with that =1 or some sort  
of assignment???


So yes, it´s an assignment.


Moral: Generally better to use [ indexing.


That´s what really made my day (and it´s only 9.30 a.m. here ) .  
Are there exceptions to rule?


Not that I know of.


I just worked a lot with the $ in the past.


$colname is just syntactic sugar for either [colname] or  
[ ,colname] and it has the disadvantage that colname is not  
evaluated.





thx

matt




On 29.04.2010, at 00:56, David Winsemius wrote:



On Apr 28, 2010, at 5:45 PM, David Winsemius wrote:



On Apr 28, 2010, at 5:31 PM, Bunny, lautloscrew.com wrote:


Dear all,

i have a problem with processing dataframes within a function  
using the $.

Here´s my code:


recode_items = function(dataframe,number,medium=2){

# this works
q-paste(columna,number,sep=)


Do your really want q to equal columna2 when number equals 2?



	# this does not work, particularly because dataframe is not  
processed

# dataframe should be: givenframe$columnagivennumber
a=dataframe$q[dataframe$q==medium]=1


Did you want a further logical test with that =1 or some sort  
of assignment???




a) Do you want to work on the column from dataframe ( horrible  
name for this purpose IMO) with the name columna2? If so, then  
start with


dataframe[ , q ]

 the q will be evaluated in this form whereas it would not  
when used with $.


b) (A guess in absence of explanation of a goal.) Now do you  
want all of the rows where that vector equals medium? If  
so ,then try this:


dataframe[ dataframe[ , q ]==2 , ]  # untested in the absence of  
data


Ooops. should have been:

dataframe[ dataframe[ , q ]==medium , ] #since both q and medium  
will be evaluated.





Moral: Generally better to use [ indexing.

--
David.





return(a)   

}


If I call this function, i´d like it to return my  dataframe.   
The problem appears to be somewhere around the $. I´m sure this  
not too hard, but somehow i am stuck. I´ll keep searchin the  
manuals.

Thx for any help in advance.

best

matt
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible  
code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.










David Winsemius, MD
West Hartford,

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

2010-04-29 Thread Barry Rowlingson

On Thu, Apr 29, 2010 at 1:27 PM, Henrique Dallazuanna www...@gmail.com wrote:
 Another option could be:

 split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1]


One thing none of the solutions so far do (except I haven't tried
Tal's original code) is insert an empty group between adjacent NA
values, for example in:

 x = c(1,2,3,NA,NA,4,5,6)

  split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1]
$`0`
[1] 1 2 3

$`2`
[1] 4 5 6

Maybe this never happens in Tal's case, or it's not what he wanted
anyway, but I thought I'd point it out!

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] time zone convert

Hi there,

I've got a column vector in a csv file as follows, and I need to add 11
hours to each of them. Is there a way that I can do it? (The actual file
size is much bigger than this.)

Time
01-DEC-2008 00:00:28.611
01-DEC-2008 00:00:43.155
01-DEC-2008 00:01:06.677
01-DEC-2008 00:01:06.677
01-DEC-2008 00:01:06.677
01-DEC-2008 00:01:06.919
01-DEC-2008 00:23:46.452
02-DEC-2008 00:03:17.646
02-DEC-2008 00:03:17.652
03-DEC-2008 00:15:11.485
03-DEC-2008 00:18:44.652
03-DEC-2008 00:22:17.447

Thank you in advance.

Cheers,

Carol

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] time zone convert

Try this:

Time2 - gsub(\\.*, , tolower(Time))
modifyList(Time2, list(hour = Time2$hour + 11))


On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.com wrote:

 Hi there,

 I've got a column vector in a csv file as follows, and I need to add 11
 hours to each of them. Is there a way that I can do it? (The actual file
 size is much bigger than this.)

 Time
 01-DEC-2008 00:00:28.611
 01-DEC-2008 00:00:43.155
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.919
 01-DEC-2008 00:23:46.452
 02-DEC-2008 00:03:17.646
 02-DEC-2008 00:03:17.652
 03-DEC-2008 00:15:11.485
 03-DEC-2008 00:18:44.652
 03-DEC-2008 00:22:17.447

 Thank you in advance.

 Cheers,

 Carol

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R Anova Analysis

2010-04-29 Thread Yanwei Tan


Dear all,

I have a quite basic questions about anova analysis in R, sorry for 
this, but I have no clue how to explain this result.


I have two datasets which are named: nmda123, nmda456. Each dataset has 
three samples which were measured three times. And I would like to 
compare means of them with Posthoc test using R, following please see 
the output:


(CREB, mCREB and No virus are the name of samples)

 nmda123
 Values  ind
1 6.7171265 CREB
2 5.0343117 CREB
3 6.900 CREB
4 0.1195394mCREB
5 0.1221876mCREB
6 0.190mCREB
7 1.000 No Virus
8 1.000 No Virus
9 1.000 No Virus

 nmda456
 Values  ind
1 6.4486940 CREB
2 6.2277490 CREB
3 6.500 CREB
4 0.200mCREB
5 0.3766052mCREB
6 0.400mCREB
7 1.000 No Virus
8 1.000 No Virus
9 1.000 No Virus

 TukeyHSD(aov(Values ~ ind, data = nmda456))
  Tukey multiple comparisons of means
95% family-wise confidence level

Fit: aov(formula = Values ~ ind, data = nmda456)

$ind
 difflwrupr p adj
mCREB-CREB -6.0666126 -6.3289033 -5.8043219 0.000
No Virus-CREB  -5.3921477 -5.6544383 -5.1298570 0.000
No Virus-mCREB  0.6744649  0.4121743  0.9367556 0.0005382

 TukeyHSD(aov(Values ~ ind, data = nmda123))
  Tukey multiple comparisons of means
95% family-wise confidence level

Fit: aov(formula = Values ~ ind, data = nmda123)

$ind
difflwr   upr p adj
mCREB-CREB -6.073237 -7.5618886 -4.584585 0.392
No Virus-CREB  -5.217146 -6.7057976 -3.728495 0.943
No Virus-mCREB  0.856091 -0.6325606  2.344743 0.2588450

So my question is No virus-mCREB group. Even I looked at the data by 
eyes, there is big difference between no virus and mCREB in data 
nmda123, but why the pvalue is 0.2588450, but in nmda456 data, the 
pvalue is 0.0005382.  But I can see there is  bigger difference in 
nmda123 than nmda456, I do not know why. Sorry for my inexperiences in 
statistics.


Thanks for your reply and time!

Cheers,
Wei

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using plyr::dply more (memory) efficiently?

2010-04-29 Thread Matthew Dowle

I don't know about that,  but try this :

install.packages(data.table, repos=http://R-Forge.R-project.org;)
require(data.table)
summaries = data.table(summaries)
summaries[,sum(counts),by=symbol]

Please let us know if that returns the correct result,  and if its 
memory/speed is ok ?

Matthew

Steve Lianoglou mailinglist.honey...@gmail.com wrote in message 
news:w2kbbdc7ed01004290606lc425e47cs95b36f6bf0a...@mail.gmail.com...
 Hi all,

 In short:

 I'm running ddply on an admittedly (somehow) large data.frame (not
 that large). It runs fine until it finishes and gets to the
 collating part where all subsets of my data.frame have been
 summarized and they are being reassembled into the final summary
 data.frame (sorry, don't know the correct plyr terminology). During
 collation, my R workspace RAM usage goes from about 1.5 GB upto 20GB
 until I kill it.

 Running a similar piece of code that iterates manually w/o ddply by
 using a combo of lapply and a do.call(rbind, ...) uses considerably
 less ram (tops out at about 8GB).

 How can I use ddply more efficiently?

 Longer:

 Here's more info:

 * The data.frame itself ~ 15.8 MB when loaded.
 * ~ 400,000 rows, 8 columns

 It looks like so:

   exon.start exon.width exon.width.unique exon.anno counts
 symbol   transcript  chr
 14225468 0   utr  0
 WASH5P   WASH5P chr1
 24833 69 0   utr  1
 WASH5P   WASH5P chr1
 3565915238   utr  1
 WASH5P   WASH5P chr1
 46470159 0   utr  0
 WASH5P   WASH5P chr1
 56721198 0   utr  0
 WASH5P   WASH5P chr1
 67096136 0   utr  0
 WASH5P   WASH5P chr1
 77469137 0   utr  0
 WASH5P   WASH5P chr1
 87778147 0   utr  0
 WASH5P   WASH5P chr1
 98131 99 0   utr  0
 WASH5P   WASH5P chr1
 10  14601154 0   utr  0
 WASH5P   WASH5P chr1
 11  19184 50 0   utr  0
 WASH5P   WASH5P chr1
 12   469314036intron  2
 WASH5P   WASH5P chr1
 13   490275736intron  1
 WASH5P   WASH5P chr1
 14   5811659   144intron 47
 WASH5P   WASH5P chr1
 15   6629 9221intron  1
 WASH5P   WASH5P chr1
 16   6919177 0intron  0
 WASH5P   WASH5P chr1
 17   723223735intron  2
 WASH5P   WASH5P chr1
 18   7606172 0intron  0
 WASH5P   WASH5P chr1
 19   7925206 0intron  0
 WASH5P   WASH5P chr1
 20   8230   6371   109intron 67
 WASH5P   WASH5P chr1
 21  14755   442955intron 12
 WASH5P   WASH5P chr1
 ...

 I'm ply-ing over the transcript column and the function transforms
 each such subset of the data.frame into a new data.frame that is just
 1 row / transcript that basically has the sum of the counts for each
 transcript.

 The code would look something like this (`summaries` is the data.frame
 I'm referring to):

 rpkm - ddply(summaries, .(transcript), function(df) {
  data.frame(symbol=df$symbol[1], counts=sum(df$counts))
 }

 (It actually calculates 2 more columns that are returned in the
 data.frame, but I'm not sure that's really important here).

 To test some things out, I've written another function to manually
 iterate/create subsets of my data.frame to summarize.

 I'm using sqldf to dump the data.frame into a db, then I lapply over
 subsets of the db `where transcript=x` to summarize each subset of my
 data into a list of single-row data.frames (like ddply is doing), and
 finish with a `do.call(rbind, the.dfs)` o nthis list.

 This returns the same exact result ddply would return, and by the time
 `do.call` finishes, my RAM usage hits about 8gb.

 So, what am I doing wrong with ddply that makes the difference ram
 usage in the last step (collation -- the equivalent of my final
 `do.call(rbind, my.dfs)` be more than 12GB?

 Thanks,
 -steve

 -- 
 Steve Lianoglou
 Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
 Contact Info: http://cbio.mskcc.org/~lianos/contact


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simple loop code

2010-04-29 Thread RCulloch


Hi fellow R Users,

I find that I typically rewrite my data specific to data in columns, which
is by no means efficient and I am struggling to break out of this bad habit
and utalise some of the excellent things R can do! I have tried to look at
'for' but I don't really follow it, and I wondered if anyone could help with
a simple example using my script so I could follow this and build on it, so
for example, wanting to change an ID code from alphanumeric to numeric. The
example below works, but takes ages, given I have a lot of IDs, to do
manually! 

Any thoughts on how to create a loop to go through each ID and give them a
unique number would be most welcome!

Cheers,

Ross


levels(dat.ID$ID2)[levels(dat.ID$ID2)=='A1']-1
levels(dat.ID$ID2)[levels(dat.ID$ID2)=='A2']-2
levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D1']-3
levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D2']-4
levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D4']-5
levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D5']-6
levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D6']-7
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Simple-loop-code-tp2075322p2075322.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Changing from 32-bit builds to 64-bit builds

2010-04-29 Thread Sachi Ito

Hi,

Probably this is a very simple question for all the programmers, but how do
you change from 32-bit builds (default) to 64-bit builds?

I've been trying to run Anova to compare two models, but I get the following
error message:

Error: cannot allocate vector of size 1.2 Gb
R(3122,0xa0ab44e0) malloc: *** mmap(size=1337688064) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
R(3122,0xa0ab44e0) malloc: *** mmap(size=1337688064) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug


I suppose it's a problem with memory allocation because of the big data
size, so I thought I should use 64-bit builds instead of 32.

As it was recommended in a manual, I've entered the following:

 CC='gcc -arch x86_64'
 CXX='g++ -arch x86_64'
 F77='gfortran -arch x86_64'
 FC='gfortran -arch x86_64'
 OBJC='gcc -arch x86_64'

But it still gave me error.  I'd greatly appreciate if someone can answer
this question!

Thank you,
Sachi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple loop code

Try this:

factor(dat.ID$ID2, labels = 1:7)

On Thu, Apr 29, 2010 at 8:39 AM, RCulloch ross.cull...@dur.ac.uk wrote:


 Hi fellow R Users,

 I find that I typically rewrite my data specific to data in columns, which
 is by no means efficient and I am struggling to break out of this bad habit
 and utalise some of the excellent things R can do! I have tried to look at
 'for' but I don't really follow it, and I wondered if anyone could help
 with
 a simple example using my script so I could follow this and build on it, so
 for example, wanting to change an ID code from alphanumeric to numeric. The
 example below works, but takes ages, given I have a lot of IDs, to do
 manually!

 Any thoughts on how to create a loop to go through each ID and give them a
 unique number would be most welcome!

 Cheers,

 Ross


 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='A1']-1
 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='A2']-2
 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D1']-3
 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D2']-4
 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D4']-5
 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D5']-6
 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D6']-7
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Simple-loop-code-tp2075322p2075322.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using get and paste in a loop to return objects for object names listed a strings

2010-04-29 Thread Nevil Amos

Thanks for that, the package looks very useful.  It gave me the answer 
in a roundabout way - reminded me I needed to sue attach() so that the 
get () was only dealing with the objects in data.frame, rather than 
using the data.frame$factorname


I therefore managed to sort a work around, but will be looking at ggplot 
2 for other things


the work around and the head of the data file is shown below

 head(data.all)
   Line Capture_ID Landscape_Name Band_text Bird_IDDate 
CODE_LETTERS Site_Name Age_Class SEX_ Capture_Number Mass Season 
SEASON_CLASS   EVC Moult Sine_Julian Cosine_Julian  HCT   Hb 
Site_Cond Logs_Length
10   10 10  Axe Creek   42605012275  7/11/2007 
0:00  YTH   Ax1 AF  1 21.5 
Spring   SS Heathy Dry Forest N   -0.80  0.59 
0.48   NA43   5
13   13 13  Axe Creek   37136021170  8/11/2007 
0:00  YTH   Ax1 AF  1 21.5 
Spring   SS Heathy Dry Forest N   -0.79  0.61 
0.53 20.443   5
19   19 21  Axe Creek   37136031171  9/11/2007 
0:00  YTH   Ax1 AF  1 19.5 
Spring   SS Heathy Dry Forest N   -0.78  0.62 
0.53   NA43   5
30   30 34  Axe Creek   37136041172 10/11/2007 
0:00  YTH   Ax1 UM  1 24.5 
Spring   SS Heathy Dry Forest Y   -0.76  0.63   
NA   NA43   5
31   31 35  Axe Creek   37136051173 10/11/2007 
0:00  YTH   Ax1 UU  1   NA 
Spring   SS Heathy Dry Forest U   -0.76  0.63   
NA   NA43   5
32   32 36  Axe Creek   37136061174 10/11/2007 
0:00  YTH   Ax1 UM  1 23.5 
Spring   SS Heathy Dry Forest U   -0.76  0.63 
0.50   NA43   5
   Litter_Cov Understorey TreeCov H.L WBC  BCI   CCIPca1 YEAR 
Hab_Config

10   22.5  650.35  NA  NA   NANA 2007  D
13   22.5  650.35  NA  NA -3.11592 0.6215803 2007  D
19   22.5  650.35  NA  NA   NANA 2007  D
30   22.5  650.35  NA  NA   NANA 2007  D
31   22.5  650.35  NA  NA   NANA 2007  D
32   22.5  650.35  NA  NA   NANA 2007  D


sp.codes=levels(data.all$CODE_LETTERS)

for(spp in sp.codes) {


data.sp=subset(data.all,CODE_LETTERS==spp)

responses = colnames(data.all)[c(20,28,29,19)]
 #if (spp==BT) responses = colnames(data.all)[c(19]#,20,26:29)]
groups=colnames   (data.all)[c(9,10)]# ,13,16,30

attach(data.sp)
for (response in responses){
for (group in groups){
g=get(group)
r=get(response)
boxplot(r ~g, main=spp,xlab=group,ylab=response)

}
}
detach(data.sp)
}




On 29/04/2010 7:05 PM, Paul Hiemstra wrote:

Nevil Amos wrote:
I am trying to create a heap of boxplots, by looping though a series 
of factors and variables in a large data.frame suing paste to 
constrcut the facto and response names from the colnames

I thought I could do this using get()
however it is not working what am I doing wrong?
You don't give a reproducible example, this makes it hard to answer 
your question.


But not really in response to your question, take a look at histogram 
from the lattice package or geom_boxplot from the ggplot2 package. 
These functions can do all the work for you of drawing boxplots for a 
series of factors and variables in a large data.frame. This saves you 
a lot of time.


cheers,
Paul







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] randomness in stepclass (klaR) or lda (MASS) ?

2010-04-29 Thread Eric Elguero

On Thu, 2010-04-29 at 15:08 +0200, Uwe Ligges wrote:

 Well, it is called cross validation which is based on random sampling if 
 you do not have k=n -fold CV (=leave-one-out).
 Again, to get reproducible results, you will need to set a seed.
 

thank you. I thought that leave-one-out was the default.

I looked at the reference file and I am not sure how to get it.

Is that by setting fold=1 ?

 
 If the results are that unstable: Do you really have a sufficient number 
 of observations for your classification problem?

you're probably right.

e.e.


Eric Elguero
Laboratory Genetics and Evolution of Infectious Diseases,
Team: Genetics and Adaptation of Plasmodium
UMR 2724 CNRS-IRD,
IRD Montpellier,
911 Avenue Agropolis, BP 64501,
34394 Montpellier Cedex 5,
France

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] time zone convert

Appreciate it! I was trying on the code you sent, then some error codes
turned up:

The first line runs ok, the second line:

 modifyList(Time2, list(hour = Time2$hour + 11))
Error in Time2$hour : $ operator is invalid for atomic vectors

The time format I used for reading the Time vector is %d-%b-%Y %H:%M:%OS.
Should I change any code above?

Carol



On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna www...@gmail.comwrote:

 Try this:

 Time2 - gsub(\\.*, , tolower(Time))
 modifyList(Time2, list(hour = Time2$hour + 11))


 On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.com wrote:

 Hi there,

 I've got a column vector in a csv file as follows, and I need to add 11
 hours to each of them. Is there a way that I can do it? (The actual file
 size is much bigger than this.)

 Time
 01-DEC-2008 00:00:28.611
 01-DEC-2008 00:00:43.155
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.919
 01-DEC-2008 00:23:46.452
 02-DEC-2008 00:03:17.646
 02-DEC-2008 00:03:17.652
 03-DEC-2008 00:15:11.485
 03-DEC-2008 00:18:44.652
 03-DEC-2008 00:22:17.447

 Thank you in advance.

 Cheers,

 Carol

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] time zone convert

Ops,

I sent to you a wrong code, try this indeed:

Time2 - strptime(Time, '%d-%b-%Y %H:%M:%S')
modifyList(Time2, list(hour = Time2$hour + 11))

On Thu, Apr 29, 2010 at 11:14 AM, Carol Gao carol.g...@gmail.com wrote:

 Appreciate it! I was trying on the code you sent, then some error codes
 turned up:

 The first line runs ok, the second line:


  modifyList(Time2, list(hour = Time2$hour + 11))
 Error in Time2$hour : $ operator is invalid for atomic vectors

 The time format I used for reading the Time vector is %d-%b-%Y %H:%M:%OS.
 Should I change any code above?

 Carol




 On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna 
 www...@gmail.comwrote:

 Try this:

 Time2 - gsub(\\.*, , tolower(Time))
 modifyList(Time2, list(hour = Time2$hour + 11))


 On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.com wrote:

 Hi there,

 I've got a column vector in a csv file as follows, and I need to add 11
 hours to each of them. Is there a way that I can do it? (The actual file
 size is much bigger than this.)

 Time
 01-DEC-2008 00:00:28.611
 01-DEC-2008 00:00:43.155
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.919
 01-DEC-2008 00:23:46.452
 02-DEC-2008 00:03:17.646
 02-DEC-2008 00:03:17.652
 03-DEC-2008 00:15:11.485
 03-DEC-2008 00:18:44.652
 03-DEC-2008 00:22:17.447

 Thank you in advance.

 Cheers,

 Carol

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O





-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] randomness in stepclass (klaR) or lda (MASS) ?

2010-04-29 Thread Uwe Ligges




On 29.04.2010 16:04, Eric Elguero wrote:

On Thu, 2010-04-29 at 15:08 +0200, Uwe Ligges wrote:


Well, it is called cross validation which is based on random sampling if
you do not have k=n -fold CV (=leave-one-out).
Again, to get reproducible results, you will need to set a seed.



thank you. I thought that leave-one-out was the default.



As you can see in ?stepclass:

foldparameter for cross-validation; omitted if ‘cv.groups’ is specified.

and the Usage line tells us:

. fold = 10, ..

hence 10-fold is the default.




I looked at the reference file and I am not sure how to get it.

Is that by setting fold=1 ?



No, leave one out is n-fold, hence you need n!

Uwe Ligges





If the results are that unstable: Do you really have a sufficient number
of observations for your classification problem?


you're probably right.

e.e.


Eric Elguero
Laboratory Genetics and Evolution of Infectious Diseases,
Team: Genetics and Adaptation of Plasmodium
UMR 2724 CNRS-IRD,
IRD Montpellier,
911 Avenue Agropolis, BP 64501,
34394 Montpellier Cedex 5,
France




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Compact Patricia Trees (Tries)

2010-04-29 Thread Richard Liu


Gabor,

Thanks for the suggestion, I'll try it out tonight or tomorrow.

Regards,
Richard

_
Richard R. Liu
Dittingerstr. 33
CH-4053 Basel
Switzerland
Tel. +41 79 708 67 66

Sent from my iPhone 3GS

On Apr 29, 2010, at 13:06, Gabor Grothendieck  
ggrothendi...@gmail.com wrote:



Using charmatch partial matches of 10,000 5 leters words to the same
list can be done in 10 seconds on my machine and 10,000 5 letter words
to 100,000 10 letter words in 1 minute.  Is that good enough?  Try
this simulation:

# generate N random words each k long
rwords - function(N, k) {
  L - sample(letters, N*k, replace = TRUE)
  apply(matrix(L, k), 2, paste, collapse = )
}
w1 - rwords(1e5, 10)
w2 - rwords(1e4, 5)

system.time(charmatch(w2, w2))

system.time(charmatch(w2, w1))


On Thu, Apr 29, 2010 at 4:05 AM, Richard R. Liu richard@pueo-owl.ch 
 wrote:
I have an application that a long list of character strings to  
determine which
occur at the beginning of a given word.  A straight forward R  
script using grep
takes a long time to run.  Rewriting it to use substr and match  
might be an
option, but I have the impression that preparing the list as a trie  
and
performing trie searches might lead to dramatic improvements in  
performance.



I have searched the CRAN packages and find no packages that support  
Compact

Patricia Trees.  Does anybody know of such?


Thanks,
Richard

Richard R. Liu
richard@pueo-owl.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Anova Analysis

2010-04-29 Thread Dennis Murphy

Hi:

It strikes me as a little curious that the No Virus values in each of your
example data sets
are all *exactly* 1. Why is that?

Dennis

On Thu, Apr 29, 2010 at 4:52 AM, Yanwei Tan t...@nbio.uni-heidelberg.dewrote:

 Dear all,

 I have a quite basic questions about anova analysis in R, sorry for this,
 but I have no clue how to explain this result.

 I have two datasets which are named: nmda123, nmda456. Each dataset has
 three samples which were measured three times. And I would like to compare
 means of them with Posthoc test using R, following please see the output:

 (CREB, mCREB and No virus are the name of samples)

  nmda123
 Values  ind
 1 6.7171265 CREB
 2 5.0343117 CREB
 3 6.900 CREB
 4 0.1195394mCREB
 5 0.1221876mCREB
 6 0.190mCREB
 7 1.000 No Virus
 8 1.000 No Virus
 9 1.000 No Virus

  nmda456
 Values  ind
 1 6.4486940 CREB
 2 6.2277490 CREB
 3 6.500 CREB
 4 0.200mCREB
 5 0.3766052mCREB
 6 0.400mCREB
 7 1.000 No Virus
 8 1.000 No Virus
 9 1.000 No Virus

  TukeyHSD(aov(Values ~ ind, data = nmda456))
  Tukey multiple comparisons of means
95% family-wise confidence level

 Fit: aov(formula = Values ~ ind, data = nmda456)

 $ind
 difflwrupr p adj
 mCREB-CREB -6.0666126 -6.3289033 -5.8043219 0.000
 No Virus-CREB  -5.3921477 -5.6544383 -5.1298570 0.000
 No Virus-mCREB  0.6744649  0.4121743  0.9367556 0.0005382

  TukeyHSD(aov(Values ~ ind, data = nmda123))
  Tukey multiple comparisons of means
95% family-wise confidence level

 Fit: aov(formula = Values ~ ind, data = nmda123)

 $ind
difflwr   upr p adj
 mCREB-CREB -6.073237 -7.5618886 -4.584585 0.392
 No Virus-CREB  -5.217146 -6.7057976 -3.728495 0.943
 No Virus-mCREB  0.856091 -0.6325606  2.344743 0.2588450

 So my question is No virus-mCREB group. Even I looked at the data by eyes,
 there is big difference between no virus and mCREB in data nmda123, but why
 the pvalue is 0.2588450, but in nmda456 data, the pvalue is 0.0005382.  But
 I can see there is  bigger difference in nmda123 than nmda456, I do not know
 why. Sorry for my inexperiences in statistics.

 Thanks for your reply and time!

 Cheers,
 Wei

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] merge on criteria

2010-04-29 Thread Alex Jameson

 Hi,

i have two files (file1.txt and file2.txt) which i would like to merge,
based on certain criteria, i.e.
 it combines data based on matching geneID and exons.
i have used the merge option, but it does not give me the desired outcome.
merged.txt shows the result i would like.




*File1. txt*
**
 AffyProbe ProbeType Flag GeneSymbol GeneID Exons Chrom Strand Affytart
AffyEnd   1 1007_s_at:1105:483 0 0 DDR1 780 21 6 + 30975403 30975427 2
1007_s_at:1119:177 0 0 DDR1 780 21 6 + 30975549 30975573 3
1007_s_at:1136:469 0 0 DDR1 780 21 6 + 30975766 30975790 4 1007_s_at:192:205
0 0 DDR1 780 21 6 + 30975523 30975547 5 1007_s_at:474:1161 0 0 DDR1 780 21 6
+ 30975745 30975769 6 1007_s_at:504:983 0 0 DDR1 780 21 6 + 30975575
30975599 7 1007_s_at:50:779 0 0 DDR1 780 21 6 + 30975758 30975782

*File2.txt*

AgilentProbe ProbeType Flag GeneSymbol GeneID Exons Chrom Strand
AgilentStart AgilentEnd   1 A_23_P11 0 0 FAM174B 400451 5 15 - 90961852
90961793 2 A_23_P100022 0 0 SV2B 9899 14 15 + 89639333 89639392 3
A_23_P100056 0 0 RBPMS2 348093 8 15 - 62819428 62819369 4 A_23_P100074 0 0
AVEN 57099 6 15 - 31946031 31945972 5 A_23_P100092 0 0 ZSCAN29 146050 5 15 -
41440680 41440621 6 A_23_P100103 0 0 VPS39 23339 24 15 - 40240319 40240260 7
A_23_P100111 0 0 CHP 11261 7 15 + 39358845 39358904 8 A_23_P100127 0 0 CASC5
57082 11 15 + 38704817 38704876 9 A_23_P100133 0 0 ATMIN 23300 4 16 +
79636596 79636655 10 A_23_P100141 0 0 UNKL 64718 12 16 - 1355346 1355287


*merged.txt (Should look like this)*

   GeneSymbol GeneID Exons Chrome AffyMatrixProbeID AffyStart AffyEnd
AgilentProbeID AgilentStart AgilentEnd DDR1 780 21 6   A_24_P123601
30975848 30975907 RFC2 5982 10 7 1053_at:120:925,
1053_at:504:41,
1053_at:522:871,
1053_at:828:1025,
203696_s_at:291:651 73287845,
73287869,
73287863,
73287881,
73287850 73287821,
73287845,
73287839,
73287857,
73287826 A_23_P93823 73287861 73287802 RFC2 5982 11 7 HSPA6 3310
1 1   A_23_P114903 159762782 159762841 PAX8 7849 12 2   A_23_P210001
113691555 113691496 GUCA1A 2978 6 6 UBA7 7318 24 3
1294_at:1079:379,
1294_at:361:881,
203281_s_at:524:889,
203281_s_at:678:1017,
203281_s_at:68:1153 49818386,
49818398,
49818378,
49818434,
49818422 49818362,
49818374,
49818354,
49818420,
49818398


sorry for the long tables,

thanks

Alex

Student
University of Colorado

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Generalized Estimating Equation (GEE): Why is Link = Identity?

2010-04-29 Thread Sachi Ito

Hi,

I'm running GEE using geepack.

I set corstr = ar1 as below:

 m.ar - geeglm(L ~ O + A,
 + data = firstgrouptxt, id = id,
 + family = binomial, corstr = ar1)


 summary(m.ar)

Call:
geeglm(formula = L ~ O + A, family = binomial,
data = firstgrouptxt, id = id, corstr = ar1)

 Coefficients:
Estimate  Std.errWald Pr(|W|)
(Intercept) -2.62516  0.21154 154.001   2e-16 ***
ontask   0.00498  0.12143   0.002   0.9673
attachmentB  0.73216  0.35381   4.282   0.0385 *
attachmentC  0.25960  0.33579   0.598   0.4395
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

Estimated Scale Parameters:
Estimate Std.err
(Intercept)1.277  0.3538

Correlation: Structure = ar1  Link = identity

Estimated Correlation Parameters:
  Estimate  Std.err
alpha0.978 0.005725
Number of clusters:   49   Maximum cluster size: 533


Then, it shows that :
Correlation: Link = identity

Why is it not Link = logit?


Thank you,
Sachi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Changing from 32-bit builds to 64-bit builds

2010-04-29 Thread Marc Schwartz

On Apr 29, 2010, at 8:56 AM, Sachi Ito wrote:

 Hi,
 
 Probably this is a very simple question for all the programmers, but how do
 you change from 32-bit builds (default) to 64-bit builds?
 
 I've been trying to run Anova to compare two models, but I get the following
 error message:
 
 Error: cannot allocate vector of size 1.2 Gb
 R(3122,0xa0ab44e0) malloc: *** mmap(size=1337688064) failed (error code=12)
 *** error: can't allocate region
 *** set a breakpoint in malloc_error_break to debug
 R(3122,0xa0ab44e0) malloc: *** mmap(size=1337688064) failed (error code=12)
 *** error: can't allocate region
 *** set a breakpoint in malloc_error_break to debug
 
 
 I suppose it's a problem with memory allocation because of the big data
 size, so I thought I should use 64-bit builds instead of 32.
 
 As it was recommended in a manual, I've entered the following:
 
 CC='gcc -arch x86_64'
 CXX='g++ -arch x86_64'
 F77='gfortran -arch x86_64'
 FC='gfortran -arch x86_64'
 OBJC='gcc -arch x86_64'
 
 But it still gave me error.  I'd greatly appreciate if someone can answer
 this question!
 
 Thank you,
 Sachi


What OS? If Linux, which distribution?

For the more common platforms, there are pre-built 64 bit binary versions of R 
available, including Windows:

  http://cran.r-project.org/bin/windows64/

Also, moving to 64 bit to take advantage of a larger memory space presumes that 
you also have the physical memory available on your computer.

If you successfully built the 64 bit version on your system, it is possible 
that you still have the 32 bit version installed and that is what is being run. 
You can check this by using:

  .Machine$sizeof.pointer

If it returns 4, you are running 32 bit R and it if returns 8, 64 bit.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave question

2010-04-29 Thread Felipe Carrillo

Thanks Duncan it does exactly what I want, how do I get my options back to 
print graphics on computer screen? I tried options(device=screen) but didn't 
work.
 
Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA



- Original Message 
 From: Duncan Murdoch murdoch.dun...@gmail.com
 To: Felipe Carrillo mazatlanmex...@yahoo.com
 Cc: r-h...@stat.math.ethz.ch
 Sent: Thu, April 29, 2010 4:12:58 AM
 Subject: Re: [R] Sweave question
 
 On 28/04/2010 11:31 PM, Felipe Carrillo wrote:
 Hi:
 I am using 
 Sweave and texi2dvi to generate a LaTeX document but
 can't find the way 
 to hide the graphics while the R chunks are being
 executed. I thought 
 results=hide would do it but that't not the case.  

Sweave runs 
 figure chunks multiple times.  The first time is probably what you're 
 seeing:  it just runs the code, with no special devices created.  You 
 need to tell R to use something other than your screen as the default device 
 for 
 this.  That's what happens if you run Sweave in batch mode, or if you 
 choose options(device=pdf).  (You'll get a file Rplots.pdf 
 created.)

Duncan Murdoch
 If I do:
 
 \begin{figure}[h]
 
 figA=true,echo=F,fig=T,results=hide=
 a  
 rnorm(1000)
 plot(a)
 @
 \caption{Weekly 
 estimates.}
 \label{figure:ggplot1}
 \end{figure}
 
 
 The graphic doesn't get displayed but gets printed on the document
 
 
 but the code below shows the graphic...how can I hide it??
 
 \begin{figure}[h]
 
 figA=true,echo=F,fig=T,results=hide=
 
 library(ggplot2)
 winter - read.csv(Winter_AllYears.csv)
 
 wintermelt - melt(winter,id=week)
 
 print(ggplot(wintermelt,aes(week,value/1000)) + 
 geom_line(aes(colour=variable))+ 
 opts(legend.position=none) +
 facet_wrap(~variable,ncol=2) + 
 opts(title=Winter) + labs(y=Number of fish X 1,000,x=WEEK))
 
 @
 \caption{Weekly estimates.}
 \label{figure:ggplot1}
 
 \end{figure}
  Felipe D. Carrillo
 Supervisory Fishery 
 Biologist
 Department of the Interior
 US Fish  Wildlife 
 Service
 California, USA
 
  Felipe D. 
 Carrillo
 Supervisory Fishery Biologist
 Department of the 
 Interior
 US Fish  Wildlife Service
 California, USA
 
 
 
 
 
 
 __
  ymailto=mailto:R-help@r-project.org; 
 href=mailto:R-help@r-project.org;R-help@r-project.org mailing list
 
  https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the 
 posting guide http://www.R-project.org/posting-guide.html
 and provide 
 commented, minimal, self-contained, reproducible code.
  




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple loop code

2010-04-29 Thread RCulloch


Thanks Henrique, 

that works! for anyone else as slow as me, just:

##Assign 
x - factor(dat.ID$ID2, labels = 1:7)  
##Convert to dataframe
x - as.data.frame(x)
##Then bind to your data
z - cbind(y,x)

Thanks again, I expected it to be more complicated!

Cheers,

Ross
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Simple-loop-code-tp2075322p2075586.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] time zone convert

I tried your new lines with some random time, it seems to be working
perfectly well, just as follows:
 z - strptime(20/2/06 23:16:16.683, %d/%m/%y %H:%M:%OS)
 modifyList(z, list(hour = z$hour + 11))
[1] 2006-02-21 10:16:16

Now it seems that I have some problem with my Time vector. As Time was
created by the following code:
Time - paste(anz$Date.G.,anz$Time.G.)
The original data looks like the following with each row correspond to each.
Date.G.
01-DEC-2008
01-DEC-2008
02-DEC-2008
03-DEC-2008
04-DEC-2008
...

Time.G.
00:03:57.398
00:04:03.778
00:04:38.639
00:04:38.639
00:04:38.639
...
Somehow, I can't read Time in  strptime(Time,%d-%b-%Y %H:%M:%OS). Do you
know what was wrong with it?

Sorry for asking such questions, as I am quite new to R. Thanks for helping
me out.

Carol



On Fri, Apr 30, 2010 at 12:16 AM, Henrique Dallazuanna www...@gmail.comwrote:

 Ops,

 I sent to you a wrong code, try this indeed:

 Time2 - strptime(Time, '%d-%b-%Y %H:%M:%S')

 modifyList(Time2, list(hour = Time2$hour + 11))

 On Thu, Apr 29, 2010 at 11:14 AM, Carol Gao carol.g...@gmail.com wrote:

 Appreciate it! I was trying on the code you sent, then some error codes
 turned up:

 The first line runs ok, the second line:


  modifyList(Time2, list(hour = Time2$hour + 11))
 Error in Time2$hour : $ operator is invalid for atomic vectors

 The time format I used for reading the Time vector is %d-%b-%Y
 %H:%M:%OS. Should I change any code above?

 Carol




 On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna 
 www...@gmail.comwrote:

 Try this:

 Time2 - gsub(\\.*, , tolower(Time))
 modifyList(Time2, list(hour = Time2$hour + 11))


 On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.comwrote:

 Hi there,

 I've got a column vector in a csv file as follows, and I need to add 11
 hours to each of them. Is there a way that I can do it? (The actual file
 size is much bigger than this.)

 Time
 01-DEC-2008 00:00:28.611
 01-DEC-2008 00:00:43.155
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.919
 01-DEC-2008 00:23:46.452
 02-DEC-2008 00:03:17.646
 02-DEC-2008 00:03:17.652
 03-DEC-2008 00:15:11.485
 03-DEC-2008 00:18:44.652
 03-DEC-2008 00:22:17.447

 Thank you in advance.

 Cheers,

 Carol

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O





 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] time zone convert

On Thu, Apr 29, 2010 at 11:44 AM, Carol Gao carol.g...@gmail.com wrote:

 I tried your new lines with some random time, it seems to be working
 perfectly well, just as follows:
  z - strptime(20/2/06 23:16:16.683, %d/%m/%y %H:%M:%OS)
  modifyList(z, list(hour = z$hour + 11))
 [1] 2006-02-21 10:16:16

 Now it seems that I have some problem with my Time vector. As Time was
 created by the following code:
 Time - paste(anz$Date.G.,anz$Time.G.)
 The original data looks like the following with each row correspond to
 each.
 Date.G.
 01-DEC-2008
 01-DEC-2008
 02-DEC-2008
 03-DEC-2008
 04-DEC-2008
 ...

 Time.G.
 00:03:57.398
 00:04:03.778
 00:04:38.639
 00:04:38.639
 00:04:38.639
 ...
 Somehow, I can't read Time in  strptime(Time,%d-%b-%Y %H:%M:%OS). Do you
 know what was wrong with it?


Why not?

anz - data.frame(Date.G = c(01-DEC-2008,
01-DEC-2008,02-DEC-2008,03-DEC-2008,04-DEC-2008),
  Time.G =
c(00:03:57.398,00:04:03.778,00:04:38.639,00:04:38.639,00:04:38.639))

Time - strptime(paste(anz$Date.G, anz$Time.G), '%d-%b-%Y %H:%M:%S')
modifyList(Time, list(hour = Time$hour + 11))




 Sorry for asking such questions, as I am quite new to R. Thanks for helping
 me out.

 Carol




 On Fri, Apr 30, 2010 at 12:16 AM, Henrique Dallazuanna 
 www...@gmail.comwrote:

 Ops,

 I sent to you a wrong code, try this indeed:

 Time2 - strptime(Time, '%d-%b-%Y %H:%M:%S')

 modifyList(Time2, list(hour = Time2$hour + 11))

 On Thu, Apr 29, 2010 at 11:14 AM, Carol Gao carol.g...@gmail.com wrote:

 Appreciate it! I was trying on the code you sent, then some error codes
 turned up:

 The first line runs ok, the second line:


  modifyList(Time2, list(hour = Time2$hour + 11))
 Error in Time2$hour : $ operator is invalid for atomic vectors

 The time format I used for reading the Time vector is %d-%b-%Y
 %H:%M:%OS. Should I change any code above?

 Carol




 On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna www...@gmail.com
  wrote:

 Try this:

 Time2 - gsub(\\.*, , tolower(Time))
 modifyList(Time2, list(hour = Time2$hour + 11))


 On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.comwrote:

 Hi there,

 I've got a column vector in a csv file as follows, and I need to add 11
 hours to each of them. Is there a way that I can do it? (The actual
 file
 size is much bigger than this.)

 Time
 01-DEC-2008 00:00:28.611
 01-DEC-2008 00:00:43.155
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.919
 01-DEC-2008 00:23:46.452
 02-DEC-2008 00:03:17.646
 02-DEC-2008 00:03:17.652
 03-DEC-2008 00:15:11.485
 03-DEC-2008 00:18:44.652
 03-DEC-2008 00:22:17.447

 Thank you in advance.

 Cheers,

 Carol

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O





 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O





-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] EBCDIC

2010-04-29 Thread Michael Steven Rooney

Does R have package/function that can read a file that has been downloaded
from a mainframe in EBCDIC format?

Thanks,
Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fidelity of lattice graphics captured to jpeg or png

2010-04-29 Thread Rob James

I am generating images via lattice from Frank Harrell's RMS package. 
These images are characterized by coloured lines and grey-scale 
confidence intervals.  I need to port them to Openoffice/etc, and have 
tried both png and jpeg (at high quality), but in neither format can I 
subsequently see the the grey scale confidence intervals.  Other than 
moving to LaTex, does anyone have other suggestions?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] time zone convert

that's weird. I opened a new R window and paste your code, it turns up
showing

 anz1 - data.frame(Date.G = c(01-DEC-2008,
01-DEC-2008,02-DEC-2008,03-DEC-2008,04-DEC-2008),
+   Time.G =
c(00:03:57.398,00:04:03.778,00:04:38.639,00:04:38.639,00:04:38.639))

 Time - strptime(paste(anz1$Date.G, anz1$Time.G), '%d-%b-%Y %H:%M:%S')
 modifyList(Time, list(hour = Time$hour + 11))
[1] NA NA NA NA NA

What could possibly be the reason for that?

On Fri, Apr 30, 2010 at 12:52 AM, Henrique Dallazuanna www...@gmail.comwrote:



 On Thu, Apr 29, 2010 at 11:44 AM, Carol Gao carol.g...@gmail.com wrote:

 I tried your new lines with some random time, it seems to be working
 perfectly well, just as follows:
  z - strptime(20/2/06 23:16:16.683, %d/%m/%y %H:%M:%OS)
  modifyList(z, list(hour = z$hour + 11))
 [1] 2006-02-21 10:16:16

 Now it seems that I have some problem with my Time vector. As Time was
 created by the following code:
 Time - paste(anz$Date.G.,anz$Time.G.)
 The original data looks like the following with each row correspond to
 each.
 Date.G.
 01-DEC-2008
 01-DEC-2008
 02-DEC-2008
 03-DEC-2008
 04-DEC-2008
 ...

 Time.G.
 00:03:57.398
 00:04:03.778
 00:04:38.639
 00:04:38.639
 00:04:38.639
 ...
 Somehow, I can't read Time in  strptime(Time,%d-%b-%Y %H:%M:%OS). Do you
 know what was wrong with it?


 Why not?

 anz - data.frame(Date.G = c(01-DEC-2008,
 01-DEC-2008,02-DEC-2008,03-DEC-2008,04-DEC-2008),
   Time.G =
 c(00:03:57.398,00:04:03.778,00:04:38.639,00:04:38.639,00:04:38.639))

 Time - strptime(paste(anz$Date.G, anz$Time.G), '%d-%b-%Y %H:%M:%S')
 modifyList(Time, list(hour = Time$hour + 11))




 Sorry for asking such questions, as I am quite new to R. Thanks for
 helping me out.

 Carol




 On Fri, Apr 30, 2010 at 12:16 AM, Henrique Dallazuanna 
 www...@gmail.comwrote:

 Ops,

 I sent to you a wrong code, try this indeed:

 Time2 - strptime(Time, '%d-%b-%Y %H:%M:%S')

 modifyList(Time2, list(hour = Time2$hour + 11))

 On Thu, Apr 29, 2010 at 11:14 AM, Carol Gao carol.g...@gmail.comwrote:

 Appreciate it! I was trying on the code you sent, then some error codes
 turned up:

 The first line runs ok, the second line:


  modifyList(Time2, list(hour = Time2$hour + 11))
 Error in Time2$hour : $ operator is invalid for atomic vectors

 The time format I used for reading the Time vector is %d-%b-%Y
 %H:%M:%OS. Should I change any code above?

 Carol




 On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna 
 www...@gmail.com wrote:

 Try this:

 Time2 - gsub(\\.*, , tolower(Time))
 modifyList(Time2, list(hour = Time2$hour + 11))


 On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.comwrote:

 Hi there,

 I've got a column vector in a csv file as follows, and I need to add
 11
 hours to each of them. Is there a way that I can do it? (The actual
 file
 size is much bigger than this.)

 Time
 01-DEC-2008 00:00:28.611
 01-DEC-2008 00:00:43.155
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.919
 01-DEC-2008 00:23:46.452
 02-DEC-2008 00:03:17.646
 02-DEC-2008 00:03:17.652
 03-DEC-2008 00:15:11.485
 03-DEC-2008 00:18:44.652
 03-DEC-2008 00:22:17.447

 Thank you in advance.

 Cheers,

 Carol

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O





 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O





 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] EBCDIC

2010-04-29 Thread Erik Iverson

Perhaps ?read.table, ?file, and ?iconv will offer some information about 
how to use different encodings in R.


Michael Steven Rooney wrote:

Does R have package/function that can read a file that has been downloaded
from a mainframe in EBCDIC format?

Thanks,
Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple loop code



On Apr 29, 2010, at 10:37 AM, RCulloch wrote:



Thanks Henrique,

that works! for anyone else as slow as me, just:

##Assign
x - factor(dat.ID$ID2, labels = 1:7)
##Convert to dataframe
x - as.data.frame(x)


The more typical methods for converting a factor to a character vector  
would be:


(ff - factor(substring(statistics, 1:10, 1:10), levels=letters))
 levels(ff)[ff]
# [1] s t a t i s t i c s
 as.character(ff)
# [1] s t a t i s t i c s


##Then bind to your data
z - cbind(y,x)


Oooh. Not a good practice, at least for the newish useR. cbind and  
rbind create matrices and as a consequence coerce all of their  
elements to be of the same type. Numeric columns would become  
character vectors. Not generally a desired result. This would be safer:


dat.I$ID2.cf  - as.character( factor(dat.ID$ID2, labels = 1:7)  )
--
David.


Thanks again, I expected it to be more complicated!

Cheers,

Ross
--
View this message in context: 
http://r.789695.n4.nabble.com/Simple-loop-code-tp2075322p2075586.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fidelity of lattice graphics captured to jpeg or png

2010-04-29 Thread Joshua Wiley

When I need high quality graphics from R, I usually use pdf or
postscript.  If you need a rasterized format, use a graphics editing
program to rasterize at whatever quality you want (e.g., GIMP which is
free).

HTH,

Josh

On Thu, Apr 29, 2010 at 8:05 AM, Rob James r...@aetiologic.ca wrote:
 I am generating images via lattice from Frank Harrell's RMS package. These
 images are characterized by coloured lines and grey-scale confidence
 intervals.  I need to port them to Openoffice/etc, and have tried both png
 and jpeg (at high quality), but in neither format can I subsequently see the
 the grey scale confidence intervals.  Other than moving to LaTex, does
 anyone have other suggestions?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Senior in Psychology
University of California, Riverside
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using plyr::dply more (memory) efficiently?

2010-04-29 Thread Steve Lianoglou

Hi Matthew,

On Thu, Apr 29, 2010 at 9:52 AM, Matthew Dowle mdo...@mdowle.plus.com wrote:
 I don't know about that,  but try this :

 install.packages(data.table, repos=http://R-Forge.R-project.org;)
 require(data.table)
 summaries = data.table(summaries)
 summaries[,sum(counts),by=symbol]

 Please let us know if that returns the correct result,  and if its
 memory/speed is ok ?

Thanks for directing me to the data.table package. I read through some
of the vignettes, and it looks quite nice.

While your sample code would provide answer if I wanted to just
compute some summary statistic/function of groups of my data.frame
(using `by=symbol`), what's the best way to produces several pieces of
info per subset.

For instance, I see that I can do something like this:

summaries[, list(counts=sum(counts), width=sum(exon.width)), by=symbol]

But what if I need to do some more complex processing within the
subsets defined in `by=symbol` -- like several lines of programming
logic for 1 result, say.

I guess I can open a new block that just returns a data.table? Like:

summaries[, {
  cnts - sum(counts)
  ew - sum(exon.width)
  # ... some complex things
  complex - # .. result of complex things
  data.table(counts=cnts, width=ew, cplx=complex)
}, by=symbol]

Is that right? (I mean, it looks like it's working, but maybe there's
a more idiomatic way(?))

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fidelity of lattice graphics captured to jpeg or png... followup

2010-04-29 Thread Rob James

Subsequent investigations (via GIMP) show that the problem is in OO, and 
now with the images themselves.

Off to the OO forums.

 Original Message 
Subject:Fidelity of lattice graphics captured to jpeg or png
Date:   Thu, 29 Apr 2010 08:05:04 -0700
From:   Rob James r...@aetiologic.ca
To: r-help@r-project.org

I am generating images via lattice from Frank Harrell's RMS package.
These images are characterized by coloured lines and grey-scale
confidence intervals.  I need to port them to Openoffice/etc, and have
tried both png and jpeg (at high quality), but in neither format can I
subsequently see the the grey scale confidence intervals.  Other than
moving to LaTex, does anyone have other suggestions?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] generating correlated random variables from different distributions

2010-04-29 Thread Richard and Barbara Males

I need to generate a set of correlated random variables for a Monte
Carlo simulation.   The solutions I have found
(http://www.stat.uiuc.edu/stat428/cndata.html,
http://www.sitmo.com/doc/Generating_Correlated_Random_Numbers), using
Cholesky Decomposition, seem to work only if the variables come from
the same distribution with the same parameters.  My situation is that
each variable may be described by a different distribution (or
different parameters of the same distribution).  This approach does
not seem to work, see code and results below.  Am I missing something
here?  My math/statistics is not very good, will I need to generate
correlated uniform random variables on (0,1) and then use the inverse
distributions to get the desired results I am looking for?  That is
acceptable, but I would prefer to just generate the individual
distributions and then correlate them.  Any advice much appreciated.
Thanks in advance

R. Males
Cincinnati, Ohio, USA

Sample Code:
# Testing Correlated Random Variables

# reference http://www.sitmo.com/doc/Generating_Correlated_Random_Numbers
# reference http://www.stat.uiuc.edu/stat428/cndata.html
# create the correlation matrix
corMat=matrix(c(1,0.6,0.3,0.6,1,0.5,0.3,0.5,1),3,3)
cholMat=chol(corMat)
# create the matrix of random variables
set.seed(1000)
nValues=1

# generate some random values

matNormalAllSame=cbind(rnorm(nValues),rnorm(nValues),rnorm(nValues))
matNormalDifferent=cbind(rnorm(nValues,1,1.5),rnorm(nValues,2,0.5),rnorm(nValues,6,1.8))
matUniformAllSame=cbind(runif(nValues),runif(nValues),runif(nValues))
matUniformDifferent=cbind(runif(nValues,1,1.5),runif(nValues,2,3.5),runif(nValues,6,10.8))

# bind to a matrix
print(correlation Matrix)
print(corMat)
print(Cholesky Decomposition)
print (cholMat)

# test same normal

resultMatNormalAllSame=matNormalAllSame%*%cholMat
print(correlation matNormalAllSame)
print(cor(resultMatNormalAllSame))

# test different normal

resultMatNormalDifferent=matNormalDifferent%*%cholMat
print(correlation matNormalDifferent)
print(cor(resultMatNormalDifferent))

# test same uniform
resultMatUniformAllSame=matUniformAllSame%*%cholMat
print(correlation matUniformAllSame)
print(cor(resultMatUniformAllSame))

# test different uniform
resultMatUniformDifferent=matUniformDifferent%*%cholMat
print(correlation matUniformDifferent)
print(cor(resultMatUniformDifferent))

and results

[1] correlation Matrix
 [,1] [,2] [,3]
[1,]  1.0  0.6  0.3
[2,]  0.6  1.0  0.5
[3,]  0.3  0.5  1.0
[1] Cholesky Decomposition
 [,1] [,2]  [,3]
[1,]1  0.6 0.300
[2,]0  0.8 0.400
[3,]0  0.0 0.8660254
[1] correlation matNormalAllSame == ok
  [,1]  [,2]  [,3]
[1,] 1.000 0.6036468 0.3013823
[2,] 0.6036468 1.000 0.5005440
[3,] 0.3013823 0.5005440 1.000
[1] correlation matNormalDifferent == no good
  [,1]  [,2]  [,3]
[1,] 1.000 0.9141472 0.2676162
[2,] 0.9141472 1.000 0.2959178
[3,] 0.2676162 0.2959178 1.000
[1] correlation matUniformAllSame == ok
  [,1]  [,2]  [,3]
[1,] 1.000 0.5971519 0.2959195
[2,] 0.5971519 1.000 0.5011267
[3,] 0.2959195 0.5011267 1.000
[1] correlation matUniformDifferent == no good
  [,1]  [,2]  [,3]
[1,] 1.000 0.2312000 0.0351460
[2,] 0.2312000 1.000 0.1526293
[3,] 0.0351460 0.1526293 1.000


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

2010-04-29 Thread Charles C. Berry


On Thu, 29 Apr 2010, Barry Rowlingson wrote:


On Thu, Apr 29, 2010 at 1:27 PM, Henrique Dallazuanna www...@gmail.com wrote:

Another option could be:

split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1]



One thing none of the solutions so far do (except I haven't tried
Tal's original code) is insert an empty group between adjacent NA
values, for example in:

x = c(1,2,3,NA,NA,4,5,6)

 split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1]
$`0`
[1] 1 2 3

$`2`
[1] 4 5 6

Maybe this never happens in Tal's case, or it's not what he wanted
anyway, but I thought I'd point it out!


The ever useful rle() helps


y - rle(!is.na(x))
split(x, rep( cumsum(y$val)*y$val, y$len ) )[-1]

$`1`
[1] 1 2 3

$`2`
[1] 4 5 6


Chuck



Barry



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merged files

2010-04-29 Thread Alex Jameson

Hi,

i have two files (file1.txt and file2.txt) which i would like to merge,
based on certain criteria, i.e.
 it combines data based on matching geneID and exons.
i have used the merge option, but it does not give me the desired outcome.
merged.txt shows the result i would like.




*File1. txt*
**
 AffyProbe ProbeType Flag GeneSymbol GeneID Exons Chrom Strand Affytart
AffyEnd   1 1007_s_at:1105:483 0 0 DDR1 780 21 6 + 30975403 30975427 2
1007_s_at:1119:177 0 0 DDR1 780 21 6 + 30975549 30975573 3
1007_s_at:1136:469 0 0 DDR1 780 21 6 + 30975766 30975790 4 1007_s_at:192:205
0 0 DDR1 780 21 6 + 30975523 30975547 5 1007_s_at:474:1161 0 0 DDR1 780 21 6
+ 30975745 30975769 6 1007_s_at:504:983 0 0 DDR1 780 21 6 + 30975575
30975599 7 1007_s_at:50:779 0 0 DDR1 780 21 6 + 30975758 30975782

*File2.txt*

AgilentProbe ProbeType Flag GeneSymbol GeneID Exons Chrom Strand
AgilentStart AgilentEnd   1 A_23_P11 0 0 FAM174B 400451 5 15 - 90961852
90961793 2 A_23_P100022 0 0 SV2B 9899 14 15 + 89639333 89639392 3
A_23_P100056 0 0 RBPMS2 348093 8 15 - 62819428 62819369 4 A_23_P100074 0 0
AVEN 57099 6 15 - 31946031 31945972 5 A_23_P100092 0 0 ZSCAN29 146050 5 15 -
41440680 41440621 6 A_23_P100103 0 0 VPS39 23339 24 15 - 40240319 40240260 7
A_23_P100111 0 0 CHP 11261 7 15 + 39358845 39358904 8 A_23_P100127 0 0 CASC5
57082 11 15 + 38704817 38704876 9 A_23_P100133 0 0 ATMIN 23300 4 16 +
79636596 79636655 10 A_23_P100141 0 0 UNKL 64718 12 16 - 1355346 1355287


*merged.txt (Should look like this)*

   GeneSymbol GeneID Exons Chrome AffyMatrixProbeID AffyStart AffyEnd
AgilentProbeID AgilentStart AgilentEnd DDR1 780 21 6   A_24_P123601
30975848 30975907 RFC2 5982 10 7 1053_at:120:925,
1053_at:504:41,
1053_at:522:871,
1053_at:828:1025,
203696_s_at:291:651 73287845,
73287869,
73287863,
73287881,
73287850 73287821,
73287845,
73287839,
73287857,
73287826 A_23_P93823 73287861 73287802 RFC2 5982 11 7 HSPA6 3310
1 1   A_23_P114903 159762782 159762841 PAX8 7849 12 2   A_23_P210001
113691555 113691496 GUCA1A 2978 6 6 UBA7 7318 24 3
1294_at:1079:379,
1294_at:361:881,
203281_s_at:524:889,
203281_s_at:678:1017,
203281_s_at:68:1153 49818386,
49818398,
49818378,
49818434,
49818422 49818362,
49818374,
49818354,
49818420,
49818398


sorry for the long tables,

thanks

Alex

Student
University of Colorado

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Tinn-R related problem

2010-04-29 Thread Maspfuhl, Oliver

Dear Mr Hewitt,

I am having exactly the same problem as descibed on page

https://stat.ethz.ch/pipermail/r-help/2008-March/156809.html

(please find a copy below), I wonder if you happen to have heart of any
solution to it (i.e. which Windows settings have to be altered in order to
solve the problem). The mystirious thing about it is that I didn't change
anything before this happend, I didn't upgrade R, Tinn-R or any other program,
it happend right in the middle of working with R.

Many thanks in advance, and kind regards,
Oliver

Oliver Maspfuhl

Commerzbank AG

oliver.maspf...@commerzbank.com

http://www.commerzbank.de

[R] Tinn-R related problem

David Hewitt dhewitt37 at gmail.com
mailto:r-help%40r-project.org?Subject=%5BR%5D%20Tinn-R%20related%20problemIn-Reply-To=15950714.post%40talk.nabble.com

Mon Mar 10 17:10:34 CET 2008

* Previous message: [R] Tinn-R related problem
https://stat.ethz.ch/pipermail/r-help/2008-March/156779.html
* Next message: [R] Tinn-R related problem
https://stat.ethz.ch/pipermail/r-help/2008-March/156839.html
* Messages sorted by: [ date ]
https://stat.ethz.ch/pipermail/r-help/2008-March/date.html#156809 [ thread ]
https://stat.ethz.ch/pipermail/r-help/2008-March/thread.html#156809 [
subject ]
https://stat.ethz.ch/pipermail/r-help/2008-March/subject.html#156809 [
author ] https://stat.ethz.ch/pipermail/r-help/2008-March/author.html#156809

A few weeks ago all of a sudden the backspace, enter and direction keys
were not working. I updated Tinn-R to the newest version but still no
sollution. After this I tried reinstalling it (prior to that I removed
Tinn-R and deleted all the leftovers manually) and still no change. In
every other execution (e.g. when I save a file) every key works fine.

I've used Tinn-R with R on Win XP ever since I started with R, and I've
never had this problem. The only immediate thing that comes to mind is that
you should be installing R in SDI mode to get it working with Tinn-R. At
least that's what they say, and I've never tried it the other way (MDI).
Maybe just uninstall R and Tinn-R, then reload R, use Custom installation
and pick SDI, then reinstall Tinn-R. Worth a shot.

From what I have read in the other forums I believe this issue is not
necessarily R or Tinn-R related but might be some hidden Windows settings
(I'm using XP) but of this I'm not sure.

If that's the case, I can't help. What occurred a few weeks ago that might
have been related? Did you upgrade R?

-
David Hewitt
Virginia Institute of Marine Science
http://www.vims.edu/fish/students/dhewitt/
--
View this message in context:
http://www.nabble.com/Tinn-R-related-problem-tp15950714p15950865.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] time zone convert

Thanks! I think it now works after I changed the time zone and language
settings on PC. It seems when the system was under some other languages
other than english, it reads the time a bit differently.
Not sure if it was the reason, but thanks for your help.

Cheers,

Carol

On Fri, Apr 30, 2010 at 1:05 AM, Carol Gao carol.g...@gmail.com wrote:

 that's weird. I opened a new R window and paste your code, it turns up
 showing

  anz1 - data.frame(Date.G = c(01-DEC-2008,
 01-DEC-2008,02-DEC-2008,03-DEC-2008,04-DEC-2008),

 +   Time.G =
 c(00:03:57.398,00:04:03.778,00:04:38.639,00:04:38.639,00:04:38.639))
 
  Time - strptime(paste(anz1$Date.G, anz1$Time.G), '%d-%b-%Y %H:%M:%S')

  modifyList(Time, list(hour = Time$hour + 11))
 [1] NA NA NA NA NA

 What could possibly be the reason for that?


 On Fri, Apr 30, 2010 at 12:52 AM, Henrique Dallazuanna 
 www...@gmail.comwrote:



 On Thu, Apr 29, 2010 at 11:44 AM, Carol Gao carol.g...@gmail.com wrote:

 I tried your new lines with some random time, it seems to be working
 perfectly well, just as follows:
  z - strptime(20/2/06 23:16:16.683, %d/%m/%y %H:%M:%OS)
  modifyList(z, list(hour = z$hour + 11))
 [1] 2006-02-21 10:16:16

 Now it seems that I have some problem with my Time vector. As Time was
 created by the following code:
 Time - paste(anz$Date.G.,anz$Time.G.)
 The original data looks like the following with each row correspond to
 each.
 Date.G.
 01-DEC-2008
 01-DEC-2008
 02-DEC-2008
 03-DEC-2008
 04-DEC-2008
 ...

 Time.G.
 00:03:57.398
 00:04:03.778
 00:04:38.639
 00:04:38.639
 00:04:38.639
 ...
 Somehow, I can't read Time in  strptime(Time,%d-%b-%Y %H:%M:%OS). Do
 you know what was wrong with it?


 Why not?

 anz - data.frame(Date.G = c(01-DEC-2008,
 01-DEC-2008,02-DEC-2008,03-DEC-2008,04-DEC-2008),
   Time.G =
 c(00:03:57.398,00:04:03.778,00:04:38.639,00:04:38.639,00:04:38.639))

 Time - strptime(paste(anz$Date.G, anz$Time.G), '%d-%b-%Y %H:%M:%S')
 modifyList(Time, list(hour = Time$hour + 11))




 Sorry for asking such questions, as I am quite new to R. Thanks for
 helping me out.

 Carol




 On Fri, Apr 30, 2010 at 12:16 AM, Henrique Dallazuanna www...@gmail.com
  wrote:

 Ops,

 I sent to you a wrong code, try this indeed:

 Time2 - strptime(Time, '%d-%b-%Y %H:%M:%S')

 modifyList(Time2, list(hour = Time2$hour + 11))

 On Thu, Apr 29, 2010 at 11:14 AM, Carol Gao carol.g...@gmail.comwrote:

 Appreciate it! I was trying on the code you sent, then some error codes
 turned up:

 The first line runs ok, the second line:


  modifyList(Time2, list(hour = Time2$hour + 11))
 Error in Time2$hour : $ operator is invalid for atomic vectors

 The time format I used for reading the Time vector is %d-%b-%Y
 %H:%M:%OS. Should I change any code above?

 Carol




 On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna 
 www...@gmail.com wrote:

 Try this:

 Time2 - gsub(\\.*, , tolower(Time))
 modifyList(Time2, list(hour = Time2$hour + 11))


 On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.comwrote:

 Hi there,

 I've got a column vector in a csv file as follows, and I need to add
 11
 hours to each of them. Is there a way that I can do it? (The actual
 file
 size is much bigger than this.)

 Time
 01-DEC-2008 00:00:28.611
 01-DEC-2008 00:00:43.155
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.677
 01-DEC-2008 00:01:06.919
 01-DEC-2008 00:23:46.452
 02-DEC-2008 00:03:17.646
 02-DEC-2008 00:03:17.652
 03-DEC-2008 00:15:11.485
 03-DEC-2008 00:18:44.652
 03-DEC-2008 00:22:17.447

 Thank you in advance.

 Cheers,

 Carol

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O





 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O





 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: convert Factor as numeric

Hi

You have to get rid of thousands separator firsr

as.numeric(gsub(,, , S))

Regards
Petr

r-help-boun...@r-project.org napsal dne 29.04.2010 13:12:44:

 Dear group,
 
 I know this issue has been already covered, and before you reply I must 
say
 I have read the R-FAQ and search the mailing list archive.
 I still can't manage to change my factor to numeric as I couldn't find 
any
 clear answer.
 
 Here is my df :
 
 Pose1 -
 structure(list(DESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L, 
 8L), .Label = c( SUGAR NO.11 May/10 , COTTON NO.2 May/10 , 
 PLATINUM Jul/10 , ROBUSTA COFFEE (10) May/10 , WHEAT May/10 , 
 PRIMARY NICKEL USD, PRM HGH GD ALUMINIUM USD, SPCL HIGH GRADE ZINC
 USD, 
 STANDARD LEAD USD), class = factor), POSITION = c(5, 3, -1, 
 15, 4, 2), SETTLEMENT = structure(c(3L, 5L, 2L, 1L, 4L, 8L), .Label =
 c(1,353., 
 1,739.4000, 16.5400, 467.7500, 78.1300, 25,760.8600, 
 2,415.9000, 2,421.0500, 2,357.1200), class = factor)), .Names =
 c(DESCRIPTION, 
 POSITION, SETTLEMENT), row.names = c(1, 2, 3, 4, 
 5, 51), class = data.frame)
 
 S-Pose1$SETTLEMENT  #select the last column
  S
 [1] 16.540078.13001,739.4000 1,353. 467.7500   2,421.0500
 Levels: 1,353. 1,739.4000 16.5400 467.7500 78.1300 25,760.8600
 2,415.9000 2,421.0500 2,357.1200
  str(S)
  Factor w/ 9 levels 1,353.,1,739.4000,..: 3 5 2 1 4 8
 
 Now I need to change S to numeric class
 
  S1-as.numeric(levels(S))[as.integer(S)]   #doesn't work, numbers are
 rounded or NA
 Warning message:
 NAs introduced by coercion
 
  S1-as.numeric(levels(S))[S]  #doesn't work, numbers are rounded or NA
 Warning message:
 NAs introduced by coercion
 
  S1-as.numeric(as.character(S))  #doesn't work, numbers are rounded or 
NA
 Warning message:
 NAs introduced by coercion
 
 If it can help, my column S is part of a DF that has been obtained via 
this
 line :
 
 
pose=read.csv2(LSCPos1.csv,sep=,,dec=.,as.is=T,h=T,skip=1)[,c(4,8,14,
 15)]
 
 pose -
 structure(list(DESCRIPTION = c(WHEAT May/10 , WHEAT May/10 , 
 WHEAT May/10 , WHEAT May/10 , COTTON NO.2 May/10 , COTTON NO.2 
May/10
 , 
 COTTON NO.2 May/10 , PLATINUM Jul/10 ,  SUGAR NO.11 May/10 , 
  SUGAR NO.11 May/10 ,  SUGAR NO.11 May/10 ,  SUGAR NO.11 May/10 , 
  SUGAR NO.11 May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE 
(10)
 May/10 , 
 ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
 ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
 ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
 ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
 ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
 PRM HGH GD ALUMINIUM USD 09/07/10 , PRM HGH GD ALUMINIUM USD 09/07/10 
, 
 PRIMARY NICKEL USD 04/06/10 , PRIMARY NICKEL USD 04/06/10 , 
 PRIMARY NICKEL USD 10/06/10 , PRIMARY NICKEL USD 10/06/10 , 
 STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , 
 STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , 
 STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , 
 STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 06/07/10 , 
 SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD 08/07/10 
, 
 SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 
, 
 SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 
, 
 SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 
, 
 SPCL HIGH GRADE ZINC USD 13/04/10 , SPCL HIGH GRADE ZINC USD 13/04/10 

 ), CREATED.DATE = structure(c(14705, 14707, 14707, 14711, 14700, 
 14700, 14711, 14711, 14708, 14708, 14708, 14711, 14711, 14707, 
 14707, 14707, 14707, 14707, 14708, 14708, 14708, 14708, 14708, 
 14708, 14708, 14708, 14708, 14672, 14673, 14678, 14678, 14700, 
 14700, 14700, 14700, 14700, 14700, 14700, 14705, 14707, 14707, 
 14707, 14708, 14708, 14708, 14708, 14708, 14622, 14634), class = 
Date), 
 QUANITY = c(1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1, 
 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 
 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1, -1, 1, 1, 1, -1), 
CLOSING.PRICE =
 c(467.7500, 
 467.7500, 467.7500, 467.7500, 78.1300, 78.1300, 
 78.1300, 1,739.4000, 16.5400, 16.5400, 16.5400, 
 16.5400, 16.5400, 1,353., 1,353., 1,353., 
 1,353., 1,353., 1,353., 1,353., 
1,353., 
 1,353., 1,353., 1,353., 1,353., 
2,415.9000, 
 2,415.9000, 25,755.7100, 25,755.7100, 25,760.8600, 
 25,760.8600, 2,355.9600, 2,355.9600, 2,355.9600, 
 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 
2,357.1200, 
 2,420.7300, 2,420.7300, 2,420.7300, 2,421.0500, 
2,421.0500, 
 2,421.0500, 2,421.0500, 2,421.0500, 2,388.4300, 2,388.4300
 )), .Names = c(DESCRIPTION, CREATED.DATE, QUANITY, 
 SETTLEMENT), row.names = c(NA, -49L), class = data.frame)
 
  str(pose)
 'data.frame':   49 obs. of  4 variables:
  $ DESCRIPTION : chr  WHEAT May/10  WHEAT May/10  WHEAT May/10  
WHEAT
 May/10  ...
  $ CREATED.DATE:Class 'Date'  num [1:49] 14705 14707 14707 14711 14700 
...
  $ QUANITY

Re: [R] time zone convert



On Apr 29, 2010, at 10:14 AM, Carol Gao wrote:

Appreciate it! I was trying on the code you sent, then some error  
codes

turned up:

The first line runs ok, the second line:


modifyList(Time2, list(hour = Time2$hour + 11))

Error in Time2$hour : $ operator is invalid for atomic vectors

The time format I used for reading the Time vector is %d-%b-%Y %H: 
%M:%OS.


It appears you have already created a datetime object from a read  
operation on that csv file, in which case adding 11 hours should be  
straightforward.


Time.plus.11hr - Time + 11*60*60

--
David


Should I change any code above?

Carol



On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna www...@gmail.com 
wrote:



Try this:

Time2 - gsub(\\.*, , tolower(Time))
modifyList(Time2, list(hour = Time2$hour + 11))


On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.com  
wrote:



Hi there,

I've got a column vector in a csv file as follows, and I need to  
add 11
hours to each of them. Is there a way that I can do it? (The  
actual file

size is much bigger than this.)

Time
01-DEC-2008 00:00:28.611
01-DEC-2008 00:00:43.155
01-DEC-2008 00:01:06.677
01-DEC-2008 00:01:06.677
01-DEC-2008 00:01:06.677
01-DEC-2008 00:01:06.919
01-DEC-2008 00:23:46.452
02-DEC-2008 00:03:17.646
02-DEC-2008 00:03:17.652
03-DEC-2008 00:15:11.485
03-DEC-2008 00:18:44.652
03-DEC-2008 00:22:17.447

Thank you in advance.

Cheers,

Carol

  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple cex sizes in main for xyplot?

2010-04-29 Thread Bert Gunter

Felix:

Oh, yes. That gives me what I want without having to resort to padding
parameters.

I don't know why it works (vs specifying the y locations), but I suppose
that's confounded with the details of lattice engineering, which I wanted to
avoid.

So many thanks for your help.

Bert Gunter
Genentech Nonclinical Biostatistics
 
 
-Original Message-
From: foolish.andr...@gmail.com [mailto:foolish.andr...@gmail.com] On Behalf
Of Felix Andrews
Sent: Wednesday, April 28, 2010 4:33 PM
To: Bert Gunter
Cc: r-help@r-project.org
Subject: Re: [R] Multiple cex sizes in main for xyplot?

I don't think there's a much better way to do it... but this seems to work:

xyplot((0:1)~(0:1),
   main = textGrob(lab=c(Some Text,\nSome More Text),x=c(0.5,0.5),
   gp=gpar(cex=c(1.2,1.0), lineheight=2))
   )

-Felix


On 29 April 2010 08:06, Bert Gunter gunter.ber...@gene.com wrote:
 Folks:

 I would like to write two lines of text in two different font sizes (or
 faces or ...) as the title (main) of a trellis plot.  The following code
 does it, but not well:

 xyplot((0:1)~(0:1),
        main = textGrob(lab=c(Some Text,Some More Text),y=c(.95,.8),
                        gp=gpar(cex=c(1.2,1.0)))
        )

 There is too much space between the title text and the plot. I assume that
 can be fixed by fooling with padding settings in lattice.options(), but my
 question is: Is there a better, simpler way to do this?
 Would using grid graphics directly by pushing title and plot viewports and
 then adding the lattice graph to the plot viewport be a better way to go?

 OS = Windows XP
 R = 2.11.0
 lattice_0.18-3
 device = windows


 Bert Gunter
 Genentech Nonclinical Biostatistics



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Felix Andrews / ???
Postdoctoral Fellow
Integrated Catchment Assessment and Management (iCAM) Centre
Fenner School of Environment and Society [Bldg 48a]
The Australian National University
Canberra ACT 0200 Australia
M: +61 410 400 963
T: + 61 2 6125 4670
E: felix.andr...@anu.edu.au
CRICOS Provider No. 00120C
-- 
http://www.neurofractal.org/felix/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generalized Estimating Equation (GEE): Why is Link = Identity?

2010-04-29 Thread Thomas Stewart

From the GEE article in R News, Vol. 2/3, December 2002:

Allows different covariates in separate models
for the mean, scale, and correlation via various
link functions.

Geepack offers link functions for the scale, correlation, and mean models.
 As the output suggests,

Correlation: Structure = ar1  Link = identity

does not refer to the mean link.  In fact, if you look at the output from
m.ar you would see:

Scale Link:   identity
Estimated Scale Parameters:  [1] 1

Correlation:  Structure = ar1Link = identity

See the R news article for more info on other correlation and scale link
functions.  The take home message is this: the mean link is exactly what you
think it is, the logit.

-tgs



On Thu, Apr 29, 2010 at 10:28 AM, Sachi Ito s.ito@gmail.com wrote:

 Hi,

 I'm running GEE using geepack.

 I set corstr = ar1 as below:

  m.ar - geeglm(L ~ O + A,
 + data = firstgrouptxt, id = id,
 + family = binomial, corstr = ar1)


  summary(m.ar)

 Call:
 geeglm(formula = L ~ O + A, family = binomial,
data = firstgrouptxt, id = id, corstr = ar1)

  Coefficients:
Estimate  Std.errWald Pr(|W|)
 (Intercept) -2.62516  0.21154 154.001   2e-16 ***
 ontask   0.00498  0.12143   0.002   0.9673
 attachmentB  0.73216  0.35381   4.282   0.0385 *
 attachmentC  0.25960  0.33579   0.598   0.4395
 ---
 Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

 Estimated Scale Parameters:
Estimate Std.err
 (Intercept)1.277  0.3538

 Correlation: Structure = ar1  Link = identity

 Estimated Correlation Parameters:
  Estimate  Std.err
 alpha0.978 0.005725
 Number of clusters:   49   Maximum cluster size: 533


 Then, it shows that :
 Correlation: Link = identity

 Why is it not Link = logit?


 Thank you,
 Sachi

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] reduce size of pdf

2010-04-29 Thread Nevil Amos


is there a way to reduce the size of pdf files  in R: ?
compression?

lower dpi ?

or some other option?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variable importance in Random Forest

2010-04-29 Thread Changbin Du

HI, Andy,

Thanks so much for your reply!

IN the paper Classification and regression by randomForest, the first
page,  there is the random forest estimate the the importance of a variable
by looking at how much prediction error increase when the variable is
permuted...

IN the help document of randomForest, the variable is measured in  total
decrease in node impurities. IT should be total* increase* in node
impurities? right?

if  total decrease in node impurities, will it be contradict with the paper?

ALso in the fit$importance, what is the meaning for first two columns?

 fit$importance
 0   1 MeanDecreaseAccuracy MeanDecreaseGini
CT0.0022352025 0.003829344 0.0030311246 5.184427
DP0.0069461974 0.016387520 0.011665096015.440624
DY0.0141150255 0.026031690 0.020060355519.901538
FC0.0024279188 0.005158945 0.0037948155 5.527078
NE0.0352705133 0.070503233 0.052771852646.278504
NW0.0256059127 0.034433862 0.029998149626.440402
QT0.0037228694 0.008181262 0.0059571350 9.308828
SK0.0048187014 0.008895719 0.006860917410.662129
TA0.0042134249 0.011746533 0.007985133112.878367
WC0.0177155268 0.014981440 0.016336632014.240232
WD0.0232972311 0.034083695 0.028670206525.335182
WG0.0328547215 0.053142508 0.042948044130.663749
WW0.0093983693 0.006377956 0.0078681474 7.250101
YG0.0051691399 0.007338639 0.006261814411.084111
num_cell  0.0061355526 0.005373049 0.0057463613 5.060577
num_genes 0.0364878788 0.044544488 0.040455809632.745034
position  0.0025375614 0.011566496 0.007025530210.070505
freq_hypo 0.0008723241 0.001757602 0.0013181209 1.930695
freq_intra0.0009449492 0.001943090 0.0014431451 2.611950
log_hypo  0.0004514713 0.001366561 0.0009096419 1.736749
acid_per  0.0125815445 0.023360179 0.017963437521.131681
base_per  0.0070077737 0.012196570 0.009612912413.675893
charge_per0.0095668425 0.024125997 0.016834595620.969665
hydrophob_per 0.0185736697 0.031941513 0.025220003625.994903
polar_per 0.0169369327 0.023633413 0.020277624720.890415




On Thu, Apr 29, 2010 at 5:22 AM, Liaw, Andy andy_l...@merck.com wrote:

  Please see the Detail section of the help page for the importance()
 function in the randomForest package, and let me know which part of it you
 do not understand.

 For boosting, you need to read its documentation and decide for yourself if
 its importance measure is at all comparable to the two in RF.

 Andy

  --
 *From:* Changbin Du [mailto:changb...@gmail.com]
 *Sent:* Wednesday, April 28, 2010 8:58 PM
 *To:* Liaw, Andy
 *Cc:* r-help@r-project.org
 *Subject:* variable importance in Random Forest

 HI, Dear Andy,

 I run the RandomFOrest in R, and get the following resutls in variable
 importance:

 What is the meaning of MeanDecreaseAccuracy  and MeanDecreaseGini?

 I found they are raw values, they are not scaled to 1, right?

 Which column if most similar to the variable rel.influence in Boosting?

 Thanks so much!



  fit$importance
  0   1 MeanDecreaseAccuracy
 MeanDecreaseGini
 CT0.0022352025 0.003829344 0.0030311246
 5.184427
 DP0.0069461974 0.016387520 0.0116650960
 15.440624
 DY0.0141150255 0.026031690 0.0200603555
 19.901538
 FC0.0024279188 0.005158945 0.0037948155
 5.527078
 NE0.0352705133 0.070503233 0.0527718526
 46.278504
 NW0.0256059127 0.034433862 0.0299981496
 26.440402
 QT0.0037228694 0.008181262 0.0059571350
 9.308828
 SK0.0048187014 0.008895719 0.0068609174
 10.662129
 TA0.0042134249 0.011746533 0.0079851331
 12.878367
 WC0.0177155268 0.014981440 0.0163366320
 14.240232
 WD0.0232972311 0.034083695 0.0286702065
 25.335182
 WG0.0328547215 0.053142508 0.0429480441
 30.663749
 WW0.0093983693 0.006377956 0.0078681474
 7.250101
 YG0.0051691399 0.007338639 0.0062618144
 11.084111
 num_cell  0.0061355526 0.005373049 0.0057463613
 5.060577
 num_genes 0.0364878788 0.044544488 0.0404558096
 32.745034
 position  0.0025375614 0.011566496 0.0070255302
 10.070505
 freq_hypo 0.0008723241 0.001757602 0.0013181209
 1.930695
 freq_intra0.0009449492 0.001943090 0.0014431451

Re: [R] Using plyr::dply more (memory) efficiently?

2010-04-29 Thread Matthew Dowle


Steve Lianoglou mailinglist.honey...@gmail.com wrote in message 
news:t2ybbdc7ed01004290812n433515b5vb15b49c170f5a...@mail.gmail.com...

 Thanks for directing me to the data.table package. I read through some
 of the vignettes, and it looks quite nice.

 While your sample code would provide answer if I wanted to just
 compute some summary statistic/function of groups of my data.frame
 (using `by=symbol`), what's the best way to produces several pieces of
 info per subset.

 For instance, I see that I can do something like this:

 summaries[, list(counts=sum(counts), width=sum(exon.width)), by=symbol]

Yes, thats it.

 But what if I need to do some more complex processing within the
 subsets defined in `by=symbol` -- like several lines of programming
 logic for 1 result, say.

 I guess I can open a new block that just returns a data.table? Like:

 summaries[, {
  cnts - sum(counts)
  ew - sum(exon.width)
  # ... some complex things
  complex - # .. result of complex things
  data.table(counts=cnts, width=ew, cplx=complex)
}, by=symbol]

 Is that right? (I mean, it looks like it's working, but maybe there's
 a more idiomatic way(?))

Yes, you got it.  Rather than a data.table at the end though, just return a 
list, its faster.
Shorter vectors will still be recycled to match any longer ones.

Or just this :

summaries[, list(
counts = sum(counts),
width = sum(exon.width),
cplx = # .. result of complex things
), by=symbol]


Sounds like its working,  but could you give us an idea whether it is quick 
and memory efficient ?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can't load doSMP from REvolutionR in regular R2.11.0

2010-04-29 Thread David M Smith

We haven't tested doSMP with the mingw compiler (hence why we haven't
yet submitted it to CRAN). We compiled it under R 2.10 using the same
Intel compilers we use for REvolution R. It is open source (GPL) so
you're welcome to try compiling it under mingw yourself, but we can't
offer support for that configuration.

# David Smith

On Wed, Apr 28, 2010 at 5:10 PM, Tao Shi shi...@hotmail.com wrote:
 I was testing out the doSMP package from REvolutionR in my regular R2.11.0 
 installation and I got the following error message.  Well, one obvious thing 
 is that R2.11.0 was built using i386-pc-mingw32 which is different from 
 what revoIPC used.  I could just use REvolutionR, but all my R peripherals 
 were set up to work
  with the regular R2.11.0.  So, I really want to make this work.  Anyideas?

--
David M Smith da...@revolution-computing.com
VP of Marketing, REvolution Computing  http://blog.revolution-computing.com
Tel: +1 (650) 330-0553 x205 (Palo Alto, CA, USA)

Download REvolution R free:
www.revolution-computing.com/downloads/revolution-r.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using plyr::dply more (memory) efficiently?

2010-04-29 Thread Steve Lianoglou

Hi Matthew,

 Sounds like its working,  but could you give us an idea whether it is quick
 and memory efficient ?

I actually can't believe what I'm seeing, I just recoded the function
to use data.table.

What has taken something on the order of ~ 20-30mins with an
lapply/do.call(rbind, ...) combo (actually I was using sqldf to do
quicker subselects) just finished in  1 min.

The memory being used in my R workspace now is still under 2GB, where
previously it was ~ 8GB when do.call(rbind, ...)-ing my list into a
data.frame, and +20GB with ddply.

I'm going to double check that I have the same results, but for now
I'm completely blown away.

data.table is awesome, thanks for this package.

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merged files



On Apr 29, 2010, at 10:21 AM, Alex Jameson wrote:


Hi,

i have two files (file1.txt and file2.txt) which i would like to  
merge,

based on certain criteria, i.e.
it combines data based on matching geneID and exons.
i have used the merge option,


Huh? What is the merge option? (There is a merge _function_.)


but it


It?  Please provide the code you used. Have you yet read the Posting  
Guide as I urged you earlier?



does not give me the desired outcome.
merged.txt shows the result i would like.



Given that those two files have no GeneID and Exons in common (after I  
took you mangled HTML posting and fixed each one to create readable  
files) , I would expect that this call which would implement the merge  
you requested above would produce 0 rows:


merge(dtd, File2, by=c(GeneID, Exons))  # which would be an inner  
join


Many (most?) of the numbers in the third desired file that we are  
seeing in mangled form do not appear in either of those two input  
files, so you appear to be requesting that we hack into your system to  
get them. Now what was it that you really wanted? (And no more HTML  
postings ... and use the dput function. That would be an equivalent to  
the dump method in the Posting Guide which (again) I urge you to read.)


--
David




*File1. txt*
**
AffyProbe ProbeType Flag GeneSymbol GeneID Exons Chrom Strand  
Affytart

AffyEnd   1 1007_s_at:1105:483 0 0 DDR1 780 21 6 + 30975403 30975427 2
1007_s_at:1119:177 0 0 DDR1 780 21 6 + 30975549 30975573 3
1007_s_at:1136:469 0 0 DDR1 780 21 6 + 30975766 30975790 4 1007_s_at: 
192:205
0 0 DDR1 780 21 6 + 30975523 30975547 5 1007_s_at:474:1161 0 0 DDR1  
780 21 6

+ 30975745 30975769 6 1007_s_at:504:983 0 0 DDR1 780 21 6 + 30975575
30975599 7 1007_s_at:50:779 0 0 DDR1 780 21 6 + 30975758 30975782

*File2.txt*

   AgilentProbe ProbeType Flag GeneSymbol GeneID Exons Chrom Strand
AgilentStart AgilentEnd   1 A_23_P11 0 0 FAM174B 400451 5 15 -  
90961852

90961793 2 A_23_P100022 0 0 SV2B 9899 14 15 + 89639333 89639392 3
A_23_P100056 0 0 RBPMS2 348093 8 15 - 62819428 62819369 4  
A_23_P100074 0 0
AVEN 57099 6 15 - 31946031 31945972 5 A_23_P100092 0 0 ZSCAN29  
146050 5 15 -
41440680 41440621 6 A_23_P100103 0 0 VPS39 23339 24 15 - 40240319  
40240260 7
A_23_P100111 0 0 CHP 11261 7 15 + 39358845 39358904 8 A_23_P100127 0  
0 CASC5

57082 11 15 + 38704817 38704876 9 A_23_P100133 0 0 ATMIN 23300 4 16 +
79636596 79636655 10 A_23_P100141 0 0 UNKL 64718 12 16 - 1355346  
1355287



*merged.txt (Should look like this)*

  GeneSymbol GeneID Exons Chrome AffyMatrixProbeID AffyStart  
AffyEnd
AgilentProbeID AgilentStart AgilentEnd DDR1 780 21 6
A_24_P123601

30975848 30975907 RFC2 5982 10 7 1053_at:120:925,
1053_at:504:41,
1053_at:522:871,
1053_at:828:1025,
203696_s_at:291:651 73287845,
73287869,
73287863,
73287881,
73287850 73287821,
73287845,
73287839,
73287857,
73287826 A_23_P93823 73287861 73287802 RFC2 5982 11 7  
HSPA6 3310
1 1   A_23_P114903 159762782 159762841 PAX8 7849 12 2
A_23_P210001

113691555 113691496 GUCA1A 2978 6 6 UBA7 7318 24 3
1294_at:1079:379,
1294_at:361:881,
203281_s_at:524:889,
203281_s_at:678:1017,
203281_s_at:68:1153 49818386,
49818398,
49818378,
49818434,
49818422 49818362,
49818374,
49818354,
49818420,
49818398


sorry for the long tables,

thanks

Alex

Student
University of Colorado

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reduce size of pdf

2010-04-29 Thread Greg Snow

It would help if we knew how big your pdf is and why it is big.  Can you show 
an example or at least describe the process used to generate the file and what 
you goals are in creating/displaying the file?

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Nevil Amos
 Sent: Thursday, April 29, 2010 9:38 AM
 To: r-help@r-project.org
 Subject: [R] reduce size of pdf
 
 is there a way to reduce the size of pdf files  in R: ?
 compression?
 
 lower dpi ?
 
 or some other option?
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to extract data table

2010-04-29 Thread ericyujin99


I'm a very new user of R,
The problem I got is when I have lots of data table, 3 columns and 100 rows
assigned to a variable x.
how can I transform the table into a external file excel or other files
without losing any information. So that make the data look nicer.

-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-extract-data-table-tp2075750p2075750.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split a vector by NA's - is there a better solution then a loop ?

2010-04-29 Thread Thomas Stewart

Or, you can modify Romain's function to account for sequential NAs.

x - c(1,2,NA,1,1,2,NA,NA,4,5,2,3)
foo - function( x ){
   idx - 1 + cumsum( is.na( x ) )
   not.na - ! is.na( x )

   f-factor(idx[not.na],levels=1:max(idx))

   split( x[not.na], f )
 }

$`1`
[1] 1 2

$`2`
[1] 1 1 2

$`3`
numeric(0)

$`4`
[1] 4 5 2 3

-tgs

On Thu, Apr 29, 2010 at 4:00 AM, Tal Galili tal.gal...@gmail.com wrote:

 Definitely Smarter,
 Thanks!

 Tal

 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |  972-52-7275845
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
 www.r-statistics.com (English)

 --




 On Thu, Apr 29, 2010 at 10:56 AM, Romain Francois 
 romain.franc...@dbmail.com wrote:

  Maybe this :
 
   foo - function( x ){
  +   idx - 1 + cumsum( is.na( x ) )
  +   not.na - ! is.na( x )
  +   split( x[not.na], idx[not.na] )
  + }
   foo( x )
 
  $`1`
  [1] 2 1 2
 
  $`2`
  [1] 1 1 2
 
  $`3`
  [1] 4 5 2 3
 
  Romain
 
  Le 29/04/10 09:42, Tal Galili a écrit :
 
 
  Hi all,
 
  I would like to have a function like this:
  split.vec.by.NA- function(x)
 
  That takes a vector like this:
  x- c(2,1,2,NA,1,1,2,NA,4,5,2,3)
 
  And returns a list of length of 3, each element of the list is the
  relevant
  segmented vector, like this:
 
  $`1`
  [1] 2 1 2
  $`2`
  [1] 1 1 2
  $`3`
  [1] 4 5 2 3
 
 
  I found how to do it with a loop, but wondered if there is some smarter
  (vectorized) way of doing it.
 
 
 
  Here is the code I used:
 
  x- c(2,1,2,NA,1,1,2,NA,4,5,2,3)
 
 
  split.vec.by.NA- function(x)
  {
  # assumes NA are seperating groups of numbers
  #TODO: add code to check for it
 
  number.of.groups- sum(is.na(x)) + 1
  groups.end.point.locations- c(which(is.na(x)), length(x)+1) # This
 will
  be
  all the places with NA's + a nubmer after the ending of the vector
   group.start- 1
  group.end- NA
  new.groups.split.id- x # we will replace all the places of the group
  with
  group ID, excapt for the NA, which will later be replaced by 0
   for(i in seq_len(number.of.groups))
  {
  group.end- groups.end.point.locations[i]-1
   new.groups.split.id[group.start:group.end]- i
   group.start- groups.end.point.locations[i]+1 # make the new group
 start
  higher for the next loop (at the final loop it won't matter
   }
   new.groups.split.id[is.na(x)]- 0
   return(split(x, new.groups.split.id)[-1])
  }
 
  split.vec.by.NA(x)
 
 
 
 
  Thanks,
  Tal
 
 
  --
  Romain Francois
  Professional R Enthusiast
  +33(0) 6 28 91 30 30
  http://romainfrancois.blog.free.fr
  |- http://bit.ly/9aKDM9 : embed images in Rd documents
  |- http://tr.im/OIXN : raster images and RImageJ
  |- http://tr.im/OcQe : Rcpp 0.7.7
 
 
 

 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] operator problem within function

2010-04-29 Thread Bunny, lautloscrew.com

Nice, thx. Which manual do you use ? an introduction to R ? Or something 
special ?

matt


On 29.04.2010, at 15:25, David Winsemius wrote:

 
 On Apr 29, 2010, at 9:03 AM, Bunny, lautloscrew.com wrote:
 
 Sorry for that offlist post, did not mean to do it intentionally. just hit 
 the wrong button. Unfortunately this disadvantage is not written next to $ 
 in the manual.
 
 Hmmm. Not my manual:
 
 Both [[ and $ select a single element of the list. The main difference is 
 that $ does not allow computed indices, whereas [[does.
 
 
 It also says that the correct equivalent using extraction operators of $ 
 would be:
 
 x$name  ==  x[[name, exact = FALSE]]
 -- 
 David.
 
 
 
 On Apr 29, 2010, at 2:34 AM, Bunny, lautloscrew.com wrote:
 
 David,
 
 With your help i finally got it. THX!
 sorry for handing out some ugly names.
 Reason being: it´s a german dataset with german variable names. With those 
 german names you are always sure you dont use a forbidden
 name. I just did not want to hit one of those by accident when changing 
 these names for the mailing list. columna is just the latin term for 
 column :) . Anyway here´s what worked
 
 note: I just tried to use some more real names here.

 recode_items = function(dataframe,question_number,medium=3){

#note column names of the initial data.frame are like 
 Question1,Question2 etc. Using [,1] would not be very practical since  
# the df contains some other data too. Indexing by names seemed to 
 most comfortable way so far.
question-paste(Question,question_number,sep=)
# needed indexing here that understands characters, that´s why 
 going with [,question_number] did not work.
dataframe[question][dataframe[question]==3]=0
 
 This would be more typical:
 
 dataframe[dataframe[question]==3, question] - 0
 


return(dataframe)

}
 
 recode_items(mydataframe,question_number,3)
 # this call uses the dataframe that contains the answers of survey 
 participants. Question number is an argument that selects the question 
 from the dataframe that should be recoded. In surveys some weighting 
 schemes only respect extreme answers, which is why the medium answer is 
 recoded to zero. Since it depends on the item scale what medium actually 
 is, I need it to be an argument of my function.
 
 Did you want a further logical test with that =1 or some sort of 
 assignment???
 
 So yes, it´s an assignment.
 
 Moral: Generally better to use [ indexing.
 
 That´s what really made my day (and it´s only 9.30 a.m. here ) . Are there 
 exceptions to rule?
 
 Not that I know of.
 
 I just worked a lot with the $ in the past.
 
 $colname is just syntactic sugar for either [colname] or [ 
 ,colname] and it has the disadvantage that colname is not evaluated.
 
 
 
 thx
 
 matt
 

 
 
 On 29.04.2010, at 00:56, David Winsemius wrote:
 
 
 On Apr 28, 2010, at 5:45 PM, David Winsemius wrote:
 
 
 On Apr 28, 2010, at 5:31 PM, Bunny, lautloscrew.com wrote:
 
 Dear all,
 
 i have a problem with processing dataframes within a function using the 
 $.
 Here´s my code:
 
 
 recode_items = function(dataframe,number,medium=2){
 
 # this works
 q-paste(columna,number,sep=)
 
 Do your really want q to equal columna2 when number equals 2?
 
 
 # this does not work, particularly because dataframe is not 
 processed
 # dataframe should be: givenframe$columnagivennumber
 a=dataframe$q[dataframe$q==medium]=1
 
 Did you want a further logical test with that =1 or some sort of 
 assignment???
 
 
 a) Do you want to work on the column from dataframe ( horrible name for 
 this purpose IMO) with the name columna2? If so, then start with
 
 dataframe[ , q ]
 
  the q will be evaluated in this form whereas it would not when 
 used with $.
 
 b) (A guess in absence of explanation of a goal.) Now do you want all of 
 the rows where that vector equals medium? If so ,then try this:
 
 dataframe[ dataframe[ , q ]==2 , ]  # untested in the absence of data
 
 Ooops. should have been:
 
 dataframe[ dataframe[ , q ]==medium , ] #since both q and medium will be 
 evaluated.
 
 
 
 Moral: Generally better to use [ indexing.
 
 -- 
 David.
 
 
 
 
 return(a)   
 
 }
 
 
 If I call this function, i´d like it to return my  dataframe.  The 
 problem appears to be somewhere around the $. I´m sure this not too 
 hard, but somehow i am stuck. I´ll keep searchin the manuals.
 Thx for any help in advance.
 
 best
 
 matt
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list

[R] how to parse out fitting statistics and write them into a data frame?

2010-04-29 Thread weix1


hello, everyone: 
I am conducting t test between drug and control for about 50,000 gene using
the following syntax (treatment is factor): 

result- lapply(split(data, data$gene),function(x) lm(value~treatment,x) 
  
however, the result is a list and i do not know whether more model fitting
statistics (like p value of t test) is included in result or not. If i
print the first element of resut i got the followings: 

 result[1] 
$`1007_s_at` 

Call: 
lm(formula = logvalue ~ treatment, data = x) 

Coefficients: 
 (Intercept)  treatmentveh   
  8.94030.3232   

 summary(result[1]) 
  Length Class Mode 
1007_s_at 13 lmlist 
 

So my question is whether more fitting statistics (other than coefficient
estimation, like p value) are included in the result. If yes, how can I
parse them into a data frame so that i can output those statistics into a
.csv file that can be shared with my clients. If not, how can I modify the
code so that more stat can be computed and stored? 
any constructive suggestions are welcome. 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-to-parse-out-fitting-statistics-and-write-them-into-a-data-frame-tp2075707p2075707.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] operator problem within function