Re: [R] by funtion
Hi you could try do.call('rbind',aa) then turn the matrix into data frame regards Tengfei On Wed, Apr 28, 2010 at 10:56 PM, Yuan Jian jayuan2...@yahoo.com wrote: Hello, I have a data.frame: namecol1col2col3col4 AA23540.9990.78 BB123510.99 AA203980.790.99 I want to get mean value data.frame in terms of name: namecol1col2col3col4 AA113. 76. 0.8945 0.8850 BB123.00 5.00 1.00 0.99 I tried to use by function: aa-by(test[,2:5], feature, mean) I found aa is by function. class(aa) [1] by how can I transfer aa to a data frame? thanks YU [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 Homepage: www.tengfei.name [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: by funtion
Hi r-help-boun...@r-project.org napsal dne 29.04.2010 05:56:23: Hello, I have a data.frame: namecol1col2col3col4 AA23540.9990.78 BB123510.99 AA203980.790.99 I want to get mean value data.frame in terms of name: namecol1col2col3col4 AA113. 76. 0.8945 0.8850 BB123.00 5.00 1.00 0.99 I tried to use by function: aa-by(test[,2:5], feature, mean) I found aa is by function. class(aa) [1] by how can I transfer aa to a data frame? use aggregate instead aa-aggregate(test[,2:5], feature, mean) Regards Petr thanks YU [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non linear estimation
Hi I put a search question about nonlinear programming in R site search and got many answers maybe you could find something which suits your needs. Maybe you could also look at CRAN task view - Optimisation and Mathematical programming Regards Petr r-help-boun...@r-project.org napsal dne 29.04.2010 03:38:27: any suggestion? actually I just wanna know if there is a package for non linear estimation with restriction, thanks. I am a new for R -- View this message in context: http://r.789695.n4.nabble.com/non-linear- estimation-tp2072136p2074911.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] by funtion
Hi r-help-boun...@r-project.org napsal dne 29.04.2010 08:11:41: Hi you could try do.call('rbind',aa) No, No, No. rbind and cbind binds vectors as rows or columns of ***matrix***, result is not a data frame do.call(rbind,aa) X069rutil X102anatas 105 26.97.9 200 22.8 10.6 400 30.6 13.3 600 50.8 20.6 800 78.7 NA exp.df-do.call(rbind,aa) str(exp.df) num [1:5, 1:2] 26.9 22.8 30.6 50.8 78.7 7.9 10.6 13.3 20.6 NA - attr(*, dimnames)=List of 2 ..$ : chr [1:5] 105 200 400 600 ... ..$ : chr [1:2] X069rutil X102anatas If some object has rectangular shape and has column names it does not automatically mean that it is data frame Regards Petr then turn the matrix into data frame regards Tengfei On Wed, Apr 28, 2010 at 10:56 PM, Yuan Jian jayuan2...@yahoo.com wrote: Hello, I have a data.frame: namecol1col2col3col4 AA23540.9990.78 BB123510.99 AA203980.790.99 I want to get mean value data.frame in terms of name: namecol1col2col3col4 AA113. 76. 0.8945 0.8850 BB123.00 5.00 1.00 0.99 I tried to use by function: aa-by(test[,2:5], feature, mean) I found aa is by function. class(aa) [1] by how can I transfer aa to a data frame? thanks YU [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 Homepage: www.tengfei.name [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] by funtion
Hi, Thanks, actually I mentioned in the reply, you need to turn the matrix into data frame in the end if use this method. e.g df=data.frame(name=c('AA','BB','AA'),c1=c(23,123,203),c2=c(54,5,98),c3=c(0.999,1,0.79),c4=c(0.78,0.99,0.99)) aa=by(df[,2:5],df$name,mean) dd=do.call('rbind',aa) df=data.frame(dd) df c1 c2 c3c4 AA 113 76 0.8945 0.885 BB 123 5 1. 0.990 Regards Tengfei On Thu, Apr 29, 2010 at 1:30 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi r-help-boun...@r-project.org napsal dne 29.04.2010 08:11:41: Hi you could try do.call('rbind',aa) No, No, No. rbind and cbind binds vectors as rows or columns of ***matrix***, result is not a data frame do.call(rbind,aa) X069rutil X102anatas 105 26.97.9 200 22.8 10.6 400 30.6 13.3 600 50.8 20.6 800 78.7 NA exp.df-do.call(rbind,aa) str(exp.df) num [1:5, 1:2] 26.9 22.8 30.6 50.8 78.7 7.9 10.6 13.3 20.6 NA - attr(*, dimnames)=List of 2 ..$ : chr [1:5] 105 200 400 600 ... ..$ : chr [1:2] X069rutil X102anatas If some object has rectangular shape and has column names it does not automatically mean that it is data frame Regards Petr then turn the matrix into data frame regards Tengfei On Wed, Apr 28, 2010 at 10:56 PM, Yuan Jian jayuan2...@yahoo.com wrote: Hello, I have a data.frame: namecol1col2col3col4 AA23540.9990.78 BB123510.99 AA203980.790.99 I want to get mean value data.frame in terms of name: namecol1col2col3col4 AA113. 76. 0.8945 0.8850 BB123.00 5.00 1.00 0.99 I tried to use by function: aa-by(test[,2:5], feature, mean) I found aa is by function. class(aa) [1] by how can I transfer aa to a data frame? thanks YU [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 Homepage: www.tengfei.name [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 Homepage: www.tengfei.name [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non linear estimation
-Mensaje original- De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En nombre de JamesHuang Enviado el: jueves, 29 de abril de 2010 3:38 Para: r-help@r-project.org Asunto: Re: [R] non linear estimation any suggestion? actually I just wanna know if there is a package for non linear estimation with restriction, thanks. I am a new for R I do not know if there is any specific package for optimization with restrictions, but you can use optim with method=L-BFGS-B This only lets you set bounds of single parameters, so for restrictions such as a+b19 in Y=a+(b+c*x)*exp(-d*x) you could deduce your restrictions in terms of single parameters (for example, in your original mail you put that a10, a+b19, and b3, so the restriction a+b19 is actually redundant), or else you could think of some re-parameterization that would put a+b (and all other multi-par restrictions) as a single parameter. Wait, is this a homework? Dr. Rubén Roa-Ureta AZTI - Tecnalia / Marine Research Unit Txatxarramendi Ugartea z/g 48395 Sukarrieta (Bizkaia) SPAIN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] by funtion
Thanks Tengfei, I have another question. df=data.frame(name=c('AA','BB', 'CC'),c1=c(23,123,5),c2=c(54,5,4),c3=c(0.999,1,23),c4=c(0.78,0.99,54)) df name c1 c2 c3 c4 1 AA 23 54 0.999 0.78 2 BB 123 5 1.000 0.99 3 CC 5 4 23.000 54.00 df1=data.frame(name=c('BB','AA', 'DD'),c5=c(98,87,54),c6=c(7,6,3)) df1 name c5 c6 1 BB 98 7 2 AA 87 6 3 DD 54 3 now I want to get interaction for df and df1 in terms of name. this is name c1 c2 c3 c4 c5 c6 AA 23 54 0.999 0.78 87 6 BB 123 5 1.000 0.99 98 7 could give advice? --- On Thu, 29/4/10, Tengfei Yin yinteng...@gmail.com wrote: From: Tengfei Yin yinteng...@gmail.com Subject: Re: [R] by funtion To: Petr PIKAL petr.pi...@precheza.cz Cc: Yuan Jian jayuan2...@yahoo.com, r-help@r-project.org Received: Thursday, 29 April, 2010, 6:44 AM Hi, Thanks, actually I mentioned in the reply, you need to turn the matrix into data frame in the end if use this method. e.g df=data.frame(name=c('AA','BB','AA'),c1=c(23,123,203),c2=c(54,5,98),c3=c(0.999,1,0.79),c4=c(0.78,0.99,0.99)) aa=by(df[,2:5],df$name,mean) dd=do.call('rbind',aa) df=data.frame(dd) df c1 c2 c3 c4AA 113 76 0.8945 0.885 BB 123 5 1. 0.990 Regards TengfeiOn Thu, Apr 29, 2010 at 1:30 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi r-help-boun...@r-project.org napsal dne 29.04.2010 08:11:41: Hi you could try do.call('rbind',aa) No, No, No. rbind and cbind binds vectors as rows or columns of ***matrix***, result is not a data frame do.call(rbind,aa) X069rutil X102anatas 105 26.9 7.9 200 22.8 10.6 400 30.6 13.3 600 50.8 20.6 800 78.7 NA exp.df-do.call(rbind,aa) str(exp.df) num [1:5, 1:2] 26.9 22.8 30.6 50.8 78.7 7.9 10.6 13.3 20.6 NA - attr(*, dimnames)=List of 2 ..$ : chr [1:5] 105 200 400 600 ... ..$ : chr [1:2] X069rutil X102anatas If some object has rectangular shape and has column names it does not automatically mean that it is data frame Regards Petr then turn the matrix into data frame regards Tengfei On Wed, Apr 28, 2010 at 10:56 PM, Yuan Jian jayuan2...@yahoo.com wrote: Hello, I have a data.frame: name col1 col2 col3 col4 AA 23 54 0.999 0.78 BB 123 5 1 0.99 AA 203 98 0.79 0.99 I want to get mean value data.frame in terms of name: name col1 col2 col3 col4 AA 113. 76. 0.8945 0.8850 BB 123.00 5.00 1.00 0.99 I tried to use by function: aa-by(test[,2:5], feature, mean) I found aa is by function. class(aa) [1] by how can I transfer aa to a data frame? thanks YU [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 Homepage: www.tengfei.name [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 Homepage: www.tengfei.name [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exporting an rgl graph
I need to use the function saveTriangleAsASY in my package. Does it allready exist in a package or may I unclude it ? Christophe -- View this message in context: http://r.789695.n4.nabble.com/Exporting-an-rgl-graph-tp1872712p2075086.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] by funtion
Hi sorry I did not read your reply as thoroughly. But generally matrices are quite often exchanged for data frames. Also if you have list with mixture of numeric and nonumeric data such approach results in nonumeric output as matrix can have values only of one type. I would therefore generally prefer dd=do.call('data.frame',aa) dd AA BB c1 113. 123.00 c2 76. 5.00 c3 0.8945 1.00 c4 0.8850 0.99 t(dd) c1 c2 c3c4 AA 113 76 0.8945 0.885 BB 123 5 1. 0.990 approach Regards Petr r-help-boun...@r-project.org napsal dne 29.04.2010 08:44:10: Hi, Thanks, actually I mentioned in the reply, you need to turn the matrix into data frame in the end if use this method. e.g df=data.frame(name=c('AA','BB','AA'),c1=c(23,123,203),c2=c(54,5,98),c3=c(0. 999,1,0.79),c4=c(0.78,0.99,0.99)) aa=by(df[,2:5],df$name,mean) dd=do.call('rbind',aa) df=data.frame(dd) df c1 c2 c3c4 AA 113 76 0.8945 0.885 BB 123 5 1. 0.990 Regards Tengfei On Thu, Apr 29, 2010 at 1:30 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi r-help-boun...@r-project.org napsal dne 29.04.2010 08:11:41: Hi you could try do.call('rbind',aa) No, No, No. rbind and cbind binds vectors as rows or columns of ***matrix***, result is not a data frame do.call(rbind,aa) X069rutil X102anatas 105 26.97.9 200 22.8 10.6 400 30.6 13.3 600 50.8 20.6 800 78.7 NA exp.df-do.call(rbind,aa) str(exp.df) num [1:5, 1:2] 26.9 22.8 30.6 50.8 78.7 7.9 10.6 13.3 20.6 NA - attr(*, dimnames)=List of 2 ..$ : chr [1:5] 105 200 400 600 ... ..$ : chr [1:2] X069rutil X102anatas If some object has rectangular shape and has column names it does not automatically mean that it is data frame Regards Petr then turn the matrix into data frame regards Tengfei On Wed, Apr 28, 2010 at 10:56 PM, Yuan Jian jayuan2...@yahoo.com wrote: Hello, I have a data.frame: namecol1col2col3col4 AA23540.9990.78 BB123510.99 AA203980.790.99 I want to get mean value data.frame in terms of name: namecol1col2col3col4 AA113. 76. 0.8945 0.8850 BB123.00 5.00 1.00 0.99 I tried to use by function: aa-by(test[,2:5], feature, mean) I found aa is by function. class(aa) [1] by how can I transfer aa to a data frame? thanks YU [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 Homepage: www.tengfei.name [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 Homepage: www.tengfei.name [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] by funtion
Hi probably merge is what you want see ?merge Regards Petr r-help-boun...@r-project.org napsal dne 29.04.2010 09:13:34: Thanks Tengfei, I have another question. df=data.frame(name=c('AA','BB', 'CC'),c1=c(23,123,5),c2=c(54,5,4),c3=c(0. 999,1,23),c4=c(0.78,0.99,54)) df name c1 c2 c3c4 1 AA 23 54 0.999 0.78 2 BB 123 5 1.000 0.99 3 CC 5 4 23.000 54.00 df1=data.frame(name=c('BB','AA', 'DD'),c5=c(98,87,54),c6=c(7,6,3)) df1 name c5 c6 1 BB 98 7 2 AA 87 6 3 DD 54 3 now I want to get interaction for df and df1 in terms of name. this is name c1 c2 c3c4 c5 c6 AA 23 54 0.999 0.78 87 6 BB123 5 1.000 0.99 98 7 could give advice? --- On Thu, 29/4/10, Tengfei Yin yinteng...@gmail.com wrote: From: Tengfei Yin yinteng...@gmail.com Subject: Re: [R] by funtion To: Petr PIKAL petr.pi...@precheza.cz Cc: Yuan Jian jayuan2...@yahoo.com, r-help@r-project.org Received: Thursday, 29 April, 2010, 6:44 AM Hi, Thanks, actually I mentioned in the reply, you need to turn the matrix into data frame in the end if use this method. e.g df=data.frame(name=c('AA','BB','AA'),c1=c(23,123,203),c2=c(54,5,98),c3=c(0. 999,1,0.79),c4=c(0.78,0.99,0.99)) aa=by(df[,2:5],df$name,mean) dd=do.call('rbind',aa) df=data.frame(dd) dfc1 c2 c3c4AA 113 76 0.8945 0.885 BB 123 5 1. 0.990 Regards TengfeiOn Thu, Apr 29, 2010 at 1:30 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi r-help-boun...@r-project.org napsal dne 29.04.2010 08:11:41: Hi you could try do.call('rbind',aa) No, No, No. rbind and cbind binds vectors as rows or columns of ***matrix***, result is not a data frame do.call(rbind,aa) X069rutil X102anatas 105 26.97.9 200 22.8 10.6 400 30.6 13.3 600 50.8 20.6 800 78.7 NA exp.df-do.call(rbind,aa) str(exp.df) num [1:5, 1:2] 26.9 22.8 30.6 50.8 78.7 7.9 10.6 13.3 20.6 NA - attr(*, dimnames)=List of 2 ..$ : chr [1:5] 105 200 400 600 ... ..$ : chr [1:2] X069rutil X102anatas If some object has rectangular shape and has column names it does not automatically mean that it is data frame Regards Petr then turn the matrix into data frame regards Tengfei On Wed, Apr 28, 2010 at 10:56 PM, Yuan Jian jayuan2...@yahoo.com wrote: Hello, I have a data.frame: namecol1col2col3col4 AA23540.9990.78 BB123510.99 AA203980.790.99 I want to get mean value data.frame in terms of name: namecol1col2col3col4 AA113. 76. 0.8945 0.8850 BB123.00 5.00 1.00 0.99 I tried to use by function: aa-by(test[,2:5], feature, mean) I found aa is by function. class(aa) [1] by how can I transfer aa to a data frame? thanks YU [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 Homepage: www.tengfei.name [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 Homepage: www.tengfei.name [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Split a vector by NA's - is there a better solution then a loop ?
Hi all, I would like to have a function like this: split.vec.by.NA - function(x) That takes a vector like this: x - c(2,1,2,NA,1,1,2,NA,4,5,2,3) And returns a list of length of 3, each element of the list is the relevant segmented vector, like this: $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 I found how to do it with a loop, but wondered if there is some smarter (vectorized) way of doing it. Here is the code I used: x - c(2,1,2,NA,1,1,2,NA,4,5,2,3) split.vec.by.NA - function(x) { # assumes NA are seperating groups of numbers #TODO: add code to check for it number.of.groups - sum(is.na(x)) + 1 groups.end.point.locations - c(which(is.na(x)), length(x)+1) # This will be all the places with NA's + a nubmer after the ending of the vector group.start - 1 group.end - NA new.groups.split.id - x # we will replace all the places of the group with group ID, excapt for the NA, which will later be replaced by 0 for(i in seq_len(number.of.groups)) { group.end - groups.end.point.locations[i]-1 new.groups.split.id[group.start:group.end] - i group.start - groups.end.point.locations[i]+1 # make the new group start higher for the next loop (at the final loop it won't matter } new.groups.split.id[is.na(x)] - 0 return(split(x, new.groups.split.id)[-1]) } split.vec.by.NA(x) Thanks, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using get and paste in a loop to return objects for object names listed a strings
I am trying to create a heap of boxplots, by looping though a series of factors and variables in a large data.frame suing paste to constrcut the facto and response names from the colnames I thought I could do this using get() however it is not working what am I doing wrong? thanks Nevil Amos sp.codes=levels(data.all$CODE_LETTERS) for(spp in sp.codes) { data.sp=subset(data.all,CODE_LETTERS==spp) responses = colnames(data.all)[c(20,28,29,19)] #if (spp==BT) responses = colnames(data.all)[c(19,20,26:29)] groups=colnames (data.all)[c(9,10,13,16,30)] data.sp=subset(data.all,CODE_LETTERS==spp) for (response in responses){ for (group in groups){ r-get(paste(data.sp$,response,sep=)) g-get(paste(data.sp$,group, sep=)) print (r) print(g) boxplot(r ~g) }}} Error in get(paste(data.sp$, response, sep = )) : object 'data.sp$Hb' not found __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split a vector by NA's - is there a better solution then a loop ?
Maybe this : foo - function( x ){ + idx - 1 + cumsum( is.na( x ) ) + not.na - ! is.na( x ) + split( x[not.na], idx[not.na] ) + } foo( x ) $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 Romain Le 29/04/10 09:42, Tal Galili a écrit : Hi all, I would like to have a function like this: split.vec.by.NA- function(x) That takes a vector like this: x- c(2,1,2,NA,1,1,2,NA,4,5,2,3) And returns a list of length of 3, each element of the list is the relevant segmented vector, like this: $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 I found how to do it with a loop, but wondered if there is some smarter (vectorized) way of doing it. Here is the code I used: x- c(2,1,2,NA,1,1,2,NA,4,5,2,3) split.vec.by.NA- function(x) { # assumes NA are seperating groups of numbers #TODO: add code to check for it number.of.groups- sum(is.na(x)) + 1 groups.end.point.locations- c(which(is.na(x)), length(x)+1) # This will be all the places with NA's + a nubmer after the ending of the vector group.start- 1 group.end- NA new.groups.split.id- x # we will replace all the places of the group with group ID, excapt for the NA, which will later be replaced by 0 for(i in seq_len(number.of.groups)) { group.end- groups.end.point.locations[i]-1 new.groups.split.id[group.start:group.end]- i group.start- groups.end.point.locations[i]+1 # make the new group start higher for the next loop (at the final loop it won't matter } new.groups.split.id[is.na(x)]- 0 return(split(x, new.groups.split.id)[-1]) } split.vec.by.NA(x) Thanks, Tal -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9aKDM9 : embed images in Rd documents |- http://tr.im/OIXN : raster images and RImageJ |- http://tr.im/OcQe : Rcpp 0.7.7 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function which saves an image of a dgtMatrix as png
Thanks so much Douglas Bates a écrit : image applied to a sparseMatrix object uses lattice functions to create the image. As described in R FAQ 7.22 you must use print(image(x)) or show(image(x)) or even plot(image(x)) when a lattice function is called from within another function. On Wed, Apr 28, 2010 at 1:20 PM, Gildas Mazo gildas.m...@curie.fr wrote: Hi, I'm getting crazy: This does work: library(Matrix) a1-b1-c(1,2) c1-rnorm(2) aDgt-spMatrix(ncol=3,nrow=3,i=a1,j=b1,x=c1) png(myImage.png) image(aDgt) dev.off() But this doesn't !!! f-function(x){ png(myImage.png) image(x) dev.off() } f(aDgt) My image is saved as a text file and contains nothing at all !!! Thanks in advance, Gildas Mazo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] by funtion
Hi Petr, Thanks for your suggestions:) @Yuan, Petr is right, you can try merge(df,df1,'name') Regards Tengfei On Thu, Apr 29, 2010 at 2:20 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi probably merge is what you want see ?merge Regards Petr r-help-boun...@r-project.org napsal dne 29.04.2010 09:13:34: Thanks Tengfei, I have another question. df=data.frame(name=c('AA','BB', 'CC'),c1=c(23,123,5),c2=c(54,5,4),c3=c(0. 999,1,23),c4=c(0.78,0.99,54)) df name c1 c2 c3c4 1 AA 23 54 0.999 0.78 2 BB 123 5 1.000 0.99 3 CC 5 4 23.000 54.00 df1=data.frame(name=c('BB','AA', 'DD'),c5=c(98,87,54),c6=c(7,6,3)) df1 name c5 c6 1 BB 98 7 2 AA 87 6 3 DD 54 3 now I want to get interaction for df and df1 in terms of name. this is name c1 c2 c3c4 c5 c6 AA 23 54 0.999 0.78 87 6 BB123 5 1.000 0.99 98 7 could give advice? --- On Thu, 29/4/10, Tengfei Yin yinteng...@gmail.com wrote: From: Tengfei Yin yinteng...@gmail.com Subject: Re: [R] by funtion To: Petr PIKAL petr.pi...@precheza.cz Cc: Yuan Jian jayuan2...@yahoo.com, r-help@r-project.org Received: Thursday, 29 April, 2010, 6:44 AM Hi, Thanks, actually I mentioned in the reply, you need to turn the matrix into data frame in the end if use this method. e.g df=data.frame(name=c('AA','BB','AA'),c1=c(23,123,203),c2=c(54,5,98),c3=c(0. 999,1,0.79),c4=c(0.78,0.99,0.99)) aa=by(df[,2:5],df$name,mean) dd=do.call('rbind',aa) df=data.frame(dd) dfc1 c2 c3c4AA 113 76 0.8945 0.885 BB 123 5 1. 0.990 Regards TengfeiOn Thu, Apr 29, 2010 at 1:30 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi r-help-boun...@r-project.org napsal dne 29.04.2010 08:11:41: Hi you could try do.call('rbind',aa) No, No, No. rbind and cbind binds vectors as rows or columns of ***matrix***, result is not a data frame do.call(rbind,aa) X069rutil X102anatas 105 26.97.9 200 22.8 10.6 400 30.6 13.3 600 50.8 20.6 800 78.7 NA exp.df-do.call(rbind,aa) str(exp.df) num [1:5, 1:2] 26.9 22.8 30.6 50.8 78.7 7.9 10.6 13.3 20.6 NA - attr(*, dimnames)=List of 2 ..$ : chr [1:5] 105 200 400 600 ... ..$ : chr [1:2] X069rutil X102anatas If some object has rectangular shape and has column names it does not automatically mean that it is data frame Regards Petr then turn the matrix into data frame regards Tengfei On Wed, Apr 28, 2010 at 10:56 PM, Yuan Jian jayuan2...@yahoo.com wrote: Hello, I have a data.frame: namecol1col2col3col4 AA23540.9990.78 BB123510.99 AA203980.790.99 I want to get mean value data.frame in terms of name: namecol1col2col3col4 AA113. 76. 0.8945 0.8850 BB123.00 5.00 1.00 0.99 I tried to use by function: aa-by(test[,2:5], feature, mean) I found aa is by function. class(aa) [1] by how can I transfer aa to a data frame? thanks YU [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 Homepage: www.tengfei.name [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 Homepage: www.tengfei.name [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read
[R] Compact Patricia Trees (Tries)
I have an application that a long list of character strings to determine which occur at the beginning of a given word. A straight forward R script using grep takes a long time to run. Rewriting it to use substr and match might be an option, but I have the impression that preparing the list as a trie and performing trie searches might lead to dramatic improvements in performance. I have searched the CRAN packages and find no packages that support Compact Patricia Trees. Does anybody know of such? Thanks, Richard Richard R. Liu richard@pueo-owl.ch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split a vector by NA's - is there a better solution then a loop ?
Definitely Smarter, Thanks! Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Thu, Apr 29, 2010 at 10:56 AM, Romain Francois romain.franc...@dbmail.com wrote: Maybe this : foo - function( x ){ + idx - 1 + cumsum( is.na( x ) ) + not.na - ! is.na( x ) + split( x[not.na], idx[not.na] ) + } foo( x ) $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 Romain Le 29/04/10 09:42, Tal Galili a écrit : Hi all, I would like to have a function like this: split.vec.by.NA- function(x) That takes a vector like this: x- c(2,1,2,NA,1,1,2,NA,4,5,2,3) And returns a list of length of 3, each element of the list is the relevant segmented vector, like this: $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 I found how to do it with a loop, but wondered if there is some smarter (vectorized) way of doing it. Here is the code I used: x- c(2,1,2,NA,1,1,2,NA,4,5,2,3) split.vec.by.NA- function(x) { # assumes NA are seperating groups of numbers #TODO: add code to check for it number.of.groups- sum(is.na(x)) + 1 groups.end.point.locations- c(which(is.na(x)), length(x)+1) # This will be all the places with NA's + a nubmer after the ending of the vector group.start- 1 group.end- NA new.groups.split.id- x # we will replace all the places of the group with group ID, excapt for the NA, which will later be replaced by 0 for(i in seq_len(number.of.groups)) { group.end- groups.end.point.locations[i]-1 new.groups.split.id[group.start:group.end]- i group.start- groups.end.point.locations[i]+1 # make the new group start higher for the next loop (at the final loop it won't matter } new.groups.split.id[is.na(x)]- 0 return(split(x, new.groups.split.id)[-1]) } split.vec.by.NA(x) Thanks, Tal -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9aKDM9 : embed images in Rd documents |- http://tr.im/OIXN : raster images and RImageJ |- http://tr.im/OcQe : Rcpp 0.7.7 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with optimization (constrOptim)
Hi, You are right, my intention was to return a set of values and to minimize them all in a multicriteria optimization problem. The interesting thing is that when I actually used scalar return of this function, by minimizing sum of squares in this form: fr - function(z) { t(z%*%matrix(c(2,5,6), 3,1)-matrix(c(5,4,2), 3,1))%*%(z%*%matrix(c(2,5,6), 3,1)-matrix(c(5,4,2), 3,1)) } constrOptim((matrix(c(0,0,0,0,0,0,0,0,0),3,3)), fr) or nlm(fr, matrix(c(0,0,0,0,0,0,0,0,0),3,3)) -- the function also returned non-comformable error. Kind regards Jacob 2010/4/29 Nikhil Kaza nikhil.l...@gmail.com fr does not return a scalar. Nikhil On Apr 28, 2010, at 3:35 AM, Cz³owiek Kuba wrote: Hello, I have the following problem: I have a set of n matrix equations in the form of : [b1] = [A] * [b0] [b2] = [A] * [b1] etc. vertical vectors [b0], [b1], ... are GIVEN. We try to estimate matrix A. As there are many equations (more than cells in matrix A) the system has no solutions. A is transition matrix (stochastic matrix) or markov process, so the sum of each row = 1 and each entry is probability (aij in 0;1). I tried to estimate A by using constrOptim the following way, but apparently it won't work on matrices. fr - function(x) { x%*%matrix(c(2,5,6), 3,1)-matrix(c(5,4,2), 3,1) x%*%matrix(c(6,2,3), 3,1)-matrix(c(1,1,1), 3,1) x%*%matrix(c(6,1,2), 3,1)-matrix(c(3,4,1), 3,1) } constrOptim(matrix(c(0.5,0.4,0.1,0.2,0.3,0.5,0.5,0.2,0.3),3,3), fr, NULL, ui=matrix(c(1,0,0,0,1,0,0,0,1),3,3), ci=matrix(c(-.1 ,-.1,-.1,-.1,-.1,-.1,-.1,-.1,-.1),3,3)) It produces the following error: Error in ui %*% theta : non-conformable arguments Kind regards and thanks for help Jacob [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice Groups
Dear R experts.. Related to the example below, (which was discussed earlier)... How do I control the graphical elements of box, whiskers etc? I would like their colors go with specific groups. i tried changing par.settings(box.umbrella, box.rectangle etc)..and could not make them work.. Sample dataset and example code is given below. tmp - data.frame( y=rnorm(100), category=rep(factor(letters[1: 5]),each=20), level=rep(factor(0:1), length=100)) barchart(y~factor(category),groups=level, data=tmp,jitter.x=F, panel=function(...){ panel.superpose( ...) panel.superpose(panel.groups=panel.bwplot, alpha=c(0.5,0.5), varwidth=T,notch=T, col=c(red,blue), fill=c(pink,lightblue),pch=16, par.settings=list(box.umbrella=list(col=c(red,blue),box.dot=list(col=c(red,blue,...) panel.superpose(panel.groups=panel.loess,lwd=2,col.line=c(red,blue),alpha=0.2,lty=1,...) panel.abline(h=0,col=black,lty=2)}, xlab=time bin (week), auto.key=list(space=right,text=c(A,H),points=T)) Thanks, Santosh _ On Wed, Apr 8, 2009 at 12:07 PM, Deepayan Sarkar deepayan.sar...@gmail.comwrote: On Wed, Apr 8, 2009 at 10:36 AM, Lyman, Mark mark.ly...@atk.com wrote: I don't understand your first question, but, since no one else has responded I can answer your second question. panel.bwplot, unlike panel.xyplot doesn't use panel.superpose when groups is not NULL. In order to get an analogous result you need to specify that you want to use panel.superpose. cols - c(Sepal.Width, Petal.Length, Petal.Width) stackedData - stack(iris[, cols]) df - data.frame(y = stackedData$values, x = rep(iris$Species, 3), which = gl(3, nrow(iris))) bwplot(y ~ x:which, data = df, groups = which, panel=panel.superpose, panel.groups = panel.bwplot) If you don't like the default colors, you can set the fill colors with par.settings like: bwplot(y ~ x:which, data = df, groups = which, panel=panel.superpose, panel.groups = panel.bwplot, par.settings=list(superpose.symbol=list(fill=2:4))) And to answer the first question: using panel.superpose hijacks the parameters of the median spot, but they can be supplied explicity: bwplot(y ~ x:which, data = df, groups = which, panel=panel.superpose, panel.groups = panel.bwplot, par.settings=list(superpose.symbol=list(fill=2:4)), col = black, pch = 16) -Deepayan Without the groups, the fill colors are controlled like this bwplot(y~x:which, data = df, par.settings=list(box.rectangle=list(fill=2:4))) Although if you have groups, using the groups argument is probably better. Mark Lyman Message: 41 Date: Tue, 7 Apr 2009 10:50:33 +0100 From: Richard Weeks dickywe...@hotmail.com Subject: [R] Lattice Groups To: r-help@r-project.org Message-ID: blu138-w2277550025ed688aae0c91dc...@phx.gbl Content-Type: text/plain Hi all, I'm trying to achieve a few things using the lattice package but am failing miserably. I am plotting side by side box plots and using a grouping variable, e.g. cols - c(Sepal.Width, Petal.Length, Petal.Width) stackedData - stack(iris[, cols]) df - data.frame(y = stackedData$values, x = rep(iris$Species, 3), which = gl(3, nrow(iris))) bwplot(y ~ x:which, data = df, group = which, panel.groups = panel.bwplot) My questions are 1) How am I able to retain the median spot in the boxes? 2) How can I change the fill using the par.settings argument rather than fill =1:3 say? Best wishes, Biff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Request - adding recycled lwd parameter to polygon
Hello dear members of R-help and R-core mailing list, I am not sure if this request is a ticket that should be filled somewhere outside the mailing list. If so, I apologize for not doing and would like to know where I should have filled it. And to the subject matter: I would like to use a command like this: plot(c(1,8), 1:2, type=n) polygon(1:7, c(2,1,2,NA,2,1,2), col=c(red, blue), # border=c(green, yellow), border=c(1,10), lwd=c(1:10)) To create two triangles, with different line widths. But the polygon command doesn't seem to recycle the lwd parameter as it does for the col, lty, and the border parameters. I would like the resulting plot to look like what the following code will produce: plot(c(1,8), 1:2, type=n) polygon(1:3, c(2,1,2), col=c(red), # border=c(green, yellow), border=c(1,10), lwd=c(1)) polygon(5:7, c(2,1,2), col=c( blue), # border=c(green, yellow), border=c(1,10), lwd=c(10)) I opened up the polygon code to add the lwd parameter so to be used as the lty is used. For some reason it didn't work (I am wondering if it is because of some way .Internal(polygon(xy$x, xy$y, col, border, lty, lwd,...)) doesn't accept lwd...) Here is the updates code I wrote: polygon2 - function (x, y = NULL, density = NULL, angle = 45, border = NULL, col = NA, lty = par(lty), lwd =par(lwd) ,..., fillOddEven = FALSE) { ..debug.hatch - FALSE xy - xy.coords(x, y) if (is.numeric(density) all(is.na(density) | density 0)) density - NULL if (!is.null(angle) !is.null(density)) { polygon.onehatch - function(x, y, x0, y0, xd, yd, ..debug.hatch = FALSE, ...) { if (..debug.hatch) { points(x0, y0) arrows(x0, y0, x0 + xd, y0 + yd) } halfplane - as.integer(xd * (y - y0) - yd * (x - x0) = 0) cross - halfplane[-1L] - halfplane[-length(halfplane)] does.cross - cross != 0 if (!any(does.cross)) return() x1 - x[-length(x)][does.cross] y1 - y[-length(y)][does.cross] x2 - x[-1L][does.cross] y2 - y[-1L][does.cross] t - (((x1 - x0) * (y2 - y1) - (y1 - y0) * (x2 - x1))/(xd * (y2 - y1) - yd * (x2 - x1))) o - order(t) tsort - t[o] crossings - cumsum(cross[does.cross][o]) if (fillOddEven) crossings - crossings%%2 drawline - crossings != 0 lx - x0 + xd * tsort ly - y0 + yd * tsort lx1 - lx[-length(lx)][drawline] ly1 - ly[-length(ly)][drawline] lx2 - lx[-1L][drawline] ly2 - ly[-1L][drawline] segments(lx1, ly1, lx2, ly2, ...) } polygon.fullhatch - function(x, y, density, angle, ..debug.hatch = FALSE, ...) { x - c(x, x[1L]) y - c(y, y[1L]) angle - angle%%180 if (par(xlog) || par(ylog)) { warning(cannot hatch with logarithmic scale active) return() } usr - par(usr) pin - par(pin) upi - c(usr[2L] - usr[1L], usr[4L] - usr[3L])/pin if (upi[1L] 0) angle - 180 - angle if (upi[2L] 0) angle - 180 - angle upi - abs(upi) xd - cos(angle/180 * pi) * upi[1L] yd - sin(angle/180 * pi) * upi[2L] if (angle 45 || angle 135) { if (angle 45) { first.x - max(x) last.x - min(x) } else { first.x - min(x) last.x - max(x) } y.shift - upi[2L]/density/abs(cos(angle/180 * pi)) x0 - 0 y0 - floor((min(y) - first.x * yd/xd)/y.shift) * y.shift y.end - max(y) - last.x * yd/xd while (y0 y.end) { polygon.onehatch(x, y, x0, y0, xd, yd, ..debug.hatch = ..debug.hatch, ...) y0 - y0 + y.shift } } else { if (angle 90) { first.y - max(y) last.y - min(y) } else { first.y - min(y) last.y - max(y) } x.shift - upi[1L]/density/abs(sin(angle/180 * pi)) x0 - floor((min(x) - first.y * xd/yd)/x.shift) * x.shift y0 - 0 x.end - max(x) - last.y * xd/yd while (x0 x.end) { polygon.onehatch(x, y, x0, y0, xd, yd, ..debug.hatch =
Re: [R] NLS amp;quot;Singular Gradientamp;quot; Error
Hi Ben, That's great, thank you very much indeed. Kind regards, Neal Quoting Ben Bolker [via R] ml-node+2074786-1865094303-243...@n4.nabble.com: bsnrh bsnrh at leeds.ac.uk writes: Hi Ben, Your book refers to the mle function in the emdbookx package. I was wondering if it's possible to find that package on the internet? Many thanks, Neal If the (draft) PDF says that, it's an error. See the mle2 function in the bbmle package (which is available from CRAN). The lambertW function is in the emdbook package, which is also available in CRAN. install.packages(c(bbmle,emdbook)) library(bbmle) library(emdbook) ?mle2 ?lambertW (You can address further questions to me off-list ...) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ View message @ http://r.789695.n4.nabble.com/NLS-Singular-Gradient-Error-tp2069029p2074786.html To unsubscribe from Re: NLS quot;Singular Gradientquot; Error, click (link removed) -- View this message in context: http://r.789695.n4.nabble.com/NLS-Singular-Gradient-Error-tp2069029p2075140.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using get and paste in a loop to return objects for object names listed a strings
Nevil Amos wrote: I am trying to create a heap of boxplots, by looping though a series of factors and variables in a large data.frame suing paste to constrcut the facto and response names from the colnames I thought I could do this using get() however it is not working what am I doing wrong? You don't give a reproducible example, this makes it hard to answer your question. But not really in response to your question, take a look at histogram from the lattice package or geom_boxplot from the ggplot2 package. These functions can do all the work for you of drawing boxplots for a series of factors and variables in a large data.frame. This saves you a lot of time. cheers, Paul thanks Nevil Amos sp.codes=levels(data.all$CODE_LETTERS) for(spp in sp.codes) { data.sp=subset(data.all,CODE_LETTERS==spp) responses = colnames(data.all)[c(20,28,29,19)] #if (spp==BT) responses = colnames(data.all)[c(19,20,26:29)] groups=colnames (data.all)[c(9,10,13,16,30)] data.sp=subset(data.all,CODE_LETTERS==spp) for (response in responses){ for (group in groups){ r-get(paste(data.sp$,response,sep=)) g-get(paste(data.sp$,group, sep=)) print (r) print(g) boxplot(r ~g) }}} Error in get(paste(data.sp$, response, sep = )) : object 'data.sp$Hb' not found __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +3130 274 3113 Mon-Tue Phone: +3130 253 5773 Wed-Fri http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Random numbers with PDF of user-defined function
Hi, In S+/R, is there an easy way to generate random numbers with a probability distribution specified by an exact user-defined function? For example, I have a function: f(x) = 1/(365 * x), which should be fitted for values of x between 1 and 100,000 How do I generate random numbers with a probability distribution that exactly maps the above function? Nick This email and any attachments may contain information t...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determining whether plot.new has been called
On 04/29/2010 02:21 AM, Dennis Fisher wrote: Colleagues I have a lengthy script that calls mtext. Under most circumstances, a graphics device is open and a plot exists, in which case mtext works as expected. However, there are some instances where the graphics device is open but no plot exists. When mtext is called, I receive an error message: Error in mtext(1) : plot.new has not been called yet The solution is to confirm that: a. the device is open: length(dev.list()) 0 b. whether plot.new has been called. I need help on the latter - how does one test whether plot.new has been called? Hi Dennis, I use: if(dev.cur() == 1) # there is no graphics device open which always seems to be the null device. Since I have never had occasion to switch to the null device, this is a sort of test for whether there is another graphics device open, and thus whether plot.new has been called (on the current device). While it has always worked for me, I am aware that it is a Sneaky Trick and may not work in some situations. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] control span in panel.loess in xyplot
Dear R gurus.. Is it possible to control span settings for different values of a grouping variable, when using xyplot? an example code shown below d=data.frame(x=rep(sample(1:5,rep=F),10),y=rnorm(50),z=rep(sample(LETTERS[1:2],rep=F),25)) xyplot(y~x,data=d,groups=z,panel=panel.superpose,panel.groups=panel.loess(span=c(2/3, 3/4,1/2)) or something like.. xyplot(y~x,data=d,groups=z,panel=function(...) {panel.superpose(...);panel.groups=panel.loess(span=3/4,...)}) Thanks, Santosh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] control span in panel.loess in xyplot
See ?panel.number for lattice functions that can be used in your panel function to discover which one is currently being drawn. On Thu, Apr 29, 2010 at 6:28 AM, Santosh santosh2...@gmail.com wrote: Dear R gurus.. Is it possible to control span settings for different values of a grouping variable, when using xyplot? an example code shown below d=data.frame(x=rep(sample(1:5,rep=F),10),y=rnorm(50),z=rep(sample(LETTERS[1:2],rep=F),25)) xyplot(y~x,data=d,groups=z,panel=panel.superpose,panel.groups=panel.loess(span=c(2/3, 3/4,1/2)) or something like.. xyplot(y~x,data=d,groups=z,panel=function(...) {panel.superpose(...);panel.groups=panel.loess(span=3/4,...)}) Thanks, Santosh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help in web browser
On 28/04/2010 11:07 PM, Chintanu wrote: Hi, I have recently updated to R 2.10.1 in my windows system. Since then, whenever I look for help (e.g., by using ? Function command), the information is displayed by opening a web-browser. However, I rather would prefer to have the information in the usual pop-up style. Is there a way to set/do it ?? Please inform. Set options(help_type=text) for the plain text popups. (This was an installation option; you could reinstall R to set it in your profile, or edit it into RHOME/etc/Rprofile.site yourself.) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compact Patricia Trees (Tries)
Using charmatch partial matches of 10,000 5 leters words to the same list can be done in 10 seconds on my machine and 10,000 5 letter words to 100,000 10 letter words in 1 minute. Is that good enough? Try this simulation: # generate N random words each k long rwords - function(N, k) { L - sample(letters, N*k, replace = TRUE) apply(matrix(L, k), 2, paste, collapse = ) } w1 - rwords(1e5, 10) w2 - rwords(1e4, 5) system.time(charmatch(w2, w2)) system.time(charmatch(w2, w1)) On Thu, Apr 29, 2010 at 4:05 AM, Richard R. Liu richard@pueo-owl.ch wrote: I have an application that a long list of character strings to determine which occur at the beginning of a given word. A straight forward R script using grep takes a long time to run. Rewriting it to use substr and match might be an option, but I have the impression that preparing the list as a trie and performing trie searches might lead to dramatic improvements in performance. I have searched the CRAN packages and find no packages that support Compact Patricia Trees. Does anybody know of such? Thanks, Richard Richard R. Liu richard@pueo-owl.ch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] UpdateLinks = FALSE
Hi, I'm reading 100s of excel files and many of them contain links to external files (I hate that, but that aside). Every time such a file is opened, a menu pops up asking if I want to update the links. I never want to update the links. I used the macro recorder to see what code would be needed to suppress that message, but to no avail (I tried more variations, but one attempt is shown below). How can I suppress such messages? excel - comCreateObject(Excel.Application) wb - comGetProperty(excel, Workbooks) comSetProperty(wb, UpdateLinks, FALSE) owb - comInvoke(wb, Open, xlsfile) # at this point, it's too late Another query: the program at large erases any cells that contain formulae. Thanks to Erich, the program now works like a charm. However, some cells contain formulae such as 832.1 * E4 * E3 (yes I know: big, big *sigh*). I did not take that possibility into account while writing the program. Would it be possible to capture the number (832.1)? My first idea would be to access the formula representation (as a string) and use a nifty regular expression. Thank you in advance. Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] convert Factor as numeric
Dear group, I know this issue has been already covered, and before you reply I must say I have read the R-FAQ and search the mailing list archive. I still can't manage to change my factor to numeric as I couldn't find any clear answer. Here is my df : Pose1 - structure(list(DESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L, 8L), .Label = c( SUGAR NO.11 May/10 , COTTON NO.2 May/10 , PLATINUM Jul/10 , ROBUSTA COFFEE (10) May/10 , WHEAT May/10 , PRIMARY NICKEL USD, PRM HGH GD ALUMINIUM USD, SPCL HIGH GRADE ZINC USD, STANDARD LEAD USD), class = factor), POSITION = c(5, 3, -1, 15, 4, 2), SETTLEMENT = structure(c(3L, 5L, 2L, 1L, 4L, 8L), .Label = c(1,353., 1,739.4000, 16.5400, 467.7500, 78.1300, 25,760.8600, 2,415.9000, 2,421.0500, 2,357.1200), class = factor)), .Names = c(DESCRIPTION, POSITION, SETTLEMENT), row.names = c(1, 2, 3, 4, 5, 51), class = data.frame) S-Pose1$SETTLEMENT #select the last column S [1] 16.540078.13001,739.4000 1,353. 467.7500 2,421.0500 Levels: 1,353. 1,739.4000 16.5400 467.7500 78.1300 25,760.8600 2,415.9000 2,421.0500 2,357.1200 str(S) Factor w/ 9 levels 1,353.,1,739.4000,..: 3 5 2 1 4 8 Now I need to change S to numeric class S1-as.numeric(levels(S))[as.integer(S)] #doesn't work, numbers are rounded or NA Warning message: NAs introduced by coercion S1-as.numeric(levels(S))[S] #doesn't work, numbers are rounded or NA Warning message: NAs introduced by coercion S1-as.numeric(as.character(S)) #doesn't work, numbers are rounded or NA Warning message: NAs introduced by coercion If it can help, my column S is part of a DF that has been obtained via this line : pose=read.csv2(LSCPos1.csv,sep=,,dec=.,as.is=T,h=T,skip=1)[,c(4,8,14, 15)] pose - structure(list(DESCRIPTION = c(WHEAT May/10 , WHEAT May/10 , WHEAT May/10 , WHEAT May/10 , COTTON NO.2 May/10 , COTTON NO.2 May/10 , COTTON NO.2 May/10 , PLATINUM Jul/10 , SUGAR NO.11 May/10 , SUGAR NO.11 May/10 , SUGAR NO.11 May/10 , SUGAR NO.11 May/10 , SUGAR NO.11 May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , PRM HGH GD ALUMINIUM USD 09/07/10 , PRM HGH GD ALUMINIUM USD 09/07/10 , PRIMARY NICKEL USD 04/06/10 , PRIMARY NICKEL USD 04/06/10 , PRIMARY NICKEL USD 10/06/10 , PRIMARY NICKEL USD 10/06/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 06/07/10 , SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 13/04/10 , SPCL HIGH GRADE ZINC USD 13/04/10 ), CREATED.DATE = structure(c(14705, 14707, 14707, 14711, 14700, 14700, 14711, 14711, 14708, 14708, 14708, 14711, 14711, 14707, 14707, 14707, 14707, 14707, 14708, 14708, 14708, 14708, 14708, 14708, 14708, 14708, 14708, 14672, 14673, 14678, 14678, 14700, 14700, 14700, 14700, 14700, 14700, 14700, 14705, 14707, 14707, 14707, 14708, 14708, 14708, 14708, 14708, 14622, 14634), class = Date), QUANITY = c(1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1, -1, 1, 1, 1, -1), CLOSING.PRICE = c(467.7500, 467.7500, 467.7500, 467.7500, 78.1300, 78.1300, 78.1300, 1,739.4000, 16.5400, 16.5400, 16.5400, 16.5400, 16.5400, 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 2,415.9000, 2,415.9000, 25,755.7100, 25,755.7100, 25,760.8600, 25,760.8600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,357.1200, 2,420.7300, 2,420.7300, 2,420.7300, 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500, 2,388.4300, 2,388.4300 )), .Names = c(DESCRIPTION, CREATED.DATE, QUANITY, SETTLEMENT), row.names = c(NA, -49L), class = data.frame) str(pose) 'data.frame': 49 obs. of 4 variables: $ DESCRIPTION : chr WHEAT May/10 WHEAT May/10 WHEAT May/10 WHEAT May/10 ... $ CREATED.DATE:Class 'Date' num [1:49] 14705 14707 14707 14711 14700 ... $ QUANITY : num 1 1 1 1 1 1 1 -1 1 1 ... $ SETTLEMENT : chr 467.7500 467.7500 467.7500 467.7500 ... Pose$SETTLEMENT has a character class, when it should have been numeric. So maybe a solution would be to give a numeric class when I read my .csv file? I tried to change class of this
Re: [R] Sweave question
On 28/04/2010 11:31 PM, Felipe Carrillo wrote: Hi: I am using Sweave and texi2dvi to generate a LaTeX document but can't find the way to hide the graphics while the R chunks are being executed. I thought results=hide would do it but that't not the case. Sweave runs figure chunks multiple times. The first time is probably what you're seeing: it just runs the code, with no special devices created. You need to tell R to use something other than your screen as the default device for this. That's what happens if you run Sweave in batch mode, or if you choose options(device=pdf). (You'll get a file Rplots.pdf created.) Duncan Murdoch If I do: \begin{figure}[h] figA=true,echo=F,fig=T,results=hide= a rnorm(1000) plot(a) @ \caption{Weekly estimates.} \label{figure:ggplot1} \end{figure} The graphic doesn't get displayed but gets printed on the document but the code below shows the graphic...how can I hide it?? \begin{figure}[h] figA=true,echo=F,fig=T,results=hide= library(ggplot2) winter - read.csv(Winter_AllYears.csv) wintermelt - melt(winter,id=week) print(ggplot(wintermelt,aes(week,value/1000)) + geom_line(aes(colour=variable))+ opts(legend.position=none) + facet_wrap(~variable,ncol=2) + opts(title=Winter) + labs(y=Number of fish X 1,000,x=WEEK)) @ \caption{Weekly estimates.} \label{figure:ggplot1} \end{figure} Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random numbers with PDF of user-defined function
At 05:40 AM 4/29/2010, Nick Crosbie wrote: Hi, In S+/R, is there an easy way to generate random numbers with a probability distribution specified by an exact user-defined function? For example, I have a function: f(x) = 1/(365 * x), which should be fitted for values of x between 1 and 100,000 How do I generate random numbers with a probability distribution that exactly maps the above function? Nick First of all, your pdf should be f(x) = 1 / [x log(10)], if x is continuous. Second, compute the cdf as F(x) = ln(x) / log(10). Third, compute the inverse cdf as G(p) = exp[p log(10)] Finally, to generate random variates, use G(u), where u is a uniform random variate in [0,1]. In R, G- function (p) exp(p*log(10)) G(runif(5)) [1] 11178.779736 9475.748549 65939.487801 94914.354479 1.694695 Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: r...@lcfltd.com Least Cost Formulations, Ltd.URL: http://lcfltd.com/ 824 Timberlake Drive Tel: 757-467-0954 Virginia Beach, VA 23464-3239Fax: 757-467-2947 Vere scire est per causas scire __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random numbers with PDF of user-defined function
On 29/04/2010 5:40 AM, Nick Crosbie wrote: Hi, In S+/R, is there an easy way to generate random numbers with a probability distribution specified by an exact user-defined function? For example, I have a function: f(x) = 1/(365 * x), which should be fitted for values of x between 1 and 100,000 How do I generate random numbers with a probability distribution that exactly maps the above function? You can use sample() with the prob argument set to the values of f(x). You probably want replace=TRUE as well. Duncan Murdoch Nick This email and any attachments may contain information t...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] errors returned upon trying to update JGR
I have upgraded R and am currently running the following version: R version 2.10.1 Patched (2010-02-20 r51163) Copyright (C) 2010 The R Foundation for Statistical Computing ISBN 3-900051-07-0 The characteristics of my system are the following: OS: Linux 2.6.27.29-0.1-default x86_64 Current user: mau...@linux-326k System: openSUSE 11.1 (x86_64) KDE: 4.1.3 (KDE 4.1.3) release 4.10.4 JGR upgrading attempt generated the following errors On-line help files cannot be found from JGR. JGR(update=TRUE) trying URL 'http://www.rforge.net/src/contrib/JGR_1.7-2.tar.gz' Content type 'application/x-gzip' length 528295 bytes (515 Kb) opened URL == downloaded 515 Kb trying URL 'http://cran.r-project.org/src/contrib/rJava_0.8-4.tar.gz' Content type 'application/x-gzip' length 520037 bytes (507 Kb) opened URL == downloaded 507 Kb trying URL 'http://www.rforge.net/src/contrib/JavaGD_0.5-3.tar.gz' Content type 'application/x-gzip' length 101898 bytes (99 Kb) opened URL == downloaded 99 Kb trying URL 'http://cran.r-project.org/src/contrib/iplots_1.1-3.tar.gz' Content type 'application/x-gzip' length 331100 bytes (323 Kb) opened URL == downloaded 323 Kb The downloaded packages are in /tmp/RtmpXEkgtp/downloaded_packages Warning messages: 1: In install.packages(c(JGR, rJava, JavaGD, iplots), lt, c(cran, : installation of package 'rJava' had non-zero exit status 2: In install.packages(c(JGR, rJava, JavaGD, iplots), lt, c(cran, : installation of package 'JavaGD' had non-zero exit status 3: In install.packages(c(JGR, rJava, JavaGD, iplots), lt, c(cran, : installation of package 'JGR' had non-zero exit status Any suggestion is welcome. Thank you, Maura tutti i telefonini TIM! tutti i telefonini TIM! tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert Factor as numeric
arnaud Gaboury arnaud.gaboury at gmail.com writes: Dear group, I know this issue has been already covered, and before you reply I must say I have read the R-FAQ and search the mailing list archive. I still can't manage to change my factor to numeric as I couldn't find any clear answer. (Posting via Gmane, so there will probably be four other solutions by the time this shows up.) Your problem is that R does not recognize the comma separators in your numeric format. Thanks for posting reproducible code! as.numeric(gsub(,,,as.character(Pose1$SETTLEMENT))) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with optimization (constrOptim)
Ah..constrOptim is for linear inequality constraints. your ci is a matrix. it should be a vector. Nikhil On Apr 29, 2010, at 3:14 AM, Cz³owiek Kuba wrote: Hi, You are right, my intention was to return a set of values and to minimize them all in a multicriteria optimization problem. The interesting thing is that when I actually used scalar return of this function, by minimizing sum of squares in this form: fr - function(z) { t(z%*%matrix(c(2,5,6), 3,1)-matrix(c(5,4,2), 3,1))%*%(z%* %matrix(c(2,5,6), 3,1)-matrix(c(5,4,2), 3,1)) } constrOptim((matrix(c(0,0,0,0,0,0,0,0,0),3,3)), fr) or nlm(fr, matrix(c(0,0,0,0,0,0,0,0,0),3,3)) -- the function also returned non-comformable error. Kind regards Jacob 2010/4/29 Nikhil Kaza nikhil.l...@gmail.com fr does not return a scalar. Nikhil On Apr 28, 2010, at 3:35 AM, Cz³owiek Kuba wrote: Hello, I have the following problem: I have a set of n matrix equations in the form of : [b1] = [A] * [b0] [b2] = [A] * [b1] etc. vertical vectors [b0], [b1], ... are GIVEN. We try to estimate matrix A. As there are many equations (more than cells in matrix A) the system has no solutions. A is transition matrix (stochastic matrix) or markov process, so the sum of each row = 1 and each entry is probability (aij in 0;1). I tried to estimate A by using constrOptim the following way, but apparently it won't work on matrices. fr - function(x) { x%*%matrix(c(2,5,6), 3,1)-matrix(c(5,4,2), 3,1) x%*%matrix(c(6,2,3), 3,1)-matrix(c(1,1,1), 3,1) x%*%matrix(c(6,1,2), 3,1)-matrix(c(3,4,1), 3,1) } constrOptim(matrix(c(0.5,0.4,0.1,0.2,0.3,0.5,0.5,0.2,0.3),3,3), fr, NULL, ui=matrix(c(1,0,0,0,1,0,0,0,1),3,3), ci=matrix(c(-.1 ,-.1,-.1,-.1,-.1,-.1,-.1,-.1,-.1), 3,3)) It produces the following error: Error in ui %*% theta : non-conformable arguments Kind regards and thanks for help Jacob [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non linear estimation
hey, thanks and I actually ready found such packages such as nlme, but i failed to found the comment for restrictions, so.anyway, thanks fro your help. James -- View this message in context: http://r.789695.n4.nabble.com/non-linear-estimation-tp2072136p2075338.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non linear estimation
it is an assignment, haha~~ I just simplify the question and i could do that in Excel using solver. I just wonder whether I can find a way to do that in R. The main problem is adding restrictions, I managed to do one question without restrictions in R by nls. James -- View this message in context: http://r.789695.n4.nabble.com/non-linear-estimation-tp2072136p2075343.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert Factor as numeric
TY petr, I was just trying something like that few mn ago :-) as.numeric(gsub(,, , S)) does exactly what I want. -Original Message- From: Petr PIKAL [mailto:petr.pi...@precheza.cz] Sent: Thursday, April 29, 2010 1:28 PM To: arnaud Gaboury Cc: r-help@r-project.org Subject: Odp: [R] convert Factor as numeric Hi You have to get rid of thousands separator firsr as.numeric(gsub(,, , S)) Regards Petr r-help-boun...@r-project.org napsal dne 29.04.2010 13:12:44: Dear group, I know this issue has been already covered, and before you reply I must say I have read the R-FAQ and search the mailing list archive. I still can't manage to change my factor to numeric as I couldn't find any clear answer. Here is my df : Pose1 - structure(list(DESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L, 8L), .Label = c( SUGAR NO.11 May/10 , COTTON NO.2 May/10 , PLATINUM Jul/10 , ROBUSTA COFFEE (10) May/10 , WHEAT May/10 , PRIMARY NICKEL USD, PRM HGH GD ALUMINIUM USD, SPCL HIGH GRADE ZINC USD, STANDARD LEAD USD), class = factor), POSITION = c(5, 3, -1, 15, 4, 2), SETTLEMENT = structure(c(3L, 5L, 2L, 1L, 4L, 8L), .Label = c(1,353., 1,739.4000, 16.5400, 467.7500, 78.1300, 25,760.8600, 2,415.9000, 2,421.0500, 2,357.1200), class = factor)), .Names = c(DESCRIPTION, POSITION, SETTLEMENT), row.names = c(1, 2, 3, 4, 5, 51), class = data.frame) S-Pose1$SETTLEMENT #select the last column S [1] 16.540078.13001,739.4000 1,353. 467.7500 2,421.0500 Levels: 1,353. 1,739.4000 16.5400 467.7500 78.1300 25,760.8600 2,415.9000 2,421.0500 2,357.1200 str(S) Factor w/ 9 levels 1,353.,1,739.4000,..: 3 5 2 1 4 8 Now I need to change S to numeric class S1-as.numeric(levels(S))[as.integer(S)] #doesn't work, numbers are rounded or NA Warning message: NAs introduced by coercion S1-as.numeric(levels(S))[S] #doesn't work, numbers are rounded or NA Warning message: NAs introduced by coercion S1-as.numeric(as.character(S)) #doesn't work, numbers are rounded or NA Warning message: NAs introduced by coercion If it can help, my column S is part of a DF that has been obtained via this line : pose=read.csv2(LSCPos1.csv,sep=,,dec=.,as.is=T,h=T,skip=1)[,c(4, 8,14, 15)] pose - structure(list(DESCRIPTION = c(WHEAT May/10 , WHEAT May/10 , WHEAT May/10 , WHEAT May/10 , COTTON NO.2 May/10 , COTTON NO.2 May/10 , COTTON NO.2 May/10 , PLATINUM Jul/10 , SUGAR NO.11 May/10 , SUGAR NO.11 May/10 , SUGAR NO.11 May/10 , SUGAR NO.11 May/10 , SUGAR NO.11 May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , PRM HGH GD ALUMINIUM USD 09/07/10 , PRM HGH GD ALUMINIUM USD 09/07/10 , PRIMARY NICKEL USD 04/06/10 , PRIMARY NICKEL USD 04/06/10 , PRIMARY NICKEL USD 10/06/10 , PRIMARY NICKEL USD 10/06/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 06/07/10 , SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 13/04/10 , SPCL HIGH GRADE ZINC USD 13/04/10 ), CREATED.DATE = structure(c(14705, 14707, 14707, 14711, 14700, 14700, 14711, 14711, 14708, 14708, 14708, 14711, 14711, 14707, 14707, 14707, 14707, 14707, 14708, 14708, 14708, 14708, 14708, 14708, 14708, 14708, 14708, 14672, 14673, 14678, 14678, 14700, 14700, 14700, 14700, 14700, 14700, 14700, 14705, 14707, 14707, 14707, 14708, 14708, 14708, 14708, 14708, 14622, 14634), class = Date), QUANITY = c(1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1, -1, 1, 1, 1, -1), CLOSING.PRICE = c(467.7500, 467.7500, 467.7500, 467.7500, 78.1300, 78.1300, 78.1300, 1,739.4000, 16.5400, 16.5400, 16.5400, 16.5400, 16.5400, 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 2,415.9000, 2,415.9000, 25,755.7100, 25,755.7100, 25,760.8600, 25,760.8600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,357.1200, 2,420.7300, 2,420.7300, 2,420.7300, 2,421.0500, 2,421.0500,
Re: [R] Split a vector by NA's - is there a better solution then a loop ?
Another option could be: split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1] On Thu, Apr 29, 2010 at 4:42 AM, Tal Galili tal.gal...@gmail.com wrote: Hi all, I would like to have a function like this: split.vec.by.NA - function(x) That takes a vector like this: x - c(2,1,2,NA,1,1,2,NA,4,5,2,3) And returns a list of length of 3, each element of the list is the relevant segmented vector, like this: $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 I found how to do it with a loop, but wondered if there is some smarter (vectorized) way of doing it. Here is the code I used: x - c(2,1,2,NA,1,1,2,NA,4,5,2,3) split.vec.by.NA - function(x) { # assumes NA are seperating groups of numbers #TODO: add code to check for it number.of.groups - sum(is.na(x)) + 1 groups.end.point.locations - c(which(is.na(x)), length(x)+1) # This will be all the places with NA's + a nubmer after the ending of the vector group.start - 1 group.end - NA new.groups.split.id - x # we will replace all the places of the group with group ID, excapt for the NA, which will later be replaced by 0 for(i in seq_len(number.of.groups)) { group.end - groups.end.point.locations[i]-1 new.groups.split.id[group.start:group.end] - i group.start - groups.end.point.locations[i]+1 # make the new group start higher for the next loop (at the final loop it won't matter } new.groups.split.id[is.na(x)] - 0 return(split(x, new.groups.split.id)[-1]) } split.vec.by.NA(x) Thanks, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] UpdateLinks = FALSE
Sorry, I intended to send this straight to the rcom mailing list. It's about the rcom package. Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ --- On Thu, 4/29/10, Albert-Jan Roskam fo...@yahoo.com wrote: From: Albert-Jan Roskam fo...@yahoo.com Subject: [R] UpdateLinks = FALSE To: R Mailing List r-help@r-project.org Date: Thursday, April 29, 2010, 1:07 PM Hi, I'm reading 100s of excel files and many of them contain links to external files (I hate that, but that aside). Every time such a file is opened, a menu pops up asking if I want to update the links. I never want to update the links. I used the macro recorder to see what code would be needed to suppress that message, but to no avail (I tried more variations, but one attempt is shown below). How can I suppress such messages? excel - comCreateObject(Excel.Application) wb - comGetProperty(excel, Workbooks) comSetProperty(wb, UpdateLinks, FALSE) owb - comInvoke(wb, Open, xlsfile) # at this point, it's too late Another query: the program at large erases any cells that contain formulae. Thanks to Erich, the program now works like a charm. However, some cells contain formulae such as 832.1 * E4 * E3 (yes I know: big, big *sigh*). I did not take that possibility into account while writing the program. Would it be possible to capture the number (832.1)? My first idea would be to access the formula representation (as a string) and use a nifty regular expression. Thank you in advance. Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] -Inline Attachment Follows- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] randomness in stepclass (klaR) or lda (MASS) ?
Hi, a colleague ran a stepwise discriminant analysis twice in a row and got different results, suggesting some sochasticity in the algorithms involved. I looked at her data and found that there was a lot of collinearity, so that I reckoned that maybe stepclass (klaR) cannot find a clear winner when trying to include a new variable and makes a random choice. Is that true? another possibility is that lda (from MASS) computes CV classification rates from a random subsample instead of using all the data (?) That might be a sensible choice with a very large sample. I advised her to run the function several times and see if a consensus emerges, but that doesn't seem to be the case, and besides, I would like to know what really is going on. thanks Eric Elguero Laboratory Genetics and Evolution of Infectious Diseases, Team: Genetics and Adaptation of Plasmodium UMR 2724 CNRS-IRD, IRD Montpellier, 911 Avenue Agropolis, BP 64501, 34394 Montpellier Cedex 5, France f4.U.spDA - stepclass(f.mes, f.gp4, lda,improvement=0.01,prior=rep(0.25,4)) `stepwise classification', using 10-fold cross-validated correctness rate of method lda'. 89 observations of 31 variables in 4 classes; direction: both stop criterion: improvement less than 1%. correctness rate: 0.58333; in: X2; variables (1): X2 correctness rate: 0.66389; in: X9; variables (2): X2, X9 correctness rate: 0.69583; in: X27; variables (3): X2, X9, X27 hr.elapsed min.elapsed sec.elapsed 0.000.00 20.77 f4.U.spDA - stepclass(f.mes, f.gp4, lda,improvement=0.01,prior=rep(0.25,4)) `stepwise classification', using 10-fold cross-validated correctness rate of method lda'. 89 observations of 31 variables in 4 classes; direction: both stop criterion: improvement less than 1%. correctness rate: 0.60556; in: X2; variables (1): X2 correctness rate: 0.71806; in: X6; variables (2): X2, X6 hr.elapsed min.elapsed sec.elapsed 0.000.00 15.14 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] operator problem within function
Sorry for that offlist post, did not mean to do it intentionally. just hit the wrong button. Unfortunately this disadvantage is not written next to $ in the manual. On Apr 29, 2010, at 2:34 AM, Bunny, lautloscrew.com wrote: David, With your help i finally got it. THX! sorry for handing out some ugly names. Reason being: it´s a german dataset with german variable names. With those german names you are always sure you dont use a forbidden name. I just did not want to hit one of those by accident when changing these names for the mailing list. columna is just the latin term for column :) . Anyway here´s what worked note: I just tried to use some more real names here. recode_items = function(dataframe,question_number,medium=3){ #note column names of the initial data.frame are like Question1,Question2 etc. Using [,1] would not be very practical since # the df contains some other data too. Indexing by names seemed to most comfortable way so far. question-paste(Question,question_number,sep=) # needed indexing here that understands characters, that´s why going with [,question_number] did not work. dataframe[question][dataframe[question]==3]=0 This would be more typical: dataframe[dataframe[question]==3, question] - 0 return(dataframe) } recode_items(mydataframe,question_number,3) # this call uses the dataframe that contains the answers of survey participants. Question number is an argument that selects the question from the dataframe that should be recoded. In surveys some weighting schemes only respect extreme answers, which is why the medium answer is recoded to zero. Since it depends on the item scale what medium actually is, I need it to be an argument of my function. Did you want a further logical test with that =1 or some sort of assignment??? So yes, it´s an assignment. Moral: Generally better to use [ indexing. That´s what really made my day (and it´s only 9.30 a.m. here ) . Are there exceptions to rule? Not that I know of. I just worked a lot with the $ in the past. $colname is just syntactic sugar for either [colname] or [ ,colname] and it has the disadvantage that colname is not evaluated. thx matt On 29.04.2010, at 00:56, David Winsemius wrote: On Apr 28, 2010, at 5:45 PM, David Winsemius wrote: On Apr 28, 2010, at 5:31 PM, Bunny, lautloscrew.com wrote: Dear all, i have a problem with processing dataframes within a function using the $. Here´s my code: recode_items = function(dataframe,number,medium=2){ # this works q-paste(columna,number,sep=) Do your really want q to equal columna2 when number equals 2? # this does not work, particularly because dataframe is not processed # dataframe should be: givenframe$columnagivennumber a=dataframe$q[dataframe$q==medium]=1 Did you want a further logical test with that =1 or some sort of assignment??? a) Do you want to work on the column from dataframe ( horrible name for this purpose IMO) with the name columna2? If so, then start with dataframe[ , q ] the q will be evaluated in this form whereas it would not when used with $. b) (A guess in absence of explanation of a goal.) Now do you want all of the rows where that vector equals medium? If so ,then try this: dataframe[ dataframe[ , q ]==2 , ] # untested in the absence of data Ooops. should have been: dataframe[ dataframe[ , q ]==medium , ] #since both q and medium will be evaluated. Moral: Generally better to use [ indexing. -- David. return(a) } If I call this function, i´d like it to return my dataframe. The problem appears to be somewhere around the $. I´m sure this not too hard, but somehow i am stuck. I´ll keep searchin the manuals. Thx for any help in advance. best matt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using plyr::dply more (memory) efficiently?
Hi all, In short: I'm running ddply on an admittedly (somehow) large data.frame (not that large). It runs fine until it finishes and gets to the collating part where all subsets of my data.frame have been summarized and they are being reassembled into the final summary data.frame (sorry, don't know the correct plyr terminology). During collation, my R workspace RAM usage goes from about 1.5 GB upto 20GB until I kill it. Running a similar piece of code that iterates manually w/o ddply by using a combo of lapply and a do.call(rbind, ...) uses considerably less ram (tops out at about 8GB). How can I use ddply more efficiently? Longer: Here's more info: * The data.frame itself ~ 15.8 MB when loaded. * ~ 400,000 rows, 8 columns It looks like so: exon.start exon.width exon.width.unique exon.anno counts symbol transcript chr 14225468 0 utr 0 WASH5P WASH5P chr1 24833 69 0 utr 1 WASH5P WASH5P chr1 3565915238 utr 1 WASH5P WASH5P chr1 46470159 0 utr 0 WASH5P WASH5P chr1 56721198 0 utr 0 WASH5P WASH5P chr1 67096136 0 utr 0 WASH5P WASH5P chr1 77469137 0 utr 0 WASH5P WASH5P chr1 87778147 0 utr 0 WASH5P WASH5P chr1 98131 99 0 utr 0 WASH5P WASH5P chr1 10 14601154 0 utr 0 WASH5P WASH5P chr1 11 19184 50 0 utr 0 WASH5P WASH5P chr1 12 469314036intron 2 WASH5P WASH5P chr1 13 490275736intron 1 WASH5P WASH5P chr1 14 5811659 144intron 47 WASH5P WASH5P chr1 15 6629 9221intron 1 WASH5P WASH5P chr1 16 6919177 0intron 0 WASH5P WASH5P chr1 17 723223735intron 2 WASH5P WASH5P chr1 18 7606172 0intron 0 WASH5P WASH5P chr1 19 7925206 0intron 0 WASH5P WASH5P chr1 20 8230 6371 109intron 67 WASH5P WASH5P chr1 21 14755 442955intron 12 WASH5P WASH5P chr1 ... I'm ply-ing over the transcript column and the function transforms each such subset of the data.frame into a new data.frame that is just 1 row / transcript that basically has the sum of the counts for each transcript. The code would look something like this (`summaries` is the data.frame I'm referring to): rpkm - ddply(summaries, .(transcript), function(df) { data.frame(symbol=df$symbol[1], counts=sum(df$counts)) } (It actually calculates 2 more columns that are returned in the data.frame, but I'm not sure that's really important here). To test some things out, I've written another function to manually iterate/create subsets of my data.frame to summarize. I'm using sqldf to dump the data.frame into a db, then I lapply over subsets of the db `where transcript=x` to summarize each subset of my data into a list of single-row data.frames (like ddply is doing), and finish with a `do.call(rbind, the.dfs)` o nthis list. This returns the same exact result ddply would return, and by the time `do.call` finishes, my RAM usage hits about 8gb. So, what am I doing wrong with ddply that makes the difference ram usage in the last step (collation -- the equivalent of my final `do.call(rbind, my.dfs)` be more than 12GB? Thanks, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] randomness in stepclass (klaR) or lda (MASS) ?
On 29.04.2010 15:01, Eric Elguero wrote: Hi, a colleague ran a stepwise discriminant analysis twice in a row and got different results, suggesting some sochasticity in the algorithms involved. I looked at her data and found that there was a lot of collinearity, so that I reckoned that maybe stepclass (klaR) cannot find a clear winner when trying to include a new variable and makes a random choice. Is that true? Yes, since a cross validation is involved. If you want stable results, you could try leave one out or set a seed. Anyway, if you variables are collinear I wonder if the stepwise approach is the smartest solution here. another possibility is that lda (from MASS) computes CV classification rates from a random subsample instead of using all the data (?) That might be a sensible choice with a very large sample. I advised her to run the function several times and see if a consensus emerges, but that doesn't seem to be the case, and besides, I would like to know what really is going on. Well, it is called cross validation which is based on random sampling if you do not have k=n -fold CV (=leave-one-out). Again, to get reproducible results, you will need to set a seed. If the results are that unstable: Do you really have a sufficient number of observations for your classification problem? Uwe Ligges thanks Eric Elguero Laboratory Genetics and Evolution of Infectious Diseases, Team: Genetics and Adaptation of Plasmodium UMR 2724 CNRS-IRD, IRD Montpellier, 911 Avenue Agropolis, BP 64501, 34394 Montpellier Cedex 5, France f4.U.spDA- stepclass(f.mes, f.gp4, lda,improvement=0.01,prior=rep(0.25,4)) `stepwise classification', using 10-fold cross-validated correctness rate of method lda'. 89 observations of 31 variables in 4 classes; direction: both stop criterion: improvement less than 1%. correctness rate: 0.58333; in: X2; variables (1): X2 correctness rate: 0.66389; in: X9; variables (2): X2, X9 correctness rate: 0.69583; in: X27; variables (3): X2, X9, X27 hr.elapsed min.elapsed sec.elapsed 0.000.00 20.77 f4.U.spDA- stepclass(f.mes, f.gp4, lda,improvement=0.01,prior=rep(0.25,4)) `stepwise classification', using 10-fold cross-validated correctness rate of method lda'. 89 observations of 31 variables in 4 classes; direction: both stop criterion: improvement less than 1%. correctness rate: 0.60556; in: X2; variables (1): X2 correctness rate: 0.71806; in: X6; variables (2): X2, X6 hr.elapsed min.elapsed sec.elapsed 0.000.00 15.14 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dendrogram and fusion levels
Dear users, I am trying to extract the fusion levels from a dendrogram (in my case, phylogenetic trees in the 'phylo' format of APE). So far, I have not been successful. I can't believe there is not a library to do it, but I can't find a function that would extract the fusion levels. Do you know any way to extract the fusion levels? Any help would be appreciated, Timothée --- Timothée POISOT - Institut des Sciences de l'Evolution Université Montpellier 2, CC 065 Place Eugène Bataillon 34095 Montpellier CEDEX 5 - Phone : (+33)4 67 14 40 61 Fax : (+33)4 67 14 40 61 E-mail : tpoi...@um2.fr Web : http://www.timotheepoisot.fr/ --- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lm() with non-linear coefficients constraints? --- nls?
dear R experts---quick question. I need to estimate a model that looks like y = (b*T+d*T^3) + (1-b-3*d*T^2)*x + (3*d*T)*x^2 + (-d)*x^3 I only have three parameters. Is nls() the right tool for the job, or is there something faster/better? /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lm() with non-linear coefficients constraints? --- nls?
dear R experts---quick question. I need to estimate a model that looks like y = (b*T+d*T^3) + (1-b-3*d*T^2)*x + (3*d*T)*x^2 + (-d)*x^3 I only have three parameters. Is nls() the right tool for the job, or is there something faster/better? /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) CV Starr Professor of Economics (Finance), Brown University http://welch.econ.brown.edu/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] operator problem within function
On Apr 29, 2010, at 9:03 AM, Bunny, lautloscrew.com wrote: Sorry for that offlist post, did not mean to do it intentionally. just hit the wrong button. Unfortunately this disadvantage is not written next to $ in the manual. Hmmm. Not my manual: Both [[ and $ select a single element of the list. The main difference is that $ does not allow computed indices, whereas [[does. It also says that the correct equivalent using extraction operators of $ would be: x$name == x[[name, exact = FALSE]] -- David. On Apr 29, 2010, at 2:34 AM, Bunny, lautloscrew.com wrote: David, With your help i finally got it. THX! sorry for handing out some ugly names. Reason being: it´s a german dataset with german variable names. With those german names you are always sure you dont use a forbidden name. I just did not want to hit one of those by accident when changing these names for the mailing list. columna is just the latin term for column :) . Anyway here´s what worked note: I just tried to use some more real names here. recode_items = function(dataframe,question_number,medium=3){ #note column names of the initial data.frame are like Question1,Question2 etc. Using [,1] would not be very practical since # the df contains some other data too. Indexing by names seemed to most comfortable way so far. question-paste(Question,question_number,sep=) # needed indexing here that understands characters, that´s why going with [,question_number] did not work. dataframe[question][dataframe[question]==3]=0 This would be more typical: dataframe[dataframe[question]==3, question] - 0 return(dataframe) } recode_items(mydataframe,question_number,3) # this call uses the dataframe that contains the answers of survey participants. Question number is an argument that selects the question from the dataframe that should be recoded. In surveys some weighting schemes only respect extreme answers, which is why the medium answer is recoded to zero. Since it depends on the item scale what medium actually is, I need it to be an argument of my function. Did you want a further logical test with that =1 or some sort of assignment??? So yes, it´s an assignment. Moral: Generally better to use [ indexing. That´s what really made my day (and it´s only 9.30 a.m. here ) . Are there exceptions to rule? Not that I know of. I just worked a lot with the $ in the past. $colname is just syntactic sugar for either [colname] or [ ,colname] and it has the disadvantage that colname is not evaluated. thx matt On 29.04.2010, at 00:56, David Winsemius wrote: On Apr 28, 2010, at 5:45 PM, David Winsemius wrote: On Apr 28, 2010, at 5:31 PM, Bunny, lautloscrew.com wrote: Dear all, i have a problem with processing dataframes within a function using the $. Here´s my code: recode_items = function(dataframe,number,medium=2){ # this works q-paste(columna,number,sep=) Do your really want q to equal columna2 when number equals 2? # this does not work, particularly because dataframe is not processed # dataframe should be: givenframe$columnagivennumber a=dataframe$q[dataframe$q==medium]=1 Did you want a further logical test with that =1 or some sort of assignment??? a) Do you want to work on the column from dataframe ( horrible name for this purpose IMO) with the name columna2? If so, then start with dataframe[ , q ] the q will be evaluated in this form whereas it would not when used with $. b) (A guess in absence of explanation of a goal.) Now do you want all of the rows where that vector equals medium? If so ,then try this: dataframe[ dataframe[ , q ]==2 , ] # untested in the absence of data Ooops. should have been: dataframe[ dataframe[ , q ]==medium , ] #since both q and medium will be evaluated. Moral: Generally better to use [ indexing. -- David. return(a) } If I call this function, i´d like it to return my dataframe. The problem appears to be somewhere around the $. I´m sure this not too hard, but somehow i am stuck. I´ll keep searchin the manuals. Thx for any help in advance. best matt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford,
Re: [R] Split a vector by NA's - is there a better solution then a loop ?
On Thu, Apr 29, 2010 at 1:27 PM, Henrique Dallazuanna www...@gmail.com wrote: Another option could be: split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1] One thing none of the solutions so far do (except I haven't tried Tal's original code) is insert an empty group between adjacent NA values, for example in: x = c(1,2,3,NA,NA,4,5,6) split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1] $`0` [1] 1 2 3 $`2` [1] 4 5 6 Maybe this never happens in Tal's case, or it's not what he wanted anyway, but I thought I'd point it out! Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] time zone convert
Hi there, I've got a column vector in a csv file as follows, and I need to add 11 hours to each of them. Is there a way that I can do it? (The actual file size is much bigger than this.) Time 01-DEC-2008 00:00:28.611 01-DEC-2008 00:00:43.155 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.919 01-DEC-2008 00:23:46.452 02-DEC-2008 00:03:17.646 02-DEC-2008 00:03:17.652 03-DEC-2008 00:15:11.485 03-DEC-2008 00:18:44.652 03-DEC-2008 00:22:17.447 Thank you in advance. Cheers, Carol [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time zone convert
Try this: Time2 - gsub(\\.*, , tolower(Time)) modifyList(Time2, list(hour = Time2$hour + 11)) On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.com wrote: Hi there, I've got a column vector in a csv file as follows, and I need to add 11 hours to each of them. Is there a way that I can do it? (The actual file size is much bigger than this.) Time 01-DEC-2008 00:00:28.611 01-DEC-2008 00:00:43.155 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.919 01-DEC-2008 00:23:46.452 02-DEC-2008 00:03:17.646 02-DEC-2008 00:03:17.652 03-DEC-2008 00:15:11.485 03-DEC-2008 00:18:44.652 03-DEC-2008 00:22:17.447 Thank you in advance. Cheers, Carol [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Anova Analysis
Dear all, I have a quite basic questions about anova analysis in R, sorry for this, but I have no clue how to explain this result. I have two datasets which are named: nmda123, nmda456. Each dataset has three samples which were measured three times. And I would like to compare means of them with Posthoc test using R, following please see the output: (CREB, mCREB and No virus are the name of samples) nmda123 Values ind 1 6.7171265 CREB 2 5.0343117 CREB 3 6.900 CREB 4 0.1195394mCREB 5 0.1221876mCREB 6 0.190mCREB 7 1.000 No Virus 8 1.000 No Virus 9 1.000 No Virus nmda456 Values ind 1 6.4486940 CREB 2 6.2277490 CREB 3 6.500 CREB 4 0.200mCREB 5 0.3766052mCREB 6 0.400mCREB 7 1.000 No Virus 8 1.000 No Virus 9 1.000 No Virus TukeyHSD(aov(Values ~ ind, data = nmda456)) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = Values ~ ind, data = nmda456) $ind difflwrupr p adj mCREB-CREB -6.0666126 -6.3289033 -5.8043219 0.000 No Virus-CREB -5.3921477 -5.6544383 -5.1298570 0.000 No Virus-mCREB 0.6744649 0.4121743 0.9367556 0.0005382 TukeyHSD(aov(Values ~ ind, data = nmda123)) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = Values ~ ind, data = nmda123) $ind difflwr upr p adj mCREB-CREB -6.073237 -7.5618886 -4.584585 0.392 No Virus-CREB -5.217146 -6.7057976 -3.728495 0.943 No Virus-mCREB 0.856091 -0.6325606 2.344743 0.2588450 So my question is No virus-mCREB group. Even I looked at the data by eyes, there is big difference between no virus and mCREB in data nmda123, but why the pvalue is 0.2588450, but in nmda456 data, the pvalue is 0.0005382. But I can see there is bigger difference in nmda123 than nmda456, I do not know why. Sorry for my inexperiences in statistics. Thanks for your reply and time! Cheers, Wei __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using plyr::dply more (memory) efficiently?
I don't know about that, but try this : install.packages(data.table, repos=http://R-Forge.R-project.org;) require(data.table) summaries = data.table(summaries) summaries[,sum(counts),by=symbol] Please let us know if that returns the correct result, and if its memory/speed is ok ? Matthew Steve Lianoglou mailinglist.honey...@gmail.com wrote in message news:w2kbbdc7ed01004290606lc425e47cs95b36f6bf0a...@mail.gmail.com... Hi all, In short: I'm running ddply on an admittedly (somehow) large data.frame (not that large). It runs fine until it finishes and gets to the collating part where all subsets of my data.frame have been summarized and they are being reassembled into the final summary data.frame (sorry, don't know the correct plyr terminology). During collation, my R workspace RAM usage goes from about 1.5 GB upto 20GB until I kill it. Running a similar piece of code that iterates manually w/o ddply by using a combo of lapply and a do.call(rbind, ...) uses considerably less ram (tops out at about 8GB). How can I use ddply more efficiently? Longer: Here's more info: * The data.frame itself ~ 15.8 MB when loaded. * ~ 400,000 rows, 8 columns It looks like so: exon.start exon.width exon.width.unique exon.anno counts symbol transcript chr 14225468 0 utr 0 WASH5P WASH5P chr1 24833 69 0 utr 1 WASH5P WASH5P chr1 3565915238 utr 1 WASH5P WASH5P chr1 46470159 0 utr 0 WASH5P WASH5P chr1 56721198 0 utr 0 WASH5P WASH5P chr1 67096136 0 utr 0 WASH5P WASH5P chr1 77469137 0 utr 0 WASH5P WASH5P chr1 87778147 0 utr 0 WASH5P WASH5P chr1 98131 99 0 utr 0 WASH5P WASH5P chr1 10 14601154 0 utr 0 WASH5P WASH5P chr1 11 19184 50 0 utr 0 WASH5P WASH5P chr1 12 469314036intron 2 WASH5P WASH5P chr1 13 490275736intron 1 WASH5P WASH5P chr1 14 5811659 144intron 47 WASH5P WASH5P chr1 15 6629 9221intron 1 WASH5P WASH5P chr1 16 6919177 0intron 0 WASH5P WASH5P chr1 17 723223735intron 2 WASH5P WASH5P chr1 18 7606172 0intron 0 WASH5P WASH5P chr1 19 7925206 0intron 0 WASH5P WASH5P chr1 20 8230 6371 109intron 67 WASH5P WASH5P chr1 21 14755 442955intron 12 WASH5P WASH5P chr1 ... I'm ply-ing over the transcript column and the function transforms each such subset of the data.frame into a new data.frame that is just 1 row / transcript that basically has the sum of the counts for each transcript. The code would look something like this (`summaries` is the data.frame I'm referring to): rpkm - ddply(summaries, .(transcript), function(df) { data.frame(symbol=df$symbol[1], counts=sum(df$counts)) } (It actually calculates 2 more columns that are returned in the data.frame, but I'm not sure that's really important here). To test some things out, I've written another function to manually iterate/create subsets of my data.frame to summarize. I'm using sqldf to dump the data.frame into a db, then I lapply over subsets of the db `where transcript=x` to summarize each subset of my data into a list of single-row data.frames (like ddply is doing), and finish with a `do.call(rbind, the.dfs)` o nthis list. This returns the same exact result ddply would return, and by the time `do.call` finishes, my RAM usage hits about 8gb. So, what am I doing wrong with ddply that makes the difference ram usage in the last step (collation -- the equivalent of my final `do.call(rbind, my.dfs)` be more than 12GB? Thanks, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simple loop code
Hi fellow R Users, I find that I typically rewrite my data specific to data in columns, which is by no means efficient and I am struggling to break out of this bad habit and utalise some of the excellent things R can do! I have tried to look at 'for' but I don't really follow it, and I wondered if anyone could help with a simple example using my script so I could follow this and build on it, so for example, wanting to change an ID code from alphanumeric to numeric. The example below works, but takes ages, given I have a lot of IDs, to do manually! Any thoughts on how to create a loop to go through each ID and give them a unique number would be most welcome! Cheers, Ross levels(dat.ID$ID2)[levels(dat.ID$ID2)=='A1']-1 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='A2']-2 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D1']-3 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D2']-4 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D4']-5 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D5']-6 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D6']-7 -- View this message in context: http://r.789695.n4.nabble.com/Simple-loop-code-tp2075322p2075322.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Changing from 32-bit builds to 64-bit builds
Hi, Probably this is a very simple question for all the programmers, but how do you change from 32-bit builds (default) to 64-bit builds? I've been trying to run Anova to compare two models, but I get the following error message: Error: cannot allocate vector of size 1.2 Gb R(3122,0xa0ab44e0) malloc: *** mmap(size=1337688064) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug R(3122,0xa0ab44e0) malloc: *** mmap(size=1337688064) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug I suppose it's a problem with memory allocation because of the big data size, so I thought I should use 64-bit builds instead of 32. As it was recommended in a manual, I've entered the following: CC='gcc -arch x86_64' CXX='g++ -arch x86_64' F77='gfortran -arch x86_64' FC='gfortran -arch x86_64' OBJC='gcc -arch x86_64' But it still gave me error. I'd greatly appreciate if someone can answer this question! Thank you, Sachi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple loop code
Try this: factor(dat.ID$ID2, labels = 1:7) On Thu, Apr 29, 2010 at 8:39 AM, RCulloch ross.cull...@dur.ac.uk wrote: Hi fellow R Users, I find that I typically rewrite my data specific to data in columns, which is by no means efficient and I am struggling to break out of this bad habit and utalise some of the excellent things R can do! I have tried to look at 'for' but I don't really follow it, and I wondered if anyone could help with a simple example using my script so I could follow this and build on it, so for example, wanting to change an ID code from alphanumeric to numeric. The example below works, but takes ages, given I have a lot of IDs, to do manually! Any thoughts on how to create a loop to go through each ID and give them a unique number would be most welcome! Cheers, Ross levels(dat.ID$ID2)[levels(dat.ID$ID2)=='A1']-1 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='A2']-2 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D1']-3 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D2']-4 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D4']-5 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D5']-6 levels(dat.ID$ID2)[levels(dat.ID$ID2)=='D6']-7 -- View this message in context: http://r.789695.n4.nabble.com/Simple-loop-code-tp2075322p2075322.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using get and paste in a loop to return objects for object names listed a strings
Thanks for that, the package looks very useful. It gave me the answer in a roundabout way - reminded me I needed to sue attach() so that the get () was only dealing with the objects in data.frame, rather than using the data.frame$factorname I therefore managed to sort a work around, but will be looking at ggplot 2 for other things the work around and the head of the data file is shown below head(data.all) Line Capture_ID Landscape_Name Band_text Bird_IDDate CODE_LETTERS Site_Name Age_Class SEX_ Capture_Number Mass Season SEASON_CLASS EVC Moult Sine_Julian Cosine_Julian HCT Hb Site_Cond Logs_Length 10 10 10 Axe Creek 42605012275 7/11/2007 0:00 YTH Ax1 AF 1 21.5 Spring SS Heathy Dry Forest N -0.80 0.59 0.48 NA43 5 13 13 13 Axe Creek 37136021170 8/11/2007 0:00 YTH Ax1 AF 1 21.5 Spring SS Heathy Dry Forest N -0.79 0.61 0.53 20.443 5 19 19 21 Axe Creek 37136031171 9/11/2007 0:00 YTH Ax1 AF 1 19.5 Spring SS Heathy Dry Forest N -0.78 0.62 0.53 NA43 5 30 30 34 Axe Creek 37136041172 10/11/2007 0:00 YTH Ax1 UM 1 24.5 Spring SS Heathy Dry Forest Y -0.76 0.63 NA NA43 5 31 31 35 Axe Creek 37136051173 10/11/2007 0:00 YTH Ax1 UU 1 NA Spring SS Heathy Dry Forest U -0.76 0.63 NA NA43 5 32 32 36 Axe Creek 37136061174 10/11/2007 0:00 YTH Ax1 UM 1 23.5 Spring SS Heathy Dry Forest U -0.76 0.63 0.50 NA43 5 Litter_Cov Understorey TreeCov H.L WBC BCI CCIPca1 YEAR Hab_Config 10 22.5 650.35 NA NA NANA 2007 D 13 22.5 650.35 NA NA -3.11592 0.6215803 2007 D 19 22.5 650.35 NA NA NANA 2007 D 30 22.5 650.35 NA NA NANA 2007 D 31 22.5 650.35 NA NA NANA 2007 D 32 22.5 650.35 NA NA NANA 2007 D sp.codes=levels(data.all$CODE_LETTERS) for(spp in sp.codes) { data.sp=subset(data.all,CODE_LETTERS==spp) responses = colnames(data.all)[c(20,28,29,19)] #if (spp==BT) responses = colnames(data.all)[c(19]#,20,26:29)] groups=colnames (data.all)[c(9,10)]# ,13,16,30 attach(data.sp) for (response in responses){ for (group in groups){ g=get(group) r=get(response) boxplot(r ~g, main=spp,xlab=group,ylab=response) } } detach(data.sp) } On 29/04/2010 7:05 PM, Paul Hiemstra wrote: Nevil Amos wrote: I am trying to create a heap of boxplots, by looping though a series of factors and variables in a large data.frame suing paste to constrcut the facto and response names from the colnames I thought I could do this using get() however it is not working what am I doing wrong? You don't give a reproducible example, this makes it hard to answer your question. But not really in response to your question, take a look at histogram from the lattice package or geom_boxplot from the ggplot2 package. These functions can do all the work for you of drawing boxplots for a series of factors and variables in a large data.frame. This saves you a lot of time. cheers, Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] randomness in stepclass (klaR) or lda (MASS) ?
On Thu, 2010-04-29 at 15:08 +0200, Uwe Ligges wrote: Well, it is called cross validation which is based on random sampling if you do not have k=n -fold CV (=leave-one-out). Again, to get reproducible results, you will need to set a seed. thank you. I thought that leave-one-out was the default. I looked at the reference file and I am not sure how to get it. Is that by setting fold=1 ? If the results are that unstable: Do you really have a sufficient number of observations for your classification problem? you're probably right. e.e. Eric Elguero Laboratory Genetics and Evolution of Infectious Diseases, Team: Genetics and Adaptation of Plasmodium UMR 2724 CNRS-IRD, IRD Montpellier, 911 Avenue Agropolis, BP 64501, 34394 Montpellier Cedex 5, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time zone convert
Appreciate it! I was trying on the code you sent, then some error codes turned up: The first line runs ok, the second line: modifyList(Time2, list(hour = Time2$hour + 11)) Error in Time2$hour : $ operator is invalid for atomic vectors The time format I used for reading the Time vector is %d-%b-%Y %H:%M:%OS. Should I change any code above? Carol On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna www...@gmail.comwrote: Try this: Time2 - gsub(\\.*, , tolower(Time)) modifyList(Time2, list(hour = Time2$hour + 11)) On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.com wrote: Hi there, I've got a column vector in a csv file as follows, and I need to add 11 hours to each of them. Is there a way that I can do it? (The actual file size is much bigger than this.) Time 01-DEC-2008 00:00:28.611 01-DEC-2008 00:00:43.155 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.919 01-DEC-2008 00:23:46.452 02-DEC-2008 00:03:17.646 02-DEC-2008 00:03:17.652 03-DEC-2008 00:15:11.485 03-DEC-2008 00:18:44.652 03-DEC-2008 00:22:17.447 Thank you in advance. Cheers, Carol [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time zone convert
Ops, I sent to you a wrong code, try this indeed: Time2 - strptime(Time, '%d-%b-%Y %H:%M:%S') modifyList(Time2, list(hour = Time2$hour + 11)) On Thu, Apr 29, 2010 at 11:14 AM, Carol Gao carol.g...@gmail.com wrote: Appreciate it! I was trying on the code you sent, then some error codes turned up: The first line runs ok, the second line: modifyList(Time2, list(hour = Time2$hour + 11)) Error in Time2$hour : $ operator is invalid for atomic vectors The time format I used for reading the Time vector is %d-%b-%Y %H:%M:%OS. Should I change any code above? Carol On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna www...@gmail.comwrote: Try this: Time2 - gsub(\\.*, , tolower(Time)) modifyList(Time2, list(hour = Time2$hour + 11)) On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.com wrote: Hi there, I've got a column vector in a csv file as follows, and I need to add 11 hours to each of them. Is there a way that I can do it? (The actual file size is much bigger than this.) Time 01-DEC-2008 00:00:28.611 01-DEC-2008 00:00:43.155 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.919 01-DEC-2008 00:23:46.452 02-DEC-2008 00:03:17.646 02-DEC-2008 00:03:17.652 03-DEC-2008 00:15:11.485 03-DEC-2008 00:18:44.652 03-DEC-2008 00:22:17.447 Thank you in advance. Cheers, Carol [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] randomness in stepclass (klaR) or lda (MASS) ?
On 29.04.2010 16:04, Eric Elguero wrote: On Thu, 2010-04-29 at 15:08 +0200, Uwe Ligges wrote: Well, it is called cross validation which is based on random sampling if you do not have k=n -fold CV (=leave-one-out). Again, to get reproducible results, you will need to set a seed. thank you. I thought that leave-one-out was the default. As you can see in ?stepclass: foldparameter for cross-validation; omitted if ‘cv.groups’ is specified. and the Usage line tells us: . fold = 10, .. hence 10-fold is the default. I looked at the reference file and I am not sure how to get it. Is that by setting fold=1 ? No, leave one out is n-fold, hence you need n! Uwe Ligges If the results are that unstable: Do you really have a sufficient number of observations for your classification problem? you're probably right. e.e. Eric Elguero Laboratory Genetics and Evolution of Infectious Diseases, Team: Genetics and Adaptation of Plasmodium UMR 2724 CNRS-IRD, IRD Montpellier, 911 Avenue Agropolis, BP 64501, 34394 Montpellier Cedex 5, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compact Patricia Trees (Tries)
Gabor, Thanks for the suggestion, I'll try it out tonight or tomorrow. Regards, Richard _ Richard R. Liu Dittingerstr. 33 CH-4053 Basel Switzerland Tel. +41 79 708 67 66 Sent from my iPhone 3GS On Apr 29, 2010, at 13:06, Gabor Grothendieck ggrothendi...@gmail.com wrote: Using charmatch partial matches of 10,000 5 leters words to the same list can be done in 10 seconds on my machine and 10,000 5 letter words to 100,000 10 letter words in 1 minute. Is that good enough? Try this simulation: # generate N random words each k long rwords - function(N, k) { L - sample(letters, N*k, replace = TRUE) apply(matrix(L, k), 2, paste, collapse = ) } w1 - rwords(1e5, 10) w2 - rwords(1e4, 5) system.time(charmatch(w2, w2)) system.time(charmatch(w2, w1)) On Thu, Apr 29, 2010 at 4:05 AM, Richard R. Liu richard@pueo-owl.ch wrote: I have an application that a long list of character strings to determine which occur at the beginning of a given word. A straight forward R script using grep takes a long time to run. Rewriting it to use substr and match might be an option, but I have the impression that preparing the list as a trie and performing trie searches might lead to dramatic improvements in performance. I have searched the CRAN packages and find no packages that support Compact Patricia Trees. Does anybody know of such? Thanks, Richard Richard R. Liu richard@pueo-owl.ch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Anova Analysis
Hi: It strikes me as a little curious that the No Virus values in each of your example data sets are all *exactly* 1. Why is that? Dennis On Thu, Apr 29, 2010 at 4:52 AM, Yanwei Tan t...@nbio.uni-heidelberg.dewrote: Dear all, I have a quite basic questions about anova analysis in R, sorry for this, but I have no clue how to explain this result. I have two datasets which are named: nmda123, nmda456. Each dataset has three samples which were measured three times. And I would like to compare means of them with Posthoc test using R, following please see the output: (CREB, mCREB and No virus are the name of samples) nmda123 Values ind 1 6.7171265 CREB 2 5.0343117 CREB 3 6.900 CREB 4 0.1195394mCREB 5 0.1221876mCREB 6 0.190mCREB 7 1.000 No Virus 8 1.000 No Virus 9 1.000 No Virus nmda456 Values ind 1 6.4486940 CREB 2 6.2277490 CREB 3 6.500 CREB 4 0.200mCREB 5 0.3766052mCREB 6 0.400mCREB 7 1.000 No Virus 8 1.000 No Virus 9 1.000 No Virus TukeyHSD(aov(Values ~ ind, data = nmda456)) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = Values ~ ind, data = nmda456) $ind difflwrupr p adj mCREB-CREB -6.0666126 -6.3289033 -5.8043219 0.000 No Virus-CREB -5.3921477 -5.6544383 -5.1298570 0.000 No Virus-mCREB 0.6744649 0.4121743 0.9367556 0.0005382 TukeyHSD(aov(Values ~ ind, data = nmda123)) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = Values ~ ind, data = nmda123) $ind difflwr upr p adj mCREB-CREB -6.073237 -7.5618886 -4.584585 0.392 No Virus-CREB -5.217146 -6.7057976 -3.728495 0.943 No Virus-mCREB 0.856091 -0.6325606 2.344743 0.2588450 So my question is No virus-mCREB group. Even I looked at the data by eyes, there is big difference between no virus and mCREB in data nmda123, but why the pvalue is 0.2588450, but in nmda456 data, the pvalue is 0.0005382. But I can see there is bigger difference in nmda123 than nmda456, I do not know why. Sorry for my inexperiences in statistics. Thanks for your reply and time! Cheers, Wei __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merge on criteria
Hi, i have two files (file1.txt and file2.txt) which i would like to merge, based on certain criteria, i.e. it combines data based on matching geneID and exons. i have used the merge option, but it does not give me the desired outcome. merged.txt shows the result i would like. *File1. txt* ** AffyProbe ProbeType Flag GeneSymbol GeneID Exons Chrom Strand Affytart AffyEnd 1 1007_s_at:1105:483 0 0 DDR1 780 21 6 + 30975403 30975427 2 1007_s_at:1119:177 0 0 DDR1 780 21 6 + 30975549 30975573 3 1007_s_at:1136:469 0 0 DDR1 780 21 6 + 30975766 30975790 4 1007_s_at:192:205 0 0 DDR1 780 21 6 + 30975523 30975547 5 1007_s_at:474:1161 0 0 DDR1 780 21 6 + 30975745 30975769 6 1007_s_at:504:983 0 0 DDR1 780 21 6 + 30975575 30975599 7 1007_s_at:50:779 0 0 DDR1 780 21 6 + 30975758 30975782 *File2.txt* AgilentProbe ProbeType Flag GeneSymbol GeneID Exons Chrom Strand AgilentStart AgilentEnd 1 A_23_P11 0 0 FAM174B 400451 5 15 - 90961852 90961793 2 A_23_P100022 0 0 SV2B 9899 14 15 + 89639333 89639392 3 A_23_P100056 0 0 RBPMS2 348093 8 15 - 62819428 62819369 4 A_23_P100074 0 0 AVEN 57099 6 15 - 31946031 31945972 5 A_23_P100092 0 0 ZSCAN29 146050 5 15 - 41440680 41440621 6 A_23_P100103 0 0 VPS39 23339 24 15 - 40240319 40240260 7 A_23_P100111 0 0 CHP 11261 7 15 + 39358845 39358904 8 A_23_P100127 0 0 CASC5 57082 11 15 + 38704817 38704876 9 A_23_P100133 0 0 ATMIN 23300 4 16 + 79636596 79636655 10 A_23_P100141 0 0 UNKL 64718 12 16 - 1355346 1355287 *merged.txt (Should look like this)* GeneSymbol GeneID Exons Chrome AffyMatrixProbeID AffyStart AffyEnd AgilentProbeID AgilentStart AgilentEnd DDR1 780 21 6 A_24_P123601 30975848 30975907 RFC2 5982 10 7 1053_at:120:925, 1053_at:504:41, 1053_at:522:871, 1053_at:828:1025, 203696_s_at:291:651 73287845, 73287869, 73287863, 73287881, 73287850 73287821, 73287845, 73287839, 73287857, 73287826 A_23_P93823 73287861 73287802 RFC2 5982 11 7 HSPA6 3310 1 1 A_23_P114903 159762782 159762841 PAX8 7849 12 2 A_23_P210001 113691555 113691496 GUCA1A 2978 6 6 UBA7 7318 24 3 1294_at:1079:379, 1294_at:361:881, 203281_s_at:524:889, 203281_s_at:678:1017, 203281_s_at:68:1153 49818386, 49818398, 49818378, 49818434, 49818422 49818362, 49818374, 49818354, 49818420, 49818398 sorry for the long tables, thanks Alex Student University of Colorado [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generalized Estimating Equation (GEE): Why is Link = Identity?
Hi, I'm running GEE using geepack. I set corstr = ar1 as below: m.ar - geeglm(L ~ O + A, + data = firstgrouptxt, id = id, + family = binomial, corstr = ar1) summary(m.ar) Call: geeglm(formula = L ~ O + A, family = binomial, data = firstgrouptxt, id = id, corstr = ar1) Coefficients: Estimate Std.errWald Pr(|W|) (Intercept) -2.62516 0.21154 154.001 2e-16 *** ontask 0.00498 0.12143 0.002 0.9673 attachmentB 0.73216 0.35381 4.282 0.0385 * attachmentC 0.25960 0.33579 0.598 0.4395 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Estimated Scale Parameters: Estimate Std.err (Intercept)1.277 0.3538 Correlation: Structure = ar1 Link = identity Estimated Correlation Parameters: Estimate Std.err alpha0.978 0.005725 Number of clusters: 49 Maximum cluster size: 533 Then, it shows that : Correlation: Link = identity Why is it not Link = logit? Thank you, Sachi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing from 32-bit builds to 64-bit builds
On Apr 29, 2010, at 8:56 AM, Sachi Ito wrote: Hi, Probably this is a very simple question for all the programmers, but how do you change from 32-bit builds (default) to 64-bit builds? I've been trying to run Anova to compare two models, but I get the following error message: Error: cannot allocate vector of size 1.2 Gb R(3122,0xa0ab44e0) malloc: *** mmap(size=1337688064) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug R(3122,0xa0ab44e0) malloc: *** mmap(size=1337688064) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug I suppose it's a problem with memory allocation because of the big data size, so I thought I should use 64-bit builds instead of 32. As it was recommended in a manual, I've entered the following: CC='gcc -arch x86_64' CXX='g++ -arch x86_64' F77='gfortran -arch x86_64' FC='gfortran -arch x86_64' OBJC='gcc -arch x86_64' But it still gave me error. I'd greatly appreciate if someone can answer this question! Thank you, Sachi What OS? If Linux, which distribution? For the more common platforms, there are pre-built 64 bit binary versions of R available, including Windows: http://cran.r-project.org/bin/windows64/ Also, moving to 64 bit to take advantage of a larger memory space presumes that you also have the physical memory available on your computer. If you successfully built the 64 bit version on your system, it is possible that you still have the 32 bit version installed and that is what is being run. You can check this by using: .Machine$sizeof.pointer If it returns 4, you are running 32 bit R and it if returns 8, 64 bit. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave question
Thanks Duncan it does exactly what I want, how do I get my options back to print graphics on computer screen? I tried options(device=screen) but didn't work. Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA - Original Message From: Duncan Murdoch murdoch.dun...@gmail.com To: Felipe Carrillo mazatlanmex...@yahoo.com Cc: r-h...@stat.math.ethz.ch Sent: Thu, April 29, 2010 4:12:58 AM Subject: Re: [R] Sweave question On 28/04/2010 11:31 PM, Felipe Carrillo wrote: Hi: I am using Sweave and texi2dvi to generate a LaTeX document but can't find the way to hide the graphics while the R chunks are being executed. I thought results=hide would do it but that't not the case. Sweave runs figure chunks multiple times. The first time is probably what you're seeing: it just runs the code, with no special devices created. You need to tell R to use something other than your screen as the default device for this. That's what happens if you run Sweave in batch mode, or if you choose options(device=pdf). (You'll get a file Rplots.pdf created.) Duncan Murdoch If I do: \begin{figure}[h] figA=true,echo=F,fig=T,results=hide= a rnorm(1000) plot(a) @ \caption{Weekly estimates.} \label{figure:ggplot1} \end{figure} The graphic doesn't get displayed but gets printed on the document but the code below shows the graphic...how can I hide it?? \begin{figure}[h] figA=true,echo=F,fig=T,results=hide= library(ggplot2) winter - read.csv(Winter_AllYears.csv) wintermelt - melt(winter,id=week) print(ggplot(wintermelt,aes(week,value/1000)) + geom_line(aes(colour=variable))+ opts(legend.position=none) + facet_wrap(~variable,ncol=2) + opts(title=Winter) + labs(y=Number of fish X 1,000,x=WEEK)) @ \caption{Weekly estimates.} \label{figure:ggplot1} \end{figure} Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA __ ymailto=mailto:R-help@r-project.org; href=mailto:R-help@r-project.org;R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple loop code
Thanks Henrique, that works! for anyone else as slow as me, just: ##Assign x - factor(dat.ID$ID2, labels = 1:7) ##Convert to dataframe x - as.data.frame(x) ##Then bind to your data z - cbind(y,x) Thanks again, I expected it to be more complicated! Cheers, Ross -- View this message in context: http://r.789695.n4.nabble.com/Simple-loop-code-tp2075322p2075586.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time zone convert
I tried your new lines with some random time, it seems to be working perfectly well, just as follows: z - strptime(20/2/06 23:16:16.683, %d/%m/%y %H:%M:%OS) modifyList(z, list(hour = z$hour + 11)) [1] 2006-02-21 10:16:16 Now it seems that I have some problem with my Time vector. As Time was created by the following code: Time - paste(anz$Date.G.,anz$Time.G.) The original data looks like the following with each row correspond to each. Date.G. 01-DEC-2008 01-DEC-2008 02-DEC-2008 03-DEC-2008 04-DEC-2008 ... Time.G. 00:03:57.398 00:04:03.778 00:04:38.639 00:04:38.639 00:04:38.639 ... Somehow, I can't read Time in strptime(Time,%d-%b-%Y %H:%M:%OS). Do you know what was wrong with it? Sorry for asking such questions, as I am quite new to R. Thanks for helping me out. Carol On Fri, Apr 30, 2010 at 12:16 AM, Henrique Dallazuanna www...@gmail.comwrote: Ops, I sent to you a wrong code, try this indeed: Time2 - strptime(Time, '%d-%b-%Y %H:%M:%S') modifyList(Time2, list(hour = Time2$hour + 11)) On Thu, Apr 29, 2010 at 11:14 AM, Carol Gao carol.g...@gmail.com wrote: Appreciate it! I was trying on the code you sent, then some error codes turned up: The first line runs ok, the second line: modifyList(Time2, list(hour = Time2$hour + 11)) Error in Time2$hour : $ operator is invalid for atomic vectors The time format I used for reading the Time vector is %d-%b-%Y %H:%M:%OS. Should I change any code above? Carol On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna www...@gmail.comwrote: Try this: Time2 - gsub(\\.*, , tolower(Time)) modifyList(Time2, list(hour = Time2$hour + 11)) On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.comwrote: Hi there, I've got a column vector in a csv file as follows, and I need to add 11 hours to each of them. Is there a way that I can do it? (The actual file size is much bigger than this.) Time 01-DEC-2008 00:00:28.611 01-DEC-2008 00:00:43.155 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.919 01-DEC-2008 00:23:46.452 02-DEC-2008 00:03:17.646 02-DEC-2008 00:03:17.652 03-DEC-2008 00:15:11.485 03-DEC-2008 00:18:44.652 03-DEC-2008 00:22:17.447 Thank you in advance. Cheers, Carol [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time zone convert
On Thu, Apr 29, 2010 at 11:44 AM, Carol Gao carol.g...@gmail.com wrote: I tried your new lines with some random time, it seems to be working perfectly well, just as follows: z - strptime(20/2/06 23:16:16.683, %d/%m/%y %H:%M:%OS) modifyList(z, list(hour = z$hour + 11)) [1] 2006-02-21 10:16:16 Now it seems that I have some problem with my Time vector. As Time was created by the following code: Time - paste(anz$Date.G.,anz$Time.G.) The original data looks like the following with each row correspond to each. Date.G. 01-DEC-2008 01-DEC-2008 02-DEC-2008 03-DEC-2008 04-DEC-2008 ... Time.G. 00:03:57.398 00:04:03.778 00:04:38.639 00:04:38.639 00:04:38.639 ... Somehow, I can't read Time in strptime(Time,%d-%b-%Y %H:%M:%OS). Do you know what was wrong with it? Why not? anz - data.frame(Date.G = c(01-DEC-2008, 01-DEC-2008,02-DEC-2008,03-DEC-2008,04-DEC-2008), Time.G = c(00:03:57.398,00:04:03.778,00:04:38.639,00:04:38.639,00:04:38.639)) Time - strptime(paste(anz$Date.G, anz$Time.G), '%d-%b-%Y %H:%M:%S') modifyList(Time, list(hour = Time$hour + 11)) Sorry for asking such questions, as I am quite new to R. Thanks for helping me out. Carol On Fri, Apr 30, 2010 at 12:16 AM, Henrique Dallazuanna www...@gmail.comwrote: Ops, I sent to you a wrong code, try this indeed: Time2 - strptime(Time, '%d-%b-%Y %H:%M:%S') modifyList(Time2, list(hour = Time2$hour + 11)) On Thu, Apr 29, 2010 at 11:14 AM, Carol Gao carol.g...@gmail.com wrote: Appreciate it! I was trying on the code you sent, then some error codes turned up: The first line runs ok, the second line: modifyList(Time2, list(hour = Time2$hour + 11)) Error in Time2$hour : $ operator is invalid for atomic vectors The time format I used for reading the Time vector is %d-%b-%Y %H:%M:%OS. Should I change any code above? Carol On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna www...@gmail.com wrote: Try this: Time2 - gsub(\\.*, , tolower(Time)) modifyList(Time2, list(hour = Time2$hour + 11)) On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.comwrote: Hi there, I've got a column vector in a csv file as follows, and I need to add 11 hours to each of them. Is there a way that I can do it? (The actual file size is much bigger than this.) Time 01-DEC-2008 00:00:28.611 01-DEC-2008 00:00:43.155 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.919 01-DEC-2008 00:23:46.452 02-DEC-2008 00:03:17.646 02-DEC-2008 00:03:17.652 03-DEC-2008 00:15:11.485 03-DEC-2008 00:18:44.652 03-DEC-2008 00:22:17.447 Thank you in advance. Cheers, Carol [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] EBCDIC
Does R have package/function that can read a file that has been downloaded from a mainframe in EBCDIC format? Thanks, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fidelity of lattice graphics captured to jpeg or png
I am generating images via lattice from Frank Harrell's RMS package. These images are characterized by coloured lines and grey-scale confidence intervals. I need to port them to Openoffice/etc, and have tried both png and jpeg (at high quality), but in neither format can I subsequently see the the grey scale confidence intervals. Other than moving to LaTex, does anyone have other suggestions? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time zone convert
that's weird. I opened a new R window and paste your code, it turns up showing anz1 - data.frame(Date.G = c(01-DEC-2008, 01-DEC-2008,02-DEC-2008,03-DEC-2008,04-DEC-2008), + Time.G = c(00:03:57.398,00:04:03.778,00:04:38.639,00:04:38.639,00:04:38.639)) Time - strptime(paste(anz1$Date.G, anz1$Time.G), '%d-%b-%Y %H:%M:%S') modifyList(Time, list(hour = Time$hour + 11)) [1] NA NA NA NA NA What could possibly be the reason for that? On Fri, Apr 30, 2010 at 12:52 AM, Henrique Dallazuanna www...@gmail.comwrote: On Thu, Apr 29, 2010 at 11:44 AM, Carol Gao carol.g...@gmail.com wrote: I tried your new lines with some random time, it seems to be working perfectly well, just as follows: z - strptime(20/2/06 23:16:16.683, %d/%m/%y %H:%M:%OS) modifyList(z, list(hour = z$hour + 11)) [1] 2006-02-21 10:16:16 Now it seems that I have some problem with my Time vector. As Time was created by the following code: Time - paste(anz$Date.G.,anz$Time.G.) The original data looks like the following with each row correspond to each. Date.G. 01-DEC-2008 01-DEC-2008 02-DEC-2008 03-DEC-2008 04-DEC-2008 ... Time.G. 00:03:57.398 00:04:03.778 00:04:38.639 00:04:38.639 00:04:38.639 ... Somehow, I can't read Time in strptime(Time,%d-%b-%Y %H:%M:%OS). Do you know what was wrong with it? Why not? anz - data.frame(Date.G = c(01-DEC-2008, 01-DEC-2008,02-DEC-2008,03-DEC-2008,04-DEC-2008), Time.G = c(00:03:57.398,00:04:03.778,00:04:38.639,00:04:38.639,00:04:38.639)) Time - strptime(paste(anz$Date.G, anz$Time.G), '%d-%b-%Y %H:%M:%S') modifyList(Time, list(hour = Time$hour + 11)) Sorry for asking such questions, as I am quite new to R. Thanks for helping me out. Carol On Fri, Apr 30, 2010 at 12:16 AM, Henrique Dallazuanna www...@gmail.comwrote: Ops, I sent to you a wrong code, try this indeed: Time2 - strptime(Time, '%d-%b-%Y %H:%M:%S') modifyList(Time2, list(hour = Time2$hour + 11)) On Thu, Apr 29, 2010 at 11:14 AM, Carol Gao carol.g...@gmail.comwrote: Appreciate it! I was trying on the code you sent, then some error codes turned up: The first line runs ok, the second line: modifyList(Time2, list(hour = Time2$hour + 11)) Error in Time2$hour : $ operator is invalid for atomic vectors The time format I used for reading the Time vector is %d-%b-%Y %H:%M:%OS. Should I change any code above? Carol On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna www...@gmail.com wrote: Try this: Time2 - gsub(\\.*, , tolower(Time)) modifyList(Time2, list(hour = Time2$hour + 11)) On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.comwrote: Hi there, I've got a column vector in a csv file as follows, and I need to add 11 hours to each of them. Is there a way that I can do it? (The actual file size is much bigger than this.) Time 01-DEC-2008 00:00:28.611 01-DEC-2008 00:00:43.155 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.919 01-DEC-2008 00:23:46.452 02-DEC-2008 00:03:17.646 02-DEC-2008 00:03:17.652 03-DEC-2008 00:15:11.485 03-DEC-2008 00:18:44.652 03-DEC-2008 00:22:17.447 Thank you in advance. Cheers, Carol [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] EBCDIC
Perhaps ?read.table, ?file, and ?iconv will offer some information about how to use different encodings in R. Michael Steven Rooney wrote: Does R have package/function that can read a file that has been downloaded from a mainframe in EBCDIC format? Thanks, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple loop code
On Apr 29, 2010, at 10:37 AM, RCulloch wrote: Thanks Henrique, that works! for anyone else as slow as me, just: ##Assign x - factor(dat.ID$ID2, labels = 1:7) ##Convert to dataframe x - as.data.frame(x) The more typical methods for converting a factor to a character vector would be: (ff - factor(substring(statistics, 1:10, 1:10), levels=letters)) levels(ff)[ff] # [1] s t a t i s t i c s as.character(ff) # [1] s t a t i s t i c s ##Then bind to your data z - cbind(y,x) Oooh. Not a good practice, at least for the newish useR. cbind and rbind create matrices and as a consequence coerce all of their elements to be of the same type. Numeric columns would become character vectors. Not generally a desired result. This would be safer: dat.I$ID2.cf - as.character( factor(dat.ID$ID2, labels = 1:7) ) -- David. Thanks again, I expected it to be more complicated! Cheers, Ross -- View this message in context: http://r.789695.n4.nabble.com/Simple-loop-code-tp2075322p2075586.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fidelity of lattice graphics captured to jpeg or png
When I need high quality graphics from R, I usually use pdf or postscript. If you need a rasterized format, use a graphics editing program to rasterize at whatever quality you want (e.g., GIMP which is free). HTH, Josh On Thu, Apr 29, 2010 at 8:05 AM, Rob James r...@aetiologic.ca wrote: I am generating images via lattice from Frank Harrell's RMS package. These images are characterized by coloured lines and grey-scale confidence intervals. I need to port them to Openoffice/etc, and have tried both png and jpeg (at high quality), but in neither format can I subsequently see the the grey scale confidence intervals. Other than moving to LaTex, does anyone have other suggestions? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Senior in Psychology University of California, Riverside http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using plyr::dply more (memory) efficiently?
Hi Matthew, On Thu, Apr 29, 2010 at 9:52 AM, Matthew Dowle mdo...@mdowle.plus.com wrote: I don't know about that, but try this : install.packages(data.table, repos=http://R-Forge.R-project.org;) require(data.table) summaries = data.table(summaries) summaries[,sum(counts),by=symbol] Please let us know if that returns the correct result, and if its memory/speed is ok ? Thanks for directing me to the data.table package. I read through some of the vignettes, and it looks quite nice. While your sample code would provide answer if I wanted to just compute some summary statistic/function of groups of my data.frame (using `by=symbol`), what's the best way to produces several pieces of info per subset. For instance, I see that I can do something like this: summaries[, list(counts=sum(counts), width=sum(exon.width)), by=symbol] But what if I need to do some more complex processing within the subsets defined in `by=symbol` -- like several lines of programming logic for 1 result, say. I guess I can open a new block that just returns a data.table? Like: summaries[, { cnts - sum(counts) ew - sum(exon.width) # ... some complex things complex - # .. result of complex things data.table(counts=cnts, width=ew, cplx=complex) }, by=symbol] Is that right? (I mean, it looks like it's working, but maybe there's a more idiomatic way(?)) -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fidelity of lattice graphics captured to jpeg or png... followup
Subsequent investigations (via GIMP) show that the problem is in OO, and now with the images themselves. Off to the OO forums. Original Message Subject:Fidelity of lattice graphics captured to jpeg or png Date: Thu, 29 Apr 2010 08:05:04 -0700 From: Rob James r...@aetiologic.ca To: r-help@r-project.org I am generating images via lattice from Frank Harrell's RMS package. These images are characterized by coloured lines and grey-scale confidence intervals. I need to port them to Openoffice/etc, and have tried both png and jpeg (at high quality), but in neither format can I subsequently see the the grey scale confidence intervals. Other than moving to LaTex, does anyone have other suggestions? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] generating correlated random variables from different distributions
I need to generate a set of correlated random variables for a Monte Carlo simulation. The solutions I have found (http://www.stat.uiuc.edu/stat428/cndata.html, http://www.sitmo.com/doc/Generating_Correlated_Random_Numbers), using Cholesky Decomposition, seem to work only if the variables come from the same distribution with the same parameters. My situation is that each variable may be described by a different distribution (or different parameters of the same distribution). This approach does not seem to work, see code and results below. Am I missing something here? My math/statistics is not very good, will I need to generate correlated uniform random variables on (0,1) and then use the inverse distributions to get the desired results I am looking for? That is acceptable, but I would prefer to just generate the individual distributions and then correlate them. Any advice much appreciated. Thanks in advance R. Males Cincinnati, Ohio, USA Sample Code: # Testing Correlated Random Variables # reference http://www.sitmo.com/doc/Generating_Correlated_Random_Numbers # reference http://www.stat.uiuc.edu/stat428/cndata.html # create the correlation matrix corMat=matrix(c(1,0.6,0.3,0.6,1,0.5,0.3,0.5,1),3,3) cholMat=chol(corMat) # create the matrix of random variables set.seed(1000) nValues=1 # generate some random values matNormalAllSame=cbind(rnorm(nValues),rnorm(nValues),rnorm(nValues)) matNormalDifferent=cbind(rnorm(nValues,1,1.5),rnorm(nValues,2,0.5),rnorm(nValues,6,1.8)) matUniformAllSame=cbind(runif(nValues),runif(nValues),runif(nValues)) matUniformDifferent=cbind(runif(nValues,1,1.5),runif(nValues,2,3.5),runif(nValues,6,10.8)) # bind to a matrix print(correlation Matrix) print(corMat) print(Cholesky Decomposition) print (cholMat) # test same normal resultMatNormalAllSame=matNormalAllSame%*%cholMat print(correlation matNormalAllSame) print(cor(resultMatNormalAllSame)) # test different normal resultMatNormalDifferent=matNormalDifferent%*%cholMat print(correlation matNormalDifferent) print(cor(resultMatNormalDifferent)) # test same uniform resultMatUniformAllSame=matUniformAllSame%*%cholMat print(correlation matUniformAllSame) print(cor(resultMatUniformAllSame)) # test different uniform resultMatUniformDifferent=matUniformDifferent%*%cholMat print(correlation matUniformDifferent) print(cor(resultMatUniformDifferent)) and results [1] correlation Matrix [,1] [,2] [,3] [1,] 1.0 0.6 0.3 [2,] 0.6 1.0 0.5 [3,] 0.3 0.5 1.0 [1] Cholesky Decomposition [,1] [,2] [,3] [1,]1 0.6 0.300 [2,]0 0.8 0.400 [3,]0 0.0 0.8660254 [1] correlation matNormalAllSame == ok [,1] [,2] [,3] [1,] 1.000 0.6036468 0.3013823 [2,] 0.6036468 1.000 0.5005440 [3,] 0.3013823 0.5005440 1.000 [1] correlation matNormalDifferent == no good [,1] [,2] [,3] [1,] 1.000 0.9141472 0.2676162 [2,] 0.9141472 1.000 0.2959178 [3,] 0.2676162 0.2959178 1.000 [1] correlation matUniformAllSame == ok [,1] [,2] [,3] [1,] 1.000 0.5971519 0.2959195 [2,] 0.5971519 1.000 0.5011267 [3,] 0.2959195 0.5011267 1.000 [1] correlation matUniformDifferent == no good [,1] [,2] [,3] [1,] 1.000 0.2312000 0.0351460 [2,] 0.2312000 1.000 0.1526293 [3,] 0.0351460 0.1526293 1.000 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split a vector by NA's - is there a better solution then a loop ?
On Thu, 29 Apr 2010, Barry Rowlingson wrote: On Thu, Apr 29, 2010 at 1:27 PM, Henrique Dallazuanna www...@gmail.com wrote: Another option could be: split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1] One thing none of the solutions so far do (except I haven't tried Tal's original code) is insert an empty group between adjacent NA values, for example in: x = c(1,2,3,NA,NA,4,5,6) split(x, replace(cumsum(is.na(x)), is.na(x), -1))[-1] $`0` [1] 1 2 3 $`2` [1] 4 5 6 Maybe this never happens in Tal's case, or it's not what he wanted anyway, but I thought I'd point it out! The ever useful rle() helps y - rle(!is.na(x)) split(x, rep( cumsum(y$val)*y$val, y$len ) )[-1] $`1` [1] 1 2 3 $`2` [1] 4 5 6 Chuck Barry Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merged files
Hi, i have two files (file1.txt and file2.txt) which i would like to merge, based on certain criteria, i.e. it combines data based on matching geneID and exons. i have used the merge option, but it does not give me the desired outcome. merged.txt shows the result i would like. *File1. txt* ** AffyProbe ProbeType Flag GeneSymbol GeneID Exons Chrom Strand Affytart AffyEnd 1 1007_s_at:1105:483 0 0 DDR1 780 21 6 + 30975403 30975427 2 1007_s_at:1119:177 0 0 DDR1 780 21 6 + 30975549 30975573 3 1007_s_at:1136:469 0 0 DDR1 780 21 6 + 30975766 30975790 4 1007_s_at:192:205 0 0 DDR1 780 21 6 + 30975523 30975547 5 1007_s_at:474:1161 0 0 DDR1 780 21 6 + 30975745 30975769 6 1007_s_at:504:983 0 0 DDR1 780 21 6 + 30975575 30975599 7 1007_s_at:50:779 0 0 DDR1 780 21 6 + 30975758 30975782 *File2.txt* AgilentProbe ProbeType Flag GeneSymbol GeneID Exons Chrom Strand AgilentStart AgilentEnd 1 A_23_P11 0 0 FAM174B 400451 5 15 - 90961852 90961793 2 A_23_P100022 0 0 SV2B 9899 14 15 + 89639333 89639392 3 A_23_P100056 0 0 RBPMS2 348093 8 15 - 62819428 62819369 4 A_23_P100074 0 0 AVEN 57099 6 15 - 31946031 31945972 5 A_23_P100092 0 0 ZSCAN29 146050 5 15 - 41440680 41440621 6 A_23_P100103 0 0 VPS39 23339 24 15 - 40240319 40240260 7 A_23_P100111 0 0 CHP 11261 7 15 + 39358845 39358904 8 A_23_P100127 0 0 CASC5 57082 11 15 + 38704817 38704876 9 A_23_P100133 0 0 ATMIN 23300 4 16 + 79636596 79636655 10 A_23_P100141 0 0 UNKL 64718 12 16 - 1355346 1355287 *merged.txt (Should look like this)* GeneSymbol GeneID Exons Chrome AffyMatrixProbeID AffyStart AffyEnd AgilentProbeID AgilentStart AgilentEnd DDR1 780 21 6 A_24_P123601 30975848 30975907 RFC2 5982 10 7 1053_at:120:925, 1053_at:504:41, 1053_at:522:871, 1053_at:828:1025, 203696_s_at:291:651 73287845, 73287869, 73287863, 73287881, 73287850 73287821, 73287845, 73287839, 73287857, 73287826 A_23_P93823 73287861 73287802 RFC2 5982 11 7 HSPA6 3310 1 1 A_23_P114903 159762782 159762841 PAX8 7849 12 2 A_23_P210001 113691555 113691496 GUCA1A 2978 6 6 UBA7 7318 24 3 1294_at:1079:379, 1294_at:361:881, 203281_s_at:524:889, 203281_s_at:678:1017, 203281_s_at:68:1153 49818386, 49818398, 49818378, 49818434, 49818422 49818362, 49818374, 49818354, 49818420, 49818398 sorry for the long tables, thanks Alex Student University of Colorado [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tinn-R related problem
Dear Mr Hewitt, I am having exactly the same problem as descibed on page https://stat.ethz.ch/pipermail/r-help/2008-March/156809.html (please find a copy below), I wonder if you happen to have heart of any solution to it (i.e. which Windows settings have to be altered in order to solve the problem). The mystirious thing about it is that I didn't change anything before this happend, I didn't upgrade R, Tinn-R or any other program, it happend right in the middle of working with R. Many thanks in advance, and kind regards, Oliver Oliver Maspfuhl Commerzbank AG oliver.maspf...@commerzbank.com http://www.commerzbank.de [R] Tinn-R related problem David Hewitt dhewitt37 at gmail.com mailto:r-help%40r-project.org?Subject=%5BR%5D%20Tinn-R%20related%20problemIn-Reply-To=15950714.post%40talk.nabble.com Mon Mar 10 17:10:34 CET 2008 * Previous message: [R] Tinn-R related problem https://stat.ethz.ch/pipermail/r-help/2008-March/156779.html * Next message: [R] Tinn-R related problem https://stat.ethz.ch/pipermail/r-help/2008-March/156839.html * Messages sorted by: [ date ] https://stat.ethz.ch/pipermail/r-help/2008-March/date.html#156809 [ thread ] https://stat.ethz.ch/pipermail/r-help/2008-March/thread.html#156809 [ subject ] https://stat.ethz.ch/pipermail/r-help/2008-March/subject.html#156809 [ author ] https://stat.ethz.ch/pipermail/r-help/2008-March/author.html#156809 _ A few weeks ago all of a sudden the backspace, enter and direction keys were not working. I updated Tinn-R to the newest version but still no sollution. After this I tried reinstalling it (prior to that I removed Tinn-R and deleted all the leftovers manually) and still no change. In every other execution (e.g. when I save a file) every key works fine. I've used Tinn-R with R on Win XP ever since I started with R, and I've never had this problem. The only immediate thing that comes to mind is that you should be installing R in SDI mode to get it working with Tinn-R. At least that's what they say, and I've never tried it the other way (MDI). Maybe just uninstall R and Tinn-R, then reload R, use Custom installation and pick SDI, then reinstall Tinn-R. Worth a shot. From what I have read in the other forums I believe this issue is not necessarily R or Tinn-R related but might be some hidden Windows settings (I'm using XP) but of this I'm not sure. If that's the case, I can't help. What occurred a few weeks ago that might have been related? Did you upgrade R? - David Hewitt Virginia Institute of Marine Science http://www.vims.edu/fish/students/dhewitt/ -- View this message in context: http://www.nabble.com/Tinn-R-related-problem-tp15950714p15950865.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time zone convert
Thanks! I think it now works after I changed the time zone and language settings on PC. It seems when the system was under some other languages other than english, it reads the time a bit differently. Not sure if it was the reason, but thanks for your help. Cheers, Carol On Fri, Apr 30, 2010 at 1:05 AM, Carol Gao carol.g...@gmail.com wrote: that's weird. I opened a new R window and paste your code, it turns up showing anz1 - data.frame(Date.G = c(01-DEC-2008, 01-DEC-2008,02-DEC-2008,03-DEC-2008,04-DEC-2008), + Time.G = c(00:03:57.398,00:04:03.778,00:04:38.639,00:04:38.639,00:04:38.639)) Time - strptime(paste(anz1$Date.G, anz1$Time.G), '%d-%b-%Y %H:%M:%S') modifyList(Time, list(hour = Time$hour + 11)) [1] NA NA NA NA NA What could possibly be the reason for that? On Fri, Apr 30, 2010 at 12:52 AM, Henrique Dallazuanna www...@gmail.comwrote: On Thu, Apr 29, 2010 at 11:44 AM, Carol Gao carol.g...@gmail.com wrote: I tried your new lines with some random time, it seems to be working perfectly well, just as follows: z - strptime(20/2/06 23:16:16.683, %d/%m/%y %H:%M:%OS) modifyList(z, list(hour = z$hour + 11)) [1] 2006-02-21 10:16:16 Now it seems that I have some problem with my Time vector. As Time was created by the following code: Time - paste(anz$Date.G.,anz$Time.G.) The original data looks like the following with each row correspond to each. Date.G. 01-DEC-2008 01-DEC-2008 02-DEC-2008 03-DEC-2008 04-DEC-2008 ... Time.G. 00:03:57.398 00:04:03.778 00:04:38.639 00:04:38.639 00:04:38.639 ... Somehow, I can't read Time in strptime(Time,%d-%b-%Y %H:%M:%OS). Do you know what was wrong with it? Why not? anz - data.frame(Date.G = c(01-DEC-2008, 01-DEC-2008,02-DEC-2008,03-DEC-2008,04-DEC-2008), Time.G = c(00:03:57.398,00:04:03.778,00:04:38.639,00:04:38.639,00:04:38.639)) Time - strptime(paste(anz$Date.G, anz$Time.G), '%d-%b-%Y %H:%M:%S') modifyList(Time, list(hour = Time$hour + 11)) Sorry for asking such questions, as I am quite new to R. Thanks for helping me out. Carol On Fri, Apr 30, 2010 at 12:16 AM, Henrique Dallazuanna www...@gmail.com wrote: Ops, I sent to you a wrong code, try this indeed: Time2 - strptime(Time, '%d-%b-%Y %H:%M:%S') modifyList(Time2, list(hour = Time2$hour + 11)) On Thu, Apr 29, 2010 at 11:14 AM, Carol Gao carol.g...@gmail.comwrote: Appreciate it! I was trying on the code you sent, then some error codes turned up: The first line runs ok, the second line: modifyList(Time2, list(hour = Time2$hour + 11)) Error in Time2$hour : $ operator is invalid for atomic vectors The time format I used for reading the Time vector is %d-%b-%Y %H:%M:%OS. Should I change any code above? Carol On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna www...@gmail.com wrote: Try this: Time2 - gsub(\\.*, , tolower(Time)) modifyList(Time2, list(hour = Time2$hour + 11)) On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.comwrote: Hi there, I've got a column vector in a csv file as follows, and I need to add 11 hours to each of them. Is there a way that I can do it? (The actual file size is much bigger than this.) Time 01-DEC-2008 00:00:28.611 01-DEC-2008 00:00:43.155 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.919 01-DEC-2008 00:23:46.452 02-DEC-2008 00:03:17.646 02-DEC-2008 00:03:17.652 03-DEC-2008 00:15:11.485 03-DEC-2008 00:18:44.652 03-DEC-2008 00:22:17.447 Thank you in advance. Cheers, Carol [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: convert Factor as numeric
Hi You have to get rid of thousands separator firsr as.numeric(gsub(,, , S)) Regards Petr r-help-boun...@r-project.org napsal dne 29.04.2010 13:12:44: Dear group, I know this issue has been already covered, and before you reply I must say I have read the R-FAQ and search the mailing list archive. I still can't manage to change my factor to numeric as I couldn't find any clear answer. Here is my df : Pose1 - structure(list(DESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L, 8L), .Label = c( SUGAR NO.11 May/10 , COTTON NO.2 May/10 , PLATINUM Jul/10 , ROBUSTA COFFEE (10) May/10 , WHEAT May/10 , PRIMARY NICKEL USD, PRM HGH GD ALUMINIUM USD, SPCL HIGH GRADE ZINC USD, STANDARD LEAD USD), class = factor), POSITION = c(5, 3, -1, 15, 4, 2), SETTLEMENT = structure(c(3L, 5L, 2L, 1L, 4L, 8L), .Label = c(1,353., 1,739.4000, 16.5400, 467.7500, 78.1300, 25,760.8600, 2,415.9000, 2,421.0500, 2,357.1200), class = factor)), .Names = c(DESCRIPTION, POSITION, SETTLEMENT), row.names = c(1, 2, 3, 4, 5, 51), class = data.frame) S-Pose1$SETTLEMENT #select the last column S [1] 16.540078.13001,739.4000 1,353. 467.7500 2,421.0500 Levels: 1,353. 1,739.4000 16.5400 467.7500 78.1300 25,760.8600 2,415.9000 2,421.0500 2,357.1200 str(S) Factor w/ 9 levels 1,353.,1,739.4000,..: 3 5 2 1 4 8 Now I need to change S to numeric class S1-as.numeric(levels(S))[as.integer(S)] #doesn't work, numbers are rounded or NA Warning message: NAs introduced by coercion S1-as.numeric(levels(S))[S] #doesn't work, numbers are rounded or NA Warning message: NAs introduced by coercion S1-as.numeric(as.character(S)) #doesn't work, numbers are rounded or NA Warning message: NAs introduced by coercion If it can help, my column S is part of a DF that has been obtained via this line : pose=read.csv2(LSCPos1.csv,sep=,,dec=.,as.is=T,h=T,skip=1)[,c(4,8,14, 15)] pose - structure(list(DESCRIPTION = c(WHEAT May/10 , WHEAT May/10 , WHEAT May/10 , WHEAT May/10 , COTTON NO.2 May/10 , COTTON NO.2 May/10 , COTTON NO.2 May/10 , PLATINUM Jul/10 , SUGAR NO.11 May/10 , SUGAR NO.11 May/10 , SUGAR NO.11 May/10 , SUGAR NO.11 May/10 , SUGAR NO.11 May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , PRM HGH GD ALUMINIUM USD 09/07/10 , PRM HGH GD ALUMINIUM USD 09/07/10 , PRIMARY NICKEL USD 04/06/10 , PRIMARY NICKEL USD 04/06/10 , PRIMARY NICKEL USD 10/06/10 , PRIMARY NICKEL USD 10/06/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 01/07/10 , STANDARD LEAD USD 06/07/10 , SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD 08/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 09/07/10 , SPCL HIGH GRADE ZINC USD 13/04/10 , SPCL HIGH GRADE ZINC USD 13/04/10 ), CREATED.DATE = structure(c(14705, 14707, 14707, 14711, 14700, 14700, 14711, 14711, 14708, 14708, 14708, 14711, 14711, 14707, 14707, 14707, 14707, 14707, 14708, 14708, 14708, 14708, 14708, 14708, 14708, 14708, 14708, 14672, 14673, 14678, 14678, 14700, 14700, 14700, 14700, 14700, 14700, 14700, 14705, 14707, 14707, 14707, 14708, 14708, 14708, 14708, 14708, 14622, 14634), class = Date), QUANITY = c(1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1, -1, 1, 1, 1, -1), CLOSING.PRICE = c(467.7500, 467.7500, 467.7500, 467.7500, 78.1300, 78.1300, 78.1300, 1,739.4000, 16.5400, 16.5400, 16.5400, 16.5400, 16.5400, 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 1,353., 2,415.9000, 2,415.9000, 25,755.7100, 25,755.7100, 25,760.8600, 25,760.8600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,357.1200, 2,420.7300, 2,420.7300, 2,420.7300, 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500, 2,388.4300, 2,388.4300 )), .Names = c(DESCRIPTION, CREATED.DATE, QUANITY, SETTLEMENT), row.names = c(NA, -49L), class = data.frame) str(pose) 'data.frame': 49 obs. of 4 variables: $ DESCRIPTION : chr WHEAT May/10 WHEAT May/10 WHEAT May/10 WHEAT May/10 ... $ CREATED.DATE:Class 'Date' num [1:49] 14705 14707 14707 14711 14700 ... $ QUANITY
Re: [R] time zone convert
On Apr 29, 2010, at 10:14 AM, Carol Gao wrote: Appreciate it! I was trying on the code you sent, then some error codes turned up: The first line runs ok, the second line: modifyList(Time2, list(hour = Time2$hour + 11)) Error in Time2$hour : $ operator is invalid for atomic vectors The time format I used for reading the Time vector is %d-%b-%Y %H: %M:%OS. It appears you have already created a datetime object from a read operation on that csv file, in which case adding 11 hours should be straightforward. Time.plus.11hr - Time + 11*60*60 -- David Should I change any code above? Carol On Thu, Apr 29, 2010 at 11:47 PM, Henrique Dallazuanna www...@gmail.com wrote: Try this: Time2 - gsub(\\.*, , tolower(Time)) modifyList(Time2, list(hour = Time2$hour + 11)) On Thu, Apr 29, 2010 at 10:33 AM, Carol Gao carol.g...@gmail.com wrote: Hi there, I've got a column vector in a csv file as follows, and I need to add 11 hours to each of them. Is there a way that I can do it? (The actual file size is much bigger than this.) Time 01-DEC-2008 00:00:28.611 01-DEC-2008 00:00:43.155 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.677 01-DEC-2008 00:01:06.919 01-DEC-2008 00:23:46.452 02-DEC-2008 00:03:17.646 02-DEC-2008 00:03:17.652 03-DEC-2008 00:15:11.485 03-DEC-2008 00:18:44.652 03-DEC-2008 00:22:17.447 Thank you in advance. Cheers, Carol [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple cex sizes in main for xyplot?
Felix: Oh, yes. That gives me what I want without having to resort to padding parameters. I don't know why it works (vs specifying the y locations), but I suppose that's confounded with the details of lattice engineering, which I wanted to avoid. So many thanks for your help. Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: foolish.andr...@gmail.com [mailto:foolish.andr...@gmail.com] On Behalf Of Felix Andrews Sent: Wednesday, April 28, 2010 4:33 PM To: Bert Gunter Cc: r-help@r-project.org Subject: Re: [R] Multiple cex sizes in main for xyplot? I don't think there's a much better way to do it... but this seems to work: xyplot((0:1)~(0:1), main = textGrob(lab=c(Some Text,\nSome More Text),x=c(0.5,0.5), gp=gpar(cex=c(1.2,1.0), lineheight=2)) ) -Felix On 29 April 2010 08:06, Bert Gunter gunter.ber...@gene.com wrote: Folks: I would like to write two lines of text in two different font sizes (or faces or ...) as the title (main) of a trellis plot. The following code does it, but not well: xyplot((0:1)~(0:1), main = textGrob(lab=c(Some Text,Some More Text),y=c(.95,.8), gp=gpar(cex=c(1.2,1.0))) ) There is too much space between the title text and the plot. I assume that can be fixed by fooling with padding settings in lattice.options(), but my question is: Is there a better, simpler way to do this? Would using grid graphics directly by pushing title and plot viewports and then adding the lattice graph to the plot viewport be a better way to go? OS = Windows XP R = 2.11.0 lattice_0.18-3 device = windows Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Felix Andrews / ??? Postdoctoral Fellow Integrated Catchment Assessment and Management (iCAM) Centre Fenner School of Environment and Society [Bldg 48a] The Australian National University Canberra ACT 0200 Australia M: +61 410 400 963 T: + 61 2 6125 4670 E: felix.andr...@anu.edu.au CRICOS Provider No. 00120C -- http://www.neurofractal.org/felix/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generalized Estimating Equation (GEE): Why is Link = Identity?
From the GEE article in R News, Vol. 2/3, December 2002: Allows different covariates in separate models for the mean, scale, and correlation via various link functions. Geepack offers link functions for the scale, correlation, and mean models. As the output suggests, Correlation: Structure = ar1 Link = identity does not refer to the mean link. In fact, if you look at the output from m.ar you would see: Scale Link: identity Estimated Scale Parameters: [1] 1 Correlation: Structure = ar1Link = identity See the R news article for more info on other correlation and scale link functions. The take home message is this: the mean link is exactly what you think it is, the logit. -tgs On Thu, Apr 29, 2010 at 10:28 AM, Sachi Ito s.ito@gmail.com wrote: Hi, I'm running GEE using geepack. I set corstr = ar1 as below: m.ar - geeglm(L ~ O + A, + data = firstgrouptxt, id = id, + family = binomial, corstr = ar1) summary(m.ar) Call: geeglm(formula = L ~ O + A, family = binomial, data = firstgrouptxt, id = id, corstr = ar1) Coefficients: Estimate Std.errWald Pr(|W|) (Intercept) -2.62516 0.21154 154.001 2e-16 *** ontask 0.00498 0.12143 0.002 0.9673 attachmentB 0.73216 0.35381 4.282 0.0385 * attachmentC 0.25960 0.33579 0.598 0.4395 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Estimated Scale Parameters: Estimate Std.err (Intercept)1.277 0.3538 Correlation: Structure = ar1 Link = identity Estimated Correlation Parameters: Estimate Std.err alpha0.978 0.005725 Number of clusters: 49 Maximum cluster size: 533 Then, it shows that : Correlation: Link = identity Why is it not Link = logit? Thank you, Sachi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reduce size of pdf
is there a way to reduce the size of pdf files in R: ? compression? lower dpi ? or some other option? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] variable importance in Random Forest
HI, Andy, Thanks so much for your reply! IN the paper Classification and regression by randomForest, the first page, there is the random forest estimate the the importance of a variable by looking at how much prediction error increase when the variable is permuted... IN the help document of randomForest, the variable is measured in total decrease in node impurities. IT should be total* increase* in node impurities? right? if total decrease in node impurities, will it be contradict with the paper? ALso in the fit$importance, what is the meaning for first two columns? fit$importance 0 1 MeanDecreaseAccuracy MeanDecreaseGini CT0.0022352025 0.003829344 0.0030311246 5.184427 DP0.0069461974 0.016387520 0.011665096015.440624 DY0.0141150255 0.026031690 0.020060355519.901538 FC0.0024279188 0.005158945 0.0037948155 5.527078 NE0.0352705133 0.070503233 0.052771852646.278504 NW0.0256059127 0.034433862 0.029998149626.440402 QT0.0037228694 0.008181262 0.0059571350 9.308828 SK0.0048187014 0.008895719 0.006860917410.662129 TA0.0042134249 0.011746533 0.007985133112.878367 WC0.0177155268 0.014981440 0.016336632014.240232 WD0.0232972311 0.034083695 0.028670206525.335182 WG0.0328547215 0.053142508 0.042948044130.663749 WW0.0093983693 0.006377956 0.0078681474 7.250101 YG0.0051691399 0.007338639 0.006261814411.084111 num_cell 0.0061355526 0.005373049 0.0057463613 5.060577 num_genes 0.0364878788 0.044544488 0.040455809632.745034 position 0.0025375614 0.011566496 0.007025530210.070505 freq_hypo 0.0008723241 0.001757602 0.0013181209 1.930695 freq_intra0.0009449492 0.001943090 0.0014431451 2.611950 log_hypo 0.0004514713 0.001366561 0.0009096419 1.736749 acid_per 0.0125815445 0.023360179 0.017963437521.131681 base_per 0.0070077737 0.012196570 0.009612912413.675893 charge_per0.0095668425 0.024125997 0.016834595620.969665 hydrophob_per 0.0185736697 0.031941513 0.025220003625.994903 polar_per 0.0169369327 0.023633413 0.020277624720.890415 On Thu, Apr 29, 2010 at 5:22 AM, Liaw, Andy andy_l...@merck.com wrote: Please see the Detail section of the help page for the importance() function in the randomForest package, and let me know which part of it you do not understand. For boosting, you need to read its documentation and decide for yourself if its importance measure is at all comparable to the two in RF. Andy -- *From:* Changbin Du [mailto:changb...@gmail.com] *Sent:* Wednesday, April 28, 2010 8:58 PM *To:* Liaw, Andy *Cc:* r-help@r-project.org *Subject:* variable importance in Random Forest HI, Dear Andy, I run the RandomFOrest in R, and get the following resutls in variable importance: What is the meaning of MeanDecreaseAccuracy and MeanDecreaseGini? I found they are raw values, they are not scaled to 1, right? Which column if most similar to the variable rel.influence in Boosting? Thanks so much! fit$importance 0 1 MeanDecreaseAccuracy MeanDecreaseGini CT0.0022352025 0.003829344 0.0030311246 5.184427 DP0.0069461974 0.016387520 0.0116650960 15.440624 DY0.0141150255 0.026031690 0.0200603555 19.901538 FC0.0024279188 0.005158945 0.0037948155 5.527078 NE0.0352705133 0.070503233 0.0527718526 46.278504 NW0.0256059127 0.034433862 0.0299981496 26.440402 QT0.0037228694 0.008181262 0.0059571350 9.308828 SK0.0048187014 0.008895719 0.0068609174 10.662129 TA0.0042134249 0.011746533 0.0079851331 12.878367 WC0.0177155268 0.014981440 0.0163366320 14.240232 WD0.0232972311 0.034083695 0.0286702065 25.335182 WG0.0328547215 0.053142508 0.0429480441 30.663749 WW0.0093983693 0.006377956 0.0078681474 7.250101 YG0.0051691399 0.007338639 0.0062618144 11.084111 num_cell 0.0061355526 0.005373049 0.0057463613 5.060577 num_genes 0.0364878788 0.044544488 0.0404558096 32.745034 position 0.0025375614 0.011566496 0.0070255302 10.070505 freq_hypo 0.0008723241 0.001757602 0.0013181209 1.930695 freq_intra0.0009449492 0.001943090 0.0014431451
Re: [R] Using plyr::dply more (memory) efficiently?
Steve Lianoglou mailinglist.honey...@gmail.com wrote in message news:t2ybbdc7ed01004290812n433515b5vb15b49c170f5a...@mail.gmail.com... Thanks for directing me to the data.table package. I read through some of the vignettes, and it looks quite nice. While your sample code would provide answer if I wanted to just compute some summary statistic/function of groups of my data.frame (using `by=symbol`), what's the best way to produces several pieces of info per subset. For instance, I see that I can do something like this: summaries[, list(counts=sum(counts), width=sum(exon.width)), by=symbol] Yes, thats it. But what if I need to do some more complex processing within the subsets defined in `by=symbol` -- like several lines of programming logic for 1 result, say. I guess I can open a new block that just returns a data.table? Like: summaries[, { cnts - sum(counts) ew - sum(exon.width) # ... some complex things complex - # .. result of complex things data.table(counts=cnts, width=ew, cplx=complex) }, by=symbol] Is that right? (I mean, it looks like it's working, but maybe there's a more idiomatic way(?)) Yes, you got it. Rather than a data.table at the end though, just return a list, its faster. Shorter vectors will still be recycled to match any longer ones. Or just this : summaries[, list( counts = sum(counts), width = sum(exon.width), cplx = # .. result of complex things ), by=symbol] Sounds like its working, but could you give us an idea whether it is quick and memory efficient ? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't load doSMP from REvolutionR in regular R2.11.0
We haven't tested doSMP with the mingw compiler (hence why we haven't yet submitted it to CRAN). We compiled it under R 2.10 using the same Intel compilers we use for REvolution R. It is open source (GPL) so you're welcome to try compiling it under mingw yourself, but we can't offer support for that configuration. # David Smith On Wed, Apr 28, 2010 at 5:10 PM, Tao Shi shi...@hotmail.com wrote: I was testing out the doSMP package from REvolutionR in my regular R2.11.0 installation and I got the following error message. Well, one obvious thing is that R2.11.0 was built using i386-pc-mingw32 which is different from what revoIPC used. I could just use REvolutionR, but all my R peripherals were set up to work with the regular R2.11.0. So, I really want to make this work. Anyideas? -- David M Smith da...@revolution-computing.com VP of Marketing, REvolution Computing http://blog.revolution-computing.com Tel: +1 (650) 330-0553 x205 (Palo Alto, CA, USA) Download REvolution R free: www.revolution-computing.com/downloads/revolution-r.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using plyr::dply more (memory) efficiently?
Hi Matthew, Sounds like its working, but could you give us an idea whether it is quick and memory efficient ? I actually can't believe what I'm seeing, I just recoded the function to use data.table. What has taken something on the order of ~ 20-30mins with an lapply/do.call(rbind, ...) combo (actually I was using sqldf to do quicker subselects) just finished in 1 min. The memory being used in my R workspace now is still under 2GB, where previously it was ~ 8GB when do.call(rbind, ...)-ing my list into a data.frame, and +20GB with ddply. I'm going to double check that I have the same results, but for now I'm completely blown away. data.table is awesome, thanks for this package. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merged files
On Apr 29, 2010, at 10:21 AM, Alex Jameson wrote: Hi, i have two files (file1.txt and file2.txt) which i would like to merge, based on certain criteria, i.e. it combines data based on matching geneID and exons. i have used the merge option, Huh? What is the merge option? (There is a merge _function_.) but it It? Please provide the code you used. Have you yet read the Posting Guide as I urged you earlier? does not give me the desired outcome. merged.txt shows the result i would like. Given that those two files have no GeneID and Exons in common (after I took you mangled HTML posting and fixed each one to create readable files) , I would expect that this call which would implement the merge you requested above would produce 0 rows: merge(dtd, File2, by=c(GeneID, Exons)) # which would be an inner join Many (most?) of the numbers in the third desired file that we are seeing in mangled form do not appear in either of those two input files, so you appear to be requesting that we hack into your system to get them. Now what was it that you really wanted? (And no more HTML postings ... and use the dput function. That would be an equivalent to the dump method in the Posting Guide which (again) I urge you to read.) -- David *File1. txt* ** AffyProbe ProbeType Flag GeneSymbol GeneID Exons Chrom Strand Affytart AffyEnd 1 1007_s_at:1105:483 0 0 DDR1 780 21 6 + 30975403 30975427 2 1007_s_at:1119:177 0 0 DDR1 780 21 6 + 30975549 30975573 3 1007_s_at:1136:469 0 0 DDR1 780 21 6 + 30975766 30975790 4 1007_s_at: 192:205 0 0 DDR1 780 21 6 + 30975523 30975547 5 1007_s_at:474:1161 0 0 DDR1 780 21 6 + 30975745 30975769 6 1007_s_at:504:983 0 0 DDR1 780 21 6 + 30975575 30975599 7 1007_s_at:50:779 0 0 DDR1 780 21 6 + 30975758 30975782 *File2.txt* AgilentProbe ProbeType Flag GeneSymbol GeneID Exons Chrom Strand AgilentStart AgilentEnd 1 A_23_P11 0 0 FAM174B 400451 5 15 - 90961852 90961793 2 A_23_P100022 0 0 SV2B 9899 14 15 + 89639333 89639392 3 A_23_P100056 0 0 RBPMS2 348093 8 15 - 62819428 62819369 4 A_23_P100074 0 0 AVEN 57099 6 15 - 31946031 31945972 5 A_23_P100092 0 0 ZSCAN29 146050 5 15 - 41440680 41440621 6 A_23_P100103 0 0 VPS39 23339 24 15 - 40240319 40240260 7 A_23_P100111 0 0 CHP 11261 7 15 + 39358845 39358904 8 A_23_P100127 0 0 CASC5 57082 11 15 + 38704817 38704876 9 A_23_P100133 0 0 ATMIN 23300 4 16 + 79636596 79636655 10 A_23_P100141 0 0 UNKL 64718 12 16 - 1355346 1355287 *merged.txt (Should look like this)* GeneSymbol GeneID Exons Chrome AffyMatrixProbeID AffyStart AffyEnd AgilentProbeID AgilentStart AgilentEnd DDR1 780 21 6 A_24_P123601 30975848 30975907 RFC2 5982 10 7 1053_at:120:925, 1053_at:504:41, 1053_at:522:871, 1053_at:828:1025, 203696_s_at:291:651 73287845, 73287869, 73287863, 73287881, 73287850 73287821, 73287845, 73287839, 73287857, 73287826 A_23_P93823 73287861 73287802 RFC2 5982 11 7 HSPA6 3310 1 1 A_23_P114903 159762782 159762841 PAX8 7849 12 2 A_23_P210001 113691555 113691496 GUCA1A 2978 6 6 UBA7 7318 24 3 1294_at:1079:379, 1294_at:361:881, 203281_s_at:524:889, 203281_s_at:678:1017, 203281_s_at:68:1153 49818386, 49818398, 49818378, 49818434, 49818422 49818362, 49818374, 49818354, 49818420, 49818398 sorry for the long tables, thanks Alex Student University of Colorado [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reduce size of pdf
It would help if we knew how big your pdf is and why it is big. Can you show an example or at least describe the process used to generate the file and what you goals are in creating/displaying the file? -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Nevil Amos Sent: Thursday, April 29, 2010 9:38 AM To: r-help@r-project.org Subject: [R] reduce size of pdf is there a way to reduce the size of pdf files in R: ? compression? lower dpi ? or some other option? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to extract data table
I'm a very new user of R, The problem I got is when I have lots of data table, 3 columns and 100 rows assigned to a variable x. how can I transform the table into a external file excel or other files without losing any information. So that make the data look nicer. -- View this message in context: http://r.789695.n4.nabble.com/How-to-extract-data-table-tp2075750p2075750.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split a vector by NA's - is there a better solution then a loop ?
Or, you can modify Romain's function to account for sequential NAs. x - c(1,2,NA,1,1,2,NA,NA,4,5,2,3) foo - function( x ){ idx - 1 + cumsum( is.na( x ) ) not.na - ! is.na( x ) f-factor(idx[not.na],levels=1:max(idx)) split( x[not.na], f ) } $`1` [1] 1 2 $`2` [1] 1 1 2 $`3` numeric(0) $`4` [1] 4 5 2 3 -tgs On Thu, Apr 29, 2010 at 4:00 AM, Tal Galili tal.gal...@gmail.com wrote: Definitely Smarter, Thanks! Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Thu, Apr 29, 2010 at 10:56 AM, Romain Francois romain.franc...@dbmail.com wrote: Maybe this : foo - function( x ){ + idx - 1 + cumsum( is.na( x ) ) + not.na - ! is.na( x ) + split( x[not.na], idx[not.na] ) + } foo( x ) $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 Romain Le 29/04/10 09:42, Tal Galili a écrit : Hi all, I would like to have a function like this: split.vec.by.NA- function(x) That takes a vector like this: x- c(2,1,2,NA,1,1,2,NA,4,5,2,3) And returns a list of length of 3, each element of the list is the relevant segmented vector, like this: $`1` [1] 2 1 2 $`2` [1] 1 1 2 $`3` [1] 4 5 2 3 I found how to do it with a loop, but wondered if there is some smarter (vectorized) way of doing it. Here is the code I used: x- c(2,1,2,NA,1,1,2,NA,4,5,2,3) split.vec.by.NA- function(x) { # assumes NA are seperating groups of numbers #TODO: add code to check for it number.of.groups- sum(is.na(x)) + 1 groups.end.point.locations- c(which(is.na(x)), length(x)+1) # This will be all the places with NA's + a nubmer after the ending of the vector group.start- 1 group.end- NA new.groups.split.id- x # we will replace all the places of the group with group ID, excapt for the NA, which will later be replaced by 0 for(i in seq_len(number.of.groups)) { group.end- groups.end.point.locations[i]-1 new.groups.split.id[group.start:group.end]- i group.start- groups.end.point.locations[i]+1 # make the new group start higher for the next loop (at the final loop it won't matter } new.groups.split.id[is.na(x)]- 0 return(split(x, new.groups.split.id)[-1]) } split.vec.by.NA(x) Thanks, Tal -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9aKDM9 : embed images in Rd documents |- http://tr.im/OIXN : raster images and RImageJ |- http://tr.im/OcQe : Rcpp 0.7.7 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] operator problem within function
Nice, thx. Which manual do you use ? an introduction to R ? Or something special ? matt On 29.04.2010, at 15:25, David Winsemius wrote: On Apr 29, 2010, at 9:03 AM, Bunny, lautloscrew.com wrote: Sorry for that offlist post, did not mean to do it intentionally. just hit the wrong button. Unfortunately this disadvantage is not written next to $ in the manual. Hmmm. Not my manual: Both [[ and $ select a single element of the list. The main difference is that $ does not allow computed indices, whereas [[does. It also says that the correct equivalent using extraction operators of $ would be: x$name == x[[name, exact = FALSE]] -- David. On Apr 29, 2010, at 2:34 AM, Bunny, lautloscrew.com wrote: David, With your help i finally got it. THX! sorry for handing out some ugly names. Reason being: it´s a german dataset with german variable names. With those german names you are always sure you dont use a forbidden name. I just did not want to hit one of those by accident when changing these names for the mailing list. columna is just the latin term for column :) . Anyway here´s what worked note: I just tried to use some more real names here. recode_items = function(dataframe,question_number,medium=3){ #note column names of the initial data.frame are like Question1,Question2 etc. Using [,1] would not be very practical since # the df contains some other data too. Indexing by names seemed to most comfortable way so far. question-paste(Question,question_number,sep=) # needed indexing here that understands characters, that´s why going with [,question_number] did not work. dataframe[question][dataframe[question]==3]=0 This would be more typical: dataframe[dataframe[question]==3, question] - 0 return(dataframe) } recode_items(mydataframe,question_number,3) # this call uses the dataframe that contains the answers of survey participants. Question number is an argument that selects the question from the dataframe that should be recoded. In surveys some weighting schemes only respect extreme answers, which is why the medium answer is recoded to zero. Since it depends on the item scale what medium actually is, I need it to be an argument of my function. Did you want a further logical test with that =1 or some sort of assignment??? So yes, it´s an assignment. Moral: Generally better to use [ indexing. That´s what really made my day (and it´s only 9.30 a.m. here ) . Are there exceptions to rule? Not that I know of. I just worked a lot with the $ in the past. $colname is just syntactic sugar for either [colname] or [ ,colname] and it has the disadvantage that colname is not evaluated. thx matt On 29.04.2010, at 00:56, David Winsemius wrote: On Apr 28, 2010, at 5:45 PM, David Winsemius wrote: On Apr 28, 2010, at 5:31 PM, Bunny, lautloscrew.com wrote: Dear all, i have a problem with processing dataframes within a function using the $. Here´s my code: recode_items = function(dataframe,number,medium=2){ # this works q-paste(columna,number,sep=) Do your really want q to equal columna2 when number equals 2? # this does not work, particularly because dataframe is not processed # dataframe should be: givenframe$columnagivennumber a=dataframe$q[dataframe$q==medium]=1 Did you want a further logical test with that =1 or some sort of assignment??? a) Do you want to work on the column from dataframe ( horrible name for this purpose IMO) with the name columna2? If so, then start with dataframe[ , q ] the q will be evaluated in this form whereas it would not when used with $. b) (A guess in absence of explanation of a goal.) Now do you want all of the rows where that vector equals medium? If so ,then try this: dataframe[ dataframe[ , q ]==2 , ] # untested in the absence of data Ooops. should have been: dataframe[ dataframe[ , q ]==medium , ] #since both q and medium will be evaluated. Moral: Generally better to use [ indexing. -- David. return(a) } If I call this function, i´d like it to return my dataframe. The problem appears to be somewhere around the $. I´m sure this not too hard, but somehow i am stuck. I´ll keep searchin the manuals. Thx for any help in advance. best matt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list
[R] how to parse out fitting statistics and write them into a data frame?
hello, everyone: I am conducting t test between drug and control for about 50,000 gene using the following syntax (treatment is factor): result- lapply(split(data, data$gene),function(x) lm(value~treatment,x) however, the result is a list and i do not know whether more model fitting statistics (like p value of t test) is included in result or not. If i print the first element of resut i got the followings: result[1] $`1007_s_at` Call: lm(formula = logvalue ~ treatment, data = x) Coefficients: (Intercept) treatmentveh 8.94030.3232 summary(result[1]) Length Class Mode 1007_s_at 13 lmlist So my question is whether more fitting statistics (other than coefficient estimation, like p value) are included in the result. If yes, how can I parse them into a data frame so that i can output those statistics into a .csv file that can be shared with my clients. If not, how can I modify the code so that more stat can be computed and stored? any constructive suggestions are welcome. -- View this message in context: http://r.789695.n4.nabble.com/how-to-parse-out-fitting-statistics-and-write-them-into-a-data-frame-tp2075707p2075707.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] operator problem within function
That was copied from the help page the comes up with: ?$ It is rather special. -- David. On Apr 29, 2010, at 12:26 PM, Bunny, lautloscrew.com wrote: Nice, thx. Which manual do you use ? an introduction to R ? Or something special ? matt On 29.04.2010, at 15:25, David Winsemius wrote: On Apr 29, 2010, at 9:03 AM, Bunny, lautloscrew.com wrote: Sorry for that offlist post, did not mean to do it intentionally. just hit the wrong button. Unfortunately this disadvantage is not written next to $ in the manual. Hmmm. Not my manual: Both [[ and $ select a single element of the list. The main difference is that $ does not allow computed indices, whereas [[does. It also says that the correct equivalent using extraction operators of $ would be: x$name == x[[name, exact = FALSE]] -- David. On Apr 29, 2010, at 2:34 AM, Bunny, lautloscrew.com wrote: David, With your help i finally got it. THX! sorry for handing out some ugly names. Reason being: it´s a german dataset with german variable names. With those german names you are always sure you dont use a forbidden name. I just did not want to hit one of those by accident when changing these names for the mailing list. columna is just the latin term for column :) . Anyway here´s what worked note: I just tried to use some more real names here. recode_items = function(dataframe,question_number,medium=3){ #note column names of the initial data.frame are like Question1,Question2 etc. Using [,1] would not be very practical since # the df contains some other data too. Indexing by names seemed to most comfortable way so far. question-paste(Question,question_number,sep=) # needed indexing here that understands characters, that´s why going with [,question_number] did not work. dataframe[question][dataframe[question]==3]=0 This would be more typical: dataframe[dataframe[question]==3, question] - 0 return(dataframe) } recode_items(mydataframe,question_number,3) # this call uses the dataframe that contains the answers of survey participants. Question number is an argument that selects the question from the dataframe that should be recoded. In surveys some weighting schemes only respect extreme answers, which is why the medium answer is recoded to zero. Since it depends on the item scale what medium actually is, I need it to be an argument of my function. Did you want a further logical test with that =1 or some sort of assignment??? So yes, it´s an assignment. Moral: Generally better to use [ indexing. That´s what really made my day (and it´s only 9.30 a.m. here ) . Are there exceptions to rule? Not that I know of. I just worked a lot with the $ in the past. $colname is just syntactic sugar for either [colname] or [ ,colname] and it has the disadvantage that colname is not evaluated. thx matt On 29.04.2010, at 00:56, David Winsemius wrote: On Apr 28, 2010, at 5:45 PM, David Winsemius wrote: On Apr 28, 2010, at 5:31 PM, Bunny, lautloscrew.com wrote: Dear all, i have a problem with processing dataframes within a function using the $. Here´s my code: recode_items = function(dataframe,number,medium=2){ # this works q-paste(columna,number,sep=) Do your really want q to equal columna2 when number equals 2? # this does not work, particularly because dataframe is not processed # dataframe should be: givenframe$columnagivennumber a=dataframe$q[dataframe$q==medium]=1 Did you want a further logical test with that =1 or some sort of assignment??? a) Do you want to work on the column from dataframe ( horrible name for this purpose IMO) with the name columna2? If so, then start with dataframe[ , q ] the q will be evaluated in this form whereas it would not when used with $. b) (A guess in absence of explanation of a goal.) Now do you want all of the rows where that vector equals medium? If so ,then try this: dataframe[ dataframe[ , q ]==2 , ] # untested in the absence of data Ooops. should have been: dataframe[ dataframe[ , q ]==medium , ] #since both q and medium will be evaluated. Moral: Generally better to use [ indexing. -- David. return(a) } If I call this function, i´d like it to return my dataframe. The problem appears to be somewhere around the $. I´m sure this not too hard, but somehow i am stuck. I´ll keep searchin the manuals. Thx for any help in advance. best matt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.