Re: [R] yaxp problem for more irregular time series in one plot
lanc...@fns.uniba.sk schrieb: Good day, I'm trying to get more time series in one plot. As there are bigger differences in values of variables I need logaritmic y axis. The code I use is the following: nvz_3_data - read.csv('/home/tomas/R_outputs/nvz_3.csv') date - (nvz_3_data$date) NO3 - (nvz_3_data$NO3) NH4 - (nvz_3_data$NH4) date_p - as.POSIXct(date, CET) par(mfrow=c(2,1), ylog = TRUE, yaxp = c(0.01, 100, 3)) plot(date_p, NO3, log = y, type = l, col = darkred, main = NVZ-1, xlab = time, ylab = NO3- ) lines(date_p, NH4, col = darkblue, lty = dotted) plot(date_p, NH4, log = y, type = l, col = darkblue, main = NVZ-1, xlab = time, ylab = NH4+ ) So, as I anderstood, extreme (max and min) values on the y axis are conntrolled byt the yaxp, but it is ignored on the plot, and the NH4 values are out of the plot (see the attached picture). Do somebody know what I am doing wrong? Many thanks in advance Tomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hey Thomas, the yaxp command isn't ignored. The problem is the boundary of your y-data. Your plot is adjusted to the NO3 data and because the higher values of your NH4 you won't be able to see it on the plot. A solution lies in implementing the ylim command in your plot. Here is your changed code: nvz_3_data - read.csv('/home/tomas/R_outputs/nvz_3.csv') date - (nvz_3_data$date) NO3 - (nvz_3_data$NO3) NH4 - (nvz_3_data$NH4) maxy-max(NO3,NH4) ## the maximum value of your data miny-min(NO3[NO30],NH4[NH40]) ## the minimum value of your data which are 0 (because: log) date_p - as.POSIXct(date, CET) par(mfrow=c(2,1), ylog = TRUE, yaxp = c(0.01, 100, 3)) plot(date_p, NO3,log = y, ylim=c(miny,maxy), type = l, col = darkred, main = NVZ-1, xlab = time, ylab = NO3- ) lines(date_p, NH4, col = darkblue, lty = dotted) plot(date_p, NH4, log = y, type = l, col = darkblue, main = NVZ-1, xlab = time, ylab = NH4+ ) I hope, I was able to help you. Regards, Christian signature.asc Description: OpenPGP digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphs
Mary A. Marion schrieb: Hello, I am plotting two distributions and want to draw a vertical line at the critical point 149. How can I stop it from going further up than the norm(140,15) curve? x-seq(75,225,0.1) plot(x,dnorm(x,mean=140, sd=15), type='l', col='navy') abline(v = 149, col = black) curve(dnorm(x,mean=150, sd=15),from=75, to=225, col='orange', add=TRUE) Thank you. Sincerely, Mary A. Marion __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hey, in your case you shouldn't use the abline command. Instead try segment like: segments(149, 0, 149, dnorm(149,mean=140, sd=15)) Regards, Christian Porsche signature.asc Description: OpenPGP digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphs
Hi Mary, One can use arrows too... Here is the code : x-seq(75,225,0.1) plot(x,dnorm(x,mean=140, sd=15), type='l', col='navy') *arrows(149,0,149,dnorm(149,140,15),length=0) *par(new=T) plot(x,dnorm(x,mean=150, sd=15), type='l', col='orange',axes=F) Regards Radha On Sun, Jul 26, 2009 at 5:09 AM, Mary A. Marion mms...@comcast.net wrote: Hello, I am plotting two distributions and want to draw a vertical line at the critical point 149. How can I stop it from going further up than the norm(140,15) curve? x-seq(75,225,0.1) plot(x,dnorm(x,mean=140, sd=15), type='l', col='navy') abline(v = 149, col = black) curve(dnorm(x,mean=150, sd=15),from=75, to=225, col='orange', add=TRUE) Thank you. Sincerely, Mary A. Marion __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave, cacheSweave, and data frame
Dear Gabor, Thanks for the suggestion. I am writing a research paper using Sweave, not building a R package. If i understand correctly, the --no-vignettes option does not really help in my case. Shige On Sun, Jul 26, 2009 at 11:56 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Another thing you can do to save time is to use the --no-vignettes switch when you build the package. On Sat, Jul 25, 2009 at 11:05 PM, Shige Songshiges...@gmail.com wrote: Dear All, I have been using Sweave (mainly via the Sweave.sh script) and really like it. I am working a paper (using Sweave, of course) which includes several time-consuming computations, and it gets tedious to re-compile the whoel thing every time I made changes. Then I discover the cacheSweave package, which seems the right solution to my problem. I only have on problem. Here is what I did: -- results=hide,echo=false= library(foreign) library(Zelig) library(memisc) options(digits=4) @ echo=false= d - read.dta(~/project/abortion/data/data_transition_wide.dta) @ ... -- It can be compiled using Sweave.sh foo.Rnw, but when I tried Sweave.sh -c foo.Rnw, I got error message: ...Processing code chunks ... 1 : term hide Error in data.frame(chunk = options$label, chunkprefix = chunkprefix, : arguments imply differing number of rows: 0, 1 I thought it might be an imcompatibiltiy problem between the cacheSweave and the foreign packages, but the problem was still there when I tried to read the data in using read.table function. Any ideas? Many thanks. Best, Shige [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave, cacheSweave, and data frame
Right. Its to avoid making the vignettes when you are mainly interested in testing out a package. On Sun, Jul 26, 2009 at 3:07 AM, Shige Songshiges...@gmail.com wrote: Dear Gabor, Thanks for the suggestion. I am writing a research paper using Sweave, not building a R package. If i understand correctly, the --no-vignettes option does not really help in my case. Shige On Sun, Jul 26, 2009 at 11:56 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Another thing you can do to save time is to use the --no-vignettes switch when you build the package. On Sat, Jul 25, 2009 at 11:05 PM, Shige Songshiges...@gmail.com wrote: Dear All, I have been using Sweave (mainly via the Sweave.sh script) and really like it. I am working a paper (using Sweave, of course) which includes several time-consuming computations, and it gets tedious to re-compile the whoel thing every time I made changes. Then I discover the cacheSweave package, which seems the right solution to my problem. I only have on problem. Here is what I did: -- results=hide,echo=false= library(foreign) library(Zelig) library(memisc) options(digits=4) @ echo=false= d - read.dta(~/project/abortion/data/data_transition_wide.dta) @ ... -- It can be compiled using Sweave.sh foo.Rnw, but when I tried Sweave.sh -c foo.Rnw, I got error message: ...Processing code chunks ... 1 : term hide Error in data.frame(chunk = options$label, chunkprefix = chunkprefix, : arguments imply differing number of rows: 0, 1 I thought it might be an imcompatibiltiy problem between the cacheSweave and the foreign packages, but the problem was still there when I tried to read the data in using read.table function. Any ideas? Many thanks. Best, Shige [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determine the dimension-names of an element in an array in R
Hi Gabor: Many thanks for your prompt reply! The code is fine. But I need it in more general form as I had mentioned that I need to input any 0 to find its dimension-names. Actually, I was using sapply to calculate correlation and this idea was required in the middle of correlation calculation. I am providing the way I tried my calculation. a= c(A1,A2,A3,A4,A5) b= c(B1,B2,B3) c= c(C1,C2,C3,C4) d= c(D1,D2) e= c(E1,E2,E3,E4,E5,E6,E7,E8) DataArray_1 = array(c(rnorm(240)),dim=c(length(a),length(b), length(d),length(e)),dimnames=list(a,b,d,e)) DataArray_2 = array(c(rnorm(320)), dim=c(length(a),length(c), length(d),length(e)),dimnames=list(a,c,d,e)) #Defining an empty array which will contain the correlation values (output array) Correl = array(NA, dim=c(length(a),length(b), length(c),length(d)),dimnames=list(a,b,c,d)) #Calculating Correlation between attributes b c over values of e Correl = sapply(Correl,function(d) cor(DataArray_1[...],DataArray_2[...], use=pairwise.complete.obs)) This is where I get stuck. In the above, d is acting as an element in the Correl array. Hence I need to get the dimension-names for d. #The first element of Correl will be: cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs) So my problem boils down to extracting the dim-names in terms of element(d) and not in terms of Correl (that I have mentioned as ... in the above code) My sincere thanks for your valuable time suggestions. Many Thanks Kind Regards, Sauvik On Sun, Jul 26, 2009 at 5:26 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this: ix - c(1, 3, 4, 2) mapply([, dimnames(mydatastructure), ix) [1] S1 T3 U4 V2 On Sat, Jul 25, 2009 at 5:12 PM, Sauvik Desauvik.s...@gmail.com wrote: Hi: How can I extract the dimension-names of a pre-defined element in a multidimensional array in R ? A toy example is provided below: I have a 4-dimensional array with each dimension having certain length. In the below example, mydatastructure explains the structure of my data. mydatastructure = array(0, dim=c(length(b),length(z),length(x),length(d)), dimnames=list(b,z,x,d)) where, b=c(S1,S2,S3,S4,S5) z=c(T1,T2, T3) x=c(U1,U2,U3,U4) d=c(V1,V2) Clearly, mydatastructure contains many 0's. Now how can I get the dimension-names of any particular 0 ? That is, my input should be a particular 0 in the array mydatastructure (Suppose this 0 corresponds to S1,T3,U4 V2 in the array). Then my output should be S1,T3,U4 V2. The function dimnames didn't help me with the solution. Any idea will greatly be appreciated. Thanks for your time! Kind Regards, Sauvik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave, cacheSweave, and data frame
Shige Song wrote: I have been using Sweave (mainly via the Sweave.sh script) and really like it. I am working a paper (using Sweave, of course) which includes several To check what's wrong, please post a reproducible minimal example. It is difficult to guess if this has to do with the data frame or something else from your post. Dieter -- View this message in context: http://www.nabble.com/Sweave%2C-cacheSweave%2C-and-data-frame-tp24663644p24665116.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave, cacheSweave, and data frame
Shige Song wrote: Dear All, I have been using Sweave (mainly via the Sweave.sh script) and really like it. I am working a paper (using Sweave, of course) which includes several time-consuming computations, and it gets tedious to re-compile the whoel thing every time I made changes. Then I discover the cacheSweave package, which seems the right solution to my problem. I only have on problem. Here is what I did: I can't help with your question, but another approach to do what you want is to have one script that does the time consuming calculations and saves the results (using save()) to a file, then have your Sweave document load() the object as necessary. You could then also move the read.dta() line to this other script, and load() that data, simplifying your Sweave document. Duncan Murdoch -- results=hide,echo=false= library(foreign) library(Zelig) library(memisc) options(digits=4) @ echo=false= d - read.dta(~/project/abortion/data/data_transition_wide.dta) @ ... -- It can be compiled using Sweave.sh foo.Rnw, but when I tried Sweave.sh -c foo.Rnw, I got error message: ...Processing code chunks ... 1 : term hide Error in data.frame(chunk = options$label, chunkprefix = chunkprefix, : arguments imply differing number of rows: 0, 1 I thought it might be an imcompatibiltiy problem between the cacheSweave and the foreign packages, but the problem was still there when I tried to read the data in using read.table function. Any ideas? Many thanks. Best, Shige [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determine the dimension-names of an element in an array in R
Sauvik De schrieb: Hi Gabor: Many thanks for your prompt reply! The code is fine. But I need it in more general form as I had mentioned that I need to input any 0 to find its dimension-names. Actually, I was using sapply to calculate correlation and this idea was required in the middle of correlation calculation. I am providing the way I tried my calculation. a= c(A1,A2,A3,A4,A5) b= c(B1,B2,B3) c= c(C1,C2,C3,C4) d= c(D1,D2) e= c(E1,E2,E3,E4,E5,E6,E7,E8) DataArray_1 = array(c(rnorm(240)),dim=c(length(a),length(b), length(d),length(e)),dimnames=list(a,b,d,e)) DataArray_2 = array(c(rnorm(320)), dim=c(length(a),length(c), length(d),length(e)),dimnames=list(a,c,d,e)) #Defining an empty array which will contain the correlation values (output array) Correl = array(NA, dim=c(length(a),length(b), length(c),length(d)),dimnames=list(a,b,c,d)) #Calculating Correlation between attributes b c over values of e Correl = sapply(Correl,function(d) cor(DataArray_1[...],DataArray_2[...], use=pairwise.complete.obs)) This is where I get stuck. In the above, d is acting as an element in the Correl array. Hence I need to get the dimension-names for d. #The first element of Correl will be: cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs) So my problem boils down to extracting the dim-names in terms of element(d) and not in terms of Correl (that I have mentioned as ... in the above code) My sincere thanks for your valuable time suggestions. Many Thanks Kind Regards, Sauvik On Sun, Jul 26, 2009 at 5:26 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this: ix - c(1, 3, 4, 2) mapply([, dimnames(mydatastructure), ix) [1] S1 T3 U4 V2 On Sat, Jul 25, 2009 at 5:12 PM, Sauvik Desauvik.s...@gmail.com wrote: Hi: How can I extract the dimension-names of a pre-defined element in a multidimensional array in R ? A toy example is provided below: I have a 4-dimensional array with each dimension having certain length. In the below example, mydatastructure explains the structure of my data. mydatastructure = array(0, dim=c(length(b),length(z),length(x),length(d)), dimnames=list(b,z,x,d)) where, b=c(S1,S2,S3,S4,S5) z=c(T1,T2, T3) x=c(U1,U2,U3,U4) d=c(V1,V2) Clearly, mydatastructure contains many 0's. Now how can I get the dimension-names of any particular 0 ? That is, my input should be a particular 0 in the array mydatastructure (Suppose this 0 corresponds to S1,T3,U4 V2 in the array). Then my output should be S1,T3,U4 V2. The function dimnames didn't help me with the solution. Any idea will greatly be appreciated. Thanks for your time! Kind Regards, Sauvik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hey, I have spend some time to write a function, which should fulfill your needs. so i hope ;-) findIndex-function(data,element) { ld-length(data) el-which(is.element(data,element)) lel-length(el) ndim-length(dim(data)) ind-array(,dim=c(lel,ndim),dimnames=list(el,1:ndim)) precomma- tempdata-data tempel-el for (j in 1:lel) { data-tempdata el-tempel ld-length(data) for (i in ndim:1) { ratio-el[j]/(ld/dim(data)[i]) if (ratio-trunc(ratio)0) { ind[j,i]-trunc(ratio)+1 } else { ind[j,i]-trunc(ratio) } if (length(dim(data))1) { k-1 while (k=1 k=(i-1)) { precomma-paste(precomma,,,sep=) k-k+1 } data-as.array(eval(parse(text=paste(data[,precomma,ind[j,i],],sep= precomma- ld-length(data) el[j]-which(is.element(data,element)) } } } return(ind) } Regards, Christian Porsche [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] zoo plot: yearly marks on X-Axis
stvienna wiener schrieb: Hi all, I am plotting a financial time series, but I need a more detailed X-Axis. Example: x - zoo(rnorm(1:6000), as.Date(1992-11-11)+c(1:6000)) plot(x) The X-Axis is labeled 1995, 2000 and 2005. I would need either 1995, 1997, etc. or maybe yearly I used google first, then look at ?plot.zoo but could't get it working. Regards, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hey, try something like the following: plot(x, y, xaxt=n) axis.Date(1, at=seq(as.Date(1960-01-01), max(as.Date(x)), years), labels = FALSE, tcl = -0.2) axis.Date(1, at=seq(as.Date(1960-01-01), max(as.Date(x)), 5 years), labels = TRUE, las=3, tcl = -0.2) Regards, Christian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is there an R implementation for the Barnard's exact test (a substitute for fisher.test) ?
Hello R help members. I came across today with an article on Barnard's exact test (http://www.cytel.com/Papers/twobinomials.pdf), that is supposed to give a more powerful fisher.test - Because it doesn't assume that we know the row and column totals are in advance. Any pointers to such a function ? Thanks, Tal -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determine the dimension-names of an element in an array in R
Hi: Lots of thanks for your valuable time! But I am not sure how you would like to use the function in this situation. As I had mentioned that the first element of my output array should be like: cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs) in my below code. and the output array of correlation I wish to get using sapply as follows: Correl = sapply(Correl,function(d) cor(DataArray_1[...],DataArray_2[...], use=pairwise.complete.obs)) So it would be of great help if you could kindly specify how to utilise your function findIndex in ... Apologies for all this! Thanks Regards, Sauvik On Sun, Jul 26, 2009 at 3:54 PM, Poerschingpoerschin...@web.de wrote: Sauvik De schrieb: Hi Gabor: Many thanks for your prompt reply! The code is fine. But I need it in more general form as I had mentioned that I need to input any 0 to find its dimension-names. Actually, I was using sapply to calculate correlation and this idea was required in the middle of correlation calculation. I am providing the way I tried my calculation. a= c(A1,A2,A3,A4,A5) b= c(B1,B2,B3) c= c(C1,C2,C3,C4) d= c(D1,D2) e= c(E1,E2,E3,E4,E5,E6,E7,E8) DataArray_1 = array(c(rnorm(240)),dim=c(length(a),length(b), length(d),length(e)),dimnames=list(a,b,d,e)) DataArray_2 = array(c(rnorm(320)), dim=c(length(a),length(c), length(d),length(e)),dimnames=list(a,c,d,e)) #Defining an empty array which will contain the correlation values (output array) Correl = array(NA, dim=c(length(a),length(b), length(c),length(d)),dimnames=list(a,b,c,d)) #Calculating Correlation between attributes b c over values of e Correl = sapply(Correl,function(d) cor(DataArray_1[...],DataArray_2[...], use=pairwise.complete.obs)) This is where I get stuck. In the above, d is acting as an element in the Correl array. Hence I need to get the dimension-names for d. #The first element of Correl will be: cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs) So my problem boils down to extracting the dim-names in terms of element(d) and not in terms of Correl (that I have mentioned as ... in the above code) My sincere thanks for your valuable time suggestions. Many Thanks Kind Regards, Sauvik On Sun, Jul 26, 2009 at 5:26 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this: ix - c(1, 3, 4, 2) mapply([, dimnames(mydatastructure), ix) [1] S1 T3 U4 V2 On Sat, Jul 25, 2009 at 5:12 PM, Sauvik Desauvik.s...@gmail.com wrote: Hi: How can I extract the dimension-names of a pre-defined element in a multidimensional array in R ? A toy example is provided below: I have a 4-dimensional array with each dimension having certain length. In the below example, mydatastructure explains the structure of my data. mydatastructure = array(0, dim=c(length(b),length(z),length(x),length(d)), dimnames=list(b,z,x,d)) where, b=c(S1,S2,S3,S4,S5) z=c(T1,T2, T3) x=c(U1,U2,U3,U4) d=c(V1,V2) Clearly, mydatastructure contains many 0's. Now how can I get the dimension-names of any particular 0 ? That is, my input should be a particular 0 in the array mydatastructure (Suppose this 0 corresponds to S1,T3,U4 V2 in the array). Then my output should be S1,T3,U4 V2. The function dimnames didn't help me with the solution. Any idea will greatly be appreciated. Thanks for your time! Kind Regards, Sauvik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hey, I have spend some time to write a function, which should fulfill your needs. so i hope ;-) findIndex-function(data,element) { ld-length(data) el-which(is.element(data,element)) lel-length(el) ndim-length(dim(data)) ind-array(,dim=c(lel,ndim),dimnames=list(el,1:ndim)) precomma- tempdata-data tempel-el for (j in 1:lel) { data-tempdata el-tempel ld-length(data) for (i in ndim:1) { ratio-el[j]/(ld/dim(data)[i]) if (ratio-trunc(ratio)0) { ind[j,i]-trunc(ratio)+1 } else { ind[j,i]-trunc(ratio) } if (length(dim(data))1) { k-1 while (k=1
Re: [R] How to add 95% confidence intervals in the calibration plot?
zhu yao wrote: Dear experts: I am a newbie to R. Recently, I try to make prediction models with R and the Design library. I have read Prof. Harrell's excellent book. But I did not quite understand. I have two problems about the validation and calibration of prediction models: 1. Can someone explain the results outputted by the validate() function? How to get 95% of c-value of validate? validate does not provide that confidence interval, unfortunately. 2. How to add 95% ci in the calibration plot? That is not provided except for survival models. Next time please include your code so we can see what model you are using. Thanks Frank Yao Zhu Department of Urology Fudan University Shanghai Cancer Center [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add 95% confidence intervals in the calibration plot?
Thanks for your reply. Actually, I'm confused about the results in the article Postoperative nomogram for survival of patients with retroperitoneal sarcoma treated with curative intent http://annonc.oxfordjournals.org/cgi/content/abstract/mdp298v1 It stated as: nomogram model The Cox model was used as the basis for the nomogram (Table 2). Figure 2 depicts the final nomogram and portrays the association between each variable and survival based on the scoring system derived from this analysis. The concordance index (discrimination) after internal validation with 200 bootstrapping resamples was 0.73 (95% CI 0.710.75). Similarly, Figure 3 illustrates the calibration of the nomogram before and after internal validation with bootstrapping samples. Calibration was excellent with observed outcomes always within 95% CI of the predicted survival probability. Figure 3 is http://i3.6.cn/cvbnm/a9/c8/8b/c01aad248a0b4ae6ef677600614bd4fa.jpg 2009/7/26 Frank E Harrell Jr f.harr...@vanderbilt.edu zhu yao wrote: Dear experts: I am a newbie to R. Recently, I try to make prediction models with R and the Design library. I have read Prof. Harrell's excellent book. But I did not quite understand. I have two problems about the validation and calibration of prediction models: 1. Can someone explain the results outputted by the validate() function? How to get 95% of c-value of validate? validate does not provide that confidence interval, unfortunately. 2. How to add 95% ci in the calibration plot? That is not provided except for survival models. Next time please include your code so we can see what model you are using. Thanks Frank Yao Zhu Department of Urology Fudan University Shanghai Cancer Center [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add 95% confidence intervals in the calibration plot?
zhu yao wrote: Thanks for your reply. Actually, I'm confused about the results in the article Postoperative nomogram for survival of patients with retroperitoneal sarcoma treated with curative intent http://annonc.oxfordjournals.org/cgi/content/abstract/mdp298v1 It stated as: nomogram model The Cox model was used as the basis for the nomogram (Table 2). Figure 2 depicts the final nomogram and portrays the association between each variable and survival based on the scoring system derived from this analysis. The concordance index (discrimination) after internal validation with 200 bootstrapping resamples was 0.73 (95% CI 0.71–0.75). Similarly, Figure 3 illustrates the calibration of the nomogram before and after internal validation with bootstrapping samples. Calibration was excellent with observed outcomes always within 95% CI of the predicted survival probability. Figure 3 is provided by the Design package without modification. As I stated before it does provide those CIs for survival models. I guess that the CI for the c-index was obtained without bootstrap validation using the Hmisc package's rcorr.cens function (and Dxy=2*(C-.5)) or by using an approximate bootstrap analysis they programmed. Note that in the abstract the authors wrongly used the confidence intervals in Fig 3 to conclude excellent validation of the model. Their conclusion can arise from just having large confidence intervals. Frank Figure 3 is http://i3.6.cn/cvbnm/a9/c8/8b/c01aad248a0b4ae6ef677600614bd4fa.jpg 2009/7/26 Frank E Harrell Jr f.harr...@vanderbilt.edu mailto:f.harr...@vanderbilt.edu zhu yao wrote: Dear experts: I am a newbie to R. Recently, I try to make prediction models with R and the Design library. I have read Prof. Harrell's excellent book. But I did not quite understand. I have two problems about the validation and calibration of prediction models: 1. Can someone explain the results outputted by the validate() function? How to get 95% of c-value of validate? validate does not provide that confidence interval, unfortunately. 2. How to add 95% ci in the calibration plot? That is not provided except for survival models. Next time please include your code so we can see what model you are using. Thanks Frank Yao Zhu Department of Urology Fudan University Shanghai Cancer Center [[alternative HTML version deleted]] __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R: R: Is there a way to extract some fields data fromHTML pages through any R function ?
It works if the web file adress is of the type: http://;. It does not work if the web file adress is of the type: 'ftp://;. outFile - read.xls(ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT/miRNA.xls;) Error in xls2csv(xls, sheet, verbose = verbose, ..., perl = perl) : Unable to read xls file 'ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT/miRNA.xls'. Error in file.exists(tfn) : invalid 'file' argument But the file does exists as shown in the following: download.file(ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT/miRNA.xls,outFile;) trying URL 'ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT/miRNA.xls' ftp data connection made, file length 2563072 bytes opened URL downloaded 2.4 Mb Can the two steps (download + read.xls) be performed with one command line only ? Thank you, Maura -Messaggio originale- Da: r-help-boun...@r-project.org per conto di Daniel Nordlund Inviato: lun 06/07/2009 20.45 A: r-h...@stat.math.ethz.ch Oggetto: Re: [R] R: R: Is there a way to extract some fields data fromHTML pages through any R function ? -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of mau...@alice.it Sent: Sunday, July 05, 2009 11:28 PM To: Martin Morgan Cc: r-h...@stat.math.ethz.ch Subject: [R] R: R: Is there a way to extract some fields data from HTML pages through any R function ? It helps. But it is overly sophisticated. I have already downloaded and used the Excel file containing the validated stuff. Since there are R commands to download gzip as well as FASTA files, I wonder whether it is possible to automatically download the Excel file from http://mirecords.umn.edu/miRecords/download.php Actually the latter may not be the actual file URL because it is necessary to click on the word here to download the file. Thank you, Maura Maura, I haven't seen a response to your question (however, I just may have missed it, or you mave have received an off-line response). I went to the URL above and found that the Excel file is at http://mirecords.umn.edu/miRecords/download_data.php?v=1 I think you could use the read.xls() function from the gdata package to get the file, something like this library(gdata) df - read.xls(http://mirecords.umn.edu/miRecords/download_data.php?v=1;) Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R: R: Is there a way to extract some fields data fromHTML pages through any R function ?
I will send you offline an enhancement for read.xls that accepts ftp connections. On Sun, Jul 26, 2009 at 11:32 AM, mau...@alice.it wrote: It works if the web file adress is of the type: http://;. It does not work if the web file adress is of the type: 'ftp://;. outFile - read.xls(ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT/miRNA.xls;) Error in xls2csv(xls, sheet, verbose = verbose, ..., perl = perl) : Unable to read xls file 'ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT/miRNA.xls'. Error in file.exists(tfn) : invalid 'file' argument But the file does exists as shown in the following: download.file(ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT/miRNA.xls,outFile;) trying URL 'ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/CURRENT/miRNA.xls' ftp data connection made, file length 2563072 bytes opened URL downloaded 2.4 Mb Can the two steps (download + read.xls) be performed with one command line only ? Thank you, Maura -Messaggio originale- Da: r-help-boun...@r-project.org per conto di Daniel Nordlund Inviato: lun 06/07/2009 20.45 A: r-h...@stat.math.ethz.ch Oggetto: Re: [R] R: R: Is there a way to extract some fields data fromHTML pages through any R function ? -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of mau...@alice.it Sent: Sunday, July 05, 2009 11:28 PM To: Martin Morgan Cc: r-h...@stat.math.ethz.ch Subject: [R] R: R: Is there a way to extract some fields data from HTML pages through any R function ? It helps. But it is overly sophisticated. I have already downloaded and used the Excel file containing the validated stuff. Since there are R commands to download gzip as well as FASTA files, I wonder whether it is possible to automatically download the Excel file from http://mirecords.umn.edu/miRecords/download.php Actually the latter may not be the actual file URL because it is necessary to click on the word here to download the file. Thank you, Maura Maura, I haven't seen a response to your question (however, I just may have missed it, or you mave have received an off-line response). I went to the URL above and found that the Excel file is at http://mirecords.umn.edu/miRecords/download_data.php?v=1 I think you could use the read.xls() function from the gdata package to get the file, something like this library(gdata) df - read.xls(http://mirecords.umn.edu/miRecords/download_data.php?v=1;) Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Collinearity in Linear Multiple Regression
Stephan Kolassa wrote: Hi Alex, I personally have had more success with the (more complicated) collinearity diagnostics proposed by Belsley, Kuh Welsch in their book Regression Diagnostics than with Variance Inflation Factors. See also: Belsley, D. A. A Guide to using the collinearity diagnostics. Computational Economics, 1991, 4, 33-50 However, I know of no R package that implements these diagnostics. Anyway, it's not hard to do so oneself. R code utilizing singular value decomposition and variance decomposition proportions along lines proposed by D. Belsley, Conditioning Diagnostics (1991) is available at http://www.uri.edu/artsci/ecn/burkett/scivdp.R -John Good luck! Stephan Alex Roy schrieb: Dear all, How can I test for collinearity in the predictor data set for multiple linear regression. Thanks Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John P. Burkett Department of Economics University of Rhode Island Kingston, RI 02881-0808 USA phone (401) 874-9195 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] suggestion for paired t-tests
jacktanner wrote: There's a funny inconsistency in how t.test handles paired=T or paired=F. If x and y parameters are lists, paired=F works, but paired=T doesn't. lg=read.csv(my.csv) a = subset(lg, condition==a)[score] b = subset(lg, condition==b)[score] t.test(a,b) t.test(a,b, paired=TRUE) Error in `[.data.frame`(y, yok) : undefined columns selected But this works a=a[,1] b=b[,1] t.test(a,b, paired=TRUE) ... It's sort of an accident that this works for the unpaired case. You can follow what happens via debug(stats:::t.test.default) ... there is some code if (paired) xok - yok - complete.cases(x, y) else { yok - !is.na(y) xok - !is.na(x) } if paired is FALSE, !is.na(y) and !is.na(x) happen to convert x and y into matrices, whence they can be used for the rest of the computations. If paired is TRUE, x and y remain data frames. Bottom line: a data frame with a single column in it really isn't the same as a vector ... -- View this message in context: http://www.nabble.com/suggestion-for-paired-t-tests-tp24651851p24668046.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] moving text labels in plot
Hi R users I need to specify some parameter input in plot code to move Y text label to left. plot(temp, develo_rate, xlab = expression(paste(Temperature (C^o,))), ylab = expression(paste(Development rate (d^-1,))),las=1,pch=19, xlim=c(0,32),ylim=c(0,0.03),xaxs = i, yaxs = i) Plot result is added. any help? Ivan attachment: scater_temp_desen_rate.png__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Non-Linear Regression with two Predictors
Hello there, I am using nls the first time for a non-linear regression with a logistic growth function: startparam - c(alpha=3e+07,beta=4000,gamma=2) fit - nls(dataset$V2~(( alpha / ( 1 + exp( beta - gamma * dataset$V1 ) ) ) ),data=dataset,start=startparam) Everything works fine and i get good results. Now I would like to improve the results using my DUMMY Variable (dataset$V6) the runs half of the time 0 and then 1. This is my new nls: startparam - c(alpha=3e+07,beta=4000,gamma=2,delta=100) fit - nls(dataset$V2~(( alpha / ( 1 + exp( beta - gamma * dataset$V1 ) ) ) + (dataset$V6*dataset$V1*delta) ),data=dataset,start=startparam) I get Singular Gradient Matrice. May anyone give me the right nls function for this problem?? Regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problems hist() and density
Hello, I have a problem with the hist() function and showing densities. The densities sum to 50 and not to 1! I use R version 2.9.1 (2009-06-26) and I load the seqinR library. My data is the following vector: [1] 0.140 0.200 0.220 0.2828283 0.160 0.160 0.360 [8] 0.160 0.220 0.260 0.200 0.300 0.220 0.2342342 [15] 0.180 0.220 0.160 0.230 0.200 0.220 0.240 [22] 0.200 0.220 0.220 0.260 0.200 0.160 0.220 [29] 0.2342342 0.200 0.220 0.200 0.200 0.140 0.180 [36] 0.220 0.160 0.160 0.140 0.220 0.200 0.2871287 [43] 0.290 0.200 0.1836735 0.200 0.200 0.290 0.240 [50] 0.220 0.280 0.200 0.2745098 0.220 0.230 0.180 [57] 0.230 0.180 0.260 0.220 0.222 0.220 0.260 [64] 0.220 0.220 0.260 0.220 0.200 0.220 I use the following command: tmp - hist(data, freq=FALSE, plot=FALSE) and that's the result: $breaks [1] 0.14 0.16 0.18 0.20 0.22 0.24 0.26 0.28 0.30 0.32 0.34 0.36 $counts [1] 10 4 15 19 8 5 2 5 0 0 1 $intensities [1] 7.2463754 2.8985507 10.8695652 13.7681159 5.7971014 3.6231884 [7] 1.4492754 3.6231884 0.000 0.000 0.7246377 $density [1] 7.2463754 2.8985507 10.8695652 13.7681159 5.7971014 3.6231884 [7] 1.4492754 3.6231884 0.000 0.000 0.7246377 $mids [1] 0.15 0.17 0.19 0.21 0.23 0.25 0.27 0.29 0.31 0.33 0.35 $xname [1] data $equidist [1] TRUE attr(,class) [1] histogram __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] smoothScatter problems
Hello, I'm having some trouble getting a good result for a smoothScatter plot. I have some data that I want to log-plot, but when I use smoothScatter the result is not correct. The problem seems to be that with the log=x argument smoothScatter calculates the bins linearly, so the plot will be skewed towards the right. See for example: file(http://dckd.nl/~jeroen/drop/example.rdata;) smoothScatter(d,log=x) smoothScatter(log(d$x),d$y) I could also use the latter way to produce the result, but I would like to use the original units on the x-axis, not the log units. I also want to use these results in a publication, but the PDFs saved from these results are not very nice. With Acrobat there are lots of blocks, and with other viewers there are many white lines through the blue colours. Thanks, Jeroen. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems hist() and density
It is the 'area' under the curve that sums to zero. Look at what the difference is between the 'breaks' (0.02). multiply this by 50 and you get 1. On Sun, Jul 26, 2009 at 7:43 AM, Jan Teichmannjan.teichm...@googlemail.com wrote: Hello, I have a problem with the hist() function and showing densities. The densities sum to 50 and not to 1! I use R version 2.9.1 (2009-06-26) and I load the seqinR library. My data is the following vector: [1] 0.140 0.200 0.220 0.2828283 0.160 0.160 0.360 [8] 0.160 0.220 0.260 0.200 0.300 0.220 0.2342342 [15] 0.180 0.220 0.160 0.230 0.200 0.220 0.240 [22] 0.200 0.220 0.220 0.260 0.200 0.160 0.220 [29] 0.2342342 0.200 0.220 0.200 0.200 0.140 0.180 [36] 0.220 0.160 0.160 0.140 0.220 0.200 0.2871287 [43] 0.290 0.200 0.1836735 0.200 0.200 0.290 0.240 [50] 0.220 0.280 0.200 0.2745098 0.220 0.230 0.180 [57] 0.230 0.180 0.260 0.220 0.222 0.220 0.260 [64] 0.220 0.220 0.260 0.220 0.200 0.220 I use the following command: tmp - hist(data, freq=FALSE, plot=FALSE) and that's the result: $breaks [1] 0.14 0.16 0.18 0.20 0.22 0.24 0.26 0.28 0.30 0.32 0.34 0.36 $counts [1] 10 4 15 19 8 5 2 5 0 0 1 $intensities [1] 7.2463754 2.8985507 10.8695652 13.7681159 5.7971014 3.6231884 [7] 1.4492754 3.6231884 0.000 0.000 0.7246377 $density [1] 7.2463754 2.8985507 10.8695652 13.7681159 5.7971014 3.6231884 [7] 1.4492754 3.6231884 0.000 0.000 0.7246377 $mids [1] 0.15 0.17 0.19 0.21 0.23 0.25 0.27 0.29 0.31 0.33 0.35 $xname [1] data $equidist [1] TRUE attr(,class) [1] histogram __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems hist() and density
sums to one I should have said. On Sun, Jul 26, 2009 at 7:43 AM, Jan Teichmannjan.teichm...@googlemail.com wrote: Hello, I have a problem with the hist() function and showing densities. The densities sum to 50 and not to 1! I use R version 2.9.1 (2009-06-26) and I load the seqinR library. My data is the following vector: [1] 0.140 0.200 0.220 0.2828283 0.160 0.160 0.360 [8] 0.160 0.220 0.260 0.200 0.300 0.220 0.2342342 [15] 0.180 0.220 0.160 0.230 0.200 0.220 0.240 [22] 0.200 0.220 0.220 0.260 0.200 0.160 0.220 [29] 0.2342342 0.200 0.220 0.200 0.200 0.140 0.180 [36] 0.220 0.160 0.160 0.140 0.220 0.200 0.2871287 [43] 0.290 0.200 0.1836735 0.200 0.200 0.290 0.240 [50] 0.220 0.280 0.200 0.2745098 0.220 0.230 0.180 [57] 0.230 0.180 0.260 0.220 0.222 0.220 0.260 [64] 0.220 0.220 0.260 0.220 0.200 0.220 I use the following command: tmp - hist(data, freq=FALSE, plot=FALSE) and that's the result: $breaks [1] 0.14 0.16 0.18 0.20 0.22 0.24 0.26 0.28 0.30 0.32 0.34 0.36 $counts [1] 10 4 15 19 8 5 2 5 0 0 1 $intensities [1] 7.2463754 2.8985507 10.8695652 13.7681159 5.7971014 3.6231884 [7] 1.4492754 3.6231884 0.000 0.000 0.7246377 $density [1] 7.2463754 2.8985507 10.8695652 13.7681159 5.7971014 3.6231884 [7] 1.4492754 3.6231884 0.000 0.000 0.7246377 $mids [1] 0.15 0.17 0.19 0.21 0.23 0.25 0.27 0.29 0.31 0.33 0.35 $xname [1] data $equidist [1] TRUE attr(,class) [1] histogram __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] moving text labels in plot
Try this: par(mar=c(4,6,2,1)) plot(0, xlab = expression(paste(Temperature (C^o,))), ylab = ,las=1,pch=19, xlim=c(0,32),ylim=c(0,0.03),xaxs = i, yaxs = i) mtext(expression(paste(Development rate (d^-1,))), 2, line=4) 2009/7/26 Luis Iván Ortiz Valencia liov2...@gmail.com: Hi R users I need to specify some parameter input in plot code to move Y text label to left. plot(temp, develo_rate, xlab = expression(paste(Temperature (C^o,))), ylab = expression(paste(Development rate (d^-1,))),las=1,pch=19, xlim=c(0,32),ylim=c(0,0.03),xaxs = i, yaxs = i) Plot result is added. any help? Ivan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determine the dimension-names of an element in an array in R
Sauvik De schrieb: Hi: Lots of thanks for your valuable time! But I am not sure how you would like to use the function in this situation. As I had mentioned that the first element of my output array should be like: cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs) in my below code. and the output array of correlation I wish to get using sapply as follows: Correl = sapply(Correl,function(d) cor(DataArray_1[...],DataArray_2[...], use=pairwise.complete.obs)) So it would be of great help if you could kindly specify how to utilise your function findIndex in ... Apologies for all this! Thanks Regards, Sauvik Hey, sorry, I haven't understood your problem last time, but now this solution should solve your problem, so I hope. :-) It's only a for to loop, but an apply function may work too. I will think about this, but for now... ;-) la-length(a) lb-length(b) lc-length(c) ld-length(d) for (ia in 1:la) { for (ib in 1:lb) { for (ic in 1:lc) { for (id in 1:ld) { Correl[ia,ib,ic,id]-cor( DataArray_1[dimnames(Correl)[[1]][ia], dimnames(Correl)[[2]][ib], dimnames(Correl)[[4]][id],] , DataArray_2[dimnames(Correl)[[1]][ia], dimnames(Correl)[[3]][ic], dimnames(Correl)[[4]][id],] , use=pairwise.complete.obs) } } } } ## with function findIndex you can find the dimensions with ## i.e. cor values greater 0.5 or smaller -0.5, like: findIndex(Correl,Correl[Correl0.5]) findIndex(Correl,Correl[Correl(-0.5)]) I have changed the code of the function findIndex in line which contents: el[j]-which(is.element(data,element[j])) Rigards, Christian On Sun, Jul 26, 2009 at 3:54 PM, Poerschingpoerschin...@web.de mailto:poerschin...@web.de wrote: Sauvik De schrieb: Hi Gabor: Many thanks for your prompt reply! The code is fine. But I need it in more general form as I had mentioned that I need to input any 0 to find its dimension-names. Actually, I was using sapply to calculate correlation and this idea was required in the middle of correlation calculation. I am providing the way I tried my calculation. a= c(A1,A2,A3,A4,A5) b= c(B1,B2,B3) c= c(C1,C2,C3,C4) d= c(D1,D2) e= c(E1,E2,E3,E4,E5,E6,E7,E8) DataArray_1 = array(c(rnorm(240)),dim=c(length(a),length(b), length(d),length(e)),dimnames=list(a,b,d,e)) DataArray_2 = array(c(rnorm(320)), dim=c(length(a),length(c), length(d),length(e)),dimnames=list(a,c,d,e)) #Defining an empty array which will contain the correlation values (output array) Correl = array(NA, dim=c(length(a),length(b), length(c),length(d)),dimnames=list(a,b,c,d)) #Calculating Correlation between attributes b c over values of e Correl = sapply(Correl,function(d) cor(DataArray_1[...],DataArray_2[...], use=pairwise.complete.obs)) This is where I get stuck. In the above, d is acting as an element in the Correl array. Hence I need to get the dimension-names for d. #The first element of Correl will be: cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs) So my problem boils down to extracting the dim-names in terms of element(d) and not in terms of Correl (that I have mentioned as ... in the above code) My sincere thanks for your valuable time suggestions. Many Thanks Kind Regards, Sauvik On Sun, Jul 26, 2009 at 5:26 AM, Gabor Grothendieck ggrothendi...@gmail.com mailto:ggrothendi...@gmail.com wrote: Try this: ix - c(1, 3, 4, 2) mapply([, dimnames(mydatastructure), ix) [1] S1 T3 U4 V2 On Sat, Jul 25, 2009 at 5:12 PM, Sauvik Desauvik.s...@gmail.com mailto:sauvik.s...@gmail.com wrote: Hi: How can I extract the dimension-names of a pre-defined element in a multidimensional array in R ? A toy example is provided below: I have a 4-dimensional array with each dimension having certain length. In the below example, mydatastructure explains the structure of my data. mydatastructure = array(0, dim=c(length(b),length(z),length(x),length(d)), dimnames=list(b,z,x,d)) where, b=c(S1,S2,S3,S4,S5) z=c(T1,T2, T3) x=c(U1,U2,U3,U4) d=c(V1,V2) Clearly, mydatastructure contains many 0's. Now how can I get the dimension-names of any particular 0 ? That is, my input should be a particular 0 in the array mydatastructure (Suppose this 0 corresponds to S1,T3,U4 V2 in the array). Then my output should be S1,T3,U4 V2. The function dimnames didn't help me with the solution. Any idea will greatly be appreciated. Thanks for
[R] obtain names of variables and data from glm object
Suppose we have some glm object such as: myglm - glm( y ~ x, data=DAT) Is there an elegant way--or the right way within the R way of thinking--to obtain the names of the response variable, the predictor variables, and the dataset, as character strings? For instance, suppose the right way was to use the (currently fictitious) functions theresponse(), thepredictors(), and theDataSet(). Then I would be able to write a function that obtains the names and subsequently pastes text along the following lines: theResponse - theresponse( myglm ) theFirstPredictor - thepredictors( myglm )[1] theDataSet - theDataSet(myglm) title(main= paste(theResponse, is the response and , theFirstPredictor, is the only predictor) In reality, I can of course extract formula(myglm) y ~ x but I see no elegant way to extract the names of the predictor and response from this object. The deparse() function doesn't quite solve this problem: deparse(formula(myglm)) [1] y ~ x deparse(formula(myglm)[2]) [1] y() deparse(formula(myglm)[3]) [1] x() Ideally the elegant method would, in this example, return the character strings x, y, and DAT. Thanks for any insights. Jake Jacob A. Wegelin Assistant Professor Department of Biostatistics Virginia Commonwealth University 730 East Broad Street Room 3006 P. O. Box 980032 Richmond VA 23298-0032 U.S.A. E-mail: jwege...@vcu.edu URL: http://www.people.vcu.edu/~jwegelin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] obtain names of variables and data from glm object
Try this: g - glm(demand ~ Time, BOD, family = gaussian) all.vars(formula(g)) The result will be a character vector whose 1st component is the name of the response and whose subsequent components are the names of the predictor variables. On Sun, Jul 26, 2009 at 3:14 PM, Jacob Wegelinjacob.wege...@gmail.com wrote: Suppose we have some glm object such as: myglm - glm( y ~ x, data=DAT) Is there an elegant way--or the right way within the R way of thinking--to obtain the names of the response variable, the predictor variables, and the dataset, as character strings? For instance, suppose the right way was to use the (currently fictitious) functions theresponse(), thepredictors(), and theDataSet(). Then I would be able to write a function that obtains the names and subsequently pastes text along the following lines: theResponse - theresponse( myglm ) theFirstPredictor - thepredictors( myglm )[1] theDataSet - theDataSet(myglm) title(main= paste(theResponse, is the response and , theFirstPredictor, is the only predictor) In reality, I can of course extract formula(myglm) y ~ x but I see no elegant way to extract the names of the predictor and response from this object. The deparse() function doesn't quite solve this problem: deparse(formula(myglm)) [1] y ~ x deparse(formula(myglm)[2]) [1] y() deparse(formula(myglm)[3]) [1] x() Ideally the elegant method would, in this example, return the character strings x, y, and DAT. Thanks for any insights. Jake Jacob A. Wegelin Assistant Professor Department of Biostatistics Virginia Commonwealth University 730 East Broad Street Room 3006 P. O. Box 980032 Richmond VA 23298-0032 U.S.A. E-mail: jwege...@vcu.edu URL: http://www.people.vcu.edu/~jwegelin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] splitting multiple data in one column into multiple rows with one entry per column
Dear R colleagues, I annotated a list of single nuclotide polymorphiosms (SNP) with the corresponding genes using biomaRt. The result is the following data.frame (pasted from R): snp ensembl_gene_id 1 rs8032583 2 rs1071600 ENSG0101605 3 rs13406898 ENSG0167165 4 rs7030479 ENSG0107249 5 rs1244414 ENSG0165629 6 rs1005636 ENSG0230681 7 rs927913 ENSG0151655;ENSG0227546 8 rs4832680 9 rs4435168 ENSG0229164;ENSG0225227;ENSG0211817 10 rs7035549 11 rs12707538 ENSG0186472 As you can see, the SNP with the identifier rs4435168 corresponds to 3 gene ids, rs927913 corresponds to 2 gene ids. As I'd like to perform a join of several data.frames using the ensembl_gene_id later on, I'd like to split columns with multiple gene identifiers into rows with only one ensembl gene identifier each. So for the example of rs4435168 it should look like this (faked output): snp ensembl_gene_id ... 9 rs4435168 ENSG0229164 10rs4435168 ENSG0225227 11rs4435168 ENSG0211817 ... This is just a simple example. Finally there will be a lot of other columns, which should be replicated like the snp column. Does anyone know how to do this? I tried strsplit, which splits nicely the multiple entries in column ensembl_gene_id. But how to go on? I'd appreciate any kind of help very much! Best regards from Munich, Felix __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mixdist package in R
Hi, All, I fitted a 3-component normal mixture model with the mixdist package in R. How can I get the density of a new data after I fit the model? Is there any function to do it? Thanks, Cindy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] moving text labels in plot
Luis Iván Ortiz Valencia wrote: Hi R users I need to specify some parameter input in plot code to move Y text label to left. plot(temp, develo_rate, xlab = expression(paste(Temperature (C^o,))), ylab = expression(paste(Development rate (d^-1,))),las=1,pch=19, xlim=c(0,32),ylim=c(0,0.03),xaxs = i, yaxs = i) Plot result is added. any help? Ivan From the looks of your plot, the label is already to the left as far as it can go. The problem is that the left margin of the plot is not big enough to allow good separation between the axes label and the axes numbering. You can adjust the plot margins by calling the par() function before creating your plot. In this case you want to indicate that you are setting the mar option and pass a vector of four numbers: par( mar = c(bottom, left, top, right) ) Note that the defaults for mar are: c( 5.1, 4.1, 4.1, 2.1 ) Hope that helps! -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://www.nabble.com/moving-text-labels-in-plot-tp24668930p2467.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] labelling points plotted in a 2D plan
Hello Tal! Nothing showed up when I used those commands! The plot still shows dots with no labels! Thanks On Sun, Jul 26, 2009 at 7:28 AM, Tal Galili tal.gal...@gmail.com wrote: Hi Khaled, Did my answer help ? On Sun, Jul 26, 2009 at 12:15 AM, Khaled OUANES koua...@gmail.com wrote: Thanks for the answer Tal! But I can't get it to work correctly! :( Please bear with me this is the first time I am using R! and I am in a rush to correct a paper in fact on the plane I am plotting a table fullpointed=read.table(fullpoints_backup.txt,h=F) plot(range(-2.5,0.95),range(0.00,1.00),type=n,axes=TRUE) and in this table there are 300 points I want to label the first 175 points with A and the others with S I couldn't figure how to configure correctly labels.to.plot - sample(c(A,B), 100, replace = T) and text(x, y , labels = labels.to.plot) ? for instance: 0,48875 0,142857143 the point plotted will be labelled a 0,409 0,142857143 the point plotted will be labelled a 0,45611 0,25 labelled a 0,49833 0,2 labelled a #the first 175 0,61158 0,125labelled S 0,5709 0,125labelled S 0,53266 0,125labelled S # the remaining Regards On Sat, Jul 25, 2009 at 5:32 PM, Tal Galili tal.gal...@gmail.com wrote: Sure, Here is an example: # get some random data to play with x - runif(100) y - runif(100) labels.to.plot - sample(c(A,B), 100, replace = T) # set up the window, play them one by one to see what they do plot.window(ylim = c(0,1), xlim = c(0,1)) plot.new() axis(1) axis(2) box() # plot the things you wished to plot, where you wanted them plotted text(x, y , labels = labels.to.plot) Cheers, Tal On Sat, Jul 25, 2009 at 7:20 PM, Khaled OUANES koua...@gmail.com wrote: hey thanks for the answer but I couldn't achieve it? would you explain a bit more? I have like 300 points to label! thanks -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about rpart decision trees (being used to predict customer churn)
Hi, I am using rpart decision trees to analyze customer churn. I am finding that the decision trees created are not effective because they are not able to recognize factors that influence churn. I have created an example situation below. What do I need to do to for rpart to build a tree with the variable experience? My guess is that this would happen if rpart used the loss matrix while creating the tree. experience - as.factor(c(rep(good,90), rep(bad,10))) cancel - as.factor(c(rep(no,85), rep(yes,5), rep(no,5), rep(yes,5))) table(experience, cancel) cancel experience no yes bad 5 5 good 85 5 rpart(cancel ~ experience) n= 100 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 100 10 no (0.900 0.100) * I tried the following commands with no success. rpart(cancel ~ experience, control=rpart.control(cp=.0001)) rpart(cancel ~ experience, parms=list(split='information')) rpart(cancel ~ experience, parms=list(split='information'), control=rpart.control(cp=.0001)) rpart(cancel ~ experience, parms=list(loss=matrix(c(0,1,1,0), nrow=2, ncol=2))) Thanks a lot for your help. Best regards, Robert [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] running a .r script and saving the output to a file
Hello, I am running R under Ubuntu 8.04. I am trying to do numerous linear fits to various subsets of my data set. I am having trouble convincing R to send the output from these fits to text files from within a script. When I run my script, all the plots are created as postscript files in the correct directory, and text files appear with the correct names for all the summaries of the fits. However, all of these files are blank! If I copy/paste the sink commands from my script manually into the command line, the files are created correctly, but if the same commands are executed from within my script, no output is generated. The output section of my script, which correctly creates the plots I want, but not the text, is: x_seq1_fit=lm(xagb$X5mag[x_seq1_filt]~x1per) x_seq2_fit=lm(xagb$X5mag[x_seq2_filt]~x2per) setwd(/home/driebel/sage/output/pl_fits/R/5mag/chem) postscript(file=x_seq1_fit.ps) plot(x1per,xagb$X5mag[x_seq1_filt],ylim=c(13,7)) abline(x_seq1_fit,col=red) dev.off() postscript(file=x_seq1_resid.ps) plot(x1per,x_seq1_fit$res) abline(h=0,col=red) dev.off() postscript(file=x_seq2_fit.ps) plot(x2per,xagb$X5mag[x_seq2_filt],ylim=c(13,7)) abline(x_seq2_fit,col=red) dev.off() postscript(file=x_seq2_resid.ps) plot(x2per,x_seq2_fit$res) abline(h=0,col=red) dev.off() sink(file=x_seq2_fit.dat) summary(x_seq2_fit) sink() sink(file=x_seq1_fit.dat) summary(x_seq1_fit) sink() Is there something else one must do with sink from within a script? Thanks for your help, Dave __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ROC curve using epicalc (after logistic regression)
Dear R-help list, I'm attempting to use the ROC routine from the epicalc package after performing a logistic regression analysis. My code is included after the sessionInfo() result. The datafile (GasketMelt1.csv) is attached. I updated both R and the epicalc packages and tried again before sending this request. sessionInfo result: R version 2.9.1 (2009-06-26) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] caret_4.19 lattice_0.17-25 epicalc_2.9.1.2 survival_2.35-4 [5] foreign_0.8-36 loaded via a namespace (and not attached): [1] grid_2.9.1 tools_2.9.1 Header information from package 'epicalc': Package:epicalc Version:2.9.1.2 Date: 2009-07-14 My code ... # # Logistic Regression (the model result is as expected) # dfile = 'GasketMelt1.csv' gmelt.df = read.csv(dfile, header = TRUE, as.is = TRUE) names(gmelt.df) gmelt.df$p = gmelt.df$Pass / gmelt.df$Total gmelt.glm = glm(p ~ Time + Temperature + Depth + Time*Temperature + Time*Depth + Temperature*Depth, family = binomial(link = logit), data=gmelt.df, weight=Total) summary(gmelt.glm) # # ROC # library(epicalc) lroc(gmelt.glm, graph = TRUE, line.col = red) The error message: lroc(gmelt.glm, graph = TRUE, line.col = red) Error in dimnames(x) - dn : length of 'dimnames' [2] not equal to array extent Have I overlooked something? Many thanks to anyone who might have a suggestion. Cliff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] splitting multiple data in one column into multiple rows with one entry per column
Try this: x - read.table(textConnection(snp ensembl_gene_id + rs8032583 + rs1071600 ENSG0101605 + rs13406898 ENSG0167165 + rs7030479 ENSG0107249 + rs1244414 ENSG0165629 + rs1005636 ENSG0230681 + rs927913 ENSG0151655;ENSG0227546 + rs4832680 + rs4435168 ENSG0229164;ENSG0225227;ENSG0211817 + rs7035549 + rs12707538 ENSG0186472), header=TRUE, fill=TRUE) closeAllConnections() x.new - do.call(rbind, apply(x, 1, function(.row){ + .ids - unlist(strsplit(.row[2], ';')) + # check for no data in second column; substitute a blank + if (length(.ids) == 0) return(cbind(.row[1], )) + else return(cbind(.row[1], .ids)) + })) x.new .ids snp rs8032583 snp rs1071600 ENSG0101605 snp rs13406898 ENSG0167165 snp rs7030479 ENSG0107249 snp rs1244414 ENSG0165629 snp rs1005636 ENSG0230681 ensembl_gene_id1 rs927913 ENSG0151655 ensembl_gene_id2 rs927913 ENSG0227546 snp rs4832680 ensembl_gene_id1 rs4435168 ENSG0229164 ensembl_gene_id2 rs4435168 ENSG0225227 ensembl_gene_id3 rs4435168 ENSG0211817 snp rs7035549 snp rs12707538 ENSG0186472 On Sun, Jul 26, 2009 at 3:26 PM, Felix Müller-Sarnowskidrfl...@googlemail.com wrote: Dear R colleagues, I annotated a list of single nuclotide polymorphiosms (SNP) with the corresponding genes using biomaRt. The result is the following data.frame (pasted from R): snp ensembl_gene_id 1 rs8032583 2 rs1071600 ENSG0101605 3 rs13406898 ENSG0167165 4 rs7030479 ENSG0107249 5 rs1244414 ENSG0165629 6 rs1005636 ENSG0230681 7 rs927913 ENSG0151655;ENSG0227546 8 rs4832680 9 rs4435168 ENSG0229164;ENSG0225227;ENSG0211817 10 rs7035549 11 rs12707538 ENSG0186472 As you can see, the SNP with the identifier rs4435168 corresponds to 3 gene ids, rs927913 corresponds to 2 gene ids. As I'd like to perform a join of several data.frames using the ensembl_gene_id later on, I'd like to split columns with multiple gene identifiers into rows with only one ensembl gene identifier each. So for the example of rs4435168 it should look like this (faked output): snp ensembl_gene_id ... 9 rs4435168 ENSG0229164 10 rs4435168 ENSG0225227 11 rs4435168 ENSG0211817 ... This is just a simple example. Finally there will be a lot of other columns, which should be replicated like the snp column. Does anyone know how to do this? I tried strsplit, which splits nicely the multiple entries in column ensembl_gene_id. But how to go on? I'd appreciate any kind of help very much! Best regards from Munich, Felix __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] running a .r script and saving the output to a file
Hi Dave, I don't know about using sink(), but if you run your script in batch mode all the output will be saved to a text file foryou. R CMD BATCH infile outfile (at the linux prompt, not from within R). Sarah On Sun, Jul 26, 2009 at 4:47 PM, David Riebeldrie...@pha.jhu.edu wrote: Hello, I am running R under Ubuntu 8.04. I am trying to do numerous linear fits to various subsets of my data set. I am having trouble convincing R to send the output from these fits to text files from within a script. When I run my script, all the plots are created as postscript files in the correct directory, and text files appear with the correct names for all the summaries of the fits. However, all of these files are blank! If I copy/paste the sink commands from my script manually into the command line, the files are created correctly, but if the same commands are executed from within my script, no output is generated. The output section of my script, which correctly creates the plots I want, but not the text, is: x_seq1_fit=lm(xagb$X5mag[x_seq1_filt]~x1per) x_seq2_fit=lm(xagb$X5mag[x_seq2_filt]~x2per) setwd(/home/driebel/sage/output/pl_fits/R/5mag/chem) postscript(file=x_seq1_fit.ps) plot(x1per,xagb$X5mag[x_seq1_filt],ylim=c(13,7)) abline(x_seq1_fit,col=red) dev.off() postscript(file=x_seq1_resid.ps) plot(x1per,x_seq1_fit$res) abline(h=0,col=red) dev.off() postscript(file=x_seq2_fit.ps) plot(x2per,xagb$X5mag[x_seq2_filt],ylim=c(13,7)) abline(x_seq2_fit,col=red) dev.off() postscript(file=x_seq2_resid.ps) plot(x2per,x_seq2_fit$res) abline(h=0,col=red) dev.off() sink(file=x_seq2_fit.dat) summary(x_seq2_fit) sink() sink(file=x_seq1_fit.dat) summary(x_seq1_fit) sink() Is there something else one must do with sink from within a script? Thanks for your help, Dave -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] normal mixture model
Hi, All, I want to fit a normal mixture model. Which package in R is best for this? I was using the package 'mixdist', but I need to group the data into groups before fitting model, and different groupings seem to lead to different results. What other package can I use which is stable? And are there packages that can automatically determine the number of components? Thank you, Cindy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normal mixture model
You can use mclustBIC in package mclust (uses the BIC for deciding about the number of components and hierarchical clustering for initialisation). Christian On Sun, 26 Jul 2009, cindy Guo wrote: Hi, All, I want to fit a normal mixture model. Which package in R is best for this? I was using the package 'mixdist', but I need to group the data into groups before fitting model, and different groupings seem to lead to different results. What other package can I use which is stable? And are there packages that can automatically determine the number of components? Thank you, Cindy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] running a .r script and saving the output to a file
You have to explicitly 'print' the output. At the command line there is an implicit 'print'. Try: sink(file=x_seq2_fit.dat) print(summary(x_seq2_fit)) sink() sink(file=x_seq1_fit.dat) print(summary(x_seq1_fit)) sink() On Sun, Jul 26, 2009 at 4:47 PM, David Riebeldrie...@pha.jhu.edu wrote: Hello, I am running R under Ubuntu 8.04. I am trying to do numerous linear fits to various subsets of my data set. I am having trouble convincing R to send the output from these fits to text files from within a script. When I run my script, all the plots are created as postscript files in the correct directory, and text files appear with the correct names for all the summaries of the fits. However, all of these files are blank! If I copy/paste the sink commands from my script manually into the command line, the files are created correctly, but if the same commands are executed from within my script, no output is generated. The output section of my script, which correctly creates the plots I want, but not the text, is: x_seq1_fit=lm(xagb$X5mag[x_seq1_filt]~x1per) x_seq2_fit=lm(xagb$X5mag[x_seq2_filt]~x2per) setwd(/home/driebel/sage/output/pl_fits/R/5mag/chem) postscript(file=x_seq1_fit.ps) plot(x1per,xagb$X5mag[x_seq1_filt],ylim=c(13,7)) abline(x_seq1_fit,col=red) dev.off() postscript(file=x_seq1_resid.ps) plot(x1per,x_seq1_fit$res) abline(h=0,col=red) dev.off() postscript(file=x_seq2_fit.ps) plot(x2per,xagb$X5mag[x_seq2_filt],ylim=c(13,7)) abline(x_seq2_fit,col=red) dev.off() postscript(file=x_seq2_resid.ps) plot(x2per,x_seq2_fit$res) abline(h=0,col=red) dev.off() sink(file=x_seq2_fit.dat) summary(x_seq2_fit) sink() sink(file=x_seq1_fit.dat) summary(x_seq1_fit) sink() Is there something else one must do with sink from within a script? Thanks for your help, Dave __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normal mixture model
Hi, Christian, Thank you for the reply. I just tried. Does the function mclustBIC only give the best model, or does it also do EM to get the cluster means and variances according to the best model it picks? I didn't find it. Is there a way to automatically select the best number of components and do EM? Because I need to do the normal mixture model in a loop (one EM at an iteration), so I want it to do everything automatically. Thanks, Cindy On Sun, Jul 26, 2009 at 3:46 PM, Christian Hennig chr...@stats.ucl.ac.ukwrote: You can use mclustBIC in package mclust (uses the BIC for deciding about the number of components and hierarchical clustering for initialisation). Christian On Sun, 26 Jul 2009, cindy Guo wrote: Hi, All, I want to fit a normal mixture model. Which package in R is best for this? I was using the package 'mixdist', but I need to group the data into groups before fitting model, and different groupings seem to lead to different results. What other package can I use which is stable? And are there packages that can automatically determine the number of components? Thank you, Cindy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Specify CRAN repository from command line
Hi, It feels like I should be able to do something like: R CMD INSTALL lib='/usr/lib64/R/library' repos='http://proxy.url/cran' package We have a bunch of servers (compute nodes in a Rocks cluster) in an isolated subnet, there is a basic pass-through proxy set up on the firewall (the head node) which just passes HTTP requests through to our nearest CRAN mirror. when using install. packages it's easy to make R install from the repository with the repos='address' option, but I can't figure out how do this from the command line. Is there a command line option for this? Currently I'm doing it using an R script, but that's causing issues because it's not 'visible' to the installer. This would greatly streamline R installation with a standard package set. Regards, Aaron Hicks Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to use do.call together with cbind and get inside a function
Dear R-helpers: I have a question related to using do.call to call cbind and get. #the following works vec1 - c(1,2) vec2 - c(3,4) ColNameVec - c('vec1','vec2') mat - do.call(cbind,lapply(ColNameVec,get)) mat #put code above into a function then it does not work #before doing so, first remove vec1 and vec2 from global environment rm(vec1,vec2) test - function() { vec1 - c(1,2) vec2 - c(3,4) ColNameVec - c('vec1','vec2') mat - do.call(cbind,lapply(ColNameVec,get)) return(mat) } test() In my task, I have to run do.call(cbind,lapply(ColNameVec,get)) inside a function, can someone kindly help? Many thanks in advance! -Sean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Specify CRAN repository from command line
On 27 July 2009 at 14:55, Aaron Hicks wrote: | It feels like I should be able to do something like: | | R CMD INSTALL lib='/usr/lib64/R/library' repos='http://proxy.url/cran' package Here's what I do using littler, you can substitute Rscript as well: #!/usr/bin/env r # # a simple example to install one or more packages if (is.null(argv) | length(argv)1) { cat(Usage: installr.r pkg1 [pkg2 pkg3 ...]\n) q() } ## adjust as necessary, see help('download.packages') repos - http://cran.us.r-project.org; ## this makes sense on Debian where no packages touch /usr/local lib.loc - /usr/local/lib/R/site-library install.packages(argv, lib.loc, repos, dependencies=TRUE) That way, I just say 'install.r foo bar baz' and these packages, plus their Depends:, will get installed. [ That said, I actually don't do that anymore because we now have cran2deb so I can just say 'apt-get install r-cran-foo r-cran-bar r-cran-baz' but that is a special case for Debian and pretty recent as per the announcement a few days ago. ] | We have a bunch of servers (compute nodes in a Rocks cluster) in an isolated subnet, there is a basic pass-through proxy set up on the firewall (the head node) which just passes HTTP requests through to our nearest CRAN mirror. | | when using install. packages it's easy to make R install from the repository with the repos='address' option, but I can't figure out how do this from the command line. | | Is there a command line option for this? Currently I'm doing it using an R script, but that's causing issues because it's not 'visible' to the installer. | | This would greatly streamline R installation with a standard package set. The repos argument can otherwise be set in ~/.Rprofile or Rprofile.site. Dirk -- Three out of two people have difficulties with fractions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Disable summary statistics in LaTeX tables using MEMISC package
Dear All, The mtable function in memisc package is very useful in producing publication quality tables directly from estimated models. There are cases where only the estimated coefficients and standard errors are needed in the table but not the summary staitstics such as N, value of the likelihood, AIC, etc. The summary.stats=FALSE option is supposed to do this, but there seems to be some problems. The easiest way to verify this is to run the anes48.R file, which is distributed as part of the package, then change the last line from: mtable(model1,model6,model7,summary.stats=c(Deviance,AIC,N)) to: mtable(model1,model6,model7,summary.stats=FALSE) I get the error message of Error in as.table.default(sumstats) : cannot coerce into a table. Any ideas? Thanks. Best, Shige [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Superstring in text()
I'd like to paste a superstring with a number in an object. Thanks for any help. Murray mycor - cor(1:10,1:10) plot(1:10,1:10) text(8,2,paste(expression(R^2), = ,mycor)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to use do.call together with cbind and get inside a function
Use lapply(ColNameVec, get, environment()) so that it gets the objects from the current environment. See: ?get ?environment On Sun, Jul 26, 2009 at 11:16 PM, Sean Zhangseane...@gmail.com wrote: Dear R-helpers: I have a question related to using do.call to call cbind and get. #the following works vec1 - c(1,2) vec2 - c(3,4) ColNameVec - c('vec1','vec2') mat - do.call(cbind,lapply(ColNameVec,get)) mat #put code above into a function then it does not work #before doing so, first remove vec1 and vec2 from global environment rm(vec1,vec2) test - function() { vec1 - c(1,2) vec2 - c(3,4) ColNameVec - c('vec1','vec2') mat - do.call(cbind,lapply(ColNameVec,get)) return(mat) } test() In my task, I have to run do.call(cbind,lapply(ColNameVec,get)) inside a function, can someone kindly help? Many thanks in advance! -Sean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Writing to a UDP server from R?
Hello, I have used socketConnection to connect to a TCP server. I havent figured out a way to do the same with a UDP server. i.e I have a server listening on 9000, communicating via UDP. I would like to , from R, send packets to this server, This does not work u - socketConnection('localhost',9000) Error in socketConnection(localhost, 9000, blocking = F) : cannot open the connection In addition: Warning message: In socketConnection(localhost, 9000, blocking = F) : localhost:9000 cannot be opened I have confirmed that something is indeed listening on the other side. Any help would be appreciated. Regards Saptarshi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Superstring in text()
mycor - cor(1:10,1:10) plot(1:10,1:10) text(8,2,bquote(R^2 == .(mycor))) HTH, Andrej Murray Pung wrote: I'd like to paste a superstring with a number in an object. Thanks for any help. Murray mycor - cor(1:10,1:10) plot(1:10,1:10) text(8,2,paste(expression(R^2), = ,mycor)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Superstring in text()
Sorry, there should be a caret symbol (^) between R and 2. Murray Pung wrote: I'd like to paste a superstring with a number in an object. Thanks for any help. Murray mycor - cor(1:10,1:10) plot(1:10,1:10) text(8,2,paste(expression(R^2), = ,mycor)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] running a .r script and saving the output to a file
Check out: http://akastrin.wordpress.com/category/r/ David Riebel wrote: Hello, I am running R under Ubuntu 8.04. I am trying to do numerous linear fits to various subsets of my data set. I am having trouble convincing R to send the output from these fits to text files from within a script. When I run my script, all the plots are created as postscript files in the correct directory, and text files appear with the correct names for all the summaries of the fits. However, all of these files are blank! If I copy/paste the sink commands from my script manually into the command line, the files are created correctly, but if the same commands are executed from within my script, no output is generated. The output section of my script, which correctly creates the plots I want, but not the text, is: x_seq1_fit=lm(xagb$X5mag[x_seq1_filt]~x1per) x_seq2_fit=lm(xagb$X5mag[x_seq2_filt]~x2per) setwd(/home/driebel/sage/output/pl_fits/R/5mag/chem) postscript(file=x_seq1_fit.ps) plot(x1per,xagb$X5mag[x_seq1_filt],ylim=c(13,7)) abline(x_seq1_fit,col=red) dev.off() postscript(file=x_seq1_resid.ps) plot(x1per,x_seq1_fit$res) abline(h=0,col=red) dev.off() postscript(file=x_seq2_fit.ps) plot(x2per,xagb$X5mag[x_seq2_filt],ylim=c(13,7)) abline(x_seq2_fit,col=red) dev.off() postscript(file=x_seq2_resid.ps) plot(x2per,x_seq2_fit$res) abline(h=0,col=red) dev.off() sink(file=x_seq2_fit.dat) summary(x_seq2_fit) sink() sink(file=x_seq1_fit.dat) summary(x_seq1_fit) sink() Is there something else one must do with sink from within a script? Thanks for your help, Dave __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.