[R] Vector recycling and zoo
I am a bit confused about the different approaches taken to recycling in plain data frames and zoo objects. When carrying out simple arithmetic, dataframe seem to recycle single arguments, zoo objects do not. Here is an example x - data.frame(a=1:5*2, b=1:5*3) x a b 1 2 3 2 4 6 3 6 9 4 8 12 5 10 15 x$a/x$a[1] [1] 1 2 3 4 5 x - zoo(x) x$a/x$a[1] 1 1 I feel understanding this difference would lead me to a greater understanding of the zoo module! Sean. -- Sean Carmody Twitter: http://twitter.com/seancarmody Stable: http://mulestable.net/sean The Stubborn Mule Blog: http://www.stubbornmule.net Forum: http://mulestable.net/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Location attribute
hi everybody, a question, as I can know the location (number) of an attribute with its name. Ej. X1 X2 X3 X4 X5 X6 1 3 5 2 1 7 6 7 4 5 2 9 as I can know that the attribute X4 is in position 4 I hope you can help me from already thank you very much to all Agustín __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] p value
How to compute the p-value of a statistic generally? -- View this message in context: http://r.789695.n4.nabble.com/p-value-tp2217867p2217867.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Location attribute
hi everybody, a question, as I can know the location (number) of an attribute with its name. Ej. X1 X2 X3 X4 X5 X6 1 3 5 2 1 7 6 7 4 5 2 9 as I can know that the attribute X4 is in position 4 I hope you can help me from already thank you very much to all Agustín __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to manage an error message about NA/NaN/Inf
Dear R Family, I have an error message. I would like to learn how to deal with that. The orginal series is as follows: I just pick up the first 10 observations. dif_transaud[1:10] [1] 0.0065880493 -0.0065880490 -0.0131743570 0.0197745715 0.0065889175 [6] 0.0131813110 0.0065923924 -0.0395587070 0.156455 0.0197693578 Then I transformed them into the following observations. dif_transaud_sq - dif_transaud^2 lnabsdif_transaud - 0.5*log(dif_transaud_sq) lnabsdif_transaud[1:10] [1] -5.022498 -5.022498 -4.329483 -3.923358 -5.022366 -4.328955 [7] -5.021839 -3.229969 -11.065327 -3.923622 Finally, I run the program, which is part of wavelet transform. mra.out - mra(lnabsdif_transaud, filter=la8, n.levels=8, + boundary=reflection, fast=TRUE, method=modwt) However, this triggered an error message. Error in FUN(1L[[1L]], ...) : NA/NaN/Inf in foreign function call (arg 1) I guess there are a few big negative numbers in lnabsdif_transaud. I was wondering if there is an appropriate way to truncate those numbers in a reasonable way. Regards, Moohwan Kim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] number of location attribute with its name
hi everybody, a question, as I can know the location (number) of an attribute with its name. Ej. X1 X2 X3 X4 X5 X6 1 3 5 2 1 7 6 7 4 5 2 9 as I can know that the attribute X4 is in position 4 I hope you can help me from already thank you very much to all Agustín __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to rank matrix data by deciles?
On 2010-05-13 17:50, Phil Spector wrote: Vincent - I'm afraid there's no solution other than artificially modifying the zeroes: vec [1] 26.58950617 5.73074074 5.9622 5.6478 20.95728395 0. 0.0700 12.8689 [9] 3.64543210 0.05049383 25.6089 3.53246914 0. 31.39049383 3.77641975 13.19617284 [17] 0. cut(vec,quantile(vec,(0:10)/10),include.lowest=TRUE,label=FALSE) Error in cut.default(vec, quantile(vec, (0:10)/10), include.lowest = TRUE, : 'breaks' are not unique vec[vec==0] = jitter(vec[vec==0]) cut(vec,quantile(vec,(0:10)/10),include.lowest=TRUE,label=FALSE) [1] 10 6 7 5 9 1 3 7 4 2 9 4 2 10 5 8 1 It gives an answer, but it may not make sense for all data. - Phil The problem is that quantile() produces multiple values for the breaks used in cut(). Phil's suggestion modifies the data. It might be preferable to modify the breaks: eps - .Machine$double.eps #or use something like 1e-10 brks - quantile(vec, (0:10)/10) + eps*(0:10) cut(vec, brks, include.lowest=TRUE, labels=FALSE) #[1] 10 6 7 5 9 1 3 7 4 2 9 4 1 10 5 8 1 -Peter Ehlers On Thu, 13 May 2010, vincent.deluard wrote: Dear Phil, You helped me with a request to rand matrix columns by deciles two weeks ago. This really un-blocked me on this project but I found a little bug. As in before, my data is in a matrix: madebt[1:16,1:2] X4.19.2010 X4.16.2010 [1,] 26.61197531 26.58950617 [2,] 5.72765432 5.73074074 [3,] 5.95839506 5.9622 [4,] 5.6433 5.6478 [5,] 20.93814815 20.95728395 [6,] 0. 0. [7,] 0.0700 0.0700 [8,] 12.87802469 12.8689 [9,] 3.64407407 3.64543210 [10,] 0.05037037 0.05049383 [11,] 25.59024691 25.6089 [12,] 3.47987654 3.53246914 [13,] 0. 0. [14,] 31.39037037 31.39049383 [15,] 3.78296296 3.77641975 [16,] 13.17876543 13.19617284 The apply function will work for this sample of my data: debtdeciles = apply(madebt[1:16,1:2],2,function(x) cut(x,quantile(x,(0:10)/10, na.rm=TRUE),label=FALSE,include.lowest=TRUE)) debtdeciles X4.19.2010 X4.16.2010 [1,] 10 10 [2,] 6 6 [3,] 6 6 [4,] 5 5 [5,] 8 8 [6,] 1 1 [7,] 2 2 [8,] 7 7 [9,] 4 4 [10,] 2 2 [11,] 9 9 [12,] 3 3 [13,] 1 1 [14,] 10 10 [15,] 4 4 [16,] 8 8 However, it will fail for madebt[1:17,1:2] X4.19.2010 X4.16.2010 [1,] 26.61197531 26.58950617 [2,] 5.72765432 5.73074074 [3,] 5.95839506 5.9622 [4,] 5.6433 5.6478 [5,] 20.93814815 20.95728395 [6,] 0. 0. [7,] 0.0700 0.0700 [8,] 12.87802469 12.8689 [9,] 3.64407407 3.64543210 [10,] 0.05037037 0.05049383 [11,] 25.59024691 25.6089 [12,] 3.47987654 3.53246914 [13,] 0. 0. [14,] 31.39037037 31.39049383 [15,] 3.78296296 3.77641975 [16,] 13.17876543 13.19617284 [17,] 0. 0. debtdeciles = apply(madebt[1:17,1:2],2,function(x) + cut(x,quantile(x,(0:10)/10, na.rm=TRUE),label=FALSE,include.lowest=TRUE)) Error in cut.default(x, quantile(x, (0:10)/10, na.rm = TRUE), label = FALSE, : 'breaks' are not unique My guess is that we now have 3 zeros in each column. For each decile, we cannot have more than 2 elements (total of 17 numbers in each column) and I believe R cannot determine where to put the third zero. Do you have any solution for this problem? Many thanks, -- View this message in context: http://r.789695.n4.nabble.com/How-to-rank-matrix-data-by-deciles-tp2133496p2215945.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sample
Hi, I am sampling two random columns from females and two random columns from males to produce tetraploid offspring. For every female I am sampling a random male. In the end I want to write out a a matrix with all the offspring, but that does not work. I get always only the offspring from the last females. There must be a mistake in my script: moms-read.delim(females.txt, stringsAsFactors=FALSE,header=TRUE) dads-read.delim(males.txt, stringsAsFactors=FALSE,header=TRUE) output_offspring-data.frame() for (i in 1:nrow(moms)){ rdad=sample(1:nrow(dads),1) kid-c(sample(moms[i,2:5],2),sample(dads[rdad,2:5],2)) output_offspring-rbind(output_offspring,c(moms$SampleID[i],dads $SampleID[rdad],kid)) } write .table (output_offspring,offspring_7.txt,row.names=T,col.names=T,quote=F) females.txt: SampleIDA1 A2 A3 A4 GM920222GATTGCC GATTGCC GATAGAC GATAGAC GM930040GTCATCA GAGTGCA ACTATAA GATTGCC GM930040GTCATCA GAGTGCA ACTATAA GATTGCC GM960023GATTGCC GTCATCA GATTGCC GATTGCC GM920224ACTAGAA GTCATCA GTCATCA ACTAGAA GM920224ACTAGAA GTCATCA GTCATCA ACTAGAA GM920034GATTGCC GTCATCA GATTGCA GATTGCA GM920096GATTGCC GATTGCC GATTGCA GATTGCC GM930029GTCATCA GATTGCC GTCATCA GATTGCC GM940031GATTGCC GAGTGCA GATTGCA ACTAGAA GM960028GATTGCC GAGTGCA GATTGCA ACTAGAA GM980007GTCATCA GATTGCC ACTTGAA GTCATCA GM970009ACTAGAA GTCAGAA GTCAGCA ACTAGCA GM930026ACTAGAA GAGTGCA GAGTGCA ACTAGAA GM920031GATTGCC GTCATCA GATTGCC GATTGCC GM990105GATTGCC GATTGCC GTCAGCA GTCAGCA GM920202GATTGCC GATTGCC GATTGCC GATTGCC GM920089GAGTGCA GTCAGAA ACTATCA GATTGCC GM980051ACTAGAA ACTAGAA GATAGCC GATAGCC GM930109GTCATCA GAGTGCA GAA ACTAGAA GM940039GTCATCA GAGTGCA GTTTGCC ACTTTCA GM050099GAGTGCA GTCAGAA GTTATCC ACTTTCA GM050099GAGTGCA GTCAGAA GTTATCC ACTTTCA GM030005ACTAGAA GAGTGCA ACTAGAA ACTAGAA GM050009ACTAGAA GATTGCC GATTGCC ACTAGAA GM990027GATTGCC GAGTGCA GATTGCA GATTGCC GM990066GATTGCC GTCATCA GTCATCA GATTGCC males.txt: SampleIDA1 A2 A3 A4 WI920425ACTAGAA ACCATCA ACTAGAA ACTAGAA WI920408ACTAGAA ACTAGAA ACTAGAA ACTAGAA WI920009ACTAGAA ACTAGAA ACTAGAA GATTGCC WI920352ACTTTCA ACGTTCA GAGAGAA GATTGCA WI920004GATTGCC GATTGCC ACTAGAA ACTAGAA WI920353ACTAGAA GATTGCC ACTAGAA GATTGCC WI920410ACTAGAA GTCAGAA GAGTACC ACTTTCA WI920007ACTAGAA ACTTTCA GAATGCA GTTAGAC WI920015ACTTTCA ACGTTCA GTCAGAA GATTGCC WI920426ACTAGAA GTCATCA GTCATCA ACTAGAA WI920433ACTAGAA GTCAGAA GTCTGCA ACTTGCA WI920370GATTGCC GAGTGCA GATTGCA ACTAGAA WI920437GTCATCA GTCAGAA GATTGCC ACTTTCA WI920027GATTGCC GAGTGCA GATTACC GATTGCC WI920415GATTGCC GAGTGCA GTCATCA ACT WI920023ACTTTCA GTCAGAA GAGATCA GATTGCC WI920360GATTGCC GTCATCA GATTGCA ACTTTCA WI920017GATTGCC GTCAGAA GATTTCC ACTAGCA WI920028GTCATCA GTCAGAA GATTGCC ACTTGCA WI920361GATTGCC GAGTGCA GTCAGCA GATTGCC WI920367GATTGCC GATTGCC GTCATCA GATTGCC WI920366GATTGCC GATTGCC GTCTGCA GTCTGCA WI920365GATTGCC GAGTGCA GTCAGCA GTTTGCC WI920362GATTGCC GAGTGCA GATTGCA ACTAGAA WI920441GATTGCC GAGTGCA GATTGCA ACTAGAA WI920022GTCATCA GTCAGAA GATTGCC ACTTTCA WI920356GTCATCA GTCAGAA GATTGCC ACTTTCA WI920355GATTGCC GATTGCC GATTGCC GATTGCC WI920423GATTGCC GATTGCC GATTGCC GATTGCC WI920021GATTGCC GATTGCC GTCAGCA GATTGCC WI920359GATTGCC GATTGCC GTCAGCA GATTGCC WI920024GATTGCC GATTGCC GATTGCC GATTGCC WI920369GATTGCC GATTGCC GATTGCC GATTGCC WI920416GATTGCC GATTGCC GATTGCC GATTGCC WI920427GATTGCC GATTGCC GATTGCC GATTGCC WI920428GATTGCC GATTGCC GATTGCC GATTGCC WI920431GATTGCC GATTGCC GATTGCC GATTGCC WI920001GATTGCC GTCATCA GTCATCA GATTGCC WI920010GATTGCC GTCATCA GTCATCA GATTGCC WI920349GATTGCC GTCATCA GTCATCA GATTGCC WI920363GATTGCC GTCATCA GTCATCA GATTGCC WI920417GATTGCC GTCATCA GTCATCA GATTGCC WI920430GATTGCC GTCATCA GTCATCA GATTGCC __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] improvement
Hi, if i just want a vector filled with names which has length(index) 0. For example if nombreC - c(Juan, Carlos, Ana, María) nombreL - c(Juan Campo, Carlos Gallardo, Ana Iglesias, María Bacaldi, Juan Grondona, Dario Grandineti, Jaime Acosta, Lourdes Serrano) I would like to obtain a matrix called vaca with two column, name and index, name is nombreC's element and index is the position in nombreL, I don't want info about nombreC which no appear in nombreL. And I would like to count how many cases appear. For example vaca: name index 1 Juan1 2 Juan5 3 Carlos 2 4 Ana 3 5 María 4 Code is it: vaca - do.call(rbind,lapply(noquote(nombreC),function(.name) { index - grep(.name,noquote(nombreL)) index - if( length(index) 0) index else 0 data.frame(name=.name,index=index) })) vaca - vaca[vaca$index0,] cuenta - nrow(vaca) Thanks, Sebastia´n 2010/5/11 markle...@verizon.net: Hi: I added another column to make the output more understandable. I hope it helps. do.call(rbind,lapply(nombreC,function(.name) { index - grep(.name,nombreL) nummatches - if (length(index) 0) length(index) else 0 index - if( length(index) 0) index else 0 data.frame(name=.name,index=index,nummatches=nummatches) })) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Location attribute
On 16-May-10 06:11:52, Agustín Muñoz M. (AMFOR) wrote: hi everybody, a question, as I can know the location (number) of an attribute with its name. Ej. X1 X2 X3 X4 X5 X6 1 3 5 2 1 7 6 7 4 5 2 9 as I can know that the attribute X4 is in position 4 I hope you can help me from already thank you very much to all AgustÃn You can use the function colnames(), with either a matrix or a dataframe, to extract (or set) the column names: X - matrix(c( 1,3,5,2,1,7,6,7,4,5,2,9), byrow=TRUE,nrow=2) colnames(X) - c(X1,X2,X3,X4,X5,X6) X # X1 X2 X3 X4 X5 X6 # [1,] 1 3 5 2 1 7 # [2,] 6 7 4 5 2 9 colnames(X) # [1] X1 X2 X3 X4 X5 X6 which(colnames(X)==X4) # [1] 4 X - data.frame(X1=c(1,6),X2=c(3,7),X3=c(5,4), X4=c(2,5),X5=c(1,2),X6=c(7,9)) X # X1 X2 X3 X4 X5 X6 # 1 1 3 5 2 1 7 # 2 6 7 4 5 2 9 colnames(X) # [1] X1 X2 X3 X4 X5 X6 which(colnames(X)==X4) # [1] 4 So, in either case, which(colnames(X)==X4) will give the result you want. Hoping this helps, Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 16-May-10 Time: 09:42:28 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] p value
Hi Soham, I don't feel your question is well defined. But an equally ill defined answer would be: Through a permutation test. Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Sat, May 15, 2010 at 7:04 PM, Soham soham.tommarvolorid...@gmail.comwrote: How to compute the p-value of a statistic generally? -- View this message in context: http://r.789695.n4.nabble.com/p-value-tp2217867p2217867.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with making multiple plots (geom_pointrange) in a loop (ggplot2)
Hi, On 16 May 2010 03:31, michael westphal mi_westp...@yahoo.com wrote: [ snipped ] Any suggestions? i'd suggest you - read the posting guide - upgrade your R to the latest version - don't post to two mailing lists - make your example minimal, self-contained, reproducible - show the result of sessionInfo() HTH, baptiste [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RODBC-Error-sqlSave
Dear R-community, After repeating the sqlSave-command 3 times on a dataframe (of size 13149 rows * 5 columns) to my MS-Access database I get the following error: *Error in sqlSave(channel, eksport_transp_acc_2, transp_acc_scenarier, : unable to append to table transp_acc_scenarier* ** This means that the first 2 savings are completed, but the third-one is somehow not. I have an idea that perhaps it is due to some out-of-memory problem. My PC has 2 CPUs, 1.83 G Hz, 0.99 GB RAM. Have anyone got some idea of what causes and solves the problem? I have tried also with the function *gc()*, but without success. Thanks in advance, Best regards, Johan PS: I use the following code, where the file *eksport_transp_acc_2_rbind.csv* is of size 13149*5: *library(RODBC)* ** *eksport_transp_acc_2 - read.table(file = results/csv/eksport_transp_acc_2_rbind.csv, sep =;, header = T)* ** *sqlSave(channel,eksport_transp_acc_2, transp_acc_scenarier,append = T,fast = F,rownames = F) * -- Johan Lassen In the cities people live in time - in the mountains people live in space [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sample
On 2010-05-16 2:23, Laetitia Schmid wrote: Hi, I am sampling two random columns from females and two random columns from males to produce tetraploid offspring. For every female I am sampling a random male. In the end I want to write out a a matrix with all the offspring, but that does not work. I get always only the offspring from the last females. There must be a mistake in my script: moms-read.delim(females.txt, stringsAsFactors=FALSE,header=TRUE) dads-read.delim(males.txt, stringsAsFactors=FALSE,header=TRUE) output_offspring-data.frame() for (i in 1:nrow(moms)){ rdad=sample(1:nrow(dads),1) kid-c(sample(moms[i,2:5],2),sample(dads[rdad,2:5],2)) output_offspring-rbind(output_offspring,c(moms$SampleID[i],dads$SampleID[rdad],kid)) } (When I run your code, I get an error.) It's always best to pre-assign your output to have the desired dimensions and then fill in the cells: output_offspring - as.data.frame(matrix(, nrow=nrow(moms), ncol=6), stringsAsFactors=FALSE) for (i in 1:nrow(moms)){ rdad - sample(1:nrow(dads),1) kid - c(sample(moms[i,2:5],2), sample(dads[rdad,2:5],2)) output_offspring[i,] - c(moms$SampleID[i], dads$SampleID[rdad], kid) } Personally, I would work with matrices, since all of your data are string variables. -Peter Ehlers write.table(output_offspring,offspring_7.txt,row.names=T,col.names=T,quote=F) females.txt: SampleID A1 A2 A3 A4 GM920222 GATTGCC GATTGCC GATAGAC GATAGAC GM930040 GTCATCA GAGTGCA ACTATAA GATTGCC GM930040 GTCATCA GAGTGCA ACTATAA GATTGCC GM960023 GATTGCC GTCATCA GATTGCC GATTGCC GM920224 ACTAGAA GTCATCA GTCATCA ACTAGAA GM920224 ACTAGAA GTCATCA GTCATCA ACTAGAA GM920034 GATTGCC GTCATCA GATTGCA GATTGCA GM920096 GATTGCC GATTGCC GATTGCA GATTGCC GM930029 GTCATCA GATTGCC GTCATCA GATTGCC GM940031 GATTGCC GAGTGCA GATTGCA ACTAGAA GM960028 GATTGCC GAGTGCA GATTGCA ACTAGAA GM980007 GTCATCA GATTGCC ACTTGAA GTCATCA GM970009 ACTAGAA GTCAGAA GTCAGCA ACTAGCA GM930026 ACTAGAA GAGTGCA GAGTGCA ACTAGAA GM920031 GATTGCC GTCATCA GATTGCC GATTGCC GM990105 GATTGCC GATTGCC GTCAGCA GTCAGCA GM920202 GATTGCC GATTGCC GATTGCC GATTGCC GM920089 GAGTGCA GTCAGAA ACTATCA GATTGCC GM980051 ACTAGAA ACTAGAA GATAGCC GATAGCC GM930109 GTCATCA GAGTGCA GAA ACTAGAA GM940039 GTCATCA GAGTGCA GTTTGCC ACTTTCA GM050099 GAGTGCA GTCAGAA GTTATCC ACTTTCA GM050099 GAGTGCA GTCAGAA GTTATCC ACTTTCA GM030005 ACTAGAA GAGTGCA ACTAGAA ACTAGAA GM050009 ACTAGAA GATTGCC GATTGCC ACTAGAA GM990027 GATTGCC GAGTGCA GATTGCA GATTGCC GM990066 GATTGCC GTCATCA GTCATCA GATTGCC males.txt: SampleID A1 A2 A3 A4 WI920425 ACTAGAA ACCATCA ACTAGAA ACTAGAA WI920408 ACTAGAA ACTAGAA ACTAGAA ACTAGAA WI920009 ACTAGAA ACTAGAA ACTAGAA GATTGCC WI920352 ACTTTCA ACGTTCA GAGAGAA GATTGCA WI920004 GATTGCC GATTGCC ACTAGAA ACTAGAA WI920353 ACTAGAA GATTGCC ACTAGAA GATTGCC WI920410 ACTAGAA GTCAGAA GAGTACC ACTTTCA WI920007 ACTAGAA ACTTTCA GAATGCA GTTAGAC WI920015 ACTTTCA ACGTTCA GTCAGAA GATTGCC WI920426 ACTAGAA GTCATCA GTCATCA ACTAGAA WI920433 ACTAGAA GTCAGAA GTCTGCA ACTTGCA WI920370 GATTGCC GAGTGCA GATTGCA ACTAGAA WI920437 GTCATCA GTCAGAA GATTGCC ACTTTCA WI920027 GATTGCC GAGTGCA GATTACC GATTGCC WI920415 GATTGCC GAGTGCA GTCATCA ACT WI920023 ACTTTCA GTCAGAA GAGATCA GATTGCC WI920360 GATTGCC GTCATCA GATTGCA ACTTTCA WI920017 GATTGCC GTCAGAA GATTTCC ACTAGCA WI920028 GTCATCA GTCAGAA GATTGCC ACTTGCA WI920361 GATTGCC GAGTGCA GTCAGCA GATTGCC WI920367 GATTGCC GATTGCC GTCATCA GATTGCC WI920366 GATTGCC GATTGCC GTCTGCA GTCTGCA WI920365 GATTGCC GAGTGCA GTCAGCA GTTTGCC WI920362 GATTGCC GAGTGCA GATTGCA ACTAGAA WI920441 GATTGCC GAGTGCA GATTGCA ACTAGAA WI920022 GTCATCA GTCAGAA GATTGCC ACTTTCA WI920356 GTCATCA GTCAGAA GATTGCC ACTTTCA WI920355 GATTGCC GATTGCC GATTGCC GATTGCC WI920423 GATTGCC GATTGCC GATTGCC GATTGCC WI920021 GATTGCC GATTGCC GTCAGCA GATTGCC WI920359 GATTGCC GATTGCC GTCAGCA GATTGCC WI920024 GATTGCC GATTGCC GATTGCC GATTGCC WI920369 GATTGCC GATTGCC GATTGCC GATTGCC WI920416 GATTGCC GATTGCC GATTGCC GATTGCC WI920427 GATTGCC GATTGCC GATTGCC GATTGCC WI920428 GATTGCC GATTGCC GATTGCC GATTGCC WI920431 GATTGCC GATTGCC GATTGCC GATTGCC WI920001 GATTGCC GTCATCA GTCATCA GATTGCC WI920010 GATTGCC GTCATCA GTCATCA GATTGCC WI920349 GATTGCC GTCATCA GTCATCA GATTGCC WI920363 GATTGCC GTCATCA GTCATCA GATTGCC WI920417 GATTGCC GTCATCA GTCATCA GATTGCC WI920430 GATTGCC GTCATCA GTCATCA GATTGCC __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] abline limit constrain x-range how?
On 05/16/2010 12:03 AM, Giovanni Azua wrote: Hello, I managed to linearize my LDA decision boundaries now I would like to call abline three times but be able to specify the exact x range. I was reading the doc but it doesn't seem to support this use-case? are there alternatives. The reason why I use abline is because I first call plot to plot all the three datasets and then call abline to append these decision boundary lines to the existing plot ... Hi Giovanni, Try the ablineclip function in the plotrix package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Normalizing plot tick values
On 05/16/2010 03:10 AM, rajesh j wrote: Hi, I have a plot whole tick values along the axis have a certain range 0 - x . I need to normalize this range without changing my data files. for e.g., if my plot has tick values at 10,20,30,40,50... i have to make this 2,4,6, etc. but without changing the plot data... I am hoping I can add something to the plot command that goes like tick values divided by a quantity. Any help is appreciated. Hi Rajesh, I think the axis.mult function in the plotrix package will do what you want. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector recycling and zoo
On May 16, 2010, at 2:00 AM, Sean Carmody wrote: I am a bit confused about the different approaches taken to recycling in plain data frames and zoo objects. When carrying out simple arithmetic, dataframe seem to recycle single arguments, zoo objects do not. Here is an example x - data.frame(a=1:5*2, b=1:5*3) x a b 1 2 3 2 4 6 3 6 9 4 8 12 5 10 15 x$a/x$a[1] [1] 1 2 3 4 5 x - zoo(x) x$a/x$a[1] 1 1 I feel understanding this difference would lead me to a greater understanding of the zoo module! I think you do have misunderstandings about the zoo package but I do not think it is in the area of vector recycling. Notice the effect of your application of the zoo function to x: x$a 1 2 3 4 5 2 4 6 8 10 x$a[1] 1 2 You have in effect transposed the elements in x and are now getting a two element column vector when requesting x$a[1]. The term vector recycling is applied to situations where short vectors are reused starting with their first elements until the necessary length is achieved. For instance if you request: data.frame(x=1:2, y=letters[1:10]) x y 1 1 a 2 2 b 3 1 c 4 2 d 5 1 e 6 2 f 7 1 g 8 2 h 9 1 i 10 2 j Or plot(1:10, col=c(red,green)) Sean. -- Sean Carmody __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Loading Intraday Time Series Data
Hi, I am trying to load a data file that looks like this: |Date,Time,Open,High,Low,Close,Up,Down 05/02/2001,0030,421.20,421.20,421.20,421.20,11,0 05/02/2001,0130,421.20,421.40,421.20,421.40,7,0 05/02/2001,0200,421.30,421.30,421.30,421.30,0,5 05/02/2001,0230,421.60,421.60,421.50,421.50,26,1| etc. into an R timeseries or ts object. The key point is that both the date and time need to become part of the index. With zoo, this line will load the data: z - read.zoo(foo_hs.csv, format = %m/%d/%Y, sep=,, header = TRUE ) but the Time does not become part of the index this way. This means the index is non-unique, and that is not the goal. Could someone kindly show me a way, using R itself, to deal with the separate Date and Time columns so as to properly combine them into the index for the timeseries? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] number of location attribute with its name
On May 16, 2010, at 2:24 AM, Agustín Muñoz M. (AMFOR) wrote: hi everybody, a question, as I can know the location (number) of an attribute with its name. Ej. X1 X2 X3 X4 X5 X6 1 3 5 2 1 7 6 7 4 5 2 9 as I can know that the attribute X4 is in position 4 It is probably not an attribute in R terms, but rather a column name. Your English is a bit unclear as to what you want but perhaps you want: names(dataframe_name)[4] -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC-Error-sqlSave
Let us see if it is a R issue. Try this: Read the CSV on Ms Access directly. It is an importation on MsAccess. If you succeed we will check R then. Caveman On Sun, May 16, 2010 at 11:48 AM, Johan Lassen johanlas...@gmail.com wrote: Dear R-community, After repeating the sqlSave-command 3 times on a dataframe (of size 13149 rows * 5 columns) to my MS-Access database I get the following error: *Error in sqlSave(channel, eksport_transp_acc_2, transp_acc_scenarier, : unable to append to table ‘transp_acc_scenarier’* ** This means that the first 2 savings are completed, but the third-one is somehow not. I have an idea that perhaps it is due to some out-of-memory problem. My PC has 2 CPUs, 1.83 G Hz, 0.99 GB RAM. Have anyone got some idea of what causes and solves the problem? I have tried also with the function *gc()*, but without success. Thanks in advance, Best regards, Johan PS: I use the following code, where the file *eksport_transp_acc_2_rbind.csv* is of size 13149*5: *library(RODBC)* ** *eksport_transp_acc_2 - read.table(file = results/csv/eksport_transp_acc_2_rbind.csv, sep =;, header = T)* ** *sqlSave(channel,eksport_transp_acc_2, transp_acc_scenarier,append = T,fast = F,rownames = F) * -- Johan Lassen In the cities people live in time - in the mountains people live in space [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loading Intraday Time Series Data
In zoo the index= argument to read.zoo can be a vector of column indices to indicate that the time is split across multiple columns and the FUN= argument can be used to process the multiple columns. In this example the resulting z uses chron: L - Date,Time,Open,High,Low,Close,Up,Down 05/02/2001,0030,421.20,421.20,421.20,421.20,11,0 05/02/2001,0130,421.20,421.40,421.20,421.40,7,0 05/02/2001,0200,421.30,421.30,421.30,421.30,0,5 05/02/2001,0230,421.60,421.60,421.50,421.50,26,1 library(zoo) library(chron) f - function(x) chron(paste(x[,1]), sprintf(%04d00, x[,2]), format = c(M/D/Y, HMS)) # z - read.zoo(myfile.csv, index = 1:2, sep=,, header = TRUE, FUN = f) z - read.zoo(textConnection(L), index = 1:2, sep=,, header = TRUE, FUN = f) On Sun, May 16, 2010 at 7:22 AM, Steve Johns steve.jo...@verizon.net wrote: Hi, I am trying to load a data file that looks like this: |Date,Time,Open,High,Low,Close,Up,Down 05/02/2001,0030,421.20,421.20,421.20,421.20,11,0 05/02/2001,0130,421.20,421.40,421.20,421.40,7,0 05/02/2001,0200,421.30,421.30,421.30,421.30,0,5 05/02/2001,0230,421.60,421.60,421.50,421.50,26,1| etc. into an R timeseries or ts object. The key point is that both the date and time need to become part of the index. With zoo, this line will load the data: z - read.zoo(foo_hs.csv, format = %m/%d/%Y, sep=,, header = TRUE ) but the Time does not become part of the index this way. This means the index is non-unique, and that is not the goal. Could someone kindly show me a way, using R itself, to deal with the separate Date and Time columns so as to properly combine them into the index for the timeseries? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loading Intraday Time Series Data
Hi Steve, I think what you want to do is get a unique time-date from the first two columns. Try something like this: (changing the file name obviously. mydate should give you a time and date format that you can add to the existing data.frame. mydata - read.table(C:/rdata/dates.junk.csv, header=TRUE, sep=,, colClasses=c(character,character, numeric , numeric, numeric, numeric, numeric, numeric)) df1 - paste(mydata[,1], , mydata[,2]) mydates - strptime(df1, %d/%m/%Y %H%M) --- On Sun, 5/16/10, Steve Johns steve.jo...@verizon.net wrote: From: Steve Johns steve.jo...@verizon.net Subject: [R] Loading Intraday Time Series Data To: r-help@r-project.org Received: Sunday, May 16, 2010, 7:22 AM Hi, I am trying to load a data file that looks like this: |Date,Time,Open,High,Low,Close,Up,Down 05/02/2001,0030,421.20,421.20,421.20,421.20,11,0 05/02/2001,0130,421.20,421.40,421.20,421.40,7,0 05/02/2001,0200,421.30,421.30,421.30,421.30,0,5 05/02/2001,0230,421.60,421.60,421.50,421.50,26,1| etc. into an R timeseries or ts object. The key point is that both the date and time need to become part of the index. With zoo, this line will load the data: z - read.zoo(foo_hs.csv, format = %m/%d/%Y, sep=,, header = TRUE ) but the Time does not become part of the index this way. This means the index is non-unique, and that is not the goal. Could someone kindly show me a way, using R itself, to deal with the separate Date and Time columns so as to properly combine them into the index for the timeseries? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions about ggplot2
I started with the summarized data, and there are different ways to do this. For this example, let there be four columns and a corresponding sum of 1s. library(ggplot2) mydf - data.frame(colname = c(A,B,C,D),mycolsum=c(1:4)) p - ggplot(mydf,aes(x=colname,y=mycolsum)) p - p + geom_bar(stat = identity) # Here is one way a legend would be created, and how to remove it. library(ggplot2) mydf - data.frame(colname = c(A,B,C,D),mycolsum=c(1:4)) p - ggplot(mydf,aes(fill=colname, x=colname,y=mycolsum)) p - p + geom_bar(stat = identity) p + opts(legend.position = none) On Thu, May 13, 2010 at 11:33 AM, Christopher David Desjardins cddesjard...@gmail.com wrote: Hi I have two questions about using ggplot2. First, I have multiple columns of data that I would like to combine into one histogram where each column of data would correspond to one bar in the histogram. Each column has 0 or 1s and I want my bars in the histogram to correspond to the sum of the 1s in each column. Does that make sense? Second, is there a way to completely turn off the legend? Thanks! Chris PS - Please cc me on the email as I'm a digest subscriber. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector recycling and zoo
When you combine zoo objects with arithmetic it merges them using all = FALSE: library(zoo) x - data.frame(a=1:5*2, b=1:5*3) x - zoo(x); x a b 1 2 3 2 4 6 3 6 9 4 8 12 5 10 15 # these two are the same x$a/x$a[1] 1 1 m - merge(x$a, x$a[1], all = FALSE) m x$a x$a[1] 1 2 2 m[,1]/m[,2] 1 1 On Sun, May 16, 2010 at 3:00 AM, Sean Carmody seancarm...@gmail.com wrote: I am a bit confused about the different approaches taken to recycling in plain data frames and zoo objects. When carrying out simple arithmetic, dataframe seem to recycle single arguments, zoo objects do not. Here is an example x - data.frame(a=1:5*2, b=1:5*3) x a b 1 2 3 2 4 6 3 6 9 4 8 12 5 10 15 x$a/x$a[1] [1] 1 2 3 4 5 x - zoo(x) x$a/x$a[1] 1 1 I feel understanding this difference would lead me to a greater understanding of the zoo module! Sean. -- Sean Carmody Twitter: http://twitter.com/seancarmody Stable: http://mulestable.net/sean The Stubborn Mule Blog: http://www.stubbornmule.net Forum: http://mulestable.net/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Box-Cox Transformation: Drastic differences when varying added constants
Dear experts, I tried to learn about Box-Cox-transformation but found the following thing: When I had to add a constant to make all values of the original variable positive, I found that the lambda estimates (box.cox.powers-function) differed dramatically depending on the specific constant chosen. In addition, the correlation between the transformed variable and the original were not 1 (as I think it should be to use the transformed variable meaningfully) but much lower. With higher added values (and a right skewed variable) the lambda estimate was even negative and the correlation between the transformed variable and the original varible was -.91!!? I guess that is something fundmental missing in my current thinking about box-cox... Best, Holger P.S. Here is what i did: # Creating of a skewed variable X (mixture of two normals) x1 = rnorm(120,0,.5) x2 = rnorm(40,2.5,2) X = c(x1,x2) # Adding a small constant Xnew1 = X +abs(min(X))+ .1 box.cox.powers(Xnew1) Xtrans1 = Xnew1^.2682 #(the value of the lambda estimate) # Adding a larger constant Xnew2 = X +abs(min(X)) + 1 box.cox.powers(Xnew2) Xtrans2 = Xnew2^-.2543 #(the value of the lambda estimate) #Plotting it all par(mfrow=c(3,2)) hist(X) qqnorm(X) qqline(X,lty=2) hist(Xtrans1) qqnorm(Xtrans1) qqline(Xtrans1,lty=2) hist(Xtrans2) qqnorm(Xtrans2) qqline(Xtrans2,lty=2) #correlation among original and transformed variables round(cor(cbind(X,Xtrans1,Xtrans2)),2) -- View this message in context: http://r.789695.n4.nabble.com/Box-Cox-Transformation-Drastic-differences-when-varying-added-constants-tp2218490p2218490.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading JPEG file, converting to HEX
Colleagues, I am using R to assemble RTF documents (which are plain text). I need to embed a JPEG graphic that was created with R. I presume that the steps need to be: a. read the file into R b. convert the object to HEX format c. write the converted object to a textfile. If I read the file into R using readLines, I get the following (only the first 5 lines shown): readLines(/path/to/file) [1] \xff\xd8\xff\xe0 [2] \002\xa3\003\001\ [3] \v\xff\xc4 [4] \026\027\030\031\032%'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz\x83\x84\x85\x86\x87\x88\x89\x8a\x92\x93\x94\x95\x96\x97\x98\x99\x9a\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xff\xc4 [5] \v\xff\xc4 and I also receive a number of warning messages: Warning messages: 1: In encodeString(object, quote = \, na.encode = FALSE) : it is not known that wchar_t is Unicode on this platform I assume (naively) that I need some other approach to reading the file. I also presume (again, naively) that once I have read the file successfully, I can convert the contents to hex format. However, it is not obvious to me what approach should be used to read the file. I found the command read.jpeg (in rimage; of note, I needed to use the Windows version because the OSX version appears to be broken). However, this command creates an imagematrix, which does not appear to be what I need. Any thoughts? Thanks in advance. Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-866-PLessThan (1-866-753-7784) www.PLessThan.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] p value
runif(1) Bert Gunter Genentech Nonclinical Statistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Soham Sent: Saturday, May 15, 2010 9:05 AM To: r-help@r-project.org Subject: [R] p value How to compute the p-value of a statistic generally? -- View this message in context: http://r.789695.n4.nabble.com/p-value-tp2217867p2217867.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] function density (stats): parameter n
Dear R-list members, About the parameter n of the function density() (Kernel Density Estimation, package stats): The R HTML documentation says about the parameter n: the number of equally spaced points at which the density is to be estimated. When n 512, it is rounded up to the next power of 2 for efficiency reasons. Note: 512 is the default size for n. The code below: data - rnorm(500) d - density(data,n=800) length(d$x) produces this result: [1] 800 Here, according to the R HTML documentation, d$x gives the n coordinates of the points where the density is estimated. Of course, given that n=800, the next power of 2 would be 1024. With regard to the parameter n, does the R documentation match what function density() actually does? I am using R 2.11.0 running on Windows XP. Thank you very much. Paulo Barata -- Paulo Barata Fundacao Oswaldo Cruz - Oswaldo Cruz Foundation Rua Leopoldo Bulhoes 1480 - 8A 21041-210 Rio de Janeiro - RJ Brazil E-mail: pbar...@infolink.com.br Alternative e-mail: paulo.bar...@ensp.fiocruz.br __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot for linear discriminant
Hi Giovanni, Have a look at the classifly package for an alternative approach that works for all classification algorithms. If you provided a small reproducible example, I could step through it for you. Hadley On Sat, May 15, 2010 at 6:19 AM, Giovanni Azua brave...@gmail.com wrote: Hello, I have a labelled dataset with three classes. I have computed manually the LDA hyperplane that separate the classes from each other i.e. \hat{\delta}_j(x)=x^Tb_j + c_j where b_j \in \mathbb{R}^p and c_j \in \mathbb{R} my concrete b_j looks like e.g. b_j - rbind(1,2) c_j - 3 How can I plot y=x^Tb_j + c_j ?? two problems: 1- I need lines and the dimension of my x is 2 2- I would like the plotted lines to end when they intersect so they nicely show the decision boundaries Any pointers? maybe an example with ggplot2 I could not find any from the showcase documentation page ... Thanks in advance, Best regards, Giovanni __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading JPEG file, converting to HEX
Use readBin: readBin('/path/to/file', 'raw', n=100) Here is an example of reading in a JPEG file: x - readBin(fileName,'raw',n=100) str(x) raw [1:801403] ff d8 ff e1 ... # convert to a HEX string (on the first 20 bytes) paste(sprintf(%s, x[1:20]), collapse='') [1] ffd8ffe1fffe457869664d4d002a0008 On Sun, May 16, 2010 at 11:13 AM, Dennis Fisher fis...@plessthan.comwrote: Colleagues, I am using R to assemble RTF documents (which are plain text). I need to embed a JPEG graphic that was created with R. I presume that the steps need to be: a. read the file into R b. convert the object to HEX format c. write the converted object to a textfile. If I read the file into R using readLines, I get the following (only the first 5 lines shown): readLines(/path/to/file) [1] \xff\xd8\xff\xe0 [2] \002\xa3\003\001\ [3] \v\xff\xc4 [4] \026\027\030\031\032%'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz\x83\x84\x85\x86\x87\x88\x89\x8a\x92\x93\x94\x95\x96\x97\x98\x99\x9a\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xff\xc4 [5] \v\xff\xc4 and I also receive a number of warning messages: Warning messages: 1: In encodeString(object, quote = \, na.encode = FALSE) : it is not known that wchar_t is Unicode on this platform I assume (naively) that I need some other approach to reading the file. I also presume (again, naively) that once I have read the file successfully, I can convert the contents to hex format. However, it is not obvious to me what approach should be used to read the file. I found the command read.jpeg (in rimage; of note, I needed to use the Windows version because the OSX version appears to be broken). However, this command creates an imagematrix, which does not appear to be what I need. Any thoughts? Thanks in advance. Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-866-PLessThan (1-866-753-7784) www.PLessThan.com http://www.plessthan.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to profile R interpreter?
Look for Rprof in the utils package. On 5/12/2010 9:22 PM, xiaoming gu wrote: Hi, all. Does anyone know how to profile R interpreter? I've tried gprof but it doesn't work. Thanks. Xiaoming -- Erich Neuwirth, University of Vienna Faculty of Computer Science Center for Computer Science Didactics and Learning Research Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-39902 Fax: +43-1-4277-39459 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discretize factors?
Thanks, That gives me exactly what I'm looking for. Two quick questions: 1) What would be the fastest way to do this if I have other continuous data as well. For example, I have a data frame with 10 variable and want to discretize one of them using this method. (Say, column 6 for example.) I thought something like this would work, but it gives me an error: new.data - rbind(data[,1:5], model.matrix(~0+data[,6]), data[,7:10]) Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match 2) What exactly is it doing? It appears as if it is a formula similar to lm, but not actually doing any regression? Thanks again! -N On 5/15/10 11:17 AM, Thomas Stewart wrote: Maybe this? group - factor(c(A, B,B,C,C,C)) model.matrix(~0+group) -tgs On Sat, May 15, 2010 at 2:02 PM, Noah Silverman n...@smartmediacorp.com mailto:n...@smartmediacorp.com wrote: Hi, I'm looking for an easy way to discretize factors in R I've noticed that the lm function does this automatically with a nice result. If I have group - c(A, B,B,C,C,C) and run: lm(result ~ x1 + group) The lm function has split the group into separate binary variables {0,1} before performing the regression. I now have: groupA groupB groupC Some of the other models that I want to try won't accept factors, so they need to be discretized this way. Is there a command in R for this, or some easy shortcut? (I tried digging into the lm code, but couldn't find where this is being done.) Thanks! -N __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Box-Cox Transformation: Drastic differences when varying added constants
On 2010-05-16 6:22, Holger Steinmetz wrote: Dear experts, I tried to learn about Box-Cox-transformation but found the following thing: When I had to add a constant to make all values of the original variable positive, I found that the lambda estimates (box.cox.powers-function) differed dramatically depending on the specific constant chosen. Let's say that x is such that 1/x has a Normal distribution, i.e. lambda = -1. Then y = (1/x) + b also has a Normal distribution. But you're expecting 1/(x+b) to also have a Normal distribution. In addition, the correlation between the transformed variable and the original were not 1 (as I think it should be to use the transformed variable meaningfully) but much lower. Again, your expectation is faulty. The relationship between the original and transformed variables is not linear (otherwise, why do the transformation?), but cor() computes the Pearson correlation coefficient by default. Try method='spearman'. Better yet, plot the transformed variables vs the original variable for further enlightenment. -Peter Ehlers With higher added values (and a right skewed variable) the lambda estimate was even negative and the correlation between the transformed variable and the original varible was -.91!!? I guess that is something fundmental missing in my current thinking about box-cox... Best, Holger P.S. Here is what i did: # Creating of a skewed variable X (mixture of two normals) x1 = rnorm(120,0,.5) x2 = rnorm(40,2.5,2) X = c(x1,x2) # Adding a small constant Xnew1 = X +abs(min(X))+ .1 box.cox.powers(Xnew1) Xtrans1 = Xnew1^.2682 #(the value of the lambda estimate) # Adding a larger constant Xnew2 = X +abs(min(X)) + 1 box.cox.powers(Xnew2) Xtrans2 = Xnew2^-.2543 #(the value of the lambda estimate) #Plotting it all par(mfrow=c(3,2)) hist(X) qqnorm(X) qqline(X,lty=2) hist(Xtrans1) qqnorm(Xtrans1) qqline(Xtrans1,lty=2) hist(Xtrans2) qqnorm(Xtrans2) qqline(Xtrans2,lty=2) #correlation among original and transformed variables round(cor(cbind(X,Xtrans1,Xtrans2)),2) -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discretize factors?
Update, I have it working, but now its producing really ugly labels. Must be a small adjustment to the code. Any ideas?? ##Create example data.frame group - c(A, B,B,C,C,C) a - c(1,4,3,4,5,6) b - c(5,4,5,3,4,5) d - data.frame(cbind(a,b,group)) #create new frame with discretized group cbind(d[,1:2], model.matrix(~0+d[,3]) ) a b d[, 3]A d[, 3]B d[, 3]C 1 1 5 1 0 0 2 4 4 0 1 0 3 3 5 0 1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 So, as you can see, it works, but the labels for the groups don't I then tried using the column name instead of number and still got ugly results: cbind(d[,1:2], model.matrix(~0+d[,group]) ) a b d[, group]A d[, group]B d[, group]C 1 1 5 1 0 0 2 4 4 0 1 0 3 3 5 0 1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 Any ideas? -N On 5/15/10 11:02 AM, Noah Silverman wrote: Hi, I'm looking for an easy way to discretize factors in R I've noticed that the lm function does this automatically with a nice result. If I have group - c(A, B,B,C,C,C) and run: lm(result ~ x1 + group) The lm function has split the group into separate binary variables {0,1} before performing the regression. I now have: groupA groupB groupC Some of the other models that I want to try won't accept factors, so they need to be discretized this way. Is there a command in R for this, or some easy shortcut? (I tried digging into the lm code, but couldn't find where this is being done.) Thanks! -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splines under tension
Thank you for the helpful direction to the smoothing splines function, it was very helpful and is exactly what i am trying to do. My data however is 3-D, i.e. i have x and y values which are coordinates for different field sites and z values which are really what I am interested in analysing with interpolation. This has posed a problem with many of the spline functions in R. Even if i input my coordinate data as a matrix as my 'x' value and my site data as my 'y' values i get the following error: Error in xy.coords(x, y) : 'x' and 'y' lengths differ I have made sure that there are the same amount of values and that they are all of the same type, i.e. numeric but with little luck and i am a bit lost as to what to try next. Does anyone have any suggestions? Thanks, Sam -- View this message in context: http://r.789695.n4.nabble.com/Splines-under-tension-tp2173887p2218693.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot for linear discriminant
Hello Hadley, Thank you very much for your help! I have just received your book btw :) On May 16, 2010, at 6:16 PM, Hadley Wickham wrote: Hi Giovanni, Have a look at the classifly package for an alternative approach that works for all classification algorithms. If you provided a small reproducible example, I could step through it for you. Hadley Please find below a self contained example. I managed to complete the task using the graphics package. I would be curious to see how to get one of those really nice ggplot2 graphs with decision boundaries and class regions :) Thank you! Best regards, Giovanni # = # (1) Generate sample labelled data # = rm(list=ls()) # clear workspace library(mvtnorm) # needed for rmvnorm set.seed(11) # predictability of results sigma - cbind(c(0.5, 0.3), c(0.3, 0.5)) # true covariance matrix mu - matrix(0,nrow=3,ncol=2) mu[1,] - c(3, 1.5) # true mean vectors mu[2,] - c(4, 4) mu[3,] - c(8.5, 2) x - matrix(0, nrow = 300, ncol = 3) x[,3] - rep(1:3, each = 100) # class labels x[1 :100,1:2] - rmvnorm(n = 100, mean = mu[1,], sigma = sigma) # simulate data x[101:200,1:2] - rmvnorm(n = 100, mean = mu[2,], sigma = sigma) x[201:300,1:2] - rmvnorm(n = 100, mean = mu[3,], sigma = sigma) # = # (2) Plot the labelled data # = ## ## Function for plotting the data separated by classes, hacked out of predplot: ## http://stat.ethz.ch/teaching/lectures/FS_2010/CompStat/predplot.R ## plotclasses - function(x, main = , len = 200, ...) { xp - seq(min(x[,1]), max(x[,1]), length=len) yp - seq(min(x[,2]), max(x[,2]), length=len) grid - expand.grid(xp, yp) colnames(grid) - colnames(x)[-3] plot(x[,1],x[,2],col=x[,3],pch=x[,3],main=main,xlab='x_1',ylab='x_2') text(2.5,4.8,Class 1,cex=.8) # class 1 text(4.2,1.0,Class 2,cex=.8) # class 2 text(8.0,0.5,Class 3,cex=.8) # class 3 } plotclasses(x) # = # (3) Functions needed: calculate separating hyperplane between two given # classes and converting hyperplanes to line equations for the p=2 case # = ## ## Returns the coefficients for the hyperplane that separates one class from another. ## Computes the coefficients according to the formula: ## $x^T\hat{\Sigma}^{-1}(\hat{\mu}_0-\hat{\mu}_1) - \frac{1}{2}(\hat{\mu}_0 + ## \hat{\mu}_1)^T\hat{\Sigma}^{-1}(\hat{\mu}_0-\hat{\mu}_1)+\log(\frac{p_0}{p_1})$ ## ## sigmainv(DxD) - precalculated sigma (covariance matrix) inverse ## mu1(1xD) - precalculated mu mean for class 1 ## mu2(1xD) - precalculated mu mean for class 2 ## prior1 - precalculated prior probability for class 1 ## prior2 - precalculated prior probability for class 2 ## ownldahyperplane - function(sigmainv,mu1,mu2,prior1,prior2) { J - nrow(mu) # number of classes b - sigmainv%*%(mu1 - mu2) c - -(1/2)*t(mu1 + mu2)%*%sigmainv%*%(mu1 - mu2) + log(prior1/prior2) return(list(b=b,c=c)) } ## ## Returns linear betas (intersect and slopes) for the given hyperplane structure. ## The structure is a list that matches the output of the function defined above. ## ownlinearize - function(sephyp) { return(list(beta0=-sephyp$c/sephyp$b[2], # line slope and intersect beta1=-sephyp$b[1]/sephyp$b[2])) } # = # (4) Run lda # = library(MASS) # needed for lda/qda # read in a function that plots
Re: [R] Attempt to customise the plotpc() function
Nikos Alexandris: Among the (R-)tools, I've seen on the net, for (bivariate) Principal Component scatter plots (+histograms), plotpc [1] is the one I like most. [...] I started the modification by attempting first to get a prcomp version of plotpc() (named it plotpc.svd()) by altering the following: [...] I am bit lost now about where I should continue looking for required modifications in the code. Any hints? Once again I am replying to myself ;-) I've spend many-many hours of searching in manuals, on the net and trying crazy things to understand where this mystical (to me) function un() could be sourced from. Eventually I decided to change all of its occurrences with unit() as I am sure this was the function meant to be used but I hit another wall, another unknown function my.plot.something(). I was then sure that they are sourced from somewhere. At some point I was enlightened and had a look in the source plotpc.R where I just found what I was expecting to found ;-). It may look silly but to the unexperienced useR it is not. I was only looking at the print-out of plotpc (without the parentheses) and was puzzled that the un() function is nowhere. - Question: why isn't the whole source of plotpc.R printed out with plotpc? Or why isn't any clue given where this un() function is coming from? methods() and getAnywhere() say anything about it. * Note to self: look at the Source.R Anyhow, it was a long learning-night-session and I am good to go. In fact, I am very close to what I want to do. I've added the option to use either prcomp or princomp as well as if the input dataset should be centered and/or scaled. I have still some strange issue with respect to how the plotted histograms (of PC1 and PC2) along with their text are flipped. Hopefully I'll fix this too. I will be contacting soon the author to post him my modifications as enhancement wishes. If anybody is interested to know or help please post here. Regards, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Attempt to customise the plotpc() function
Nikos, I think you can just replace the line pc - princomp(x[,1:2], scores=TRUE, na.action=na.fail) with pc - prcomp(x[,1:2], retx=TRUE, center=pc.center, scale.=pc.scale, na.action=na.fail) and rename the components of pc names(pc) - c('sdev', 'loadings', 'center', 'scale', 'scores') and then use the rest of the plotpc() code as is (except for maybe having to use flip1=TRUE, etc). As to why other functions used in plotpc() are not printed when you ask R to print plotpc(): why should they be? Can you imagine the mess that would result if you got the printouts of is.na(), pushViewport, popViewport, ...? Egad! Anyway, as you've discovered, when you want to modify code, look at the sources. -Peter Ehlers On 2010-05-16 12:05, Nikos Alexandris wrote: Nikos Alexandris: Among the (R-)tools, I've seen on the net, for (bivariate) Principal Component scatter plots (+histograms), plotpc [1] is the one I like most. [...] I started the modification by attempting first to get a prcomp version of plotpc() (named it plotpc.svd()) by altering the following: [...] I am bit lost now about where I should continue looking for required modifications in the code. Any hints? Once again I am replying to myself ;-) I've spend many-many hours of searching in manuals, on the net and trying crazy things to understand where this mystical (to me) function un() could be sourced from. Eventually I decided to change all of its occurrences with unit() as I am sure this was the function meant to be used but I hit another wall, another unknown function my.plot.something(). I was then sure that they are sourced from somewhere. At some point I was enlightened and had a look in the source plotpc.R where I just found what I was expecting to found ;-). It may look silly but to the unexperienced useR it is not. I was only looking at the print-out of plotpc (without the parentheses) and was puzzled that the un() function is nowhere. - Question: why isn't the whole source of plotpc.R printed out with plotpc? Or why isn't any clue given where this un() function is coming from? methods() and getAnywhere() say anything about it. * Note to self: look at the Source.R Anyhow, it was a long learning-night-session and I am good to go. In fact, I am very close to what I want to do. I've added the option to use either prcomp or princomp as well as if the input dataset should be centered and/or scaled. I have still some strange issue with respect to how the plotted histograms (of PC1 and PC2) along with their text are flipped. Hopefully I'll fix this too. I will be contacting soon the author to post him my modifications as enhancement wishes. If anybody is interested to know or help please post here. Regards, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discretize factors?
On 2010-05-16 11:06, Noah Silverman wrote: Update, I have it working, but now its producing really ugly labels. Must be a small adjustment to the code. Any ideas?? ##Create example data.frame group- c(A, B,B,C,C,C) a- c(1,4,3,4,5,6) b- c(5,4,5,3,4,5) d- data.frame(cbind(a,b,group)) #create new frame with discretized group cbind(d[,1:2], model.matrix(~0+d[,3]) ) a b d[, 3]A d[, 3]B d[, 3]C 1 1 5 1 0 0 2 4 4 0 1 0 3 3 5 0 1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 So, as you can see, it works, but the labels for the groups don't I then tried using the column name instead of number and still got ugly results: cbind(d[,1:2], model.matrix(~0+d[,group]) ) a b d[, group]A d[, group]B d[, group]C 1 1 5 1 0 0 2 4 4 0 1 0 3 3 5 0 1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 Any ideas? Can't you just use names(...) - c() on your final dataframe? -Peter Ehlers -N On 5/15/10 11:02 AM, Noah Silverman wrote: Hi, I'm looking for an easy way to discretize factors in R I've noticed that the lm function does this automatically with a nice result. If I have group- c(A, B,B,C,C,C) and run: lm(result ~ x1 + group) The lm function has split the group into separate binary variables {0,1} before performing the regression. I now have: groupA groupB groupC Some of the other models that I want to try won't accept factors, so they need to be discretized this way. Is there a command in R for this, or some easy shortcut? (I tried digging into the lm code, but couldn't find where this is being done.) Thanks! -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discretize factors?
I could, but with close to 100 columns, its messy. On 5/16/10 11:22 AM, Peter Ehlers wrote: On 2010-05-16 11:06, Noah Silverman wrote: Update, I have it working, but now its producing really ugly labels. Must be a small adjustment to the code. Any ideas?? ##Create example data.frame group- c(A, B,B,C,C,C) a- c(1,4,3,4,5,6) b- c(5,4,5,3,4,5) d- data.frame(cbind(a,b,group)) #create new frame with discretized group cbind(d[,1:2], model.matrix(~0+d[,3]) ) a b d[, 3]A d[, 3]B d[, 3]C 1 1 5 1 0 0 2 4 4 0 1 0 3 3 5 0 1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 So, as you can see, it works, but the labels for the groups don't I then tried using the column name instead of number and still got ugly results: cbind(d[,1:2], model.matrix(~0+d[,group]) ) a b d[, group]A d[, group]B d[, group]C 1 1 5 1 0 0 2 4 4 0 1 0 3 3 5 0 1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 Any ideas? Can't you just use names(...) - c() on your final dataframe? -Peter Ehlers -N On 5/15/10 11:02 AM, Noah Silverman wrote: Hi, I'm looking for an easy way to discretize factors in R I've noticed that the lm function does this automatically with a nice result. If I have group- c(A, B,B,C,C,C) and run: lm(result ~ x1 + group) The lm function has split the group into separate binary variables {0,1} before performing the regression. I now have: groupA groupB groupC Some of the other models that I want to try won't accept factors, so they need to be discretized this way. Is there a command in R for this, or some easy shortcut? (I tried digging into the lm code, but couldn't find where this is being done.) Thanks! -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Attempt to customise the plotpc() function
Peter Ehlers wrote: Nikos, I think you can just replace the line pc - princomp(x[,1:2], scores=TRUE, na.action=na.fail) with pc - prcomp(x[,1:2], retx=TRUE, center=pc.center, scale.=pc.scale, na.action=na.fail) and rename the components of pc names(pc) - c('sdev', 'loadings', 'center', 'scale', 'scores') Right. Υet, it is still not enough. I had to change the definition of the limits that feed viewport mainly because of the huge difference of an unscaled vs. scaled dataset before the pc-analysis takes place. Because I want to give (me) the option to have really informative plots, I've added an extra grid.points() in case the data are transformed (centered and/or scaled) to print both the original and the transformed (with another pch and/or color) point cloud. and then use the rest of the plotpc() code as is (except for maybe having to use flip1=TRUE, etc). Hmm... I am _now_ working on it to understand how I could make this automatic!. If I give flip1, flip2 (=TRUE) the histograms are located where they should (optically) be printed but the text (rotation angle) that accompanies the histogram is I think not correct. It is quite the opposite angle that is being printed. Any ideas? As to why other functions used in plotpc() are not printed when you ask R to print plotpc(): why should they be? Can you imagine the mess that would result if you got the printouts of is.na(), pushViewport, popViewport, ...? Egad! Thank you Peter. I understand it now. [ Ignorant me but if you don't know something you will probably do mistakes (which is after all the learning process. ] Anyway, as you've discovered, when you want to modify code, look at the sources. Thank you Peter. Kindest regards, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Attempt to customise the plotpc() function
Peter Ehlers wrote: and then use the rest of the plotpc() code as is (except for maybe having to use flip1=TRUE, etc). Nikos: Hmm... I am _now_ working on it to understand how I could make this automatic!. If I give flip1, flip2 (=TRUE) the histograms are located where they should (optically) be printed but the text (rotation angle) that accompanies the histogram is I think not correct. It is quite the opposite angle that is being printed. Any ideas? I was wrong. flip is about the location only and has nothing to do with the rotation angle of the principal components (right?). So it's ok. Maybe there is still a way to auto-define the flips? But it's not worth spending time on it... Thanks again, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC-Error-sqlSave
Thank you so much for pointing on this obvious check of the MS Access database! Inspired, I tried to import the csv-file directly into the MS Access database and I encountered an Error saying (freely translated from Danish) : Cannot find search key. The MS Access database is in MS Access-2000 format and I run MS office 2007 on my machine. Hence I tried to make a new MS Access-database in 2002-2003 format and did the same operations in R. With this new set-up for the database I had no problems at all saving the large dataframe from R to the new database. It did the saving of even much larger dataframes quickly. So somehow, setting the database up in 2002-2003 format solved the problem for me. Thank you very much! 2010/5/16 Orvalho Augusto orvaq...@gmail.com Let us see if it is a R issue. Try this: Read the CSV on Ms Access directly. It is an importation on MsAccess. If you succeed we will check R then. Caveman On Sun, May 16, 2010 at 11:48 AM, Johan Lassen johanlas...@gmail.com wrote: Dear R-community, After repeating the sqlSave-command 3 times on a dataframe (of size 13149 rows * 5 columns) to my MS-Access database I get the following error: *Error in sqlSave(channel, eksport_transp_acc_2, transp_acc_scenarier, : unable to append to table transp_acc_scenarier* ** This means that the first 2 savings are completed, but the third-one is somehow not. I have an idea that perhaps it is due to some out-of-memory problem. My PC has 2 CPUs, 1.83 G Hz, 0.99 GB RAM. Have anyone got some idea of what causes and solves the problem? I have tried also with the function *gc()*, but without success. Thanks in advance, Best regards, Johan PS: I use the following code, where the file *eksport_transp_acc_2_rbind.csv* is of size 13149*5: *library(RODBC)* ** *eksport_transp_acc_2 - read.table(file = results/csv/eksport_transp_acc_2_rbind.csv, sep =;, header = T)* ** *sqlSave(channel,eksport_transp_acc_2, transp_acc_scenarier,append = T,fast = F,rownames = F) * -- Johan Lassen In the cities people live in time - in the mountains people live in space [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Johan Lassen In the cities people live in time - in the mountains people live in space (Budistisk munk). [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to profile R interpreter?
Erich Neuwirth wrote: Look for Rprof in the utils package. This was already suggested- but the original poster clarified that he is looking to profile the R interpreter it's self, not R scripts. - Charlie Sharpsteen Undergraduate-- Environmental Resources Engineering Humboldt State University -- View this message in context: http://r.789695.n4.nabble.com/how-to-profile-R-interpreter-tp2196633p2218846.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GSA getting ngenes into the list
Greetings I suppose a simple matter for R experts but for me... Am using GSA and have the GSA.obj (class GSA) created with GSA that contains the value ngenes visible with GSA.obj$ngenes And I have the list object (class list) created with GSA.listsets that contains Table of negative sets with score, p and FDR Table of positive sets with same But the list object does not contain ngenes. How do I manipulate ngenes into the list? Suggestions appreciated. Thank you. -- Loren Engrav, MD Professor and Chief, Plastic Surgery, 1977-2001 Associate Director, Burn Center, 1977-2001 Univ Washington Seattle [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discretize factors?
Maybe this will lead you to an acceptable solution. Note that changed how the data set is created. (In your example, the numeric variables were being converted to factor variables. It seems to me that you want something different.) The key difference between my code and yours is that I use the variable name in the model matrix function; that is, I use ~0+grp instead of ~0+d[,3]. As seen below, this change creates non-ugly results. grp - c(A, B,B,C,C,C) a - c(1,4,3,4,5,6) b - c(5,4,5,3,4,5) d - data.frame(a=a,b=b,grp=grp) str(d) 'data.frame': 6 obs. of 3 variables: $ a : num 1 4 3 4 5 6 $ b : num 5 4 5 3 4 5 $ grp: Factor w/ 3 levels A,B,C: 1 2 2 3 3 3 d-cbind(d,model.matrix(~0+grp,data=d)) d a b grp grpA grpB grpC 1 1 5 A100 2 4 4 B010 3 3 5 B010 4 4 3 C001 5 5 4 C001 6 6 5 C001 str(d) 'data.frame': 6 obs. of 6 variables: $ a : num 1 4 3 4 5 6 $ b : num 5 4 5 3 4 5 $ grp : Factor w/ 3 levels A,B,C: 1 2 2 3 3 3 $ grpA: num 1 0 0 0 0 0 $ grpB: num 0 1 1 0 0 0 $ grpC: num 0 0 0 1 1 1 If you are trying to automate the process---convert factor variables to dummy variables without direct user input of variables names---you have several options. Here is a quick function I wrote that you may have to alter for your own needs. -tgs grp - c(A, B,B,C,C,C) sex-c(m,m,m,f,f,f) educ-c(none,some,some,grad,law,med) a - c(1,4,3,4,5,6) b - c(5,4,5,3,4,5) d - data.frame(a=a,b=b,grp=grp,sex=sex,educ=educ) Factors.to.dummies-function(data){ Factor.Flag-sapply(data,is.factor) formula-paste(~0+,paste(colnames(data)[Factor.Flag],collapse=+),sep=) data2-model.matrix(as.formula(formula),data=data) return(cbind(data,data2))} Factors.to.dummies(d) a b grp sex educ grpA grpB grpC sexm educlaw educmed educnone educsome 1 1 5 A m none1001 0 010 2 4 4 B m some0101 0 001 3 3 5 B m some0101 0 001 4 4 3 C f grad0010 0 000 5 5 4 C f law0010 1 000 6 6 5 C f med0010 0 100 On Sun, May 16, 2010 at 2:24 PM, Noah Silverman n...@smartmediacorp.comwrote: I could, but with close to 100 columns, its messy. On 5/16/10 11:22 AM, Peter Ehlers wrote: On 2010-05-16 11:06, Noah Silverman wrote: Update, I have it working, but now its producing really ugly labels. Must be a small adjustment to the code. Any ideas?? ##Create example data.frame group- c(A, B,B,C,C,C) a- c(1,4,3,4,5,6) b- c(5,4,5,3,4,5) d- data.frame(cbind(a,b,group)) #create new frame with discretized group cbind(d[,1:2], model.matrix(~0+d[,3]) ) a b d[, 3]A d[, 3]B d[, 3]C 1 1 5 1 0 0 2 4 4 0 1 0 3 3 5 0 1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 So, as you can see, it works, but the labels for the groups don't I then tried using the column name instead of number and still got ugly results: cbind(d[,1:2], model.matrix(~0+d[,group]) ) a b d[, group]A d[, group]B d[, group]C 1 1 5 1 0 0 2 4 4 0 1 0 3 3 5 0 1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 Any ideas? Can't you just use names(...) - c() on your final dataframe? -Peter Ehlers -N On 5/15/10 11:02 AM, Noah Silverman wrote: Hi, I'm looking for an easy way to discretize factors in R I've noticed that the lm function does this automatically with a nice result. If I have group- c(A, B,B,C,C,C) and run: lm(result ~ x1 + group) The lm function has split the group into separate binary variables {0,1} before performing the regression. I now have: groupA groupB groupC Some of the other models that I want to try won't accept factors, so they need to be discretized this way. Is there a command in R for this, or some easy shortcut? (I tried digging into the lm code, but couldn't find where this is being done.) Thanks! -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
Re: [R] Discretize factors?
And if you do have many variables in one dataframe, you might wish to construct the formula first using paste(): nm - c(0, names(d)[-c(1,2)]) fo - as.formula(paste(~, paste(nm, collapse= +))) d - cbind(d, model.matrix(fo, data=d) -Peter Ehlers On 2010-05-16 15:30, Thomas Stewart wrote: Maybe this will lead you to an acceptable solution. Note that changed how the data set is created. (In your example, the numeric variables were being converted to factor variables. It seems to me that you want something different.) The key difference between my code and yours is that I use the variable name in the model matrix function; that is, I use ~0+grp instead of ~0+d[,3]. As seen below, this change creates non-ugly results. grp- c(A, B,B,C,C,C) a- c(1,4,3,4,5,6) b- c(5,4,5,3,4,5) d- data.frame(a=a,b=b,grp=grp) str(d) 'data.frame': 6 obs. of 3 variables: $ a : num 1 4 3 4 5 6 $ b : num 5 4 5 3 4 5 $ grp: Factor w/ 3 levels A,B,C: 1 2 2 3 3 3 d-cbind(d,model.matrix(~0+grp,data=d)) d a b grp grpA grpB grpC 1 1 5 A100 2 4 4 B010 3 3 5 B010 4 4 3 C001 5 5 4 C001 6 6 5 C001 str(d) 'data.frame': 6 obs. of 6 variables: $ a : num 1 4 3 4 5 6 $ b : num 5 4 5 3 4 5 $ grp : Factor w/ 3 levels A,B,C: 1 2 2 3 3 3 $ grpA: num 1 0 0 0 0 0 $ grpB: num 0 1 1 0 0 0 $ grpC: num 0 0 0 1 1 1 If you are trying to automate the process---convert factor variables to dummy variables without direct user input of variables names---you have several options. Here is a quick function I wrote that you may have to alter for your own needs. -tgs grp- c(A, B,B,C,C,C) sex-c(m,m,m,f,f,f) educ-c(none,some,some,grad,law,med) a- c(1,4,3,4,5,6) b- c(5,4,5,3,4,5) d- data.frame(a=a,b=b,grp=grp,sex=sex,educ=educ) Factors.to.dummies-function(data){ Factor.Flag-sapply(data,is.factor) formula-paste(~0+,paste(colnames(data)[Factor.Flag],collapse=+),sep=) data2-model.matrix(as.formula(formula),data=data) return(cbind(data,data2))} Factors.to.dummies(d) a b grp sex educ grpA grpB grpC sexm educlaw educmed educnone educsome 1 1 5 A m none1001 0 010 2 4 4 B m some0101 0 001 3 3 5 B m some0101 0 001 4 4 3 C f grad0010 0 000 5 5 4 C f law0010 1 000 6 6 5 C f med0010 0 100 On Sun, May 16, 2010 at 2:24 PM, Noah Silvermann...@smartmediacorp.comwrote: I could, but with close to 100 columns, its messy. On 5/16/10 11:22 AM, Peter Ehlers wrote: On 2010-05-16 11:06, Noah Silverman wrote: Update, I have it working, but now its producing really ugly labels. Must be a small adjustment to the code. Any ideas?? ##Create example data.frame group- c(A, B,B,C,C,C) a- c(1,4,3,4,5,6) b- c(5,4,5,3,4,5) d- data.frame(cbind(a,b,group)) #create new frame with discretized group cbind(d[,1:2], model.matrix(~0+d[,3]) ) a b d[, 3]A d[, 3]B d[, 3]C 1 1 5 1 0 0 2 4 4 0 1 0 3 3 5 0 1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 So, as you can see, it works, but the labels for the groups don't I then tried using the column name instead of number and still got ugly results: cbind(d[,1:2], model.matrix(~0+d[,group]) ) a b d[, group]A d[, group]B d[, group]C 1 1 5 1 0 0 2 4 4 0 1 0 3 3 5 0 1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 Any ideas? Can't you just use names(...)- c() on your final dataframe? -Peter Ehlers -N On 5/15/10 11:02 AM, Noah Silverman wrote: Hi, I'm looking for an easy way to discretize factors in R I've noticed that the lm function does this automatically with a nice result. If I have group- c(A, B,B,C,C,C) and run: lm(result ~ x1 + group) The lm function has split the group into separate binary variables {0,1} before performing the regression. I now have: groupA groupB groupC Some of the other models that I want to try won't accept factors, so they need to be discretized this way. Is there a command in R for this, or some easy shortcut? (I tried digging into the lm code, but couldn't find where this is being done.) Thanks! -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] predict.lda breaks when priors are specified
Dear R help, What am I doing wrong here? when I don't specify the priors it works just fine but when I specify the priors it breaks. Does anyone know why and how I can fix it? N=2 ncontrol=ncases=50 X - as.matrix(rnorm(N,0,1)) eta - -5.3 + X * 1.7 p - exp(eta)/(1+exp(eta)) Y - rbinom(N,1,p) controls - sample(seq_len(N), ncontrol, prob=!Y) cases - sample(seq_len(N), ncases, prob=Y) data-rbind( + data.frame(Y = 0, X = cbind(1,X[controls,])), + data.frame(Y = 1, X = cbind(1,X[cases,]))) head(data) Y X.1 X.2 1 0 1 0.6965323 2 0 1 -0.0817520 3 0 1 2.8673412 4 0 1 -0.2351386 5 0 1 0.2653452 6 0 1 -1.2437612 m - lda(Y~X,subset=c(controls,cases),priors=c(.95,.05)) predict(m) Error in model.frame.default(formula = Y ~ X, priors = c(0.95, 0.05), : variable lengths differ (found for '(priors)') predict(m,prior=c(.95,0.05)) Error in model.frame.default(formula = Y ~ X, priors = c(0.95, 0.05), : variable lengths differ (found for '(priors)') --- Thanks, Andrew __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] predict.lda breaks when priors are specified
Never mind. Stupid misplaced 's'. -Andrew On Sun, May 16, 2010 at 5:39 PM, Andrew Redd ar...@stat.tamu.edu wrote: Dear R help, What am I doing wrong here? when I don't specify the priors it works just fine but when I specify the priors it breaks. Does anyone know why and how I can fix it? N=2 ncontrol=ncases=50 X - as.matrix(rnorm(N,0,1)) eta - -5.3 + X * 1.7 p - exp(eta)/(1+exp(eta)) Y - rbinom(N,1,p) controls - sample(seq_len(N), ncontrol, prob=!Y) cases - sample(seq_len(N), ncases, prob=Y) data-rbind( + data.frame(Y = 0, X = cbind(1,X[controls,])), + data.frame(Y = 1, X = cbind(1,X[cases,]))) head(data) Y X.1 X.2 1 0 1 0.6965323 2 0 1 -0.0817520 3 0 1 2.8673412 4 0 1 -0.2351386 5 0 1 0.2653452 6 0 1 -1.2437612 m - lda(Y~X,subset=c(controls,cases),priors=c(.95,.05)) predict(m) Error in model.frame.default(formula = Y ~ X, priors = c(0.95, 0.05), : variable lengths differ (found for '(priors)') predict(m,prior=c(.95,0.05)) Error in model.frame.default(formula = Y ~ X, priors = c(0.95, 0.05), : variable lengths differ (found for '(priors)') --- Thanks, Andrew __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector recycling and zoo
Thanks David, You comment made me realise that whereas when x is a data frame, x$a is a numeric vector, when x is of class zoo, x$a is also of class zoo, so the following does what I was expecting: x$a/as.numeric(x$a[1]) Sean. On Sun, May 16, 2010 at 9:25 PM, David Winsemius dwinsem...@comcast.netwrote: On May 16, 2010, at 2:00 AM, Sean Carmody wrote: I am a bit confused about the different approaches taken to recycling in plain data frames and zoo objects. When carrying out simple arithmetic, dataframe seem to recycle single arguments, zoo objects do not. Here is an example x - data.frame(a=1:5*2, b=1:5*3) x a b 1 2 3 2 4 6 3 6 9 4 8 12 5 10 15 x$a/x$a[1] [1] 1 2 3 4 5 x - zoo(x) x$a/x$a[1] 1 1 I feel understanding this difference would lead me to a greater understanding of the zoo module! I think you do have misunderstandings about the zoo package but I do not think it is in the area of vector recycling. Notice the effect of your application of the zoo function to x: x$a 1 2 3 4 5 2 4 6 8 10 x$a[1] 1 2 You have in effect transposed the elements in x and are now getting a two element column vector when requesting x$a[1]. The term vector recycling is applied to situations where short vectors are reused starting with their first elements until the necessary length is achieved. For instance if you request: data.frame(x=1:2, y=letters[1:10]) x y 1 1 a 2 2 b 3 1 c 4 2 d 5 1 e 6 2 f 7 1 g 8 2 h 9 1 i 10 2 j Or plot(1:10, col=c(red,green)) Sean. -- Sean Carmody -- Sean Carmody Twitter: http://twitter.com/seancarmody Stable: http://mulestable.net/sean The Stubborn Mule Blog: http://www.stubbornmule.net Forum: http://mulestable.net/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector recycling and zoo
Normally that would be written like this using the coredata extraction function which extracts the data portion of a zoo object: x$a / coredata( x$a[1] ) On Sun, May 16, 2010 at 7:32 PM, Sean Carmody seancarm...@gmail.com wrote: Thanks David, You comment made me realise that whereas when x is a data frame, x$a is a numeric vector, when x is of class zoo, x$a is also of class zoo, so the following does what I was expecting: x$a/as.numeric(x$a[1]) Sean. On Sun, May 16, 2010 at 9:25 PM, David Winsemius dwinsem...@comcast.netwrote: On May 16, 2010, at 2:00 AM, Sean Carmody wrote: I am a bit confused about the different approaches taken to recycling in plain data frames and zoo objects. When carrying out simple arithmetic, dataframe seem to recycle single arguments, zoo objects do not. Here is an example x - data.frame(a=1:5*2, b=1:5*3) x a b 1 2 3 2 4 6 3 6 9 4 8 12 5 10 15 x$a/x$a[1] [1] 1 2 3 4 5 x - zoo(x) x$a/x$a[1] 1 1 I feel understanding this difference would lead me to a greater understanding of the zoo module! I think you do have misunderstandings about the zoo package but I do not think it is in the area of vector recycling. Notice the effect of your application of the zoo function to x: x$a 1 2 3 4 5 2 4 6 8 10 x$a[1] 1 2 You have in effect transposed the elements in x and are now getting a two element column vector when requesting x$a[1]. The term vector recycling is applied to situations where short vectors are reused starting with their first elements until the necessary length is achieved. For instance if you request: data.frame(x=1:2, y=letters[1:10]) x y 1 1 a 2 2 b 3 1 c 4 2 d 5 1 e 6 2 f 7 1 g 8 2 h 9 1 i 10 2 j Or plot(1:10, col=c(red,green)) Sean. -- Sean Carmody -- Sean Carmody Twitter: http://twitter.com/seancarmody Stable: http://mulestable.net/sean The Stubborn Mule Blog: http://www.stubbornmule.net Forum: http://mulestable.net/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector recycling and zoo
Or even: with(x, a / coredata(a[1]) ) On Sun, May 16, 2010 at 7:48 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Normally that would be written like this using the coredata extraction function which extracts the data portion of a zoo object: x$a / coredata( x$a[1] ) On Sun, May 16, 2010 at 7:32 PM, Sean Carmody seancarm...@gmail.com wrote: Thanks David, You comment made me realise that whereas when x is a data frame, x$a is a numeric vector, when x is of class zoo, x$a is also of class zoo, so the following does what I was expecting: x$a/as.numeric(x$a[1]) Sean. On Sun, May 16, 2010 at 9:25 PM, David Winsemius dwinsem...@comcast.netwrote: On May 16, 2010, at 2:00 AM, Sean Carmody wrote: I am a bit confused about the different approaches taken to recycling in plain data frames and zoo objects. When carrying out simple arithmetic, dataframe seem to recycle single arguments, zoo objects do not. Here is an example x - data.frame(a=1:5*2, b=1:5*3) x a b 1 2 3 2 4 6 3 6 9 4 8 12 5 10 15 x$a/x$a[1] [1] 1 2 3 4 5 x - zoo(x) x$a/x$a[1] 1 1 I feel understanding this difference would lead me to a greater understanding of the zoo module! I think you do have misunderstandings about the zoo package but I do not think it is in the area of vector recycling. Notice the effect of your application of the zoo function to x: x$a 1 2 3 4 5 2 4 6 8 10 x$a[1] 1 2 You have in effect transposed the elements in x and are now getting a two element column vector when requesting x$a[1]. The term vector recycling is applied to situations where short vectors are reused starting with their first elements until the necessary length is achieved. For instance if you request: data.frame(x=1:2, y=letters[1:10]) x y 1 1 a 2 2 b 3 1 c 4 2 d 5 1 e 6 2 f 7 1 g 8 2 h 9 1 i 10 2 j Or plot(1:10, col=c(red,green)) Sean. -- Sean Carmody -- Sean Carmody Twitter: http://twitter.com/seancarmody Stable: http://mulestable.net/sean The Stubborn Mule Blog: http://www.stubbornmule.net Forum: http://mulestable.net/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sapply code
Hi r-users, I have this code here, but I just wonder how do I use 'sapply' to make it more efficient lamda_cor - eigen(winter_cor)$values lamda_cor [1] 1.3459066 1.0368399 0.8958128 0.7214407 lamda_cxn - function(dt) { n - length(dt) term - vector(length=n, mode=numeric) for (i in 1:n) { term[i] - (dt[i]/n)*log(dt[i]/n) } #sum(term) cxn - 1 + (1/log(n))*sum(term) cxn } lamda_cxn(lamda_cor) lamda_cxn(lamda_cor) [1] 0.01861457 Thank you so much for all helps given. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply code
Try this: 1 + (1 / log(length(lambda_cor))) * sum((l - lambda_cor / length(lambda_cor)) * log(l)) On Sun, May 16, 2010 at 10:43 PM, Roslina Zakaria zrosl...@yahoo.comwrote: Hi r-users, I have this code here, but I just wonder how do I use 'sapply' to make it more efficient lamda_cor - eigen(winter_cor)$values lamda_cor [1] 1.3459066 1.0368399 0.8958128 0.7214407 lamda_cxn - function(dt) { n - length(dt) term- vector(length=n, mode=numeric) for (i in 1:n) { term[i] - (dt[i]/n)*log(dt[i]/n) } #sum(term) cxn - 1 + (1/log(n))*sum(term) cxn } lamda_cxn(lamda_cor) lamda_cxn(lamda_cor) [1] 0.01861457 Thank you so much for all helps given. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splines under tension
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of sam.e Sent: Sunday, May 16, 2010 10:13 AM To: r-help@r-project.org Subject: Re: [R] Splines under tension Thank you for the helpful direction to the smoothing splines function, it was very helpful and is exactly what i am trying to do. My data however is 3-D, i.e. i have x and y values which are coordinates for different field sites and z values which are really what I am interested in analysing with interpolation. Look into the 'Tps' (thin plate splines) package. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com This has posed a problem with many of the spline functions in R. Even if i input my coordinate data as a matrix as my 'x' value and my site data as my 'y' values i get the following error: Error in xy.coords(x, y) : 'x' and 'y' lengths differ I have made sure that there are the same amount of values and that they are all of the same type, i.e. numeric but with little luck and i am a bit lost as to what to try next. Does anyone have any suggestions? Thanks, Sam -- View this message in context: http://r.789695.n4.nabble.com/Splines-under-tension-tp2173887p 2218693.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Normalizing plot tick values
Hi, I tried using this in my plot and I get an error saying 'at' and 'labels' lengths differ, 8 != 5 ~Rajesh On Sat, May 15, 2010 at 10:47 PM, Henrique Dallazuanna www...@gmail.comwrote: Try this: x - 1:100 plot(x, xaxt = 'n') axis(1, axTicks(1), pretty(x) / 10) On Sat, May 15, 2010 at 2:10 PM, rajesh j akshay.raj...@gmail.com wrote: Hi, I have a plot whole tick values along the axis have a certain range 0 - x . I need to normalize this range without changing my data files. for e.g., if my plot has tick values at 10,20,30,40,50... i have to make this 2,4,6, etc. but without changing the plot data... I am hoping I can add something to the plot command that goes like tick values divided by a quantity. Any help is appreciated. Rajesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Rajesh.J [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.