[R] save txt file
Hi, I have 2 questions: Question 1: I define 2 variables: a, b: a-rbinom(4,10,0.8) output: [1] 9 7 8 8 b-rbinom(2,6,0.7) output: [1] 4 5 if I write: write.table(a, file = filename, etc. etc. ) it save only the values of variable a. There is a way to save in a .txt file the values a and b as consecutive data? (but I would use many variables..) ..like this: 9 7 8 8 4 5 Question 2: is possible save data as rows? (9 7 8 8 4 5) thank's Eiger -- View this message in context: http://www.nabble.com/save-txt-file-tp25531307p25531307.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] any advice on web interfaces to R?
Mitchell Maltenfort mmalten at gmail.com writes: I saw http://cran.r-project.org/doc/FAQ/R-FAQ.html#R-Web-Interfaces and I'm still not sure yet which platform (Linux, Windows, etc.) I'll be working on -- and no, it's not under my control to pick. I was wondering if anyone out there had good advice, that would save me time and stomach acid, on how to set up a web browser to send a list of commands to an R and put the resulting table or graph in a web page. Thanks! It isn't cross platform (no windows) but RApache is a very good framework for this. Take a look at http://www.jeroenooms.com/stockplot.html to see a fancy use. For an alternative interface, where no web programming is needed, you can look at http://www.math.csi.cuny.edu/gWidgetsWWW/. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Stretch the x-axis for better alignment comparison
I have the following code that aligns the two graphs. Problem is that in .pdf it gives me it x-axis (0-100) is broken down into 0-20, 20-40..and so on. I wonder if there is for it to display the x-axis (and y-axis) in more detail than that. I'd appreciate your input -- pdf(file=VECTOR ICA ALIGNMENT.pdf, height=5, width=5) par(oma=c(4,4,4,4),mar=c(2,2,2,2),mgp=c(1.8,0.1.8,0),mfrow=c(1,1)) vector - read.table(file=paste(a_i_u_100.TXT,sep=)) plot(test$V2,test$V1,xlim=c(1,100),ylim=c (-1,10),xlab=TRs,ylab=amplitude,col=blue,type=l) ica - read.table(file=paste(ica_100.TXT,sep=)) lines(test$V2,test$V1, col=red) title (VECTOR ICA ALIGNMENT) dev.off() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Box plot
Hi, Is there a way I can plot the median as well as the quantiles in the actual boxplot using the boxplot command? Thanks -- View this message in context: http://www.nabble.com/Box-plot-tp25531261p25531261.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] set choose.files directory?
Hi, I've been trying to set the directory for choose.files as follows: [R2.9.0 running on XP] setwd(C:/Documents and Settings/2/Data) getwd() infile2 = choose.files(filters = Filters[c(txt,All),], caption = Choose ECD datafile) #...do a bunch of stuff... It appears the working directory isn't updated until after choose.files() executes. getwd() returns the correct path, as set by setwd(), but the path does not appear to be used by choose.files() unless I execute choose.files() a second time. Help. Thanks. Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scaled Schoenfeld residuals
hi sorry if this has been discussed before, but I'm wondering why the scaled Schoenfeld residuals do not follow the defining formula for obtaining them from the ordinary Schoenfeld residuals, but are instead offset by the estimated parameter values. e.g. library(survival) attach(ovarian) sv-Surv(futime,fustat) f1-coxph(sv~age+ecog.ps) f1 schres-resid(f1,type=schoenfeld) schresc-resid(f1,type=scaledsch) n=sum(fustat) V-f1$var schresc1-t(n*V%*%t(schres)) #schresc1 is how the scaled Schoenfeld residuals are defined #in terms of the number of events #variance of the parameter estimates, #and ordinary Schoenfeld residuals #but schresc1 and schresc differ schresc schresc1 #schresc is schresc1 offset by the parameter estimates beta-as.vector(f1$coef) nbeta-outer(rep(1,n),beta) nbeta schresc-nbeta schresc1 #is there a reason for the offset #or am I missing something? thanks Greg Greg Dropkin gr...@gn.apc.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with xtabs(), exclude=NULL, and counting NA's
Webb Sprague webb.sprague at gmail.com writes: xtabs(~wkhp, x, exclude=NULL, na.action=na.pass) wkhp 20 30 40 45 60 NA 1 1 10 1 3 4 now this doesn't even work table(wtf, exclude=NULL) wtf [0,10) [10,20) [20,30) [30,40) [40,50) [50,60) [60,70) [70,80) 8562 15297 9666 6659 3583 667 1357 238 [80,90) [90,100) [100,110) [110,120) [120,130) [130,140) [140,150) [150,160) 61 311571995 7 3 111 [180,190) NA 161 62161 Compare: xtabs (~ wtf, exclude=NULL, na.action=na.pass) wtf [0,10) [10,20) [20,30) [30,40) [40,50) [50,60) [60,70) [70,80) 8562 15297 9666 6659 3583 667 1357 238 [80,90) [90,100) [100,110) [110,120) [120,130) [130,140) [140,150) [150,160) 61 311571995 7 3 111 [180,190) 161 version _ platform x86_64-unknown-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 2 minor 9.2 year 2009 month 08 day24 svn rev49384 language R version.string R version 2.9.2 (2009-08-24) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Statistical analysis
Hi all, I have got two datasets, one of them is rainfall data and the other one is groundwater level data. I would like to see whether there is a correlation between these two datasets and if there is, to what extent they are correlated. My stats background is limited, therefore any advice on which command I should use in R would be greatly appreciated. Thanks in advance. Chris -- View this message in context: http://www.nabble.com/Statistical-analysis-tp25531331p25531331.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to show number in the %f format?
On Sep 23, 2009, at 6:42 PM, Peng Yu wrote: On Wed, Sep 23, 2009 at 5:16 PM, David Winsemius dwinsem...@comcast.net wrote: On Sep 23, 2009, at 5:58 PM, Peng Yu wrote: Hi, I have the following matrix, which is printed %e format (in C's way). I am wondering how make it be printed in %f format (in C's way)? ??printf # scroll down to base package listings, the C function ?sprintf# the s/r function I tried the following command. The column names are missing and the command is a little complicated. Is there any better solution? t(apply(significant_analysis_results[,7:8],1,function(x){sprintf(%. 7f,x)})) Why not apply to the column index? ... rather than to the row and then transposing. [,1][,2] Nab2 0.019 0.000 Rasal10.248 0.105 Ccndbp1 0.001 0.0002269 Svep1 0.000 0.000 Ppara 0.0008219 0.000 Pros1 0.009 0.000 Papss20.000 0.002 Hdac9 0.000 0.000 Adcyap1r1 0.000 0.000 Robo1 0.000 0.000 Sema3a0.000 0.000 Rab9b 0.110 0.011 Tgfb3 0.000 0.000 Slc9a90.0074608 0.000 Creb5 0.003 0.000 Ccnd1 0.0007869 0.001 Pafah1b3 0.000 0.068 Tiam2 0.000 0.000 Etv5 0.000 0.000 Hcrtr20.000 0.166 Regards, Peng significant_analysis_results[,7:8] pval(ki-wt) pval(ko-wt) Nab2 1.913348979e-06 2.731944670e-09 Rasal12.482254110e-05 1.054711084e-05 Ccndbp1 6.307674516e-08 2.268947934e-04 Svep1 0.0e+00 1.564526286e-12 Ppara 8.218961690e-04 2.802202914e-13 Pros1 8.787052919e-07 0.0e+00 Papss20.0e+00 2.190819073e-07 Hdac9 0.0e+00 8.881784197e-16 Adcyap1r1 2.085731587e-11 1.998401444e-15 Robo1 0.0e+00 0.0e+00 Sema3a4.903322193e-11 0.0e+00 Rab9b 1.099629676e-05 1.116694168e-06 Tgfb3 0.0e+00 0.0e+00 Slc9a97.460784795e-03 1.552167950e-09 Creb5 2.959174867e-07 8.973577437e-11 Ccnd1 7.868573521e-04 1.460805570e-07 Pafah1b3 1.576464070e-08 6.757446065e-06 Tiam2 0.0e+00 0.0e+00 Etv5 2.279731959e-12 0.0e+00 Hcrtr21.258646520e-10 1.661509722e-05 str(significant_analysis_results[,7:8]) num [1:20, 1:2] 1.91e-06 2.48e-05 6.31e-08 0.00 8.22e-04 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:20] Nab2 Rasal1 Ccndbp1 Svep1 ... ..$ : chr [1:2] pval(ki-wt) pval(ko-wt) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Box plot
Hi r-help-boun...@r-project.org napsal dne 23.09.2009 21:17:10: Hi, Is there a way I can plot the median as well as the quantiles in the actual boxplot using the boxplot command? AFAIK boxplot produces box which marks upper and lower quartile and median. So you shall be more precise what you want. But when you modify standard boxplot you can easily decieve your audience if they expect standard boxplot. Regards Petr Thanks -- View this message in context: http://www.nabble.com/Box-plot-tp25531261p25531261.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with xtabs(), exclude=NULL, and counting NA's
On Sep 23, 2009, at 5:11 PM, ws wrote: Webb Sprague webb.sprague at gmail.com writes: xtabs(~wkhp, x, exclude=NULL, na.action=na.pass) wkhp 20 30 40 45 60 NA 11 10134 now this doesn't even work Try: wtf - factor(x, levels(c(levels(wtf), NA), exclude=NULL) xtabs (~ wtf, exclude=NULL, na.action=na.pass) -- David table(wtf, exclude=NULL) wtf [0,10) [10,20) [20,30) [30,40) [40,50) [50,60) [60,70) [70,80) 8562 15297 9666 6659 3583 667 1357 238 [80,90) [90,100) [100,110) [110,120) [120,130) [130,140) [140,150) [150,160) 61 311571995 7 3 111 [180,190) NA 161 62161 Compare: xtabs (~ wtf, exclude=NULL, na.action=na.pass) wtf [0,10) [10,20) [20,30) [30,40) [40,50) [50,60) [60,70) [70,80) 8562 15297 9666 6659 3583 667 1357 238 [80,90) [90,100) [100,110) [110,120) [120,130) [130,140) [140,150) [150,160) 61 311571995 7 3 111 [180,190) 161 version _ platform x86_64-unknown-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 2 minor 9.2 year 2009 month 08 day24 svn rev49384 language R version.string R version 2.9.2 (2009-08-24) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with xtabs(), exclude=NULL, and counting NA's
On Sep 24, 2009, at 3:06 AM, David Winsemius wrote: On Sep 23, 2009, at 5:11 PM, ws wrote: Webb Sprague webb.sprague at gmail.com writes: xtabs(~wkhp, x, exclude=NULL, na.action=na.pass) wkhp 20 30 40 45 60 NA 11 10134 now this doesn't even work Try: wtf - factor(x, levels(c(levels(wtf), NA), exclude=NULL) xtabs (~ wtf, exclude=NULL, na.action=na.pass) Note: that will mean that sum(is.na(wtf)) will return 0 and is.na(wtf)- will no longer have any valid targets. -- David table(wtf, exclude=NULL) wtf [0,10) [10,20) [20,30) [30,40) [40,50) [50,60) [60,70) [70,80) 8562 15297 9666 6659 3583 667 1357 238 [80,90) [90,100) [100,110) [110,120) [120,130) [130,140) [140,150) [150,160) 61 311571995 7 3 111 [180,190) NA 161 62161 Compare: xtabs (~ wtf, exclude=NULL, na.action=na.pass) wtf [0,10) [10,20) [20,30) [30,40) [40,50) [50,60) [60,70) [70,80) 8562 15297 9666 6659 3583 667 1357 238 [80,90) [90,100) [100,110) [110,120) [120,130) [130,140) [140,150) [150,160) 61 311571995 7 3 111 [180,190) 161 -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stretch the x-axis for better alignment comparison
On Wed, Sep 23, 2009 at 11:25:23AM -0700, Maggie wrote: I have the following code that aligns the two graphs. Problem is that in .pdf it gives me it x-axis (0-100) is broken down into 0-20, 20-40..and so on. I wonder if there is for it to display the x-axis (and y-axis) in more detail than that. Without the necessary data I canot directly reproduce your example but have a look at this for a start: plot(0:10) axis(1, seq(0,10,0.2), labels=F) You may also want to use xaxt='n' in the plot command and then construct use axis to build the axis the way you want it. If reading out data from the graph is a concern, you may also want to look at the grid() command. cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save txt file
any r manual helps here, and there are many easy ways to do it. ?cbind ?matrix ?data.frame If you need it in rows, matrix transposition helps, or add the byrow argument when using the matrix function (or by.row; I don't remember from the top of my head). ?dim could also do the job if you make one vector out of a and b. Best, Daniel - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im Auftrag von Eiger Gesendet: Wednesday, September 23, 2009 5:37 PM An: r-help@r-project.org Betreff: [R] save txt file Hi, I have 2 questions: Question 1: I define 2 variables: a, b: a-rbinom(4,10,0.8) output: [1] 9 7 8 8 b-rbinom(2,6,0.7) output: [1] 4 5 if I write: write.table(a, file = filename, etc. etc. ) it save only the values of variable a. There is a way to save in a .txt file the values a and b as consecutive data? (but I would use many variables..) ..like this: 9 7 8 8 4 5 Question 2: is possible save data as rows? (9 7 8 8 4 5) thank's Eiger -- View this message in context: http://www.nabble.com/save-txt-file-tp25531307p25531307.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] set choose.files directory?
See http://finzi.psych.upenn.edu/Rhelp08/2009-March/192978.html You can do something like: default.search = paste(getwd(),/*.txt,sep=) infile2 = choose.files(default.search,filters = Filters[c(txt,All),], caption = Choose ECD datafile) Schalk Heunis On Wed, Sep 23, 2009 at 8:57 PM, mdusa...@umn.edu wrote: Hi, I've been trying to set the directory for choose.files as follows: [R2.9.0 running on XP] setwd(C:/Documents and Settings/2/Data) getwd() infile2 = choose.files(filters = Filters[c(txt,All),], caption = Choose ECD datafile) #...do a bunch of stuff... It appears the working directory isn't updated until after choose.files() executes. getwd() returns the correct path, as set by setwd(), but the path does not appear to be used by choose.files() unless I execute choose.files() a second time. Help. Thanks. Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stripchart with pch %in% 21:25 with bg
stripchart.formula() works for me with your modification to stripchart.default(). Great! But you don't need the 'bordered' pch for that. Indeed, but this may improve lisibility: # n - 500 x - rnorm(n) y - rnorm(n) fac1 - rep(c(male, female), n) fac2 - rep(c(blue, red), each = n/2) par(mfrow = c(1,2), lend = butt, mar = c(5,4,1,1), oma = c(0,0,4,0)) plot(x, y, pch = ifelse(fac1 == male, 15, 19), col = fac2, asp = 1) mbg - rgb(0.9, 0.9, 0.9, 0.9) legend(topleft, inset = 0.02, c(male,female), pch = c(15,19), bg = mbg) legend(bottomright, inset = 0.02, c(blue,red), col = c(blue,red), bg = mbg, lty = 1, lwd = 5) plot(x, y, pch = ifelse(fac1 == male, 22, 21), bg = fac2, asp = 1) legend(topleft, inset = 0.02, c(male,female), pch = c(22, 21), bg = mbg) legend(bottomright, inset = 0.02, c(blue,red), col = c(blue,red), bg = mbg, lty = 1, lwd = 5) mtext(Comparison of pch c(15, 19) versus c(22, 21), outer = TRUE, cex = 1.5, line = 1) ## Best, Jean -- Jean R. Lobry(lo...@biomserv.univ-lyon1.fr) Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - LYON I, 43 Bd 11/11/1918, F-69622 VILLEURBANNE CEDEX, FRANCE allo : +33 472 43 27 56 fax: +33 472 43 13 88 http://pbil.univ-lyon1.fr/members/lobry/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data
On 09/23/2009 10:42 PM, Ashta wrote: Dear R-users, I am a new user for R. I am eager to lean about it. I wanted to read and summary of the a simple data file I used the following, rel- read.table(C:/Documents and Settings/ashta/My Documents/R_data/rel.dat, quote=,header=FALSE,sep=,col.names= c(id,orel,nrel)) summary(rel) Below is the error message, rel- read.table(C:/Documents and Settings/ashta/My Documents/R_data/rel.dat, quote=,header=FALSE,sep=,col.names= + c(id,orel,nrel)) Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'file=C:/Documents and Settings/sewalem/My Documents/R_data/rel.dat': Invalid argument summary(rel) Error in summary(rel) : object 'rel' not found Does it need a library? Where can I get the library? Hi Ashta, If you have checked that the file rel.dat is really there where you think it is, there is a nasty trick that Windows plays with many files. For example, if you have created this file in Notepad and saved it, you may find that .txt has been added to the filename. So the real filename is rel.dat.txt. Of course, Windows won't show you that unless you go into Folder Options in Windows Explorer and turn off that Hide known extensions option. This is a wild guess, but it has happened to me so often that I am wary of it. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re peated measures
check for missing values. Tal On Wed, Sep 23, 2009 at 3:27 PM, pompon julien.pom...@agr.gc.ca wrote: Hi, I am performing a repeated measures 2-way ANOVA to assess the influence of plant and leaf on aphid fecundity. Fecundity is measured for each aphid on a single leaf. Here is what I typed. wingless - reshape(Wingless, varying = list(c(d0,d1,d2,d3,d4,d5,d6,d7,d8,d9,d10,d11,d12,d13,d14,d15,d16)), v.names = c(fecundity), timevar = time, direction = long) wingless.aov - aov(fecundity ~ factor(time) * clip.cage * plant + Error(factor(id)), data = wingless) summary(wingless.aov) and I obtained Error: factor(id) Df Sum Sq Mean Sq F value Pr(F) factor(time)4 56.789 14.197 3.0613 0.05925 . clip.cage 1 14.149 14.149 3.0509 0.10621 plant 1 3.251 3.251 0.7010 0.41880 factor(time):clip.cage 1 0.304 0.304 0.0655 0.80240 clip.cage:plant 1 17.114 17.114 3.6903 0.07880 . Residuals 12 55.652 4.638 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Error: Within Df Sum Sq Mean Sq F value Pr(F) factor(time) 16 340.83 21.30 11.5222 2e-16 *** factor(time):clip.cage16 27.341.71 0.9242 0.54195 factor(time):plant16 46.362.90 1.5673 0.07783 . factor(time):clip.cage:plant 16 24.501.53 0.8281 0.65304 Residuals255 471.441.85 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 I don't understand why I have the factor(time) inmy between subject results, whereas with a similar set of data I don't. Thank you very much, Julien Pompon. -- View this message in context: http://www.nabble.com/repeated-measures-tp25531110p25531110.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stretch the x-axis for better alignment comparison
On 09/24/2009 04:25 AM, Maggie wrote: I have the following code that aligns the two graphs. Problem is that in .pdf it gives me it x-axis (0-100) is broken down into 0-20, 20-40..and so on. I wonder if there is for it to display the x-axis (and y-axis) in more detail than that. I'd appreciate your input -- pdf(file=VECTOR ICA ALIGNMENT.pdf, height=5, width=5) par(oma=c(4,4,4,4),mar=c(2,2,2,2),mgp=c(1.8,0.1.8,0),mfrow=c(1,1)) vector- read.table(file=paste(a_i_u_100.TXT,sep=)) plot(test$V2,test$V1,xlim=c(1,100),ylim=c (-1,10),xlab=TRs,ylab=amplitude,col=blue,type=l) ica- read.table(file=paste(ica_100.TXT,sep=)) lines(test$V2,test$V1, col=red) title (VECTOR ICA ALIGNMENT) dev.off() Hi Maggie, The axis function defaults to fairly spaced-out labels, and will omit labels if they are too crowded. You can get around that by specifying that the x-axis (xaxt=n) or y-axis(yaxt=n) are not drawn by plot and then adding one or both later. If you have the crowded axis problem, have a look at the staxlab function in the plotrix package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Statistical analysis
Chris Li wrote: Hi all, I have got two datasets, one of them is rainfall data and the other one is groundwater level data. I would like to see whether there is a correlation between these two datasets and if there is, to what extent they are correlated. My stats background is limited, therefore any advice on which command I should use in R would be greatly appreciated. Thanks in advance. Chris Hi, My advice would be to get an introductory statistics book and start with that. There is an Introductory stats book by Dalgaard that uses R. Strikes two birds with one blow. http://www.amazon.com/Introductory-Statistics-R-Peter-Dalgaard/dp/0387954759 cheers, Paul -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +3130 274 3113 Mon-Tue Phone: +3130 253 5773 Wed-Fri http://intamap.geo.uu.nl/~paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Statistical analysis
Hi Chris, If I understand your question correctly, what you want is both easy and hard. Easy: # making a reproducible example, as asked in the posting guide # two vectors water - rnorm(1000) rain - rgamma(1000,.5) # the following does everything you mention and more summary(lm(water~rain)) cor(water,rain) Hard: lm() and cor() assume independence of observations, linearity of the relation, normality of the residuals. Are these assumptions valid for your problem? Are your datasets time series? There will be ??autocorrelation in both datasets. There may be a ?lag. Decide whether to estimate and correct for those. Are there multiple sample locations? There may be dependence. Would you rather assume rain and change in groundwater level are related? Etc. Cheers, Arien Chris Li wrote: Hi all, I have got two datasets, one of them is rainfall data and the other one is groundwater level data. I would like to see whether there is a correlation between these two datasets and if there is, to what extent they are correlated. My stats background is limited, therefore any advice on which command I should use in R would be greatly appreciated. Thanks in advance. Chris -- drs. H.A. (Arien) Lam (Ph.D. student) Department of Physical Geography Faculty of Geosciences Utrecht University, The Netherlands __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Downloading currency data from from Yahoo
Hi, I wanted to download some currency data using quantmod package, however got following error : getSymbols('USD/GBP',src='yahoo') Error in download.file(paste(yahoo.URL, s=, Symbols.name, a=, from.m, : cannot open URL 'http://chart.yahoo.com/table.csv?s=USD/GBPa=0b=01c=2007d=8e=24f=2009g=dq=qy=0z=USD/GBPx=.csv' In addition: Warning message: In download.file(paste(yahoo.URL, s=, Symbols.name, a=, from.m, : cannot open: HTTP status was '404 Not Found' Can anyone please tell me how to get rid of that? Your help will be highly appreciated. Thanks -- View this message in context: http://www.nabble.com/Downloading-currency-data-from-from-Yahoo-tp25553792p25553792.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save txt file
cls59 wrote: Hope this helps! -Charlie Thanks!!! :) -- View this message in context: http://www.nabble.com/save-data-in-a-txt-file-tp25531307p25554140.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiply Normal Curves
R -helpers i have been trying to do this problem without must success,i managed to do a graph for x, but it is not what i want to define,(i want to specify number of observations as well). I have also been able to do simple rendom sample. data.frame(ID=c(1,2,3),mu=c(1,34000,5),sigma=c(2000,3000,5000)) curve(dnorm(x,mean=parms$mu[1],sd=parms$sigma[1]),from=2000, to=8, ylab=density, col=red) curve(dnorm(x,mean=parms$mu[2],sd=parms$sigma[2]),from=1000, to=8, ylab=density, col=blue, add=TRUE) curve(dnorm(x,mean=parms$mu[3],sd=parms$sigma[3]),from=1000, to=8, ylab=density, col=forestgreen, add=TRUE) ### R-helpers I have been learning a little bit of R. I am simulating and i want to draw a normal curve for all my variables so that i will see the overlaps and reduce them, after that i want to draw a gragh of all the values that are in the data frame to see if it follows a normal distribution also. Lastly i will try to sample from this data. Please help and make suggestions. #My original codes #code for dataframe Hypermarket - matrix(rnorm(100, mean=5, sd=5000)) Supermarket - matrix(rnorm(400, mean=34000, sd=3000)) Minimarket - matrix(rnorm(1000, mean=1,sd=2000)) Cornershop - matrix(rnorm(1500, mean=2500, sd=500)) Spazashop - matrix(rnorm(2000, mean=1000, sd=250)) dat=data.frame(type=c(rep(Hypermarket,100), rep(Supermarket,400), rep(Minimarket,1000),rep(Cornershop,1500), rep(Spazashop,2000)), value=c(Hypermarket, Supermarket, Minimarket, Cornershop,Spazashop)) dat #code for histogram of Hypermarket(Please suggest something simple ) hist(Hypermarket, breaks=seq(3, 65000, 1000), freq=F) x- seq(3, 65000, 500) lines(x, dnorm(x, 5, 5000)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] aggregate() - error message
Dear list, would anybody be able to tell me why the statement Tripstatistics=aggregate(TripsData[2:3],by=list(Trip=Tripmatch),FUN=mean) seems to work well with TripsData 1 but not with TripsData 2 ? With TripsData 2 it yields Error in FUN(X[[1L]], ...) : arguments must have same length I can't see a difference in the two data sets. Could someone shed light on the error message ? Maybe that would help me figure out where the problem is. Thanks very much for any help. Juliane PS: The examples below contain the first 5 rows of my two datafiles. TripsData 1 Trip Distance TimeDiff Tripmatch 7329 35,36 172.7453 0.0041 35,36 7371 36,35 172.7453 0.0004 35,36 7372 35,36 172.7453 0. 35,36 7373 36,35 172.7453 0.0004 35,36 7374 35,36 172.7453 0.0001 35,36 TripsData 2 Trip Distance TimeDiff Tripmatch 1617 19,20 1087.365 0.0441 19,20 1899 20,19 1087.365 0.0207 19,20 1915 19,20 1087.365 0.0361 19,20 3285 20,19 1087.365 0.0356 19,20 3826 19,20 1087.365 0.0697 19,20 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Maximum likelihood estimation of parameters make no biological sense
R-help, I'm trying to estimate some parameters using the Maximum Likehood method. The model describes fish growth using a sigmoidal-type of curve: fn_w - function(params) { Winf - params[1] k - params[2] t0 - params[3] b - params[4] sigma - params[5] what - Winf * (1-exp(- k *(tt - t0)))^b logL - -sum(dnorm(log(wobs),log(what),sqrt(sigma),TRUE)) return(logL) } tt - 4:14 wobs - c(1.545, 1.920, 2.321 ,2.591, 3.676, 4.425 ,5.028, 5.877, 6.990, 6.800 ,6.900) An then the optimization method: OPT -optim(c(8, .1, 0, 3, 1), fn_w, method=L-BFGS-B ,lower=c(0.0, 0.001, 0.001,0.001, 0.01), upper = rep(Inf, 5), hessian=TRUE, control=list(trace=1)) which gives: $par Winf k t0 b sigma [1] 24.27206813 0.04679844 0.0010 1.61760492 0.0100 $value [1] -11.69524 $counts function gradient 143 143 $convergence [1] 0 $message [1] CONVERGENCE: REL_REDUCTION_OF_F = FACTR*EPSMCH $hessian [,1] [,2] [,3] [,4] [,5] [1,] 1.867150e+00 1.262763e+03-7.857719 -5.153276e+01 -1.492850e-05 [2,] 1.262763e+03 8.608461e+05 -5512.469266 -3.562137e+04 9.693180e-05 [3,] -7.857719e+00 -5.512469e+0341.670222 2.473167e+02 -5.356813e+01 [4,] -5.153276e+01 -3.562137e+04 247.316675 1.535086e+03 -1.464370e-03 [5,] -1.492850e-05 9.693180e-05 -53.568127 -1.464370e-03 1.730462e+04 after iteration number 80. From the biological point of view Winf =24(hipothesized asimptotical maximum weight) makes no sense while the b parameter is no nearly close to b=3 leading to a non-sigmoidal curve. However using the least-squares method provide with more sensible parameter estimates $par Winf k t0b [1] 10.3827256 0.0344187 3.1751376 2.9657368 $value [1] 2.164403 $counts function gradient 48 48 $convergence [1] 0 $message [1] CONVERGENCE: REL_REDUCTION_OF_F = FACTR*EPSMCH Is there anything wrong with my MLE function and parameters? I have tried with distinct initial parameters also. Can anyone give me a clue on this? Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: Re: Multiple Normal Curves
Sorry about the subject --- On Thu, 24/9/09, KABELI MEFANE kabelimef...@yahoo.co.uk wrote: From: KABELI MEFANE kabelimef...@yahoo.co.uk Subject: Re: [R] Multiply Normal Curves To: R-help@r-project.org Date: Thursday, 24 September, 2009, 11:48 AM R -helpers i have been trying to do this problem without must success,i managed to do a graph for x, but it is not what i want to define,(i want to specify number of observations as well). I have also been able to do simple rendom sample. data.frame(ID=c(1,2,3),mu=c(1,34000,5),sigma=c(2000,3000,5000)) curve(dnorm(x,mean=parms$mu[1],sd=parms$sigma[1]),from=2000, to=8, ylab=density, col=red) curve(dnorm(x,mean=parms$mu[2],sd=parms$sigma[2]),from=1000, to=8, ylab=density, col=blue, add=TRUE) curve(dnorm(x,mean=parms$mu[3],sd=parms$sigma[3]),from=1000, to=8, ylab=density, col=forestgreen, add=TRUE) ### R-helpers I have been learning a little bit of R. I am simulating and i want to draw a normal curve for all my variables so that i will see the overlaps and reduce them, after that i want to draw a gragh of all the values that are in the data frame to see if it follows a normal distribution also. Lastly i will try to sample from this data. Please help and make suggestions. #My original codes #code for dataframe Hypermarket - matrix(rnorm(100, mean=5, sd=5000)) Supermarket - matrix(rnorm(400, mean=34000, sd=3000)) Minimarket - matrix(rnorm(1000, mean=1,sd=2000)) Cornershop - matrix(rnorm(1500, mean=2500, sd=500)) Spazashop - matrix(rnorm(2000, mean=1000, sd=250)) dat=data.frame(type=c(rep(Hypermarket,100), rep(Supermarket,400), rep(Minimarket,1000),rep(Cornershop,1500), rep(Spazashop,2000)), value=c(Hypermarket, Supermarket, Minimarket, Cornershop,Spazashop)) dat #code for histogram of Hypermarket(Please suggest something simple ) hist(Hypermarket, breaks=seq(3, 65000, 1000), freq=F) x- seq(3, 65000, 500) lines(x, dnorm(x, 5, 5000)) [[alternative HTML version deleted]] -Inline Attachment Follows- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bug
Hello R users I tried to get maximum of sale date from my dataframe using sqldf in R. First time when i was executing the following code sqldf(select max(sale_date) from test1) i got the result as 9997.0 BUT when i was running the same for second time, the result was 2031-04-09 (this is what correct one!) why it was happened? thanks. -- View this message in context: http://www.nabble.com/Bug-tp25548042p25548042.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate() - error message
Hi, Are you trying to get columns 2 and 3 from TripsData in which case you need to say TripsData[,2:3] ? Paul Juliane Struve wrote: Dear list, would anybody be able to tell me why the statement Tripstatistics=aggregate(TripsData[2:3],by=list(Trip=Tripmatch),FUN=mean) seems to work well with TripsData 1 but not with TripsData 2 ? With TripsData 2 it yields Error in FUN(X[[1L]], ...) : arguments must have same length I can't see a difference in the two data sets. Could someone shed light on the error message ? Maybe that would help me figure out where the problem is. Thanks very much for any help. Juliane PS: The examples below contain the first 5 rows of my two datafiles. TripsData 1 Trip Distance TimeDiff Tripmatch 7329 35,36 172.7453 0.0041 35,36 7371 36,35 172.7453 0.0004 35,36 7372 35,36 172.7453 0. 35,36 7373 36,35 172.7453 0.0004 35,36 7374 35,36 172.7453 0.0001 35,36 TripsData 2 Trip Distance TimeDiff Tripmatch 1617 19,20 1087.365 0.0441 19,20 1899 20,19 1087.365 0.0207 19,20 1915 19,20 1087.365 0.0361 19,20 3285 20,19 1087.365 0.0356 19,20 3826 19,20 1087.365 0.0697 19,20 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate() - error message
Hi Juliane, Try TripsData[, 2:3] instead of TripsData[ 2:3 ] in the aggregate call. HTH, Jorge On Thu, Sep 24, 2009 at 6:52 AM, Juliane Struve wrote: Dear list, would anybody be able to tell me why the statement Tripstatistics=aggregate(TripsData[2:3],by=list(Trip=Tripmatch),FUN=mean) seems to work well with TripsData 1 but not with TripsData 2 ? With TripsData 2 it yields Error in FUN(X[[1L]], ...) : arguments must have same length I can't see a difference in the two data sets. Could someone shed light on the error message ? Maybe that would help me figure out where the problem is. Thanks very much for any help. Juliane PS: The examples below contain the first 5 rows of my two datafiles. TripsData 1 Trip Distance TimeDiff Tripmatch 7329 35,36 172.7453 0.0041 35,36 7371 36,35 172.7453 0.0004 35,36 7372 35,36 172.7453 0. 35,36 7373 36,35 172.7453 0.0004 35,36 7374 35,36 172.7453 0.0001 35,36 TripsData 2 Trip Distance TimeDiff Tripmatch 1617 19,20 1087.365 0.0441 19,20 1899 20,19 1087.365 0.0207 19,20 1915 19,20 1087.365 0.0361 19,20 3285 20,19 1087.365 0.0356 19,20 3826 19,20 1087.365 0.0697 19,20 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem on SUSE Linux Enterprise Server 10 (ia64)
Dear Sir, When I install R on SUSE Linux Enterprise Server 10 (ia64) (Linux a450 2.6.16.21-0.8-default #1 SMP Mon Jul 3 18:25:39 UTC 2006 ia64 ia64 ia64 GNU/Linux) it reported the wrong messages at the end: # ./configure checking build system type... ia64-unknown-linux-gnu checking host system type... ia64-unknown-linux-gnu loading site script './config.site' loading build specific script './config.site' checking for pwd... /bin/pwd checking whether builddir is srcdir... yes checking for working aclocal... found checking for working autoconf... found . checking for readline/readline.h... no checking for rl_callback_read_char in -lreadline... no checking for main in -lncurses... yes checking for rl_callback_read_char in -lreadline... no checking for history_truncate_file... no configure: error: --with-readline=yes (default) and headers/libs are not available Could you tell me how to fix the problem? Thank you! Best wishes, Yuan Zhidong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug
Please read and follow the last line to every message on r-help. On Thu, Sep 24, 2009 at 5:32 AM, dhansekaran dhana...@gmail.com wrote: Hello R users I tried to get maximum of sale date from my dataframe using sqldf in R. First time when i was executing the following code sqldf(select max(sale_date) from test1) i got the result as 9997.0 BUT when i was running the same for second time, the result was 2031-04-09 (this is what correct one!) why it was happened? thanks. -- View this message in context: http://www.nabble.com/Bug-tp25548042p25548042.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fw: Re: Multiple Normal Curves
On 09/24/2009 08:57 PM, KABELI MEFANE wrote: Sorry about the subject --- On Thu, 24/9/09, KABELI MEFANEkabelimef...@yahoo.co.uk wrote: From: KABELI MEFANEkabelimef...@yahoo.co.uk Subject: Re: [R] Multiply Normal Curves To: R-help@r-project.org Date: Thursday, 24 September, 2009, 11:48 AM R -helpers i have been trying to do this problem without must success,i managed to do a graph for x, but it is not what i want to define,(i want to specify number of observations as well). I have also been able to do simple rendom sample. data.frame(ID=c(1,2,3),mu=c(1,34000,5),sigma=c(2000,3000,5000)) curve(dnorm(x,mean=parms$mu[1],sd=parms$sigma[1]),from=2000, to=8, ylab=density, col=red) curve(dnorm(x,mean=parms$mu[2],sd=parms$sigma[2]),from=1000, to=8, ylab=density, col=blue, add=TRUE) curve(dnorm(x,mean=parms$mu[3],sd=parms$sigma[3]),from=1000, to=8, ylab=density, col=forestgreen, add=TRUE) ### R-helpers I have been learning a little bit of R. I am simulating and i want to draw a normal curve for all my variables so that i will see the overlaps and reduce them, after that i want to draw a gragh of all the values that are in the data frame to see if it follows a normal distribution also. Lastly i will try to sample from this data. Please help and make suggestions. Hi Kabeli, I think you want to get multiple histograms and normal curves on the same plot. You can do something like that if you get a table of frequencies for each of your three sets of values using cut or hist, combine these vectors of frequencies into a matrix and pass this to barplot. Then draw your normal curves using curve on top of the grouped bars. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem in forcing variables in CART
Hai Can someone tel me whether it is possible to force in Variables at various levels of the tree in CART. I have been using RPART for CART which comes with PARTYKIT package (is this package good for CART or is there any other better package?). I am facing a problem with the tree it generates, it picks up variables which don't make much sense. I want it to pick up few vars at few levels. Can someone please help me out in this regard. Its urgent. Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] superimposing xyplots on same scale
I have two xyplots that i want to superimpose (code below). By default they are displayed on slightly different y scales (one runs from 10 to 25, the other from 10 to 30). I would like to force them both onto the same scale (10 to 30) so the relation between the two is clear. Is there a way to do this? thanks much pct_compl_chart - xyplot(pct_compl ~ date, col=red, type=b, pch=15, scalse=list(tick.number=5), ylab=list(label=Pct. Compl.), layout=c(1,5), xlab=list(label=), between = list(x = c(0, 0, 0), y = c(8,-10,-10,-10,-10)) ) time_pct_chart - xyplot(time_pct ~ date, col=blue, type=b, pch=15, scales=list(tick.number=5), ylab=list(label=), layout=c(1,5), xlab=list(label=), between = list(x = c(0, 0, 0), y = c(8,-10,-10,-10,-10)) ) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate() - error message
Dear all, thanks you very much for replying. However, this does not seem to solve the problem. I still get the same error message when using TripsData[,2:3]. Is there anything else that could be wrong with this statement or the data ? What length is the error message referring to ? Many thanks, Juliane Dr. Juliane Struve Environmental Scientist 10, Lynwood Crescent Sunningdale SL5 0BL 01344 620811 - Original Message From: Paul Emberson em...@calidasoft.co.uk To: Juliane Struve juliane_str...@yahoo.co.uk Cc: r-help@r-project.org Sent: Thursday, 24 September, 2009 12:27:00 Subject: Re: [R] aggregate() - error message Hi, Are you trying to get columns 2 and 3 from TripsData in which case you need to say TripsData[,2:3] ? Paul Juliane Struve wrote: Dear list, would anybody be able to tell me why the statement Tripstatistics=aggregate(TripsData[2:3],by=list(Trip=Tripmatch),FUN=mean) seems to work well with TripsData 1 but not with TripsData 2 ? With TripsData 2 it yields Error in FUN(X[[1L]], ...) : arguments must have same length I can't see a difference in the two data sets. Could someone shed light on the error message ? Maybe that would help me figure out where the problem is. Thanks very much for any help. Juliane PS: The examples below contain the first 5 rows of my two datafiles. TripsData 1 Trip Distance TimeDiff Tripmatch 7329 35,36 172.7453 0.0041 35,36 7371 36,35 172.7453 0.0004 35,36 7372 35,36 172.7453 0. 35,36 7373 36,35 172.7453 0.0004 35,36 7374 35,36 172.7453 0.0001 35,36 TripsData 2 Trip Distance TimeDiff Tripmatch 1617 19,20 1087.365 0.0441 19,20 1899 20,19 1087.365 0.0207 19,20 1915 19,20 1087.365 0.0361 19,20 3285 20,19 1087.365 0.0356 19,20 3826 19,20 1087.365 0.0697 19,20 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate() - error message
Dear Jorge, thank you very much for your help. So with() was missing. Tripstatistics=with(TripsData,aggregate(TripsData[,3:4],by=list(Trip=Tripmatch),FUN=mean)) works. Best wishes, Juliane Dr. Juliane Struve Environmental Scientist 10, Lynwood Crescent Sunningdale SL5 0BL 01344 620811 From: Jorge Ivan Velez jorgeivanve...@gmail.com Cc: R mailing list r-help@r-project.org Sent: Thursday, 24 September, 2009 13:32:59 Subject: Re: [R] aggregate() - error message Hi Julien, Works for me: # Data sets TripsData1 - read.table(textConnection(Trip Distance TimeDiff Tripmatch 7329 35,36 172.7453 0.0041 35,36 7371 36,35 172.7453 0.0004 35,36 7372 35,36 172.7453 0. 35,36 7373 36,35 172.7453 0.0004 35,36 7374 35,36 172.7453 0.0001 35,36), header = TRUE) TripsData2 - read.table(textConnection(Trip Distance TimeDiff Tripmatch 1617 19,20 1087.365 0.0441 19,20 1899 20,19 1087.365 0.0207 19,20 1915 19,20 1087.365 0.0361 19,20 3285 20,19 1087.365 0.0356 19,20 3826 19,20 1087.365 0.0697 19,20), header = TRUE) closeAllConnections() # Individually with(TripsData1, aggregate(TripsData1[, 2:3], by = list(Tripmatch), FUN=mean) ) # Group.1 Distance TimeDiff # 1 35,36 172.7453 0.001 with(TripsData2, aggregate(TripsData2[, 2:3], by = list(Tripmatch), FUN=mean) ) # Group.1 Distance TimeDiff # 1 19,20 1087.365 0.04124 # Combining both data sets AllTrips - rbind(TripsData1, TripsData2) with(AllTrips, aggregate(AllTrips[, 2:3], by = list(Tripmatch), FUN=mean) ) # Group.1 Distance TimeDiff # 1 35,36 172.7453 0.00100 # 2 19,20 1087.3650 0.04124 HTH, Jorge On Thu, Sep 24, 2009 at 8:23 AM, Juliane Struve wrote: Dear all, thanks you very much for replying. However, this does not seem to solve the problem. I still get the same error message when using TripsData[,2:3]. Is there anything else that could be wrong with this statement or the data ? What length is the error message referring to ? Many thanks, Juliane Dr. Juliane Struve Environmental Scientist 10, Lynwood Crescent Sunningdale SL5 0BL 01344 620811 - Original Message From: Paul Emberson em...@calidasoft.co.uk Cc: r-help@r-project.org Sent: Thursday, 24 September, 2009 12:27:00 Subject: Re: [R] aggregate() - error message Hi, Are you trying to get columns 2 and 3 from TripsData in which case you need to say TripsData[,2:3] ? Paul Juliane Struve wrote: Dear list, would anybody be able to tell me why the statement Tripstatistics=aggregate(TripsData[2:3],by=list(Trip=Tripmatch),FUN=mean) seems to work well with TripsData 1 but not with TripsData 2 ? With TripsData 2 it yields Error in FUN(X[[1L]], ...) : arguments must have same length I can't see a difference in the two data sets. Could someone shed light on the error message ? Maybe that would help me figure out where the problem is. Thanks very much for any help. Juliane PS: The examples below contain the first 5 rows of my two datafiles. TripsData 1 Trip Distance TimeDiff Tripmatch 7329 35,36 172.7453 0.0041 35,36 7371 36,35 172.7453 0.0004 35,36 7372 35,36 172.7453 0. 35,36 7373 36,35 172.7453 0.0004 35,36 7374 35,36 172.7453 0.0001 35,36 TripsData 2 Trip Distance TimeDiff Tripmatch 1617 19,20 1087.365 0.0441 19,20 1899 20,19 1087.365 0.0207 19,20 1915 19,20 1087.365 0.0361 19,20 3285 20,19 1087.365 0.0356 19,20 3826 19,20 1087.365 0.0697 19,20 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] P-value and R-squared variable selection criteria
Hi R community I have a question. I'll explain my situation. I have to build a climate model to obtain monthly and annual temperature from 2004 to 2008 from a specif area in Almeria (Spain). To build this climate model, I will use Multiple regression. My dependant variable will be monthly and annual temperature and independant variables will be Latitute, Longitude and Altitude and I will work with climate data from 10 climate stations distributed in my area of interest. I have to fit the climate model from the data to get temperature for each month. And I need to use p-value and r-squared adjusted from the model to obtain the best fit. I'll put an example. My climate data will be: V1 V2 V3 V4 V5 1 1 18 3 6 187 2 2 21 6 8 68 3 3 23 9 5 42 4 4 19 8 2 194 5 5 17 3 2 225 (V1 - climate station, V2 - temperature, V3 - Latitude, V4 - Longitude, V5 - Altitude) I fit the model to the data fit(V2~V3+V4+V5, data=clima) And I get Call: lm(formula = V2 ~ V3 + V4 + V5, data = clima) Residuals: 12345 0.24684 -0.25200 0.17487 -0.05865 -0.11107 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 22.103408 2.526638 8.748 0.0725 . V3 0.236477 0.152067 1.555 0.3638 V4 -0.073973 0.169716 -0.436 0.7383 V5 -0.024684 0.006951 -3.551 0.1748 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.4133 on 1 degrees of freedom Multiple R-squared: 0.9926, Adjusted R-squared: 0.9706 F-statistic: 44.95 on 3 and 1 DF, p-value: 0.1091 P- value for this model is 0.1091 However, I see that variable V4 has a really high p-value, so if I take it out, my model will have a better p-value. So: fit2-lm(V2~V4+V5) Call: lm(formula = V2 ~ V4 + V5, data = clima) Residuals: 12345 0.28356 -0.21880 0.05952 0.40918 -0.53346 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 25.764478 1.199212 21.485 0.00216 ** V4 -0.278286 0.140452 -1.981 0.18606 V5 -0.034109 0.004451 -7.664 0.01660 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.5403 on 2 degrees of freedom Multiple R-squared: 0.9748, Adjusted R-squared: 0.9497 F-statistic: 38.74 on 2 and 2 DF, p-value: 0.02516 My new p value for the model is lower, and better. So, this is what I have to do, I have to import climate data, and build the climate model using those independant variables that give me the best p-value for the model, and I have to do it automatic (since this example I did it manual). So, my question after all this long explanation. Is there a package u order I can download to apply selection of independent variables using as criteria p-value and adjusted R-squered, or on the contrary, I have to build what I need by myself. I guess I can build it by myself but it will take me a while but I would like to know if there is some package to help to do it faster. Well, thanks in advance. Lucas _ Nuevo Windows Live, un mundo lleno de posibilidades. Descúbrelo. http://www.microsoft.com/windows/windowslive/default.aspx [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging
All, I'm trying again with a slightly more generic version of my first question. I can extract the plotted values from hist(), boxplot(), and even plot.randomForest(). Observe: # get some data dat - rnorm(100) # grab histogram data hdat - hist(dat) hdat #provides details of the hist output #grab boxplot data bdat - boxplot(dat) bdat #provides details of the boxplot output # the same works for randomForest library(randomForest) data(mtcars) RFdat - plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), log=y) RFdat ##But, I can't use this method in ROCR library(ROCR) data(ROCR.xval) RCdat - plot(perf, avg=threshold) RCdat ## output: NULL Does anyone have any tricks for piping or extracting these data? Or, perhaps for steering me in another direction? Thanks, Tim From: Tim Howard tghow...@gw.dec.state.ny.us Subject: [R] ROCR.plot methods, cross validation averaging To: osan...@mpi-sb.mpg.de, tobias.s...@mpi-sb.mpg.de, r-help@r-project.org Message-ID: 4aba1079.6d16.00d...@gw.dec.state.ny.us Content-Type: text/plain; charset=US-ASCII Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) - I think my first question is generic and could apply to many methods, which is why I'm directing this initially to R-help as well as Tobias and Oliver. Question 1. The plot function in ROCR will average your cross validation data if asked. I'd like to use that averaged data to find a best cutoff but I can't figure out how to grab the actual data that get plotted. A simple redirect of the plot (such as test - plot(mydata)) doesn't do it. Question 2. I am asking ROCR to average lists with varying lengths for each list entry. See my example below. None of the ROCR examples have data structured in this manner. Can anyone speak to whether the averaging methods in ROCR allow for this? If I can't easily grab the data as desired from Question 1, can someone help me figure out how to average the lists, by threshold, similarly? Question 3. If my cross validation data happen to have a list entry whose length = 2, ROCR errors out. Please see the second part of my example. Any suggestions? #reproducible examples exemplifying my questions ##part one## library(ROCR) data(ROCR.xval) # set up data so it looks more like my real data sampSize - c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25) testSet - ROCR.xval # do the extraction for (i in 1:length(ROCR.xval[[1]])){ y - sample(c(1:350),sampSize[i]) testSet$predictions[[i]] - ROCR.xval$predictions[[i]][y] testSet$labels[[i]] - ROCR.xval$labels[[i]][y] } # now massage the data using ROCR, set up for a ROC plot # if it errors out here, run the above sample again. pred - prediction(testSet$predictions, testSet$labels) perf - performance(pred,tpr,fpr) # create the ROC plot, averaging by cutoff value plot(perf, avg=threshold) # check out the structure of the data str(perf) # note the ragged edges of the list and that I assume averaging # whether it be vertical, horizontal, or threshold, somehow # accounts for this? ## part two ## # add a list entry with only two values p...@x.values[[1]] - c(0,1) p...@y.values[[1]] - c(0,1) p...@alpha.values[[1]] - c(Inf,0) plot(perf, avg=threshold) ##output results in an error with this message # Error in if (from == to) rep.int(from, length.out) else as.vector(c(from, : # missing value where TRUE/FALSE needed Thanks in advance for your help Tim Howard New York Natural Heritage Program __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging
On Sep 24, 2009, at 9:09 AM, Tim Howard wrote: All, I'm trying again with a slightly more generic version of my first question. I can extract the plotted values from hist(), boxplot(), and even plot.randomForest(). Observe: # get some data dat - rnorm(100) # grab histogram data hdat - hist(dat) hdat #provides details of the hist output #grab boxplot data bdat - boxplot(dat) bdat #provides details of the boxplot output # the same works for randomForest library(randomForest) data(mtcars) RFdat - plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), log=y) RFdat ##But, I can't use this method in ROCR library(ROCR) data(ROCR.xval) RCdat - plot(perf, avg=threshold) That code throws an object not found error. Perhaps you defined perf earlier? David RCdat ## output: NULL Does anyone have any tricks for piping or extracting these data? Or, perhaps for steering me in another direction? Thanks, Tim From: Tim Howard tghow...@gw.dec.state.ny.us Subject: [R] ROCR.plot methods, cross validation averaging To: osan...@mpi-sb.mpg.de, tobias.s...@mpi-sb.mpg.de, r-help@r-project.org Message-ID: 4aba1079.6d16.00d...@gw.dec.state.ny.us Content-Type: text/plain; charset=US-ASCII Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) - I think my first question is generic and could apply to many methods, which is why I'm directing this initially to R-help as well as Tobias and Oliver. Question 1. The plot function in ROCR will average your cross validation data if asked. I'd like to use that averaged data to find a best cutoff but I can't figure out how to grab the actual data that get plotted. A simple redirect of the plot (such as test - plot(mydata)) doesn't do it. Question 2. I am asking ROCR to average lists with varying lengths for each list entry. See my example below. None of the ROCR examples have data structured in this manner. Can anyone speak to whether the averaging methods in ROCR allow for this? If I can't easily grab the data as desired from Question 1, can someone help me figure out how to average the lists, by threshold, similarly? Question 3. If my cross validation data happen to have a list entry whose length = 2, ROCR errors out. Please see the second part of my example. Any suggestions? #reproducible examples exemplifying my questions ##part one## library(ROCR) data(ROCR.xval) # set up data so it looks more like my real data sampSize - c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25) testSet - ROCR.xval # do the extraction for (i in 1:length(ROCR.xval[[1]])){ y - sample(c(1:350),sampSize[i]) testSet$predictions[[i]] - ROCR.xval$predictions[[i]][y] testSet$labels[[i]] - ROCR.xval$labels[[i]][y] } # now massage the data using ROCR, set up for a ROC plot # if it errors out here, run the above sample again. pred - prediction(testSet$predictions, testSet$labels) perf - performance(pred,tpr,fpr) # create the ROC plot, averaging by cutoff value plot(perf, avg=threshold) # check out the structure of the data str(perf) # note the ragged edges of the list and that I assume averaging # whether it be vertical, horizontal, or threshold, somehow # accounts for this? ## part two ## # add a list entry with only two values p...@x.values[[1]] - c(0,1) p...@y.values[[1]] - c(0,1) p...@alpha.values[[1]] - c(Inf,0) plot(perf, avg=threshold) ##output results in an error with this message # Error in if (from == to) rep.int(from, length.out) else as.vector(c(from, : # missing value where TRUE/FALSE needed Thanks in advance for your help Tim Howard New York Natural Heritage Program __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging
Whoops, sorry. Here is the full set with the missing lines: library(ROCR) data(ROCR.xval) pred - prediction(ROCR.xval$predictions, ROCR.xval$labels) perf - performance(pred,tpr,fpr) RCdat - plot(perf, avg=threshold) RCdat Thanks. Tim David Winsemius dwinsem...@comcast.net 9/24/2009 9:25 AM On Sep 24, 2009, at 9:09 AM, Tim Howard wrote: All, I'm trying again with a slightly more generic version of my first question. I can extract the plotted values from hist(), boxplot(), and even plot.randomForest(). Observe: # get some data dat - rnorm(100) # grab histogram data hdat - hist(dat) hdat #provides details of the hist output #grab boxplot data bdat - boxplot(dat) bdat #provides details of the boxplot output # the same works for randomForest library(randomForest) data(mtcars) RFdat - plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), log=y) RFdat ##But, I can't use this method in ROCR library(ROCR) data(ROCR.xval) RCdat - plot(perf, avg=threshold) That code throws an object not found error. Perhaps you defined perf earlier? David RCdat ## output: NULL Does anyone have any tricks for piping or extracting these data? Or, perhaps for steering me in another direction? Thanks, Tim From: Tim Howard tghow...@gw.dec.state.ny.us Subject: [R] ROCR.plot methods, cross validation averaging To: osan...@mpi-sb.mpg.de, tobias.s...@mpi-sb.mpg.de, r-help@r-project.org Message-ID: 4aba1079.6d16.00d...@gw.dec.state.ny.us Content-Type: text/plain; charset=US-ASCII Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) - I think my first question is generic and could apply to many methods, which is why I'm directing this initially to R-help as well as Tobias and Oliver. Question 1. The plot function in ROCR will average your cross validation data if asked. I'd like to use that averaged data to find a best cutoff but I can't figure out how to grab the actual data that get plotted. A simple redirect of the plot (such as test - plot(mydata)) doesn't do it. Question 2. I am asking ROCR to average lists with varying lengths for each list entry. See my example below. None of the ROCR examples have data structured in this manner. Can anyone speak to whether the averaging methods in ROCR allow for this? If I can't easily grab the data as desired from Question 1, can someone help me figure out how to average the lists, by threshold, similarly? Question 3. If my cross validation data happen to have a list entry whose length = 2, ROCR errors out. Please see the second part of my example. Any suggestions? #reproducible examples exemplifying my questions ##part one## library(ROCR) data(ROCR.xval) # set up data so it looks more like my real data sampSize - c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25) testSet - ROCR.xval # do the extraction for (i in 1:length(ROCR.xval[[1]])){ y - sample(c(1:350),sampSize[i]) testSet$predictions[[i]] - ROCR.xval$predictions[[i]][y] testSet$labels[[i]] - ROCR.xval$labels[[i]][y] } # now massage the data using ROCR, set up for a ROC plot # if it errors out here, run the above sample again. pred - prediction(testSet$predictions, testSet$labels) perf - performance(pred,tpr,fpr) # create the ROC plot, averaging by cutoff value plot(perf, avg=threshold) # check out the structure of the data str(perf) # note the ragged edges of the list and that I assume averaging # whether it be vertical, horizontal, or threshold, somehow # accounts for this? ## part two ## # add a list entry with only two values p...@x.values[[1]] - c(0,1) p...@y.values[[1]] - c(0,1) p...@alpha.values[[1]] - c(Inf,0) plot(perf, avg=threshold) ##output results in an error with this message # Error in if (from == to) rep.int(from, length.out) else as.vector(c(from, : # missing value where TRUE/FALSE needed Thanks in advance for your help Tim Howard New York Natural Heritage Program __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Downloading data from from internet
Hi all, I want to download data from those two different sources, directly into R : http://www.rateinflation.com/consumer-price-index/usa-cpi.php http://eaindustry.nic.in/asp2/list_d.asp First one is CPI of US and 2nd one is WPI of India. Can anyone please give any clue how to download them directly into R. I want to make them zoo object for further analysis. Thanks, -- View this message in context: http://www.nabble.com/Downloading-data-from-from-internet-tp25568930p25568930.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] P-value and R-squared variable selection criteria
Lucas This problem is very old --- older than keypunches. There are several methods for selecting variables (forward, backwards, both, all subsets) using a variety of criteria (p-values, R^2, adjusted R^2, Cp, AIC, BIC, and more). Be sure you understand the methods, especially the tendency to overfit. I use the BIC --- the function is stepAIC with parameter k = log(sample size) from the MASS package. Joe [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging
On Sep 24, 2009, at 9:09 AM, Tim Howard wrote: All, I'm trying again with a slightly more generic version of my first question. I can extract the plotted values from hist(), boxplot(), and even plot.randomForest(). Observe: # get some data dat - rnorm(100) # grab histogram data hdat - hist(dat) hdat #provides details of the hist output #grab boxplot data bdat - boxplot(dat) bdat #provides details of the boxplot output # the same works for randomForest library(randomForest) data(mtcars) RFdat - plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), log=y) RFdat ##But, I can't use this method in ROCR library(ROCR) data(ROCR.xval) RCdat - plot(perf, avg=threshold) RCdat ## output: NULL Does anyone have any tricks for piping or extracting these data? Or, perhaps for steering me in another direction? After looking at the examples in ROCR, my guess is that you really ought to examine the perf object itself. It's an S4 object so some of the access to internals are a bit different. In the example performance object I just created, the y-values slot values would ba obtainable with: p...@y.values The is also help from: ?plot-methods -- David Thanks, Tim From: Tim Howard tghow...@gw.dec.state.ny.us Subject: [R] ROCR.plot methods, cross validation averaging To: osan...@mpi-sb.mpg.de, tobias.s...@mpi-sb.mpg.de, r-help@r-project.org Message-ID: 4aba1079.6d16.00d...@gw.dec.state.ny.us Content-Type: text/plain; charset=US-ASCII Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) - I think my first question is generic and could apply to many methods, which is why I'm directing this initially to R-help as well as Tobias and Oliver. Question 1. The plot function in ROCR will average your cross validation data if asked. I'd like to use that averaged data to find a best cutoff but I can't figure out how to grab the actual data that get plotted. A simple redirect of the plot (such as test - plot(mydata)) doesn't do it. Question 2. I am asking ROCR to average lists with varying lengths for each list entry. See my example below. None of the ROCR examples have data structured in this manner. Can anyone speak to whether the averaging methods in ROCR allow for this? If I can't easily grab the data as desired from Question 1, can someone help me figure out how to average the lists, by threshold, similarly? Question 3. If my cross validation data happen to have a list entry whose length = 2, ROCR errors out. Please see the second part of my example. Any suggestions? #reproducible examples exemplifying my questions ##part one## library(ROCR) data(ROCR.xval) # set up data so it looks more like my real data sampSize - c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25) testSet - ROCR.xval # do the extraction for (i in 1:length(ROCR.xval[[1]])){ y - sample(c(1:350),sampSize[i]) testSet$predictions[[i]] - ROCR.xval$predictions[[i]][y] testSet$labels[[i]] - ROCR.xval$labels[[i]][y] } # now massage the data using ROCR, set up for a ROC plot # if it errors out here, run the above sample again. pred - prediction(testSet$predictions, testSet$labels) perf - performance(pred,tpr,fpr) # create the ROC plot, averaging by cutoff value plot(perf, avg=threshold) # check out the structure of the data str(perf) # note the ragged edges of the list and that I assume averaging # whether it be vertical, horizontal, or threshold, somehow # accounts for this? ## part two ## # add a list entry with only two values p...@x.values[[1]] - c(0,1) p...@y.values[[1]] - c(0,1) p...@alpha.values[[1]] - c(Inf,0) plot(perf, avg=threshold) ##output results in an error with this message # Error in if (from == to) rep.int(from, length.out) else as.vector(c(from, : # missing value where TRUE/FALSE needed Thanks in advance for your help Tim Howard New York Natural Heritage Program __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] superimposing xyplots on same scale
Hi, try ?as.layer in the latticeExtra package. HTH, baptiste 2009/9/24 Larry White ljw1...@gmail.com: I have two xyplots that i want to superimpose (code below). By default they are displayed on slightly different y scales (one runs from 10 to 25, the other from 10 to 30). I would like to force them both onto the same scale (10 to 30) so the relation between the two is clear. Is there a way to do this? thanks much pct_compl_chart - xyplot(pct_compl ~ date, col=red, type=b, pch=15, scalse=list(tick.number=5), ylab=list(label=Pct. Compl.), layout=c(1,5), xlab=list(label=), between = list(x = c(0, 0, 0), y = c(8,-10,-10,-10,-10)) ) time_pct_chart - xyplot(time_pct ~ date, col=blue, type=b, pch=15, scales=list(tick.number=5), ylab=list(label=), layout=c(1,5), xlab=list(label=), between = list(x = c(0, 0, 0), y = c(8,-10,-10,-10,-10)) ) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging
Tim, if I understand correctly, you are trying to get the numerical values of averaged cross-validation curves. Unfortunately the plot function of ROCR does not return anything in the current version (it's a good suggestion to change this). If you want a quick fix, you could change the plot.performance function of ROCR to return back the values you wanted. Kind regards, Tobias On Thu, Sep 24, 2009 at 3:09 PM, Tim Howard tghow...@gw.dec.state.ny.us wrote: All, I'm trying again with a slightly more generic version of my first question. I can extract the plotted values from hist(), boxplot(), and even plot.randomForest(). Observe: # get some data dat - rnorm(100) # grab histogram data hdat - hist(dat) hdat #provides details of the hist output #grab boxplot data bdat - boxplot(dat) bdat #provides details of the boxplot output # the same works for randomForest library(randomForest) data(mtcars) RFdat - plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), log=y) RFdat ##But, I can't use this method in ROCR library(ROCR) data(ROCR.xval) RCdat - plot(perf, avg=threshold) RCdat ## output: NULL Does anyone have any tricks for piping or extracting these data? Or, perhaps for steering me in another direction? Thanks, Tim From: Tim Howard tghow...@gw.dec.state.ny.us Subject: [R] ROCR.plot methods, cross validation averaging To: osan...@mpi-sb.mpg.de, tobias.s...@mpi-sb.mpg.de, r-help@r-project.org Message-ID: 4aba1079.6d16.00d...@gw.dec.state.ny.us Content-Type: text/plain; charset=US-ASCII Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) - I think my first question is generic and could apply to many methods, which is why I'm directing this initially to R-help as well as Tobias and Oliver. Question 1. The plot function in ROCR will average your cross validation data if asked. I'd like to use that averaged data to find a best cutoff but I can't figure out how to grab the actual data that get plotted. A simple redirect of the plot (such as test - plot(mydata)) doesn't do it. Question 2. I am asking ROCR to average lists with varying lengths for each list entry. See my example below. None of the ROCR examples have data structured in this manner. Can anyone speak to whether the averaging methods in ROCR allow for this? If I can't easily grab the data as desired from Question 1, can someone help me figure out how to average the lists, by threshold, similarly? Question 3. If my cross validation data happen to have a list entry whose length = 2, ROCR errors out. Please see the second part of my example. Any suggestions? #reproducible examples exemplifying my questions ##part one## library(ROCR) data(ROCR.xval) # set up data so it looks more like my real data sampSize - c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25) testSet - ROCR.xval # do the extraction for (i in 1:length(ROCR.xval[[1]])){ y - sample(c(1:350),sampSize[i]) testSet$predictions[[i]] - ROCR.xval$predictions[[i]][y] testSet$labels[[i]] - ROCR.xval$labels[[i]][y] } # now massage the data using ROCR, set up for a ROC plot # if it errors out here, run the above sample again. pred - prediction(testSet$predictions, testSet$labels) perf - performance(pred,tpr,fpr) # create the ROC plot, averaging by cutoff value plot(perf, avg=threshold) # check out the structure of the data str(perf) # note the ragged edges of the list and that I assume averaging # whether it be vertical, horizontal, or threshold, somehow # accounts for this? ## part two ## # add a list entry with only two values p...@x.values[[1]] - c(0,1) p...@y.values[[1]] - c(0,1) p...@alpha.values[[1]] - c(Inf,0) plot(perf, avg=threshold) ##output results in an error with this message # Error in if (from == to) rep.int(from, length.out) else as.vector(c(from, : # missing value where TRUE/FALSE needed Thanks in advance for your help Tim Howard New York Natural Heritage Program __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] P-value and R-squared variable selection criteria
Don't throw out the baby with the bath water just yet. Note that even though your first model is insignificant, the R-squared is very high. This is because you fit the whole model with intercept and three coefficients on 1 degree of freedom. You need to first import the data, then run the model, and then decide which coefficients to include. Second, you may have data redundancy issues, for example, if altitude correlates with longitude or latitude (especially, since you have so few stations from a very restricted region, this seems more likely than for larger regions). Check the correlations. If they are high, you may think about data reduction strategies (e.g. principal components analysis). Further, your data is panel data (where the cross-section is the 10 stations and the time series is the 2004 to 2008 monthly data). Thus, it is very likely that fitting OLS without recognizing the dependence of the time-series within each station is problematic. On top, there is certainly correlation across stations, e.g., due to seasonal patterns that you may want to account for. That said, if you want to step down a model to exclude the insignificant predictor variables one by one (more specifically, those with a t-value smaller than 1), use step x1=rnorm(100) x2=rnorm(100) x3=rnorm(100) x4=rnorm(100) e=rnorm(100,0,2) y=x1+x3+e reg=lm(y~x1+x2+x3+x4) summary(reg) step(reg2) reg2=lm(y~x1+x3) summary(reg2) HTH Daniel - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im Auftrag von Lucas Sevilla García Gesendet: Thursday, September 24, 2009 8:49 AM An: r-help@r-project.org Betreff: [R] P-value and R-squared variable selection criteria Hi R community I have a question. I'll explain my situation. I have to build a climate model to obtain monthly and annual temperature from 2004 to 2008 from a specif area in Almeria (Spain). To build this climate model, I will use Multiple regression. My dependant variable will be monthly and annual temperature and independant variables will be Latitute, Longitude and Altitude and I will work with climate data from 10 climate stations distributed in my area of interest. I have to fit the climate model from the data to get temperature for each month. And I need to use p-value and r-squared adjusted from the model to obtain the best fit. I'll put an example. My climate data will be: V1 V2 V3 V4 V5 1 1 18 3 6 187 2 2 21 6 8 68 3 3 23 9 5 42 4 4 19 8 2 194 5 5 17 3 2 225 (V1 - climate station, V2 - temperature, V3 - Latitude, V4 - Longitude, V5 - Altitude) I fit the model to the data fit(V2~V3+V4+V5, data=clima) And I get Call: lm(formula = V2 ~ V3 + V4 + V5, data = clima) Residuals: 12345 0.24684 -0.25200 0.17487 -0.05865 -0.11107 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 22.103408 2.526638 8.748 0.0725 . V3 0.236477 0.152067 1.555 0.3638 V4 -0.073973 0.169716 -0.436 0.7383 V5 -0.024684 0.006951 -3.551 0.1748 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.4133 on 1 degrees of freedom Multiple R-squared: 0.9926, Adjusted R-squared: 0.9706 F-statistic: 44.95 on 3 and 1 DF, p-value: 0.1091 P- value for this model is 0.1091 However, I see that variable V4 has a really high p-value, so if I take it out, my model will have a better p-value. So: fit2-lm(V2~V4+V5) Call: lm(formula = V2 ~ V4 + V5, data = clima) Residuals: 12345 0.28356 -0.21880 0.05952 0.40918 -0.53346 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 25.764478 1.199212 21.485 0.00216 ** V4 -0.278286 0.140452 -1.981 0.18606 V5 -0.034109 0.004451 -7.664 0.01660 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.5403 on 2 degrees of freedom Multiple R-squared: 0.9748, Adjusted R-squared: 0.9497 F-statistic: 38.74 on 2 and 2 DF, p-value: 0.02516 My new p value for the model is lower, and better. So, this is what I have to do, I have to import climate data, and build the climate model using those independant variables that give me the best p-value for the model, and I have to do it automatic (since this example I did it manual). So, my question after all this long explanation. Is there a package u order I can download to apply selection of independent variables using as criteria p-value and adjusted R-squered, or on the contrary, I have to build what I need by myself. I guess I can build it by myself but it will take me a while but I would like to know if there is some package to help to do it faster. Well, thanks in advance. Lucas
Re: [R] Statistical analysis
Rainfall data is widely accepted as Random walk process and hence it is non-stationary. Therefore if correlation or regression coef. is measured on raw data then you may land in the world of spurious measures. I would suggest you to check whether unit root is there in your data or not first. If it is there then estimate corr or any other statistical measure on differenced data. Best, cls59 wrote: Chris Li wrote: Hi all, I have got two datasets, one of them is rainfall data and the other one is groundwater level data. I would like to see whether there is a correlation between these two datasets and if there is, to what extent they are correlated. My stats background is limited, therefore any advice on which command I should use in R would be greatly appreciated. Thanks in advance. Chris Supposing you have two variables-- precipitation, p, and groundwater potential, h-- a simple test for linear correlation is to produce a scatterplot of h vs. p: plot( h ~ p ) If it looks linear, than it may be worthwhile to have R estimate the coefficient of correlation for the data: cor( p, h ) If the correlation coefficient is close to +/- 1, than your data is exhibiting a strong linear trend and a linear model may be appropriate: linModel - lm( h ~ p ) abline( linModel ) Good luck! -Charlie -- View this message in context: http://www.nabble.com/Statistical-analysis-tp25531331p25570612.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug
You have found a bug. It would be best to use dput(test1) to show unambiguously display what is in test1 but in the absence of that I will assume that its as in test1 shown below. library(sqldf) test1 - data.frame(sale_date = as.Date(c(2008-08-01, 2031-01-09, + 1990-01-03, 2007-02-03, 1997-01-03, 2004-02-04))) sqldf(select max(sale_date) from test1) max(sale_date) 1 9864.0 Evidently it is taking the internal numeric representation and then storing it in the database as characters and then taking the maximum of those characters. As the fifth entry starts with 9 its the maximum when sorted alphabetically: as.numeric(test1[[1]]) [1] 14092 22288 7307 13547 9864 12452 I will have to investigate whether the problem is in sqldf or the underlying software. In the meantime if you represent the Date data as character you should be ok: test2 - transform(test1, sale_date = as.character(sale_date)) sqldf(select max(sale_date) from test2) max(sale_date) 1 2031-01-09 packageDescription(sqldf)$Version [1] 0-1.7 R.version.string [1] R version 2.9.2 Patched (2009-09-08 r49647) Please provide the output of dput(test1) so that we know unambiguously what your data looks like. On Thu, Sep 24, 2009 at 9:07 AM, dhanasekaran dhana...@gmail.com wrote: The data looks like 2008-08-01 2031-01-09 1990-01-03 2007-02-03 1997-01-03 2004-02-04 Thanks. On Thu, Sep 24, 2009 at 5:20 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Please read and follow the last line to every message on r-help. On Thu, Sep 24, 2009 at 5:32 AM, dhansekaran dhana...@gmail.com wrote: Hello R users I tried to get maximum of sale date from my dataframe using sqldf in R. First time when i was executing the following code sqldf(select max(sale_date) from test1) i got the result as 9997.0 BUT when i was running the same for second time, the result was 2031-04-09 (this is what correct one!) why it was happened? thanks. -- View this message in context: http://www.nabble.com/Bug-tp25548042p25548042.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Best Dhanasekaran Without trust, words become the hollow sound of a wooden gong. With trust, words become life itself.” __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Modelling
Dear R-users, Suppose I have the followin g sample of data, 0 1 2 4 3 1 2 1 3 1 1 3 3 4 1 0 1 2 1 2 1 4 1 4 2 1 2 2 1 1 The first variable is the response variable where 0 is defective and 1 normal. The other four factors( x1,x2,x3,x4) that influence the outcome. I want to fit a binomial model in R . I want also to rder the factors based on their degree of influence the outcome. How do I do this in R. thanks in advance Ashta [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with xtabs(), exclude=NULL, and counting NA's
wtf - factor(x, levels(c(levels(wtf), NA), exclude=NULL) xtabs (~ wtf, exclude=NULL, na.action=na.pass) Also see addNA. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging
David, Thank you for your reply. Yes, I can access the y-values slot with p...@y-values but, note that in the cross-validation example (ROCR.xval), the plot function averages across the list of ten vectors in the y-values slot. I might be able to create a function to average across these ten vectors, but, since the plot function already does it for me, I thought it most efficient to get the values from the function. The compounding factor is that averaging needs to incorporate some kind of complex (to me at least) equalization based on the third slot (alpha.values). I don't know how to average vectors (especially uneven-length vectors) that align using the alpha-values (suggestions here welcome!). Again, the plot function does this for me... if I could just get those values. Tobias, You suggestion to change the plot.performance function is a good one. I'll see if I can get in there and tweak it. Thanks to both of you for the help. Tim David Winsemius dwinsem...@comcast.net 9/24/2009 9:43 AM On Sep 24, 2009, at 9:09 AM, Tim Howard wrote: All, I'm trying again with a slightly more generic version of my first question. I can extract the plotted values from hist(), boxplot(), and even plot.randomForest(). Observe: # get some data dat - rnorm(100) # grab histogram data hdat - hist(dat) hdat #provides details of the hist output #grab boxplot data bdat - boxplot(dat) bdat #provides details of the boxplot output # the same works for randomForest library(randomForest) data(mtcars) RFdat - plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), log=y) RFdat ##But, I can't use this method in ROCR library(ROCR) data(ROCR.xval) RCdat - plot(perf, avg=threshold) RCdat ## output: NULL Does anyone have any tricks for piping or extracting these data? Or, perhaps for steering me in another direction? After looking at the examples in ROCR, my guess is that you really ought to examine the perf object itself. It's an S4 object so some of the access to internals are a bit different. In the example performance object I just created, the y-values slot values would ba obtainable with: p...@y.values The is also help from: ?plot-methods -- David Thanks, Tim From: Tim Howard tghow...@gw.dec.state.ny.us Subject: [R] ROCR.plot methods, cross validation averaging To: osan...@mpi-sb.mpg.de, tobias.s...@mpi-sb.mpg.de, r-help@r-project.org Message-ID: 4aba1079.6d16.00d...@gw.dec.state.ny.us Content-Type: text/plain; charset=US-ASCII Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) - I think my first question is generic and could apply to many methods, which is why I'm directing this initially to R-help as well as Tobias and Oliver. Question 1. The plot function in ROCR will average your cross validation data if asked. I'd like to use that averaged data to find a best cutoff but I can't figure out how to grab the actual data that get plotted. A simple redirect of the plot (such as test - plot(mydata)) doesn't do it. Question 2. I am asking ROCR to average lists with varying lengths for each list entry. See my example below. None of the ROCR examples have data structured in this manner. Can anyone speak to whether the averaging methods in ROCR allow for this? If I can't easily grab the data as desired from Question 1, can someone help me figure out how to average the lists, by threshold, similarly? Question 3. If my cross validation data happen to have a list entry whose length = 2, ROCR errors out. Please see the second part of my example. Any suggestions? #reproducible examples exemplifying my questions ##part one## library(ROCR) data(ROCR.xval) # set up data so it looks more like my real data sampSize - c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25) testSet - ROCR.xval # do the extraction for (i in 1:length(ROCR.xval[[1]])){ y - sample(c(1:350),sampSize[i]) testSet$predictions[[i]] - ROCR.xval$predictions[[i]][y] testSet$labels[[i]] - ROCR.xval$labels[[i]][y] } # now massage the data using ROCR, set up for a ROC plot # if it errors out here, run the above sample again. pred - prediction(testSet$predictions, testSet$labels) perf - performance(pred,tpr,fpr) # create the ROC plot, averaging by cutoff value plot(perf, avg=threshold) # check out the structure of the data str(perf) # note the ragged edges of the list and that I assume averaging # whether it be vertical, horizontal, or threshold, somehow # accounts for this? ## part two ## # add a list entry with only two values p...@x.values[[1]] - c(0,1) p...@y.values[[1]] - c(0,1) p...@alpha.values[[1]] - c(Inf,0) plot(perf, avg=threshold) ##output results in an error with this message # Error in if (from == to) rep.int(from, length.out) else as.vector(c(from, : # missing value where TRUE/FALSE needed Thanks in advance for your help Tim Howard New York Natural
Re: [R] scaled Schoenfeld residuals
On Wed, 23 Sep 2009, Greg Dropkin wrote: hi sorry if this has been discussed before, but I'm wondering why the scaled Schoenfeld residuals do not follow the defining formula for obtaining them from the ordinary Schoenfeld residuals, but are instead offset by the estimated parameter values. Because their purpose in life is to be smoothed against time to get an estimate of the parameter as a function of time (plot.cox.zph). -thomas Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xtable - print - suppress output
On Mon, 21 Sep 2009, David Winsemius wrote: On Sep 21, 2009, at 5:52 PM, Martin Batholdy wrote: I use xtable to convert data.frames to html tables. But when I use the print-command I always get the whole output printed even if I just want to save the html table into a variable; table - print(xtable(CERAT), type=html) How can I suppress that output is printed? Perhaps by diverting it somewhere else? (after the example in xtable's help page) capture.output(print(tli.table, type=html), file=HTout.html) R is not an HTML editor, so it would seem less than intuitive to send it to a character variable. It would not work to assign the value of capture.output since that is an invisible NULL. If that were true, capture.output() would be pretty useless. The returned value is NULL if the file= argument is specified, otherwise it is the captured output. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging
Yes, that's exactly what I am after. Thank you for clarifying my problem for me! I'll try to dive into the plot.performance function. Best, Tim Tobias Sing tobias.s...@gmail.com 9/24/2009 9:57 AM Tim, if I understand correctly, you are trying to get the numerical values of averaged cross-validation curves. Unfortunately the plot function of ROCR does not return anything in the current version (it's a good suggestion to change this). If you want a quick fix, you could change the plot.performance function of ROCR to return back the values you wanted. Kind regards, Tobias On Thu, Sep 24, 2009 at 3:09 PM, Tim Howard tghow...@gw.dec.state.ny.us wrote: All, I'm trying again with a slightly more generic version of my first question. I can extract the plotted values from hist(), boxplot(), and even plot.randomForest(). Observe: # get some data dat - rnorm(100) # grab histogram data hdat - hist(dat) hdat #provides details of the hist output #grab boxplot data bdat - boxplot(dat) bdat #provides details of the boxplot output # the same works for randomForest library(randomForest) data(mtcars) RFdat - plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), log=y) RFdat ##But, I can't use this method in ROCR library(ROCR) data(ROCR.xval) RCdat - plot(perf, avg=threshold) RCdat ## output: NULL Does anyone have any tricks for piping or extracting these data? Or, perhaps for steering me in another direction? Thanks, Tim From: Tim Howard tghow...@gw.dec.state.ny.us Subject: [R] ROCR.plot methods, cross validation averaging To: osan...@mpi-sb.mpg.de, tobias.s...@mpi-sb.mpg.de, r-help@r-project.org Message-ID: 4aba1079.6d16.00d...@gw.dec.state.ny.us Content-Type: text/plain; charset=US-ASCII Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) - I think my first question is generic and could apply to many methods, which is why I'm directing this initially to R-help as well as Tobias and Oliver. Question 1. The plot function in ROCR will average your cross validation data if asked. I'd like to use that averaged data to find a best cutoff but I can't figure out how to grab the actual data that get plotted. A simple redirect of the plot (such as test - plot(mydata)) doesn't do it. Question 2. I am asking ROCR to average lists with varying lengths for each list entry. See my example below. None of the ROCR examples have data structured in this manner. Can anyone speak to whether the averaging methods in ROCR allow for this? If I can't easily grab the data as desired from Question 1, can someone help me figure out how to average the lists, by threshold, similarly? Question 3. If my cross validation data happen to have a list entry whose length = 2, ROCR errors out. Please see the second part of my example. Any suggestions? #reproducible examples exemplifying my questions ##part one## library(ROCR) data(ROCR.xval) # set up data so it looks more like my real data sampSize - c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25) testSet - ROCR.xval # do the extraction for (i in 1:length(ROCR.xval[[1]])){ y - sample(c(1:350),sampSize[i]) testSet$predictions[[i]] - ROCR.xval$predictions[[i]][y] testSet$labels[[i]] - ROCR.xval$labels[[i]][y] } # now massage the data using ROCR, set up for a ROC plot # if it errors out here, run the above sample again. pred - prediction(testSet$predictions, testSet$labels) perf - performance(pred,tpr,fpr) # create the ROC plot, averaging by cutoff value plot(perf, avg=threshold) # check out the structure of the data str(perf) # note the ragged edges of the list and that I assume averaging # whether it be vertical, horizontal, or threshold, somehow # accounts for this? ## part two ## # add a list entry with only two values p...@x.values[[1]] - c(0,1) p...@y.values[[1]] - c(0,1) p...@alpha.values[[1]] - c(Inf,0) plot(perf, avg=threshold) ##output results in an error with this message # Error in if (from == to) rep.int(from, length.out) else as.vector(c(from, : # missing value where TRUE/FALSE needed Thanks in advance for your help Tim Howard New York Natural Heritage Program __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Downloading data from from internet
Bogaso wrote: Hi all, I want to download data from those two different sources, directly into R : http://www.rateinflation.com/consumer-price-index/usa-cpi.php http://eaindustry.nic.in/asp2/list_d.asp First one is CPI of US and 2nd one is WPI of India. Can anyone please give any clue how to download them directly into R. I want to make them zoo object for further analysis. Thanks, The following site did not load for me: http://eaindustry.nic.in/asp2/list_d.asp But I was able to extract the table from the US CPI site using Duncan Temple Lang's XML package: library(XML) First, download the website into R: html.raw - readLines( 'http://www.rateinflation.com/consumer-price-index/usa-cpi.php' ) Then, convert to an HTML object using the XML package: html.data - htmlTreeParse( html.raw, asText = T, useInternalNodes = T ) A quick scan of the page source in the browser reveals that the table you want is encased in a div with a class of dynamicContent-- we will use a xpath specification[1] to retrieve all rows in that table: table.html - getNodeSet( html.data, '//d...@class=dynamicContent]/table/tr' ) Now, the data values can be extracted from the cells in the rows using a little sapply and xpathXpply voodoo: table.data - t( sapply( table.html, function( row ){ row.data - xpathSApply( row, './td', xmlValue ) return( row.data) })) Good luck! -Charlie [1]: http://www.w3schools.com/XPath/xpath_syntax.asp - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://www.nabble.com/Downloading-data-from-from-internet-tp25568930p25572316.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] graphics mailing list?
Dear all, Would it make sense to have a separate mailing list (special interest group*) for Grid graphics? (or is there one already?) I don't feel comfortable asking questions about the design of new a new grid class in R-help where I'm guessing most people won't be interested. Of course having yet another mailing list would only make sense if it's to be followed by those people who work with Grid (lattice, vcd, ggplot2, latticeExtra, Rgraphics, etc.). Having read a bit of code from these packages recently, I get the feeling that several people may have been facing similar problems or reinventing the same things. Just a thought, Best regards, baptiste *: http://www.r-project.org/mail.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with xtabs(), exclude=NULL, and counting NA's
On Sep 24, 2009, at 10:06 AM, hadley wickham wrote: wtf - factor(x, levels(c(levels(wtf), NA), exclude=NULL) xtabs (~ wtf, exclude=NULL, na.action=na.pass) Also see addNA. That is nice. The addNA function does not exactly jump off the page for the (too) casual reader. In the context of the OP's original problem, these lines of code are illustrative: xtabs(~addNA(wkhp), x, exclude=NULL, na.action=na.pass) addNA(wkhp) 20 30 40 45 60 NA 11 10134 x$wtf - cut(x$wkhp, breaks=seq(20, 80, by=20) ) xtabs( ~ addNA(wkhp), x) addNA(wkhp) 20 30 40 45 60 NA 11 10134 xtabs( ~ addNA(wtf), x) addNA(wtf) (20,40] (40,60] (60,80]NA 11 4 0 5 Hadley -- http://had.co.nz/ David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] graphics mailing list?
Dear all, Would it make sense to have a separate mailing list (special interest group*) for Grid graphics? (or is there one already?) I don't feel comfortable asking questions about the design of new a new grid class in R-help where I'm guessing most people won't be interested. Of course having yet another mailing list would only make sense if it's to be followed by those people who work with Grid (lattice, vcd, ggplot2, latticeExtra, Rgraphics, etc.). Having read a bit of code from these packages recently, I get the feeling that several people may have been facing similar problems or reinventing the same things. Just a thought, Best regards, baptiste *: http://www.r-project.org/mail.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphics mailing list?
On 9/24/2009 10:34 AM, baptiste.auguie wrote: Dear all, Would it make sense to have a separate mailing list (special interest group*) for Grid graphics? (or is there one already?) I don't feel comfortable asking questions about the design of new a new grid class in R-help where I'm guessing most people won't be interested. That sounds more like an R-devel question. Duncan Murdoch Of course having yet another mailing list would only make sense if it's to be followed by those people who work with Grid (lattice, vcd, ggplot2, latticeExtra, Rgraphics, etc.). Having read a bit of code from these packages recently, I get the feeling that several people may have been facing similar problems or reinventing the same things. Just a thought, Best regards, baptiste *: http://www.r-project.org/mail.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphics mailing list?
Why just grid ? why not a list for all kind of graphics ? On 09/24/2009 04:34 PM, baptiste.auguie wrote: Dear all, Would it make sense to have a separate mailing list (special interest group*) for Grid graphics? (or is there one already?) I don't feel comfortable asking questions about the design of new a new grid class in R-help where I'm guessing most people won't be interested. Of course having yet another mailing list would only make sense if it's to be followed by those people who work with Grid (lattice, vcd, ggplot2, latticeExtra, Rgraphics, etc.). Having read a bit of code from these packages recently, I get the feeling that several people may have been facing similar problems or reinventing the same things. Just a thought, Best regards, baptiste *: http://www.r-project.org/mail.html -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/ztCu : RGG #158:161: examples of package IDPmisc |- http://tr.im/yw8E : New R package : sos `- http://tr.im/y8y0 : search the graph gallery from R __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to show number in the %f format?
There is also the formatC function, whose description is Formatting numbers individually and flexibly, using 'C' style format specifications. -Don At 2:28 AM -0400 9/24/09, David Winsemius wrote: On Sep 23, 2009, at 6:42 PM, Peng Yu wrote: On Wed, Sep 23, 2009 at 5:16 PM, David Winsemius dwinsem...@comcast.net wrote: On Sep 23, 2009, at 5:58 PM, Peng Yu wrote: Hi, I have the following matrix, which is printed %e format (in C's way). I am wondering how make it be printed in %f format (in C's way)? ??printf # scroll down to base package listings, the C function ?sprintf# the s/r function I tried the following command. The column names are missing and the command is a little complicated. Is there any better solution? t(apply(significant_analysis_results[,7:8],1,function(x){sprintf(%.7f,x)})) Why not apply to the column index? ... rather than to the row and then transposing. [,1][,2] Nab2 0.019 0.000 Rasal10.248 0.105 Ccndbp1 0.001 0.0002269 Svep1 0.000 0.000 Ppara 0.0008219 0.000 Pros1 0.009 0.000 Papss20.000 0.002 Hdac9 0.000 0.000 Adcyap1r1 0.000 0.000 Robo1 0.000 0.000 Sema3a0.000 0.000 Rab9b 0.110 0.011 Tgfb3 0.000 0.000 Slc9a90.0074608 0.000 Creb5 0.003 0.000 Ccnd1 0.0007869 0.001 Pafah1b3 0.000 0.068 Tiam2 0.000 0.000 Etv5 0.000 0.000 Hcrtr20.000 0.166 Regards, Peng significant_analysis_results[,7:8] pval(ki-wt) pval(ko-wt) Nab2 1.913348979e-06 2.731944670e-09 Rasal12.482254110e-05 1.054711084e-05 Ccndbp1 6.307674516e-08 2.268947934e-04 Svep1 0.0e+00 1.564526286e-12 Ppara 8.218961690e-04 2.802202914e-13 Pros1 8.787052919e-07 0.0e+00 Papss20.0e+00 2.190819073e-07 Hdac9 0.0e+00 8.881784197e-16 Adcyap1r1 2.085731587e-11 1.998401444e-15 Robo1 0.0e+00 0.0e+00 Sema3a4.903322193e-11 0.0e+00 Rab9b 1.099629676e-05 1.116694168e-06 Tgfb3 0.0e+00 0.0e+00 Slc9a97.460784795e-03 1.552167950e-09 Creb5 2.959174867e-07 8.973577437e-11 Ccnd1 7.868573521e-04 1.460805570e-07 Pafah1b3 1.576464070e-08 6.757446065e-06 Tiam2 0.0e+00 0.0e+00 Etv5 2.279731959e-12 0.0e+00 Hcrtr21.258646520e-10 1.661509722e-05 str(significant_analysis_results[,7:8]) num [1:20, 1:2] 1.91e-06 2.48e-05 6.31e-08 0.00 8.22e-04 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:20] Nab2 Rasal1 Ccndbp1 Svep1 ... ..$ : chr [1:2] pval(ki-wt) pval(ko-wt) __ R-help@r-project.org mailing list https://*stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://*stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://*stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA 925-423-1062 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Statistical analysis
Since todays ground water may be influenced by yesterdays rainfall, you may want to look at the dynlm package and possibly lag.plot and the zoo package. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Chris Li Sent: Wednesday, September 23, 2009 5:37 PM To: r-help@r-project.org Subject: [R] Statistical analysis Hi all, I have got two datasets, one of them is rainfall data and the other one is groundwater level data. I would like to see whether there is a correlation between these two datasets and if there is, to what extent they are correlated. My stats background is limited, therefore any advice on which command I should use in R would be greatly appreciated. Thanks in advance. Chris -- View this message in context: http://www.nabble.com/Statistical- analysis-tp25531331p25531331.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] making R print on screen
On 09/24/2009 05:26 PM, ld7631 wrote: Hello! I am running a for loop. In the loop I am producing some intermediary results and asking R to print it (of the type below). However, I noticed - when the task is complicated and takes a lot of time, R does not print those intermediary results immediately, but prints them in batches - or does not print at all until we are done with the whole calculation. Is there any way to force R to really print everything it's supposed to be printing as soon as one iteration is over? Thanks a lot! for(i in ) { x-... print(x) } Maybe ?flush.console -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/ztCu : RGG #158:161: examples of package IDPmisc |- http://tr.im/yw8E : New R package : sos `- http://tr.im/y8y0 : search the graph gallery from R __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] making R print on screen
On Sep 24, 2009, at 10:26 AM, ld7631 wrote: Hello! I am running a for loop. In the loop I am producing some intermediary results and asking R to print it (of the type below). However, I noticed - when the task is complicated and takes a lot of time, R does not print those intermediary results immediately, but prints them in batches - or does not print at all until we are done with the whole calculation. Is there any way to force R to really print everything it's supposed to be printing as soon as one iteration is over? Thanks a lot! for(i in ) { x-... print(x) } Presumably you are on Windows? If so, see: http://cran.r-project.org/bin/windows/base/rw-FAQ.html#The-output-to-the-console-seems-to-be-delayed HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] making R print on screen
Thanks a lot, everyone! On Thu, Sep 24, 2009 at 11:41 AM, Mario Valle mva...@cscs.ch wrote: ?flush.console Ciao! mario ld7631 wrote: Hello! I am running a for loop. In the loop I am producing some intermediary results and asking R to print it (of the type below). However, I noticed - when the task is complicated and takes a lot of time, R does not print those intermediary results immediately, but prints them in batches - or does not print at all until we are done with the whole calculation. Is there any way to force R to really print everything it's supposed to be printing as soon as one iteration is over? Thanks a lot! for(i in ) { x-... print(x) } -- Ing. Mario Valle Data Analysis and Visualization Group | http://www.cscs.ch/~mvalle Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60 v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 -- Dimitri Liakhovitski Ninah.com dimitri.liakhovit...@ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Downloading data from from internet
Thanks for explaining this, Charlie. Just for completeness and to make things a little easier, the XML package has a function named readHTMLTable() and you can call it with a URL and it will attempt to read all the tables in the page. tbls = readHTMLTable('http://www.rateinflation.com/consumer-price-index/usa-cpi.php') yields a list with 10 elements, and the table of interest with the data is the 10th one. tbls[[10]] The function does the XPath voodoo and sapply() work for you and uses some heuristics. There are various controls one can specify and also various methods for working with sub-parts of the HTML document directly. D. cls59 wrote: Bogaso wrote: Hi all, I want to download data from those two different sources, directly into R : http://www.rateinflation.com/consumer-price-index/usa-cpi.php http://eaindustry.nic.in/asp2/list_d.asp First one is CPI of US and 2nd one is WPI of India. Can anyone please give any clue how to download them directly into R. I want to make them zoo object for further analysis. Thanks, The following site did not load for me: http://eaindustry.nic.in/asp2/list_d.asp But I was able to extract the table from the US CPI site using Duncan Temple Lang's XML package: library(XML) First, download the website into R: html.raw - readLines( 'http://www.rateinflation.com/consumer-price-index/usa-cpi.php' ) Then, convert to an HTML object using the XML package: html.data - htmlTreeParse( html.raw, asText = T, useInternalNodes = T ) A quick scan of the page source in the browser reveals that the table you want is encased in a div with a class of dynamicContent-- we will use a xpath specification[1] to retrieve all rows in that table: table.html - getNodeSet( html.data, '//d...@class=dynamicContent]/table/tr' ) Now, the data values can be extracted from the cells in the rows using a little sapply and xpathXpply voodoo: table.data - t( sapply( table.html, function( row ){ row.data - xpathSApply( row, './td', xmlValue ) return( row.data) })) Good luck! -Charlie [1]: http://www.w3schools.com/XPath/xpath_syntax.asp - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dotchart to barplots
I decided to use your tip and plot the bars using different shades of grey as follows barplot(t(as.matrix(intersect.data[,2:5])),col=c('black','grey40','darkgrey','white'), beside = T, horiz = T, legend.text = names(intersect.data)[-1], axes=TRUE, border=TRUE,plot.grid=F,cex=2, names.arg = intersect.data[,1],cex.axis = 0.7, cex.names = 0.7, las=1, xlim=c(0,80),ylim=c(0,75), xlab=Number of Features, ylab=Rank interval) box() I was trying to use cex.axis=10 scale the expansion of the axis. It complaints giving the following error Error in barplot.default(t(as.matrix(intersect.data[, 2:5])), col = c(black, : formal argument cex.axis matched by multiple actual arguments Can you help me with the correct usage of the argument in my case? Thanks ../Murli From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Nair, Murlidharan T [mn...@iusb.edu] Sent: Wednesday, September 23, 2009 5:11 PM To: Greg Snow; Henrique Dallazuanna Cc: r-help@r-project.org Subject: Re: [R] dotchart to barplots Thanks Greg. I was also thinking about it after I saw my plots. Cheers../Murli -Original Message- From: Greg Snow [mailto:greg.s...@imail.org] Sent: Wednesday, September 23, 2009 3:49 PM To: Nair, Murlidharan T; Henrique Dallazuanna Cc: r-help@r-project.org Subject: RE: [R] dotchart to barplots The current recommendation is to not put designs/hash lines/pictures/etc. into the bars, but to use a single solid color (gray in your case). Back when a quality graph meant using a pen plotter, hash lines made sense as a way to distinguish between bars, but quality graphics no longer depend on pen plotters (I don't remember the last time I actually saw one) and hash lines can cause what is called the Moire effect or Moire vibrations (there should be an accent on the 'e'), which distorts the effects of the graph and can even cause nausea in the viewer. Other patterns in the bars runs the risk of causing other optical illusions and distorting the true content of the graph. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Nair, Murlidharan T Sent: Wednesday, September 23, 2009 1:21 PM To: Henrique Dallazuanna Cc: r-help@r-project.org Subject: Re: [R] dotchart to barplots I had tried names.arg=c(intersect.data[,1]) so that was the problem. That solves part of what need. I there a way to rotate how it is written on the y-axis? Also, use designs instead of gray scale and making keys for it? Thanks for chipping in. Cheers../Murli -Original Message- From: Henrique Dallazuanna [mailto:www...@gmail.com] Sent: Wednesday, September 23, 2009 3:09 PM To: Nair, Murlidharan T Cc: r-help@r-project.org Subject: Re: [R] dotchart to barplots Try this: barplot(t(as.matrix(intersect.data[,2:5])), beside = T, horiz = T, names.arg = intersect.data[,1], cex.axis = 0.7, cex.names = 0.7) On Wed, Sep 23, 2009 at 4:01 PM, Nair, Murlidharan T mn...@iusb.edu wrote: Hi, I am trying to plot the following data so that it can be visually represented well. I tried the dotchart but I felt it was too spread out. Then I tried the barplot which is good enough for me. Is there a way to give the labels for the y-axis as in the dot chart? Also, I feel the grey level is confusing, so is there options for designs within the bars? I cannot use color as the journal wants it in black and white. I also need to specify the key. If someone has done it, I would appreciate your input. Cheers../Murli intersect.data-structure(list(X = structure(c(1L, 3L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 2L, 4L, 5L, 6L, 7L), .Label = c(1-100, 1001-1100, 101-200, 1101-1200, 1201-1300, 1301-1400, 1401-1532, 201-300, 301-400, 401-500, 501-600, 601-700, 701-800, 801-900, 901-1000), class = factor), MCM.Cell.vs.MCM.Tumor = c(6L, 7L, 12L, 9L, 13L, 7L, 11L, 4L, 8L, 11L, 11L, 12L, 4L, 15L, 28L ), Ttest.Tumor.vs.Ttest.Cell = c(4L, 2L, 7L, 9L, 8L, 10L, 4L, 7L, 8L, 7L, 5L, 7L, 4L, 5L, 9L), Ttest.Cell.vs.MCM.Cell = c(66L, 22L, 14L, 7L, 11L, 6L, 12L, 7L, 9L, 8L, 7L, 9L, 9L, 5L, 20L), Ttest.Tumor.vs.MCM.Tumor = c(31L, 18L, 8L, 12L, 5L, 8L, 5L, 8L, 9L, 8L, 10L, 12L, 13L, 8L, 18L)), .Names = c(X, MCM.Cell.vs.MCM.Tumor, Ttest.Tumor.vs.Ttest.Cell, Ttest.Cell.vs.MCM.Cell, Ttest.Tumor.vs.MCM.Tumor ), class = data.frame, row.names = c(NA, -15L)) dotchart(as.matrix(intersect.data[-1]), labels=intersect.data[,1], cex=0.5, gpch=70) barplot(t(as.matrix(intersect.data[,2:5])), beside=T, horiz=T) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html
[R] basic cubic spline smoothing (resending because not sure about pending)
Hello, I come from a non statistics background, but R is available to me, and I needed to test an implementation of smoothing spline that I have written in c++, so I would like to match the results with R (for my unit tests). I am following Smoothing Splines, D.G. Pollock (available online) where we have a list of points (xi, yi), the yi points are random such that: y_i = f(x_i) + e_i where e_i is normal with mean 0 and variance sigma_i^2 There is a smoothing parameter lambda between 0 and 1. .when lambda is 0, smoothness is all that matters, and the fitting function will be a straight line. .when lambda is 1, the result is the interpolating spline. In my case, this parameter is an input. The resulting function is the spline that minimizes the criteria in (62) in the referenced paper. I am trying to call smooth.spline in R with parameters that match my problem above. So I tried this sequence of calls in R: x - c(1.,5.,10., 15., 20., 25., 30., 35.) y - c(-999.98099, -1001.61875, -1007.9, -1019.36875, -1036.4, -1059.21875, -1087.9, -1122.36875) smooth.spline( x,y, w=NULL, spar=0.5, cv=TRUE, all.knots= TRUE ) I am unsure about spar being the smoothness parameter, about where to put the standard errors of the points, and about the return of the smooth.spline function: Smoothing Parameter spar= 0.5 lambda= 0.006833112 Equivalent Degrees of Freedom (Df): 3.221101 Penalized Criterion: 66.27819 PRESS: 56.13537 Basically, what would make sense to me is the list of points by which the resulting spline passes, or the errors from the initial y_i, or the coefficients of the cubic polynomials within the [x_i, x_i+1] intervals. Also, how would I plot the smoothed spline and see the progression from straight line to interpolating spline as I change the smoothing parameter? Any help is appreciated, Best regards, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphics mailing list?
(Sorry about the double post earlier, googlemail is having hiccups today) 2009/9/24 Romain Francois romain.franc...@dbmail.com: Why just grid ? why not a list for all kind of graphics ? I figured that a good share of the traffic on r-help might be considered graphics-related, while I was aiming at discussing less documented areas. But I agree that the distinction shouldn't be made on a particular package or system. Best, baptiste On 09/24/2009 04:34 PM, baptiste.auguie wrote: Dear all, Would it make sense to have a separate mailing list (special interest group*) for Grid graphics? (or is there one already?) I don't feel comfortable asking questions about the design of new a new grid class in R-help where I'm guessing most people won't be interested. Of course having yet another mailing list would only make sense if it's to be followed by those people who work with Grid (lattice, vcd, ggplot2, latticeExtra, Rgraphics, etc.). Having read a bit of code from these packages recently, I get the feeling that several people may have been facing similar problems or reinventing the same things. Just a thought, Best regards, baptiste *: http://www.r-project.org/mail.html -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/ztCu : RGG #158:161: examples of package IDPmisc |- http://tr.im/yw8E : New R package : sos `- http://tr.im/y8y0 : search the graph gallery from R __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RODBC problem
Hi, I'm attempting to use the RODBC package on Windows Vista to import an excel spreadsheet. The spreadsheet has three worksheets the last of which is blank. Following an example in Phil Spector's book (p. 34), after creating a connection named con I did the following: con RODBC Connection 3 Details: case=nochange DBQ=c:\temp\test.xls DefaultDir=c:\temp Driver={Microsoft Excel Driver (*.xls)} DriverId=790 MaxBufferSize=2048 PageTimeout=5 tbls - sqlTables(con) tbls TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS 1 c:\\temp\\testNASheet1$ SYSTEM TABLENA 2 c:\\temp\\testNASheet2$ SYSTEM TABLENA 3 c:\\temp\\testNASheet3$ SYSTEM TABLENA Everything seems to be fine. Then I did qry - paste(SELECT * FROM, tbls$TABLE_NAME[1], sep = ' ') qry [1] SELECT * FROM Sheet1$ sqlQuery(con, qry) and got the error message [1] 42000 -3506 [Microsoft][ODBC Excel Driver] Syntax error in FROM clause. [RODBC] ERROR: Could not SQLExecDirect 'SELECT * FROM Sheet1$' Any advise as to why and how to fix it? What's the syntax error that I'm just not seeing? Thanks, Walt -- Walter R. Paczkowski, Ph.D. Data Analytics Corp. 44 Hamilton Lane Plainsboro, NJ 08536 (V) 609-936-8999 (F) 609-936-3733 dataanalyt...@earthlink.net www.dataanalyticscorp.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] basic cubic spline smoothing
Hello, I come from a non statistics background, but R is available to me, and I needed to test an implementation of smoothing spline that I have written in c++, so I would like to match the results with R (for my unit tests) I am following http://www.nabble.com/file/p25569553/SPLINES.PDF SPLINES.PDF where we have a list of points (xi, yi), the yi points are random such that: y_i = f(x_i) + e_i where e_i is normal with mean 0 and variance sigma_i^2 There is a smoothing parameter lambda between 0 and 1. .when lambda is 0, smoothness is all that matters, and the fitting function will be a straight line. .when lambda is 1, the result is the interpolating spline. In my case, this parameter is an input. The resulting function is the spline that minimizes the criteria in (62) in the attached paper. I am trying to call smooth.spline in R with parameters that match my problem above. So I tried this sequence of calls in R: x - c(1.,5.,10., 15., 20., 25., 30., 35.) y - c(-999.98099, -1001.61875, -1007.9, -1019.36875, -1036.4, -1059.21875, -1087.9, -1122.36875) smooth.spline( x,y, w=NULL, spar=0.5, cv=TRUE, all.knots= TRUE ) I am unsure about spar being the smoothness parameter, about where to put the standard errors of the points, and about the return of the smooth.spline function: Smoothing Parameter spar= 0.5 lambda= 0.006833112 Equivalent Degrees of Freedom (Df): 3.221101 Penalized Criterion: 66.27819 PRESS: 56.13537 Basically, what would make sense to me is the list of points by which the resulting spline passes, or the errors from the initial y_i, or the coefficients of the cubic polynomials within the [x_i, x_i+1] intervals. Also, how would I plot the smoothed spline and see the progression from straight line to interpolating spline as I change the smoothing parameter. best regards, -- View this message in context: http://www.nabble.com/basic-cubic-spline-smoothing-tp25569553p25569553.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmer() vs. fixed effects regression
Hi, First, some quick terminology I am using: Fixed effects = model with unit dummy variables Random effects = model without unit dummy variables, integrating unit-level variance out of likelihood I am confused about the difference between the multilevel modeling framework of lmer() and a fixed effects model with unit dummy variables. Say I had the following model with individual-level variable x, estimated with lmer(): model1 - lmer(y ~ x + (1|unit)) How is model1 different from this?: model2 - lm(y ~ x + factor(unit)) I was under the impression that the lmer() function was a random effects estimator as I have defined above. But if you use the command ranef(model1), R returns unit-specific deviations from the intercept. This seems to be more in line with the fixed effects estimator that returns intercept estimates for each unit. Question 2: Why is it that I can add in unit-level predictors into lmer(), which I cannot do in a standard fixed effects model with unit dummies? Thank you. -- View this message in context: http://www.nabble.com/lmer%28%29-vs.-%22fixed-effects%22-regression-tp25577800p25577800.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] more strange behavior of Revolution R 1.3.0
It runs more than twice as slowly using 8 core than using a single core in inverting large matrix. Tested on 8 core Windows XP 64 machine. n = 1000 n.simu = 100 func1 = function() + { + x = rnorm(n*n) + dim(x)=c(n,n) + y = solve(x) + } setMKLthreads(1) system.time(for(i in 1:n.simu) func1()) user system elapsed 69.482.42 71.91 setMKLthreads(8) system.time(for(i in 1:n.simu) func1()) user system elapsed 179.06 17.90 41.70 Jason Liao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scaled Schoenfeld residuals
hi thanks, I see that cox.zph is plotting and smoothing the scaled Schoenfeld residuals as generated by R, but since the term is already in the literature with a formula, maybe the help should clarify the offset. I found it confusing anyway. thanks for help greg Thomas Lumley tlumley at u.washington.edu Thu Sep 24 16:18:17 CEST 2009 Previous message: [R] scaled Schoenfeld residuals Next message: [R] generate random number without repetition Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] On Wed, 23 Sep 2009, Greg Dropkin wrote: hi sorry if this has been discussed before, but I'm wondering why the scaled Schoenfeld residuals do not follow the defining formula for obtaining them from the ordinary Schoenfeld residuals, but are instead offset by the estimated parameter values. Because their purpose in life is to be smoothed against time to get an estimate of the parameter as a function of time (plot.cox.zph). -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Access to conditioning variables (lattice)
[using R version 2.8.1 (2008-12-22)] Hello, I'm trying to access the conditioning variables of an xyplot within a 'panel' function but I have not been able to figure out how to do so. Here is a simple example that describes what I wish to do (the problem lies with the commented line): dataset - data.frame(x = c(1,2), y = c(4,5), Type = factor(c(a,b))) xyplot( y ~ x | Type, dataset, panel = function(...) { panel.xyplot(...) # do_something_with(conditioning_variables[which.packet()]) }) The problem I am facing is that I do not know how to generically access the conditioning variables within the panel function. In this simple case, I can achieve what I want to do with the following call : do_something_with(Type[which.packet()]) but that requires the panel function call to have prior knowledge of the object used as the conditioning variable, which is not flexible enough for my needs. Thank you, Martin D. Lepage __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multinormial runs tests?
Dear R users, I would like to test the randomness in a series of N values (N=2). I know that runs.test works for dichotomous factor only: x - rep(c(1,2), 50) runs.test(factor(x)) However it doesn't work for series that can take any N values (N2): x - rep(c(1,2,5,4),50) runs.test(factor(x)) Error in runs.test(factor(x)) : x does not contain dichotomous data Are there any R function that does multinormial runs test? Thank you very much, sincerely, Julia -- View this message in context: http://www.nabble.com/multinormial-runs-tests--tp25574075p25574075.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to make a function recognize the name of an object/vector given as argument
Dear guRus, I'd like to learn how to make a function recognize the name of an object/vector given as argument If I have : testFun - function(x,y) plot(x,y, main=paste(plot of,names(x),and,names(y)) ) # this just a simple example ... a1 - 5:8 b1 - 9:6 testFun(a1,b1) # Returns the plot, but not the names of the objects/vectors given as arguments, # but since 'names()' refers to the elements INSIDE the object/vector I don't get what I'm looking for ... # In fact, I (also) would like to know that actually a1 and b1 were given as arguments to my function. # As in the example, this could be useful for (sub-)titles of graphs etc... # Is there a way to get this kind of information ? #For compleness: sessionInfo() R version 2.9.1 (2009-06-26) i386-pc-mingw32 locale: LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base Thank's in advance, Wolfgang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et Génomique Intégratives CNRS UMR7104, IGBMC, 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 wolfgang.raffelsberger (at) igbmc.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to make a function recognize the name of an object/vector given as argument
Try this, testFun - function(x,y) plot(x,y, main=paste(plot of,deparse(substitute(x)),and, deparse(substitute(y))) ) a1 - 5:8 b1 - 9:6 testFun(a1,b1) ?deparse HTH, baptiste 2009/9/24 Wolfgang Raffelsberger wr...@igbmc.fr: Dear guRus, I'd like to learn how to make a function recognize the name of an object/vector given as argument If I have : testFun - function(x,y) plot(x,y, main=paste(plot of,names(x),and,names(y)) ) # this just a simple example ... a1 - 5:8 b1 - 9:6 testFun(a1,b1) # Returns the plot, but not the names of the objects/vectors given as arguments, # but since 'names()' refers to the elements INSIDE the object/vector I don't get what I'm looking for ... # In fact, I (also) would like to know that actually a1 and b1 were given as arguments to my function. # As in the example, this could be useful for (sub-)titles of graphs etc... # Is there a way to get this kind of information ? #For compleness: sessionInfo() R version 2.9.1 (2009-06-26) i386-pc-mingw32 locale: LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base Thank's in advance, Wolfgang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et Génomique Intégratives CNRS UMR7104, IGBMC, 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 wolfgang.raffelsberger (at) igbmc.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] more strange behavior of Revolution R 1.3.0
Please discuss with REvolution support. Most of us do not have a version of REvolution R. Uwe Ligges Jason Liao wrote: It runs more than twice as slowly using 8 core than using a single core in inverting large matrix. Tested on 8 core Windows XP 64 machine. n = 1000 n.simu = 100 func1 = function() + { + x = rnorm(n*n) + dim(x)=c(n,n) + y = solve(x) + } setMKLthreads(1) system.time(for(i in 1:n.simu) func1()) user system elapsed 69.482.42 71.91 setMKLthreads(8) system.time(for(i in 1:n.simu) func1()) user system elapsed 179.06 17.90 41.70 Jason Liao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with xtabs(), exclude=NULL, and counting NA's
Also see addNA. Works great. Sometimes R drives me crazy, but Hadley, you make it much easier for me That is nice. The addNA function does not exactly jump off the page for the (too) casual reader. In the context of the OP's original problem, these lines of code are illustrative: Here is an example from the actual dataset: xtabs(pwgtp~addNA(cut(pums.mex$JWMNP, c(0, 19, 65, Inf), include.lowest=TRUE))) addNA(cut(pums.mex$JWMNP, c(0, 19, 65, Inf), include.lowest = TRUE)) [0,19] (19,65] (65,Inf] NA 3905346787 252161914 xtabs(pwgtp~(cut(pums.mex$JWMNP, c(0, 19, 65, Inf), include.lowest=TRUE))) cut(pums.mex$JWMNP, c(0, 19, 65, Inf), include.lowest = TRUE) [0,19] (19,65] (65,Inf] 3905346787 2521 Thanks to all! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Post-Hoc tests for Friedman Test?
On Sep 20, 2009, at 9:35 PM, David Winsemius wrote: On Sep 20, 2009, at 9:05 PM, j...@terraspark.com wrote: Hi there all, This is my first post to the list and I'll first say a few things: - R is great! - The archives of this list have helped me solve all of my problems/ questions so far - I only know enough statistics to be dangerous I'm looking for a way to do post-hoc tests for the Friedman test. I have a dataset from a within-subjects design with 5 conditions where some of the dependent variables are ordinal, resulting from (summed) likert-scaled questionnaire data. From what I've read, I could use a wilcox.test on pairs of conditions and adjust the p level, but is there something in R that does a better job/automates this. I've seen references to the npmc package but that doesn't seem to do what I'm looking for, because it only accepts a data frame with two columns - i.e. there's no way to specify grouping/subject identifiers. Thanks, There is a worked example in the coin package for using a permutation test to examine differences after a Friedman test. The authors, Hothorn , Hornik , van de Wiel, and Zeileis, call this method the Wilcoxon-Nemenyi-McDonald-Thompson test and cite: Hollander Wolfe (1999), page 295 http://finzi.psych.upenn.edu/R/library/coin/html/SymmetryTests.html A further option just presented itself during a search for an unrelated question: The MTP function in the multtest package has a robust=TRUE set of methods with these equivalencies offered: t.onesamp or t.pair: Wilcoxon signed rank, wilcox.test with y=NULL or paired=TRUE, t.twosamp.equalvar: Wilcoxon rank sum or Mann-Whitney, wilcox.test, f: Kruskal-Wallis rank sum, kruskal.test, f.block:Friedman rank sum, friedman.test, f.twoway: Friedman rank sum, friedman.test, http://finzi.psych.upenn.edu/R/library/multtest/html/MTP.html -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Access to conditioning variables (lattice)
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Martin D. Lepage Sent: Thursday, September 24, 2009 7:38 AM To: r-help@r-project.org Subject: [R] Access to conditioning variables (lattice) [using R version 2.8.1 (2008-12-22)] Hello, I'm trying to access the conditioning variables of an xyplot within a 'panel' function but I have not been able to figure out how to do so. Here is a simple example that describes what I wish to do (the problem lies with the commented line): dataset - data.frame(x = c(1,2), y = c(4,5), Type = factor(c(a,b))) xyplot( y ~ x | Type, dataset, panel = function(...) { panel.xyplot(...) # do_something_with(conditioning_variables[which.packet()]) }) The problem I am facing is that I do not know how to generically access the conditioning variables within the panel function. In this simple case, I can achieve what I want to do with the following call : do_something_with(Type[which.packet()]) If your panel function has an argument called 'subscripts' then xyplot will pass it the row numbers of the data argument that correspond this the current panel. E.g., xyplot( y ~ x | Type, dataset, + panel = function(..., subscripts) { + panel.xyplot(...) + cat(subscripts=, deparse(subscripts), :\n) + print(dataset[subscripts,]) + }) subscripts= 1L : x y Type 1 1 4a subscripts= 2L : x y Type 2 2 5b You can use that information to add panel-specific information to the plot. Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com but that requires the panel function call to have prior knowledge of the object used as the conditioning variable, which is not flexible enough for my needs. Thank you, Martin D. Lepage __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC problem
Try it without the '$' in the table name, that has worked for me in the past. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Data Analytics Corp. Sent: Thursday, September 24, 2009 10:23 AM To: r-help@r-project.org Subject: [R] RODBC problem Hi, I'm attempting to use the RODBC package on Windows Vista to import an excel spreadsheet. The spreadsheet has three worksheets the last of which is blank. Following an example in Phil Spector's book (p. 34), after creating a connection named con I did the following: con RODBC Connection 3 Details: case=nochange DBQ=c:\temp\test.xls DefaultDir=c:\temp Driver={Microsoft Excel Driver (*.xls)} DriverId=790 MaxBufferSize=2048 PageTimeout=5 tbls - sqlTables(con) tbls TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS 1 c:\\temp\\testNASheet1$ SYSTEM TABLENA 2 c:\\temp\\testNASheet2$ SYSTEM TABLENA 3 c:\\temp\\testNASheet3$ SYSTEM TABLENA Everything seems to be fine. Then I did qry - paste(SELECT * FROM, tbls$TABLE_NAME[1], sep = ' ') qry [1] SELECT * FROM Sheet1$ sqlQuery(con, qry) and got the error message [1] 42000 -3506 [Microsoft][ODBC Excel Driver] Syntax error in FROM clause. [RODBC] ERROR: Could not SQLExecDirect 'SELECT * FROM Sheet1$' Any advise as to why and how to fix it? What's the syntax error that I'm just not seeing? Thanks, Walt -- Walter R. Paczkowski, Ph.D. Data Analytics Corp. 44 Hamilton Lane Plainsboro, NJ 08536 (V) 609-936-8999 (F) 609-936-3733 dataanalyt...@earthlink.net www.dataanalyticscorp.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC problem
Walt I get the same message using R2.9.2 on Vista. Using sqlFetch(con,'Sheet1') seems to however. HTH Schalk Heunis On Thu, Sep 24, 2009 at 6:23 PM, Data Analytics Corp. dataanalyt...@earthlink.net wrote: Hi, I'm attempting to use the RODBC package on Windows Vista to import an excel spreadsheet. The spreadsheet has three worksheets the last of which is blank. Following an example in Phil Spector's book (p. 34), after creating a connection named con I did the following: con RODBC Connection 3 Details: case=nochange DBQ=c:\temp\test.xls DefaultDir=c:\temp Driver={Microsoft Excel Driver (*.xls)} DriverId=790 MaxBufferSize=2048 PageTimeout=5 tbls - sqlTables(con) tbls TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS 1 c:\\temp\\test NA Sheet1$ SYSTEM TABLE NA 2 c:\\temp\\test NA Sheet2$ SYSTEM TABLE NA 3 c:\\temp\\test NA Sheet3$ SYSTEM TABLE NA Everything seems to be fine. Then I did qry - paste(SELECT * FROM, tbls$TABLE_NAME[1], sep = ' ') qry [1] SELECT * FROM Sheet1$ sqlQuery(con, qry) and got the error message [1] 42000 -3506 [Microsoft][ODBC Excel Driver] Syntax error in FROM clause. [RODBC] ERROR: Could not SQLExecDirect 'SELECT * FROM Sheet1$' Any advise as to why and how to fix it? What's the syntax error that I'm just not seeing? Thanks, Walt -- Walter R. Paczkowski, Ph.D. Data Analytics Corp. 44 Hamilton Lane Plainsboro, NJ 08536 (V) 609-936-8999 (F) 609-936-3733 dataanalyt...@earthlink.net www.dataanalyticscorp.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] October-November R /S Courses: Nationwide (1) R/S+ Fundamentals and (2) R/S-Plus Advanced Programming. in San Francisco, New York City, Boston, Washington DC, Seattle and Salt Lake City
XLSolutions Corporation is proud to announce our October-November R /S course schedule in New York City, San Francisco, Boston, Washington DC, Seattle and Salt Lake City. Taught by top R/S+ gurus! http://www.xlsolutions-corp.com/rplus.asp (1-a) R/S-PLUS: An Introduction to R and S October-November, 2009 (1-b)) R/S-PLUS Fundamentals and Programming Techniques October-November, 2009 (2) R/S+ System: Advanced Programming October-November, 2009 Ask for group discount and reserve your seat Now - Earlybird Rates. Payment due after the class! Email Sue Turner: s...@xlsolutions-corp.com http://www.xlsolutions-corp.com/rplus.asp Please let us know if you and your colleagues are interested in this class to take advantage of group discount. Register now to secure your seat! Cheers, Elvis Miller, PhD Manager Training. XLSolutions Corporation 206 686 1578 www.xlsolutions-corp.com el...@xlsolutions-corp.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC problem
Schalk, This worked. Thanks for the hint. Walt Schalk Heunis wrote: Walt I get the same message using R2.9.2 on Vista. Using sqlFetch(con,'Sheet1') seems to however. HTH Schalk Heunis On Thu, Sep 24, 2009 at 6:23 PM, Data Analytics Corp. dataanalyt...@earthlink.net wrote: Hi, I'm attempting to use the RODBC package on Windows Vista to import an excel spreadsheet. The spreadsheet has three worksheets the last of which is blank. Following an example in Phil Spector's book (p. 34), after creating a connection named con I did the following: con RODBC Connection 3 Details: case=nochange DBQ=c:\temp\test.xls DefaultDir=c:\temp Driver={Microsoft Excel Driver (*.xls)} DriverId=790 MaxBufferSize=2048 PageTimeout=5 tbls - sqlTables(con) tbls TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS 1 c:\\temp\\testNASheet1$ SYSTEM TABLENA 2 c:\\temp\\testNASheet2$ SYSTEM TABLENA 3 c:\\temp\\testNASheet3$ SYSTEM TABLENA Everything seems to be fine. Then I did qry - paste(SELECT * FROM, tbls$TABLE_NAME[1], sep = ' ') qry [1] SELECT * FROM Sheet1$ sqlQuery(con, qry) and got the error message [1] 42000 -3506 [Microsoft][ODBC Excel Driver] Syntax error in FROM clause. [RODBC] ERROR: Could not SQLExecDirect 'SELECT * FROM Sheet1$' Any advise as to why and how to fix it? What's the syntax error that I'm just not seeing? Thanks, Walt -- Walter R. Paczkowski, Ph.D. Data Analytics Corp. 44 Hamilton Lane Plainsboro, NJ 08536 (V) 609-936-8999 (F) 609-936-3733 dataanalyt...@earthlink.net www.dataanalyticscorp.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Walter R. Paczkowski, Ph.D. Data Analytics Corp. 44 Hamilton Lane Plainsboro, NJ 08536 (V) 609-936-8999 (F) 609-936-3733 dataanalyt...@earthlink.net www.dataanalyticscorp.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Does anybody know how to connect to KDB from within R?
Please give me some pointers... Thanks a lot! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multinormial runs tests?
You can do this by simulation: Generate data from a multinomial of the same length as your data (the sample function can help) using either theoretical or observed probabilities. Measure the length of the longest run, or the number of runs (the rle function can help). Repeat this a bunch of times (the replicate function can help) See how your observed data compares to the simulations. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of liujb Sent: Thursday, September 24, 2009 8:52 AM To: r-help@r-project.org Subject: [R] multinormial runs tests? Dear R users, I would like to test the randomness in a series of N values (N=2). I know that runs.test works for dichotomous factor only: x - rep(c(1,2), 50) runs.test(factor(x)) However it doesn't work for series that can take any N values (N2): x - rep(c(1,2,5,4),50) runs.test(factor(x)) Error in runs.test(factor(x)) : x does not contain dichotomous data Are there any R function that does multinormial runs test? Thank you very much, sincerely, Julia -- View this message in context: http://www.nabble.com/multinormial-runs- tests--tp25574075p25574075.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Time Series Format Error
I have rec'd the following error: P34annual - read.table(A:\\Data\\Output\\Sparrow\\Hydro_Data\\P34_Annual.txt, header=TRUE, sep=,, stringsAsFactors= FALSE, skip=1) P34annual$GS - rep(1.86, dim(P34annual)[1]) P34annual$Depth - as.numeric(P34annual$P34_stage) - as.numeric(P34annual$GS) P34annual.ts - as.ts(data=P34annual$Depth, frequency = ( 1), start=c(1981), end=c(2009)) Error in inherits(x, ts) : element 1 is empty; the part of the args list of '.Internal' being evaluated was: (x, what, which) If anyone can let me know where I've messed up I'd appreciate it greatly. Thanks Steve Friedman Ph. D. Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] panel.text question
Dear R-help, I would like to add text to each of four panels in a plot generated by xyplot in lattice library. A sample code is given below, the plot generated has the first label repeated in all panels! How can I get the labels to be different in each panel? library(lattice) x - rnorm(400) y - rnorm(400) a - gl(4, 100) xyplot(y~x|a, panel=function(...){ panel.loess(...) panel.text(0,2,label=c('best','better','bad','worst'))}) Thanks Osman Osman O. Al-Radi, MD, MSc, FRCSC Staff Cardiovascular Surgeon Co-medical director, Tissue Bank The Hospital for Sick Children University of Toronto, Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error in amer function
Dear all, I am trying to reproduce the exemple in the vignette Using lme4 to fit Generalized Additive Mixed Models with my dataset. But... mod - amer(pasvig ~ -1 + harvf + tp(dias,by=harvf) + (1 | pac), data=exemplo) Erro em if (from == to) rep.int(from, length.out) else as.vector(c(from, : valor ausente onde TRUE/FALSE necessário A subset of the dataset is attached. No clue about what is wrong. Thanks a lot! Marilia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multicomp plotting
I am trying to plot my multiple comparison data. Can anyone give me some input of the error I am getting. The data and code is appended below. Thanks ../Murli library(multcomp) sig.data-structure(list(X = 1:63, Cell.lines = structure(c(1L, 6L, 13L, 25L, 33L, 42L, 2L, 7L, 14L, 26L, 34L, 43L, 3L, 4L, 5L, 18L, 22L, 52L, 58L, 8L, 27L, 35L, 45L, 9L, 36L, 46L, 10L, 15L, 28L, 37L, 47L, 11L, 16L, 29L, 38L, 44L, 12L, 17L, 30L, 39L, 48L, 19L, 23L, 53L, 59L, 20L, 21L, 24L, 54L, 60L, 31L, 40L, 49L, 50L, 32L, 41L, 51L, 55L, 61L, 56L, 62L, 57L, 63L), .Label = c(DU145-Caki-2, DU145-Calu1, HCE-7-DU145, HCT116-DU145, HT29-DU145, LAPC4-Caki-2, LAPC4-Calu1, LAPC4-EC-17, LAPC4-Fet, LAPC4-HCE-7, LAPC4-HCT116, LAPC4-HT29, LNCaP-Caki-2, LNCaP-Calu1, LNCaP-HCE-7, LNCaP-HCT116, LNCaP-HT29, LS174-DU145, LS174-LAPC4, LS174-LNCaP, MCF7-LNCaP, MDA-MB-468-DU145, MDA-MB-468-LAPC4, MDA-MB-468-LNCaP, PC3-Caki-2, PC3-Calu1, PC3-EC-17, PC3-HCE-7, PC3-HCT116-2, PC3-HT29, PC3-LS174, PC3-MDA-MB-468, RWPE1-Caki-2, RWPE1-Calu1, RWPE1-EC-17, RWPE1-Fet, RWPE1-HCE-7, RWPE1-HCT116, RWPE1-HT29, RWPE1-LS174, RWPE1-MDA-MB-468, RWPE2-Caki-2, RWPE2-Calu1, RWPE2-E-HCT116, RWPE2-EC-17, RWPE2-Fet, RWPE2-HCE-7, RWPE2-HT29, RWPE2-LS174, RWPE2-MCF7, RWPE2-MDA-MB-468, SW480-DU145, SW480-LAPC4, SW480-LNCaP, SW480-PC3, SW480-RWPE1, SW480-RWPE2, TE3-DU145, TE3-LAPC4, TE3-LNCaP, TE3-PC3, TE3-RWPE1, TE3-RWPE2), class = factor), estimate = c(-2759.302703, -3690.072718, -2607.150854, -3282.218985, -3635.312686, -3786.281227, -1189.109264, -2119.879279, -1036.957415, -1712.025546, -2065.119246, -2216.087787, 1253.075395, 1009.183561, 808.413018, 2038.189972, 788.61518, 1453.525701, 1001.526663, -1135.02519, -727.171457, -1080.265157, -1231.233698, -682.040377, -627.280345, -778.248885, -2183.84541, -1100.923546, -1775.991677, -2129.085377, -2280.053918, -1939.953576, -857.031712, -1532.099843, -1885.193544, -2036.162085, -1739.183033, -656.261169, -1331.3293, -1684.423001, -1835.391542, 2968.959987, 1719.385195, 2384.295716, 1932.296678, 1886.038123, -578.466846, 636.463331, 1301.373852, 849.374814, -2561.106254, -2914.199954, -3065.168495, -600.663526, -1311.531462, -1664.625162, -1815.593703, 1976.441983, 1524.442945, 2329.535683, 1877.536646, 2480.504224, 2028.505187), lower = c(-3326.68652, -4257.45653, -3174.53467, -3849.6028, -4202.6965, -4353.66504, -1756.49308, -2687.26309, -1604.34123, -2279.40936, -2632.50306, -2783.4716, 685.69158, 441.79975, 241.02921, 1470.80616, 221.23137, 886.14189, 434.14285, -1702.409, -1294.55527, -1647.64897, -1798.61751, -1249.42419, -1194.66416, -1345.6327, -2751.22922, -1668.30736, -2343.37549, -2696.46919, -2847.43773, -2507.33739, -1424.41552, -2099.48366, -2452.57736, -2603.5459, -2306.56685, -1223.64498, -1898.71311, -2251.80681, -2402.77535, 2401.57617, 1152.00138, 1816.9119, 1364.91287, 1318.65431, -1145.85066, 69.07952, 733.99004, 281.991, -3128.49007, -3481.58377, -3632.55231, -1168.04734, -1878.91527, -2232.00897, -2382.97752, 1409.05817, 957.05913, 1762.15187, 1310.15283, 1913.12041, 1461.12137), upper = c(-2191.918891, -3122.688906, -2039.767042, -2714.835173, -3067.928873, -3218.897414, -621.725451, -1552.495466, -469.573602, -1144.641733, -1497.735434, -1648.703975, 1820.459207, 1576.567374, 1375.796831, 2605.573784, 1355.998992, 2020.909513, 1568.910476, -567.641377, -159.787644, -512.881345, -663.849886, -114.656565, -59.896532, -210.865073, -1616.461597, -533.539733, -1208.607864, -1561.701565, -1712.670106, -1372.569764, -289.6479, -964.716031, -1317.809731, -1468.778272, -1171.799221, -88.877357, -763.945488, -1117.039188, -1268.007729, 3536.343799, 2286.769007, 2951.679528, 2499.680491, 2453.421935, -11.083033, 1203.847143, 1868.757664, 1416.758627, -1993.722441, -2346.816142, -2497.784683, -33.279714, -744.147649, -1097.24135, -1248.209891, 2543.825795, 2091.826758, 2896.919496, 2444.920458, 3047.888037, 2595.888999), p.val.raw = c(2.22e-15, 0, 8.22e-15, 0, 0, 0, 6.2e-08, 7.41e-13, 6.07e-07, 6.36e-11, 1.29e-12, 2.85e-13, 2.47e-08, 9.33e-07, 2.3e-05, 1.71e-12, 3.18e-05, 1.59e-09, 1.05e-06, 1.37e-07, 8.74e-05, 3.13e-07, 3.37e-08, 0.000184, 0.000452, 3.77e-05, 3.91e-13, 2.29e-07, 3.02e-11, 6.75e-13, 1.54e-13, 4.84e-12, 1.05e-05, 5.77e-10, 8.81e-12, 1.75e-12, 4.62e-11, 0.000281, 8.24e-09, 8.83e-11, 1.53e-11, 4.44e-16, 5.83e-11, 5.82e-14, 5.26e-12, 8.73e-12, 0.001, 0.000389, 1.25e-08, 1.18e-05, 1.2e-14, 6.66e-16, 2.22e-16, 0.000698, 1.08e-08, 1.12e-10, 1.92e-11, 3.28e-12, 6.36e-10, 9.66e-14, 9.58e-12, 2.44e-14, 1.89e-12), p.val.bon = c(3.4e-13, 0, 1.26e-12, 0, 0, 0, 9.48e-06, 1.13e-10, 9.28e-05, 9.73e-09, 1.98e-10, 4.37e-11, 3.77e-06, 0.000143, 0.00352, 2.62e-10, 0.00487, 2.43e-07, 0.000161, 2.1e-05, 0.0134, 4.79e-05, 5.15e-06, 0.0281, 0.0691, 0.00577, 5.99e-11, 3.5e-05, 4.61e-09, 1.03e-10, 2.36e-11, 7.41e-10, 0.0016, 8.83e-08, 1.35e-09, 2.67e-10, 7.07e-09, 0.043, 1.26e-06, 1.35e-08, 2.35e-09, 6.79e-14, 8.92e-09, 8.9e-12, 8.05e-10,
Re: [R] panel.text question
Try this: xyplot(y ~ x | a, panel=function(x, y, subscripts, ...){ panel.loess(x, y) panel.text(0, 2, label=c('best','better','bad','worst')[tail(subscripts, 1)/100]) }) On Thu, Sep 24, 2009 at 2:45 PM, Osman Al-Radi osman.al.r...@gmail.com wrote: Dear R-help, I would like to add text to each of four panels in a plot generated by xyplot in lattice library. A sample code is given below, the plot generated has the first label repeated in all panels! How can I get the labels to be different in each panel? library(lattice) x - rnorm(400) y - rnorm(400) a - gl(4, 100) xyplot(y~x|a, panel=function(...){ panel.loess(...) panel.text(0,2,label=c('best','better','bad','worst'))}) Thanks Osman Osman O. Al-Radi, MD, MSc, FRCSC Staff Cardiovascular Surgeon Co-medical director, Tissue Bank The Hospital for Sick Children University of Toronto, Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Color of the plot which correspond to the group of the observations
Dear All, Let: dp: depth of the river tp: temperature with respect to depth These pair of observations are in 3 different groups i.e: Obs. 1,3,5,7 from the first group Obs. 2,4 and 10 from second group Obs 6,8 and 9 from third group. We can have a simple scatter plot, between depth as y-axis and temperature as x-axis, with each pairs are denoted by a red dot, by using a plot function shown below. # dp - c(1,4,3,2,5,7,9,8,9,2) tp - 1:10 plot(tp,dp, type= 'p', col = 'red') # Could someone please give some advices on the way to have a plot in which the color of each pair of observation corrrespond to its group? For instance, we might have red, blue and green dots corresponding to groups 1,2 and 3, respectively. Thank you Fir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] more strange behavior of Revolution R 1.3.0
The best place for questions specific to REvolution R is the REvolution forums: http://forums.revolution-computing.com/ (Note: due to anti-spam moderation, it may take a little while for your post to appear.) In this particular case, it looks to me like you're getting a significant speedup after setting it to use 8 cores rather than 1 (elapsed time of 42 vs 72 seconds -- with threaded applications it's the wall clock that counts). The fact that there is a big increase in the user (CPU) time indicates that there is a lot of overhead for this problem in using multiple threads, but nonetheless it's a net real-world benefit. # David Smith On Thu, Sep 24, 2009 at 7:13 AM, Jason Liao jl...@hes.hmc.psu.edu wrote: It runs more than twice as slowly using 8 core than using a single core in inverting large matrix. Tested on 8 core Windows XP 64 machine. n = 1000 n.simu = 100 func1 = function() + { + x = rnorm(n*n) + dim(x)=c(n,n) + y = solve(x) + } setMKLthreads(1) system.time(for(i in 1:n.simu) func1()) user system elapsed 69.482.42 71.91 setMKLthreads(8) system.time(for(i in 1:n.simu) func1()) user system elapsed 179.06 17.90 41.70 Jason Liao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- David M Smith da...@revolution-computing.com Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (San Francisco, USA) Check out our upcoming events schedule at www.revolution-computing.com/events [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Color of the plot which correspond to the group of the observations
It is easier in lattice dp - c(1,4,3,2,5,7,9,8,9,2) tp - 1:10 gg - rep(1:3, c(3,3,4)) ddff - data.frame(dp=dp, tp=tp, gg=gg) xyplot(dp ~ tp, groups=gg, data=ddff) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Color of the plot which correspond to the group of the observations
Try these three options, dp - c(1,4,3,2,5,7,9,8,9,2) tp - 1:10 group - factor(c(1, 2, 1, 2, 1, 3, 1, 3, 3, 2), label=letters[1:3]) plot(tp,dp, type= 'p', col = group) d - data.frame(dp=dp, tp=tp, group=group) library(lattice) xyplot(dp~tp, data=d, groups=group, auto.key=TRUE) library(ggplot2) qplot(tp, dp, data=d, colour=group) HTH, baptiste 2009/9/24 FMH kagba2...@yahoo.com: Dear All, Let: dp: depth of the river tp: temperature with respect to depth These pair of observations are in 3 different groups i.e: Obs. 1,3,5,7 from the first group Obs. 2,4 and 10 from second group Obs 6,8 and 9 from third group. We can have a simple scatter plot, between depth as y-axis and temperature as x-axis, with each pairs are denoted by a red dot, by using a plot function shown below. # dp - c(1,4,3,2,5,7,9,8,9,2) tp - 1:10 plot(tp,dp, type= 'p', col = 'red') # Could someone please give some advices on the way to have a plot in which the color of each pair of observation corrrespond to its group? For instance, we might have red, blue and green dots corresponding to groups 1,2 and 3, respectively. Thank you Fir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Non-parametric test for location with two unpaired sets of data measured on ordinal scale.
Please forgive a stats question. I have to sets of data (unpaired) measured on an ordinal scale. I want to test to see if the two sets are different (i.e. do they have the same location): set1: 1,3,2,2,4,3,3,2,2 set: 4,4,4,3,3,5,4,4 What is the most appropriate non-parametric test to test location? Thanks, John Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] col headers in read.table()
Hi, I was trying to read in a file test.txt, which has the following data: norm normnormclass class class a 1 2 3 4 5 6 b 3 4 5 6 7 8 c 5 6 7 8 9 10 in my R code, I do the following: --- mat - read.table('test.txt',header=T,row.names=1,sep='\t') mat norm norm.1 norm.2 class class.1 class.2 a1 2 3 4 5 6 b3 4 5 6 7 8 c5 6 7 8 9 10 -- What do I need to do so that I don't get 'norm.1', 'norm.2' etc., but just 'norm', 'norm'..,i.e. without the numbers. thanks, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.