Re: [R] problem reading from serial connection since 2.10.0
RTFM time: what do you think the description of 'file' in ?scan says about devices (I can see nothing that says they are supported). It does however say that As from R 2.10.0 this can be a compressed file (see ‘file’). which is a clear indication of what changed in R 2.10.0. Did you follow the xref?: it tells you how to use a file() connection to achieve the former behaviour. If you are reading repeatedly from a file or device, you would do better to open a connection, read repeatedly from that connection and close it when done. So (untested, of course), something like con - file(COM1, rb) for(i in 1:10) foo - readLines(con, n=1) close(con) [Technical aside: a text-mode file() connection when opened looks at the file twice, once in binary mode to read the header to see if it is compressed and if so how, then using the appropriate decompressor to prepare to read the contents.] There are many other aspects of file() connections that will not work with devices (seeking, for example) and we decided long ago not to support them. Nevertheless, it is helpful for documentation purposes to have such reports when alpha/beta testing is requested and not three months later. On Tue, 29 Dec 2009, Tom Gottfried wrote: Dear list, I have a balance connected to the serial port of a windows machine (COM1) and I read the text output of the balance with scan(COM1, what=character, sep=\n, n=1) after calling the previous line I press the print key on the balance which triggers sending one line of text to the serial connection and with R 2.9.2 I get something like Read 1 item + 32.004 mg Now with R 2.10.1 (and previously with 2.10.0) I have to press the print key on the balance twice to get the same result, thus apparently I have to send two lines of text to make scan() reading only one. But this is only when reading from the serial port, not when reading from a text file on disk. The same is true for 2.10.1 on Linux (I did not try 2.9.2 nor 2.10.0 on Linux). I can't figure out from the documentation nor the NEWS whether something should be specified differently when calling scan since 2.10.0. Any ideas what this behaviour comes from? Here the sessionInfo() for the three cases I have tested: R version 2.9.2 (2009-08-24) i386-pc-mingw32 locale: LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252 attached base packages: [1] stats graphics grDevices utils datasets [6] methods base respectively: R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C [5] LC_TIME=German_Germany.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base respectively: R version 2.10.1 (2009-12-14) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C [3] LC_TIME=de_DE.UTF-8LC_COLLATE=de_DE.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=de_DE.UTF-8 [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils methods base other attached packages: [1] lattice_0.17-26 RODBC_1.3-1 loaded via a namespace (and not attached): [1] grid_2.10.1 Thanks a lot! Tom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ops method does not dispatch on either of two classes
This is as documented on the help page for Ops (the page in base, not the one in Methods) which is linked from ?|. For background you should also read the reference on that help page. You are wrong in asserting that the internal method is 'for logicals': please do study the help page. It covers e.g. integer vectors which is what I suspect you have here (assuming this is something to do with package 'bit', unmentioned). On Mon, 28 Dec 2009, Jens Oehlschlägel wrote: I have defined boolean methods for bit and bitwhich objects, for example |.bit - function(e1,e2) and |.bitwhich - function(e1,e2) Both methods coerce their arguments to the respective class, however if I do something like bit_obj | bitwhich_obj then I get a warning Warning message: Incompatible methods (|.bit, |.bitwhich) for | and none of the two methods is called. Instead the (internal) method for logicals seems to be called - not even coercing its arguments to logical. Same problem with Ops.bit and Ops.bitwhich . What is the recommended way to get my methods reliably dispatched? They are! Take a look at what classes in R itself have S3 Ops methods and how they achieve what you seem to want via class inheritance. Jens Oehlschlägel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram/BoxPlot Panel
Dear Baptiste, Thanks a lot for the excellent example, which convinced me to start studying ggplot2. A trivial question: is there an easy way to generate a boxplot without outliers? Using R standard plotting facilities, this amounts to giving outline=FALSE within boxplot. Can I easily achieve the same using ggplot2? Beside not plotting the outliers, I also would like the y range to adjust automatically. I did some online research, someone suggested preprocessing the data, but I can hardly believe that this is the only way to go. Cheers Lorenzo baptiste auguie wrote: I forgot the base graphics way, ## divide the window in 4x4 cells par(mfrow=n2mfrow(length(datasets))) ## loop over the list of datasets and plot each one be.quiet - lapply(datasets, function(ii) boxplot(y~x, data=ii)) ggplot2 has a website with many examples, http://had.co.nz/ggplot2/ as well as a book. Lattice also has a dedicated book, and a companion website with the figures, http://r-forge.r-project.org/projects/lmdvr/ HTH, baptiste 2009/12/29 baptiste auguie baptiste.aug...@googlemail.com: Hi, Here is some artificial data followed by minimal ggplot2 and lattice examples, makeUpData - function(){ data.frame(x=sample(letters[1:4], 100, repl=TRUE), y=rnorm(100)) } datasets - replicate(15, makeUpData(), simplify=FALSE) names(datasets) - paste(dataset, seq_along(datasets), sep=) str(datasets) require(reshape) ## combine the datasets in one long format data.frame m - melt(datasets, meas=c(y)) str(m) require(ggplot2) ggplot(m)+ geom_boxplot(mapping=aes(x, value))+ facet_wrap(~L1) # or more concisely qplot(x, value, data=m, geom=boxplot, facets=~L1) require(lattice) bwplot(value~x | L1, data=m) HTH, baptiste 2009/12/29 Lorenzo Isella lorenzo.ise...@gmail.com: Dear All, I am given 15 different data sets and I would like to generate a panel showing all of them. Each dataset will be presented either as a boxplot or as a histogram. There are several possible ways to achieve this (as far as I know) (1) using plot and mfrow() (2) using lattice (3) using ggplot/ggplot2 I am not very experienced (to be euphemistic) about (2) and (3). My question then is: how would you try to organize these 15 histograms/boxplots into a single figure? Can anyone provide me with a simple example (with artificial data) for (2) and (3) (or point me to some targeted online resource)? Any suggestion is welcome. Many thanks Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Axis Labels for Compound Plots in ggplot2
Dear All, I am trying to stitch together multiple plots using ggplot2 Consider for instance the following snippet based on an old thread (http://tinyurl.com/ylehm2t) library(ggplot2) vplayout - function(x, y) viewport(layout.pos.row=x, layout.pos.col=y) draw4 - function(pdfname, a,b,c,d,w,h) { pdf(pdfname,width=w, height=h) grid.newpage() pushViewport(viewport(layout=grid.layout(2,2) ) ) print(a, vp=vplayout(1,1)) print(b, vp=vplayout(1,2)) print(c, vp=vplayout(2,1)) print(d, vp=vplayout(2,2)) dev.off() } data(diamonds) set.seed(1234) randind - sample(nrow(diamonds),1000,replace=FALSE) dsmall - diamonds[randind,] a - qplot(carat, data=dsmall, geom=histogram,binwidth=1) b - qplot(carat, data=dsmall, geom=histogram,binwidth=.1) c - qplot(carat, data=dsmall, geom=histogram,binwidth=.01) d - qplot(carat, data=dsmall, geom=histogram,binwidth=2) width - 7 height - 7 draw4( test-4.pdf, a,b,c,d, width, height) Is there any way to give an overall label along the y and x axis? E.g.something like a xlab to write some text which applies to the x axis of all the plots and which should go at the middle bottom of the compound plot (and something perfectly along these lines for the y axis). Many thanks Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] boot function returns the same results every time - there appears to be not resampling of the original data.
R 2.8.1 windows XP I am trying to learn how to use the boot function to perform a bootstrap of a regression. I have written a short trial program, shown below. Clearly I have done something wrong as the output of each of the 100 bootstrap values for the regression are exactly the same - there does not appear to be any bootstrap respampling!. What have I done wrong? # Define function to be run. Function will return # beta coefficeint for x. fitter-function(d) { fit1-lm(y~x,data=d) print(names(fit1)) print(summary(fit1)) summary(fit1)$coefficients[2,1] } # Define dataframe x-1:10 y-x+rnorm(10) d-data.frame(x,y) #Run boot strap boot(d,fitter,R=100,sim=parametric) John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Axis Labels for Compound Plots in ggplot2
Hi, You can set up a Grid layout with one viewport at the bottom and another on the left and use grid.text to add your labels. An example is given below using grid.pack. The gridExtra package provides a convenient wrapper for these regular arrangements of plots, ##library(gridExtra) #http://gridextra.googlecode.com/ source(http://gridextra.googlecode.com/svn/trunk/R/arrange.r;) arrange(a, b, c, d) I will add a viewport on the left side for a global y axis, but in the meantime you can use packGrob, ## return a grob with all plots p - arrange(a, b, c, d, plot=FALSE) g - frameGrob() g - packGrob(g, p) g - packGrob(g, textGrob(my y label, rot=90), side = left, border = unit(rep(4, 4), mm)) g - packGrob(g, textGrob(my x label), side = bottom, border = unit(rep(4, 4), mm)) grid.draw(g) HTH, baptiste 2009/12/30 Lorenzo Isella lorenzo.ise...@gmail.com: Dear All, I am trying to stitch together multiple plots using ggplot2 Consider for instance the following snippet based on an old thread (http://tinyurl.com/ylehm2t) library(ggplot2) vplayout - function(x, y) viewport(layout.pos.row=x, layout.pos.col=y) draw4 - function(pdfname, a,b,c,d,w,h) { pdf(pdfname,width=w, height=h) grid.newpage() pushViewport(viewport(layout=grid.layout(2,2) ) ) print(a, vp=vplayout(1,1)) print(b, vp=vplayout(1,2)) print(c, vp=vplayout(2,1)) print(d, vp=vplayout(2,2)) dev.off() } data(diamonds) set.seed(1234) randind - sample(nrow(diamonds),1000,replace=FALSE) dsmall - diamonds[randind,] a - qplot(carat, data=dsmall, geom=histogram,binwidth=1) b - qplot(carat, data=dsmall, geom=histogram,binwidth=.1) c - qplot(carat, data=dsmall, geom=histogram,binwidth=.01) d - qplot(carat, data=dsmall, geom=histogram,binwidth=2) width - 7 height - 7 draw4( test-4.pdf, a,b,c,d, width, height) Is there any way to give an overall label along the y and x axis? E.g.something like a xlab to write some text which applies to the x axis of all the plots and which should go at the middle bottom of the compound plot (and something perfectly along these lines for the y axis). Many thanks Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Negbin Error Warnings
Hi, I ran a negative binomial regression (NBR) using the Zelig-package and the negbin model. When I then try to use the simumlation approach using the setx () and sim() functions to calculate expected values and first difference for different levels of one of my independent variables, I get 50 errors warnings, telling me that the calculation rpois produced NAs. However, the data I use doesn't have any NAs. Does this mean that the NBR-model is the wrong model for this data? Thanks in advance. ___ Preisknaller: WEB.DE DSL Flatrate für nur 16,99 Euro/mtl.! http://produkte.web.de/go/02/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Positioning plots on top of each other (aligment borders)
Hello, I want to place two plots on top of each other. However, the problem is that I can't figure out a simple way to align them correctly. Is there a way to specify this? Since the data is bunch of coordinates and the second layer is an outline of a map (a .ps file I import using the grImport package), I suppose one option would be to specify a set of artificial coordinates that make up the very corners of that plot, and then have the second layer will this same space. Any ideas how to do this? //John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glm error: cannot correct step size
R 2.8.1 windows XP I am getting an error message that I don't understand when I try to run GLM. The error only occurs when I have all independent variables in the model. When I drop one independent variable, the model runs fine. Can anyone help me understand what the error means and how I can correct it? Thank you, John fit11-glm(AAMTCARE~BMI+BMIsq+SEX+jPHI+jMEDICAID+factor(AgeCat)+ + factor(jINDINC)+jMARSTAT+jEDUCATION+factor(jsmokercat)+factor(jrace),data=SimData,family=Gamma(link=log)) Warning: step size truncated due to divergence Error in glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, : inner loop 1; cannot correct step size # Drop factor(jrace) and model runs without a problem. fit11-glm(AAMTCARE~BMI+BMIsq+SEX+jPHI+jMEDICAID+factor(AgeCat)+ + factor(jINDINC)+jMARSTAT+jEDUCATION+factor(jsmokercat) ,data=SimData,family=Gamma(link=log)) # Drop jPHI and model runs without a problem. fit11-glm(AAMTCARE~BMI+BMIsq+SEX+jMEDICAID+factor(AgeCat)+ + factor(jINDINC)+jMARSTAT+jEDUCATION+factor(jsmokercat)+jrace,data=SimData,family=Gamma(link=log)) Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boot function returns the same results every time - there appears to be not resampling of the original data.
On Wed, 30 Dec 2009, John Sorkin wrote: R 2.8.1 Rather old: have you updated package boot since? windows XP I am trying to learn how to use the boot function to perform a bootstrap of a regression. I have written a short trial program, shown below. Clearly I have done something wrong as the output of each of the 100 bootstrap values for the regression are exactly the same - there does not appear to be any bootstrap respampling!. What have I done wrong? First, sim=parametric does not give 'bootstrap respampling', so I wonder if you have consulted the book for which this is support software? You will find it hard to use boot without a clear grasp of the underlying concepts. For bootstrapping a regression (and why you almost certainly do not want to do that) see the examples in MASS. You clearly missed the following on the help page: For the parametric bootstrap it is necessary for the user to specify how the resampling is to be conducted. The best way of accomplishing this is to specify the function ‘ran.gen’ which will return a simulated data set from the observed data set and a set of parameter estimates specified in ‘mle’. The default ran.gen() function just returns the data # Define function to be run. Function will return # beta coefficeint for x. fitter-function(d) { fit1-lm(y~x,data=d) print(names(fit1)) print(summary(fit1)) summary(fit1)$coefficients[2,1] } # Define dataframe x-1:10 y-x+rnorm(10) d-data.frame(x,y) #Run boot strap boot(d,fitter,R=100,sim=parametric) John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hdf5 package
I am having the same issue with the HDF5 package. I installed the latest hdf5 DLLs from HDFGroup. I also installed the HDFView application from the same. This was all done on Windows. The example in ?hdf5load can be used to test. The example works fine for saving and loading the structure, but the ex1.hdf file file produced is not readable by HDFView. I have an hdf5 file written on Linux using visitconvert to write a 3D file format to hdf5 format. This file is readable by HDFView on Windows but cannot be loaded by hdf5load on Windows. When attempting to load the file, the process apparently goes into a loop producing multiple lines of Processing object: .. .. which is a Group and finally ends with Error: C stack usage is too close to the limit This can be reproduced by using the data below to create a file named test.3D and converting the file to an HDF5 structure using (Linux) visitconvert test.3D test Silo and supplying 1 for HDF5 and 0 for Single File at the prompts. x y z theta 740322.174 181641.09 0.5 22.7385591768410 740332.174 181641.09 0.5 22.5965486314811 740342.174 181641.09 0.5 22.3900892567167 740352.174 181641.09 0.5 22.0708441524548 740362.174 181641.09 0.5 21.5597876586419 740372.174 181641.09 0.5 20.8148058389816 740382.174 181641.09 0.5 20.3085959981628 740392.174 181641.09 0.5 20.7148449544629 740402.174 181641.09 0.5 21.3456158108870 740412.174 181641.09 0.5 21.806643177 740422.174 181641.09 0.5 21.9911529065165 740382.174 181411.09 16.5 23.7646023452337 740392.174 181411.09 16.5 23.7251135887479 740402.174 181411.09 16.5 23.6082663058794 740412.174 181411.09 16.5 23.5401409105013 Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Negbin Error Warnings
Dear Clara, could you, please, share the summary() of your data. bests milton 2009/12/30 Clara Brück clara_bru...@web.de Hi, I ran a negative binomial regression (NBR) using the Zelig-package and the negbin model. When I then try to use the simumlation approach using the setx () and sim() functions to calculate expected values and first difference for different levels of one of my independent variables, I get 50 errors warnings, telling me that the calculation rpois produced NAs. However, the data I use doesn't have any NAs. Does this mean that the NBR-model is the wrong model for this data? Thanks in advance. ___ Preisknaller: WEB.DE http://web.de/ DSL Flatrate für nur 16,99 Euro/mtl.! http://produkte.web.de/go/02/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: glm error: cannot correct step size
I forgot to cc: the list on this response. -- Forwarded message -- From: Douglas Bates ba...@stat.wisc.edu Date: Wed, Dec 30, 2009 at 8:05 AM Subject: Re: [R] glm error: cannot correct step size To: John Sorkin jsor...@grecc.umaryland.edu On Wed, Dec 30, 2009 at 7:50 AM, John Sorkin jsor...@grecc.umaryland.edu wrote: R 2.8.1 windows XP I am getting an error message that I don't understand when I try to run GLM. The error only occurs when I have all independent variables in the model. When I drop one independent variable, the model runs fine. Can anyone help me understand what the error means and how I can correct it? Thank you, John fit11-glm(AAMTCARE~BMI+BMIsq+SEX+jPHI+jMEDICAID+factor(AgeCat)+ + factor(jINDINC)+jMARSTAT+jEDUCATION+factor(jsmokercat)+factor(jrace),data=SimData,family=Gamma(link=log)) Warning: step size truncated due to divergence Error in glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, : inner loop 1; cannot correct step size That error message and the warning indicate that the algorithm (iteratively reweighted least squares or IRLS) for fitting the parameters cannot converge, which can be because the model is over-specified. As you see, when you remove of the terms you do get parameter estimates and I would guess that, even then, the model will be over-specified. # Drop factor(jrace) and model runs without a problem. fit11-glm(AAMTCARE~BMI+BMIsq+SEX+jPHI+jMEDICAID+factor(AgeCat)+ + factor(jINDINC)+jMARSTAT+jEDUCATION+factor(jsmokercat) ,data=SimData,family=Gamma(link=log)) # Drop jPHI and model runs without a problem. fit11-glm(AAMTCARE~BMI+BMIsq+SEX+ jMEDICAID+factor(AgeCat)+ + factor(jINDINC)+jMARSTAT+jEDUCATION+factor(jsmokercat)+jrace,data=SimData,family=Gamma(link=log)) Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] seg-fault... but on what
I got the following after running succesfully through this loop 28 million times... the loop opens text files in a directory and inserts line by line into a database... *** caught segfault *** address 0xc010, cause 'memory not mapped' Traceback: 1: .getGeneric(f, where, package) 2: getGeneric(coerce, where = where) 3: as(obj, integer) 4: mysqlConnectionInfo(dbObj, ...) 5: dbGetInfo(conn, rsId) 6: dbGetInfo(conn, rsId) 7: dbListResults(con) 8: dbListResults(con) 9: mysqlQuickSQL(conn, statement, ...) 10: dbGetQuery(con, sql) 11: dbGetQuery(con, sql) 12: doTryCatch(return(expr), name, parentenv, handler) 13: tryCatchOne(expr, names, parentenv, handlers[[1L]]) 14: tryCatchList(expr, classes, parentenv, handlers) 15: tryCatch(expr, error = function(e) {call - conditionCall(e)if (!is.null(call)) {if (identical(call[[1L]], quote(doTryCatch))) call - sys.call(-4L)dcall - deparse(call)[1L]prefix - paste(Error in, dcall, : ) LONG - 75Lmsg - conditionMessage(e)sm - strsplit(msg, \n)[[1L]]if (14L + nchar(dcall, type = w) + nchar(sm[1L], type = w) LONG) prefix - paste(prefix, \n , sep = )}else prefix - Error : msg - paste(prefix, conditionMessage(e), \n, sep = ).Internal(seterrmessage(msg[1L])) if (!silent identical(getOption(show.error.messages), TRUE)) {cat(msg, file = stderr()) .Internal(printDeferredWarnings())}invisible(structure(msg, class = try-error))}) 16: try(dbGetQuery(con, sql)) 17: eval.with.vis(expr, envir, enclos) 18: eval.with.vis(ei, envir) 19: source(~/SoothSayer/EODData/DBScripts/loadEODQuotes.r) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: The code is pretty simple... library(RMySQL) drv - dbDriver(MySQL) con - dbConnect(drv, host=localhost, dbname=markets, user=root, pass=embryoni3) fileList - list.files(pattern = [[:upper:]]{2,}, all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE) for (x in 1:length(fileList)){ string - strsplit(fileList[x], _, fixed = TRUE) string - string[[1]][1] fileLines - read.csv(fileList[x],header = FALSE,row.names = NULL ) for ( j in 1:length(fileLines[,1])){ sql - paste(insert into endOfDayData (date,market,symbol,open,high,low,close,volume) values (', fileLines[j,2],',',string,',',as.character(fileLines[j,1]),',',fileLines[j,3],',',fileLines[j,4],',',fileLines[j ,5],',',fileLines[j,6],',',fileLines[j,7],'),sep=) #print(sql) atmpt - try(dbGetQuery(con,sql)) options(show.error.messages = TRUE) if(inherits(atmpt, try-error)){ } } } dbDisconnect(con) My understanding was that in general R liked to put a placekeeper like NA or NULL in for non-existent elements of defined structures... so I was surprised to see a seg fault... I'm having trouble interpretting the error... any clues would be appreciated. Nick [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] select elements and transpose
Hi all, Given the following, xx [[1]] V1 V2 V3 [1,] 1 2 3 [2,] 4 5 6 [3,] 7 8 9 [[2]] V1 V2 V3 [1,]10 11 12 [2,]13 14 15 [3,]16 17 18 [[3]] V1 V2 V3 [1,]19 20 21 [2,]22 23 24 [3,]25 26 27 how do i extract elements in each file so that after transpose, it looks something like the following; 1 10 19 2 11 20 3 12 21 and so on.. Thanks.. -- Muhammad Rahiz | Doctoral Student in Regional Climate Modeling Climate Research Laboratory, School of Geography the Environment Oxford University Centre for the Environment South Parks Road, Oxford, OX1 3QY, United Kingdom Tel: +44 (0)1865-285194 Mobile: +44 (0)7854-625974 Email: muhammad.ra...@ouce.ox.ac.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] parallel computing--help
Hey R Users I am trying to to implement a linux snow sockets parallelism in R, I am pretty new to parallel computing Here is the problem I am facing Whenever I am doing newSOCKnode(localhost) Its working fine But whenever I do newSOCKnode(ip to another computer on the network ) It throws up the following error Fatal error: cannot open file '/home/t136/R/i486-pc-linux-gnu-library/2.9/snow/RSOCKnode.R': No such file or directory R is stuck I have to force quit it with CTRL C . Now t136 is the machine I am working on and manually I have checked that there exists a file RSOCKnode.R on the specified directory SSH is working fine as I have checked system(ssh -l me 'ip to another computer on the network' ls) Works fine and I am getting the list of files on the other computer on my machine Please help! Sayan Dasgupta [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] seg-fault... but on what
On 30/12/2009 9:20 AM, Nick Torenvliet wrote: I got the following after running succesfully through this loop 28 million times... the loop opens text files in a directory and inserts line by line into a database... This looks like a bug in RMySql or possibly R. What versions are you using? Can you make it happen reproducibly with the same script? Can you simplify the script so it happens quickly, rather than requiring 28 million iterations? Without a simple recipe to reproduce it, I doubt if anyone would be able to track down the problem. Duncan Murdoch *** caught segfault *** address 0xc010, cause 'memory not mapped' Traceback: 1: .getGeneric(f, where, package) 2: getGeneric(coerce, where = where) 3: as(obj, integer) 4: mysqlConnectionInfo(dbObj, ...) 5: dbGetInfo(conn, rsId) 6: dbGetInfo(conn, rsId) 7: dbListResults(con) 8: dbListResults(con) 9: mysqlQuickSQL(conn, statement, ...) 10: dbGetQuery(con, sql) 11: dbGetQuery(con, sql) 12: doTryCatch(return(expr), name, parentenv, handler) 13: tryCatchOne(expr, names, parentenv, handlers[[1L]]) 14: tryCatchList(expr, classes, parentenv, handlers) 15: tryCatch(expr, error = function(e) {call - conditionCall(e)if (!is.null(call)) {if (identical(call[[1L]], quote(doTryCatch))) call - sys.call(-4L)dcall - deparse(call)[1L]prefix - paste(Error in, dcall, : ) LONG - 75Lmsg - conditionMessage(e)sm - strsplit(msg, \n)[[1L]]if (14L + nchar(dcall, type = w) + nchar(sm[1L], type = w) LONG) prefix - paste(prefix, \n , sep = )}else prefix - Error : msg - paste(prefix, conditionMessage(e), \n, sep = ).Internal(seterrmessage(msg[1L])) if (!silent identical(getOption(show.error.messages), TRUE)) {cat(msg, file = stderr()) .Internal(printDeferredWarnings())}invisible(structure(msg, class = try-error))}) 16: try(dbGetQuery(con, sql)) 17: eval.with.vis(expr, envir, enclos) 18: eval.with.vis(ei, envir) 19: source(~/SoothSayer/EODData/DBScripts/loadEODQuotes.r) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: The code is pretty simple... library(RMySQL) drv - dbDriver(MySQL) con - dbConnect(drv, host=localhost, dbname=markets, user=root, pass=embryoni3) fileList - list.files(pattern = [[:upper:]]{2,}, all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE) for (x in 1:length(fileList)){ string - strsplit(fileList[x], _, fixed = TRUE) string - string[[1]][1] fileLines - read.csv(fileList[x],header = FALSE,row.names = NULL ) for ( j in 1:length(fileLines[,1])){ sql - paste(insert into endOfDayData (date,market,symbol,open,high,low,close,volume) values (', fileLines[j,2],',',string,',',as.character(fileLines[j,1]),',',fileLines[j,3],',',fileLines[j,4],',',fileLines[j ,5],',',fileLines[j,6],',',fileLines[j,7],'),sep=) #print(sql) atmpt - try(dbGetQuery(con,sql)) options(show.error.messages = TRUE) if(inherits(atmpt, try-error)){ } } } dbDisconnect(con) My understanding was that in general R liked to put a placekeeper like NA or NULL in for non-existent elements of defined structures... so I was surprised to see a seg fault... I'm having trouble interpretting the error... any clues would be appreciated. Nick [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] seg-fault... but on what
Thanks Duncan, I'll work on reproducibility and possibly getting a core dump... If I get those I'll post to R-dev or RMySQL-dev. Nick On Wed, Dec 30, 2009 at 9:36 AM, Duncan Murdoch murd...@stats.uwo.cawrote: On 30/12/2009 9:20 AM, Nick Torenvliet wrote: I got the following after running succesfully through this loop 28 million times... the loop opens text files in a directory and inserts line by line into a database... This looks like a bug in RMySql or possibly R. What versions are you using? Can you make it happen reproducibly with the same script? Can you simplify the script so it happens quickly, rather than requiring 28 million iterations? Without a simple recipe to reproduce it, I doubt if anyone would be able to track down the problem. Duncan Murdoch *** caught segfault *** address 0xc010, cause 'memory not mapped' Traceback: 1: .getGeneric(f, where, package) 2: getGeneric(coerce, where = where) 3: as(obj, integer) 4: mysqlConnectionInfo(dbObj, ...) 5: dbGetInfo(conn, rsId) 6: dbGetInfo(conn, rsId) 7: dbListResults(con) 8: dbListResults(con) 9: mysqlQuickSQL(conn, statement, ...) 10: dbGetQuery(con, sql) 11: dbGetQuery(con, sql) 12: doTryCatch(return(expr), name, parentenv, handler) 13: tryCatchOne(expr, names, parentenv, handlers[[1L]]) 14: tryCatchList(expr, classes, parentenv, handlers) 15: tryCatch(expr, error = function(e) {call - conditionCall(e)if (!is.null(call)) {if (identical(call[[1L]], quote(doTryCatch))) call - sys.call(-4L)dcall - deparse(call)[1L]prefix - paste(Error in, dcall, : ) LONG - 75Lmsg - conditionMessage(e)sm - strsplit(msg, \n)[[1L]]if (14L + nchar(dcall, type = w) + nchar(sm[1L], type = w) LONG) prefix - paste(prefix, \n , sep = )}else prefix - Error : msg - paste(prefix, conditionMessage(e), \n, sep = ).Internal(seterrmessage(msg[1L])) if (!silent identical(getOption(show.error.messages), TRUE)) {cat(msg, file = stderr()) .Internal(printDeferredWarnings())}invisible(structure(msg, class = try-error))}) 16: try(dbGetQuery(con, sql)) 17: eval.with.vis(expr, envir, enclos) 18: eval.with.vis(ei, envir) 19: source(~/SoothSayer/EODData/DBScripts/loadEODQuotes.r) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: The code is pretty simple... library(RMySQL) drv - dbDriver(MySQL) con - dbConnect(drv, host=localhost, dbname=markets, user=root, pass=embryoni3) fileList - list.files(pattern = [[:upper:]]{2,}, all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE) for (x in 1:length(fileList)){ string - strsplit(fileList[x], _, fixed = TRUE) string - string[[1]][1] fileLines - read.csv(fileList[x],header = FALSE,row.names = NULL ) for ( j in 1:length(fileLines[,1])){ sql - paste(insert into endOfDayData (date,market,symbol,open,high,low,close,volume) values (', fileLines[j,2],',',string,',',as.character(fileLines[j,1]),',',fileLines[j,3],',',fileLines[j,4],',',fileLines[j ,5],',',fileLines[j,6],',',fileLines[j,7],'),sep=) #print(sql) atmpt - try(dbGetQuery(con,sql)) options(show.error.messages = TRUE) if(inherits(atmpt, try-error)){ } } } dbDisconnect(con) My understanding was that in general R liked to put a placekeeper like NA or NULL in for non-existent elements of defined structures... so I was surprised to see a seg fault... I'm having trouble interpretting the error... any clues would be appreciated. Nick [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multivariate group means
Hello, I look for a simple command computing multivariate group means and returning an object of class matrix rather than list. Does any such function exist in standard packages? I'm beginning with R, so I'm sorry if the solution is trivial. Ondra Mikula __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boot function returns the same results every time - there appears to be not resampling of the original data.
Prof Ripley, Thank you for your advise. You recommend I consult the book for which this is support software. I would be happy to do so, but I don't know what book you are referring to. Thank you, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Prof Brian Ripley rip...@stats.ox.ac.uk 12/30/2009 8:55 AM On Wed, 30 Dec 2009, John Sorkin wrote: R 2.8.1 Rather old: have you updated package boot since? windows XP I am trying to learn how to use the boot function to perform a bootstrap of a regression. I have written a short trial program, shown below. Clearly I have done something wrong as the output of each of the 100 bootstrap values for the regression are exactly the same - there does not appear to be any bootstrap respampling!. What have I done wrong? First, sim=parametric does not give 'bootstrap respampling', so I wonder if you have consulted the book for which this is support software? You will find it hard to use boot without a clear grasp of the underlying concepts. For bootstrapping a regression (and why you almost certainly do not want to do that) see the examples in MASS. You clearly missed the following on the help page: For the parametric bootstrap it is necessary for the user to specify how the resampling is to be conducted. The best way of accomplishing this is to specify the function ̔ran.gen̓ which will return a simulated data set from the observed data set and a set of parameter estimates specified in ̔mle̓. The default ran.gen() function just returns the data # Define function to be run. Function will return # beta coefficeint for x. fitter-function(d) { fit1-lm(y~x,data=d) print(names(fit1)) print(summary(fit1)) summary(fit1)$coefficients[2,1] } # Define dataframe x-1:10 y-x+rnorm(10) d-data.frame(x,y) #Run boot strap boot(d,fitter,R=100,sim=parametric) John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for\...{{dropped:25}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Negbin Error Warnings
Perhaps this will give a clue: rpois(1,-1) [1] NaN Warning message: In rpois(1, -1) : NAs produced # (lambda needs to be nonnegative) -Peter Ehlers Clara Brück wrote: Hi, I ran a negative binomial regression (NBR) using the Zelig-package and the negbin model. When I then try to use the simumlation approach using the setx () and sim() functions to calculate expected values and first difference for different levels of one of my independent variables, I get 50 errors warnings, telling me that the calculation rpois produced NAs. However, the data I use doesn't have any NAs. Does this mean that the NBR-model is the wrong model for this data? Thanks in advance. ___ Preisknaller: WEB.DE DSL Flatrate für nur 16,99 Euro/mtl.! http://produkte.web.de/go/02/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary 403.202.3921 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boot function returns the same results every time - there appears to be not resampling of the original data.
John, packageDescription('boot') will give you a hint. -Peter Ehlers John Sorkin wrote: Prof Ripley, Thank you for your advise. You recommend I consult the book for which this is support software. I would be happy to do so, but I don't know what book you are referring to. Thank you, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Prof Brian Ripley rip...@stats.ox.ac.uk 12/30/2009 8:55 AM On Wed, 30 Dec 2009, John Sorkin wrote: R 2.8.1 Rather old: have you updated package boot since? windows XP I am trying to learn how to use the boot function to perform a bootstrap of a regression. I have written a short trial program, shown below. Clearly I have done something wrong as the output of each of the 100 bootstrap values for the regression are exactly the same - there does not appear to be any bootstrap respampling!. What have I done wrong? First, sim=parametric does not give 'bootstrap respampling', so I wonder if you have consulted the book for which this is support software? You will find it hard to use boot without a clear grasp of the underlying concepts. For bootstrapping a regression (and why you almost certainly do not want to do that) see the examples in MASS. You clearly missed the following on the help page: For the parametric bootstrap it is necessary for the user to specify how the resampling is to be conducted. The best way of accomplishing this is to specify the function ̔ran.gen̓ which will return a simulated data set from the observed data set and a set of parameter estimates specified in ̔mle̓. The default ran.gen() function just returns the data # Define function to be run. Function will return # beta coefficeint for x. fitter-function(d) { fit1-lm(y~x,data=d) print(names(fit1)) print(summary(fit1)) summary(fit1)$coefficients[2,1] } # Define dataframe x-1:10 y-x+rnorm(10) d-data.frame(x,y) #Run boot strap boot(d,fitter,R=100,sim=parametric) John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for\...{{dropped:25}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary 403.202.3921 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] select elements and transpose
If I understand, you can try this: array(do.call(rbind, xx), c(3, 3, 3)) On Wed, Dec 30, 2009 at 11:45 AM, Muhammad Rahiz muhammad.ra...@ouce.ox.ac.uk wrote: Hi all, Given the following, xx [[1]] V1 V2 V3 [1,] 1 2 3 [2,] 4 5 6 [3,] 7 8 9 [[2]] V1 V2 V3 [1,]10 11 12 [2,]13 14 15 [3,]16 17 18 [[3]] V1 V2 V3 [1,]19 20 21 [2,]22 23 24 [3,]25 26 27 how do i extract elements in each file so that after transpose, it looks something like the following; 1 10 19 2 11 20 3 12 21 and so on.. Thanks.. -- Muhammad Rahiz | Doctoral Student in Regional Climate Modeling Climate Research Laboratory, School of Geography the Environment Oxford University Centre for the Environment South Parks Road, Oxford, OX1 3QY, United Kingdom Tel: +44 (0)1865-285194 Mobile: +44 (0)7854-625974 Email: muhammad.ra...@ouce.ox.ac.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NA or work around ??
I've searched and tried several ideas (na.action. and other things), but I can't see to figure this out. I'm guessing this is so simple I'll feel foolish for asking, but here goes. Thanks, L.A. Dataset$Rcil=with(Dataset, ifelse(Rpr = .95, Dataset[,percentchgn], NA)) Dataset$LLCI-with(Dataset, ave(Rcil, LEAID, Property, FUN=function(x)max(x))) LEAID percentchgn PropertyRpr Rcil LLCI 12 2036 12.190220 UNSOLD 0.237 12.190220 16.09097 13 2036 14.741559 UNSOLD 0.9992714 14.741559 16.09097 14 2036 15.882518 UNSOLD 0.9955750 15.882518 16.09097 15 2036 16.090965 UNSOLD 0.9807892 16.090965 16.09097 34 2201 -6.542363 UNSOLD -1.000 NA NA 55 2201 -5.060431 UNSOLD 0.9848659 -5.060431 NA 56 2201 -5.056231 UNSOLD 0.9699841 -5.056231 NA 57 2201 -4.994895 UNSOLD 0.9441028 NA NA 58 2201 -4.982279 UNSOLD 0.9020456 NA NA 99 2202 -8.982821 UNSOLD -1.000 NA NA 241 22029.704573 UNSOLD -1.000 NA NA 242 2202 10.447709 UNSOLD 0.237 10.447709NA 243 2202 12.616943 UNSOLD 0.9992714 12.616943NA 244 2202 12.943887 SOLD 0.9955750 12.943887 NA 245 2202 13.774406 SOLD 0.9807892 13.774406 NA 246 2202 15.007848 SOLD 0.9364319 NA NA 333 23332.099854 UNSOLD 0.7853246 NA NA 333 23332.343452 UNSOLD 0.7853246 NA NA What I'm trying to accomplish is: LEAID percentchgn PropertyRpr Rcil LLCI 12 2036 12.190220 UNSOLD 0.237 12.190220 16.09097 13 2036 14.741559 UNSOLD 0.9992714 14.741559 16.09097 14 2036 15.882518 UNSOLD 0.9955750 15.882518 16.09097 15 2036 16.090965 UNSOLD 0.9807892 16.090965 16.09097 34 2201 -6.542363 UNSOLD -1.000 NA -5.056231 55 2201 -5.060431 UNSOLD 0.9848659 -5.060431 -5.056231 56 2201 -5.056231 UNSOLD 0.9699841 -5.056231 -5.056231 57 2201 -4.994895 UNSOLD 0.9441028 NA -5.056231 58 2201 -4.982279 UNSOLD 0.9020456 NA -5.056231 99 2202 -8.982821 UNSOLD -1.000 NA 12.616943 241 22029.704573 UNSOLD -1.000 NA 12.616943 242 2202 10.447709 UNSOLD 0.237 10.447709 12.616943 243 2202 12.616943 UNSOLD 0.9992714 12.616943 12.616943 244 2202 12.943887 SOLD 0.9955750 12.943887 13.774406 245 2202 13.774406 SOLD 0.9807892 13.774406 13.774406 246 2202 15.007848 SOLD 0.9364319 NA 13.774406 333 23332.099854 UNSOLD 0.7853246NANA 333 23332.343452 UNSOLD 0.7853246NANA -- View this message in context: http://n4.nabble.com/NA-or-work-around-tp991019p991019.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Removing replicated rows
Dear All, I have a matrix like X and need to create a new matrix based on the vector m which gives the number of replicates for each unique row. The resulting matrix should look like y. x x1 x2 [1,] 2 0.5 [2,] 2 0.5 [3,] 2 0.5 [4,] 6 1.0 [5,] 6 1.0 [6,] 6 1.0 [7,] 1 1.5 [8,] 1 1.5 [9,] 1 1.5 [10,] 4 2.0 [11,] 4 2.0 [12,] 3 2.5 [13,] 3 2.5 [14,] 3 2.5 m-c(1,1,3,1,2) y y1 y2 [1,] 2 0.5 [2,] 6 1.0 [3,] 1 1.5 [4,] 1 1.5 [5,] 1 1.5 [6,] 4 2.0 [7,] 3 2.5 [8,] 3 2.5 can anyone suggest the best way to do this? Many thanks in advance, Farida [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What am I doing wrong in my loops?
Dear kind list people: I have the following code: hours [1] 0 1 2 4 5 6 7 8 9 10 11 12 13 14 15 [16] 16 17 18 19 20 21 22 23 alist $`0` [1] 3 10 10 6 5 6 4 8 9 3 7 5 8 3 6 7 2 6 6 1 4 8 10 4 10 [26] 13 6 2 8 4 7 3 4 7 9 6 4 7 4 4 4 3 $`1` [1] 1 1 3 2 3 4 2 1 $`2` [1] 1 1 3 3 . . . mn=c(length(alist)*1000) il=c(length(alist)*1000) # Now calculate the means for(i in 1:length(alist)) { for(j in 1:1000) {mn[i*j]=mean(sample(alist[[i]], 1000, replace=TRUE)); il[i*j]= hours[i]}} But not even the il vector is correct: il [1] 0 1 2 4 5 6 7 8 9 10 11 12 13 14 [15] 15 16 17 18 19 20 21 22 23 12 5 13 9 14 [29] 0 15 0 16 11 17 7 18 0 19 13 20 0 21 [43] 0 22 15 23 0 16 7 10 17 13 0 18 11 14 [57] 19 1 0 20 0 1 21 16 13 22 0 17 23 14 [71] 0 18 0 1 15 19 11 13 0 20 9 1 0 21 . . . and after a while, both mn and il get lots of NAs. The first 1000 entries should be 0. What am I doing wrong? And is there a way to do this without the two for loops? Thanks, Jim Rome __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting power function to practice data
Hi, I tried to be comprehensive but Jim's comment is indeed in place. I have data of a practice experiment where people practice a certain motor task and time-to-completion was recorded. Appropriately, the time measure declines as practice goes on. And, again appropriately, the relation seems to be non-linear. It looks like y=1/x but much less steep. I understand that the general case of such functions is called power-function. So what I'm looking for is something like abline() with a power-function fit (rather then a linear one). Jim is also correct writing that I would like to have separate fits for 'c' and 'nc'. Of course this can be achieved using subset() but, as Dennis wrote, some graphic functions include an option to graph the data by groups. Thanks again for any tip or reference. dror - On Tue, Dec 29, 2009 at 10:56 PM, jim holtman jholt...@gmail.com wrote: It would help if you were more explicit on what you were trying to look at. I assume that you want two curves ('c' and'nc') on one graphs and you can do that with the basic plot routines, or you can use the 'lattice' package, but without knowing what you are looking for, it is hard to tell. On Tue, Dec 29, 2009 at 3:33 PM, Dror D Lev dror.te...@gmail.com wrote: Hello, I have practice data of motor action in the format: S | Cond. | Time +-+ 01 | c | 1.23 01 | nc| 0.89 02 | c | 2.15 02 | nc| 1.80 . I want to look at the learning curves graphically. I will appreciate pointers to relevant functions / packages. Thanks in advance, dror [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Squared correlation error
Hi everyone, Thanks a lot for the explanation⦠I tried the following code to compute R2 for a regression system but it does not work: my_svm_model - function(myformula, mydata, mytestdata) { mymodel - svm(myformula, data=mydata) k- summary(mymodel) k$r.squared } Can anyone please tell me what I have to change to compute R2? Many thanks, Nancy Date: Tue, 29 Dec 2009 17:28:59 + From: rip...@stats.ox.ac.uk To: spec...@stat.berkeley.edu CC: nancyada...@hotmail.com; Subject: Re: [R] R2 On Tue, 29 Dec 2009, Phil Spector wrote: Nancy - Please notice that ** is not an R operator. The caret (^) is the exponentiation operator in R. - Phil Actually, no, 2**3 [1] 8 Indeed ^ is the preferred exponentiation operator, but ** has 'always' been allowed. See the following note in ?^ Note: ʽ**ʼ is translated in the parser to ʽ^ʼ, but this was undocumented for many years. It appears as an index entry in Becker _et al_ (1988), pointing to the help for ʽDeprecatedʼ but is not actually mentioned on that page. Even though it has been deprecated in S for 20 years, it is still accepted. I added that note in May 2008 (and it is not intended to be a reference to current versions of S-PLUS, since we cannot keep checking that, but that Svr4 accepted it). On Tue, 29 Dec 2009, Nancy Adam wrote: Hi everyone, I tried to write the code of computing R2 for a regression system but I failed. This is the code I use for computing RMSE: my_svm_model - function(myformula, mydata, mytestdata) { mymodel - svm(myformula, data=mydata) mytest - predict(mymodel, mytestdata) error - mytest - mytestdata[,1] -sqrt(mean(error**2)) } can anyone please tell me what I have to change to compute R2 instead of RMSE? Many thanks, Nancy _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 _ Keep your friends updatedâeven when youâre not signed in. http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-action/social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:en-xm:SI_SB_5:092010 [[alternative HTML version deleted]] _ Windows Live Hotmail: Your friends can get your Facebook updates, right from Hotmail®. http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-action/social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:en-xm:SI_SB_4:092009 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NA or work around ??
Try this: Dataset$LLCI-with(Dataset, ave(Rcil, LEAID, Property, FUN = function(x)max(x, na.rm = TRUE))) On Wed, Dec 30, 2009 at 2:44 PM, L.A. ro...@millect.com wrote: I've searched and tried several ideas (na.action. and other things), but I can't see to figure this out. I'm guessing this is so simple I'll feel foolish for asking, but here goes. Thanks, L.A. Dataset$Rcil=with(Dataset, ifelse(Rpr = .95, Dataset[,percentchgn], NA)) Dataset$LLCI-with(Dataset, ave(Rcil, LEAID, Property, FUN=function(x)max(x))) LEAID percentchgn Property Rpr Rcil LLCI 12 2036 12.190220 UNSOLD 0.237 12.190220 16.09097 13 2036 14.741559 UNSOLD 0.9992714 14.741559 16.09097 14 2036 15.882518 UNSOLD 0.9955750 15.882518 16.09097 15 2036 16.090965 UNSOLD 0.9807892 16.090965 16.09097 34 2201 -6.542363 UNSOLD -1.000 NA NA 55 2201 -5.060431 UNSOLD 0.9848659 -5.060431 NA 56 2201 -5.056231 UNSOLD 0.9699841 -5.056231 NA 57 2201 -4.994895 UNSOLD 0.9441028 NA NA 58 2201 -4.982279 UNSOLD 0.9020456 NA NA 99 2202 -8.982821 UNSOLD -1.000 NA NA 241 2202 9.704573 UNSOLD -1.000 NA NA 242 2202 10.447709 UNSOLD 0.237 10.447709 NA 243 2202 12.616943 UNSOLD 0.9992714 12.616943 NA 244 2202 12.943887 SOLD 0.9955750 12.943887 NA 245 2202 13.774406 SOLD 0.9807892 13.774406 NA 246 2202 15.007848 SOLD 0.9364319 NA NA 333 2333 2.099854 UNSOLD 0.7853246 NA NA 333 2333 2.343452 UNSOLD 0.7853246 NA NA What I'm trying to accomplish is: LEAID percentchgn Property Rpr Rcil LLCI 12 2036 12.190220 UNSOLD 0.237 12.190220 16.09097 13 2036 14.741559 UNSOLD 0.9992714 14.741559 16.09097 14 2036 15.882518 UNSOLD 0.9955750 15.882518 16.09097 15 2036 16.090965 UNSOLD 0.9807892 16.090965 16.09097 34 2201 -6.542363 UNSOLD -1.000 NA -5.056231 55 2201 -5.060431 UNSOLD 0.9848659 -5.060431 -5.056231 56 2201 -5.056231 UNSOLD 0.9699841 -5.056231 -5.056231 57 2201 -4.994895 UNSOLD 0.9441028 NA -5.056231 58 2201 -4.982279 UNSOLD 0.9020456 NA -5.056231 99 2202 -8.982821 UNSOLD -1.000 NA 12.616943 241 2202 9.704573 UNSOLD -1.000 NA 12.616943 242 2202 10.447709 UNSOLD 0.237 10.447709 12.616943 243 2202 12.616943 UNSOLD 0.9992714 12.616943 12.616943 244 2202 12.943887 SOLD 0.9955750 12.943887 13.774406 245 2202 13.774406 SOLD 0.9807892 13.774406 13.774406 246 2202 15.007848 SOLD 0.9364319 NA 13.774406 333 2333 2.099854 UNSOLD 0.7853246 NA NA 333 2333 2.343452 UNSOLD 0.7853246 NA NA -- View this message in context: http://n4.nabble.com/NA-or-work-around-tp991019p991019.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What am I doing wrong in my loops?
First of all your expression 'i*j' is storing on top of one another. Look at what happens in the case of i=1,j=8 i=2,j=4 i=4,j=2, These are all storing to the same location. Instead make 'mn' 'li' matrices: mn - matrix(length(alist), 1000) and then within the loop: mn[i,j] - value On Wed, Dec 30, 2009 at 11:52 AM, James Rome jamesr...@gmail.com wrote: Dear kind list people: I have the following code: hours [1] 0 1 2 4 5 6 7 8 9 10 11 12 13 14 15 [16] 16 17 18 19 20 21 22 23 alist $`0` [1] 3 10 10 6 5 6 4 8 9 3 7 5 8 3 6 7 2 6 6 1 4 8 10 4 10 [26] 13 6 2 8 4 7 3 4 7 9 6 4 7 4 4 4 3 $`1` [1] 1 1 3 2 3 4 2 1 $`2` [1] 1 1 3 3 . . . mn=c(length(alist)*1000) il=c(length(alist)*1000) # Now calculate the means for(i in 1:length(alist)) { for(j in 1:1000) {mn[i*j]=mean(sample(alist[[i]], 1000, replace=TRUE)); il[i*j]= hours[i]}} But not even the il vector is correct: il [1] 0 1 2 4 5 6 7 8 9 10 11 12 13 14 [15] 15 16 17 18 19 20 21 22 23 12 5 13 9 14 [29] 0 15 0 16 11 17 7 18 0 19 13 20 0 21 [43] 0 22 15 23 0 16 7 10 17 13 0 18 11 14 [57] 19 1 0 20 0 1 21 16 13 22 0 17 23 14 [71] 0 18 0 1 15 19 11 13 0 20 9 1 0 21 . . . and after a while, both mn and il get lots of NAs. The first 1000 entries should be 0. What am I doing wrong? And is there a way to do this without the two for loops? Thanks, Jim Rome __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What am I doing wrong in my loops?
Oops. Thank you. I guess I needed mn[i*1000 + j]. I should have known better. But I had wanted to make this into a data frame with df=data.frame(mn, il) so that I could plot it using factors. Can I do that with a matrix? And can I eliminate the loops? On 12/30/09 12:11 PM, jim holtman wrote: First of all your expression 'i*j' is storing on top of one another. Look at what happens in the case of i=1,j=8 i=2,j=4 i=4,j=2, These are all storing to the same location. Instead make 'mn' 'li' matrices: mn - matrix(length(alist), 1000) and then within the loop: mn[i,j] - value On Wed, Dec 30, 2009 at 11:52 AM, James Rome jamesr...@gmail.com mailto:jamesr...@gmail.com wrote: Dear kind list people: I have the following code: hours [1] 0 1 2 4 5 6 7 8 9 10 11 12 13 14 15 [16] 16 17 18 19 20 21 22 23 alist $`0` [1] 3 10 10 6 5 6 4 8 9 3 7 5 8 3 6 7 2 6 6 1 4 8 10 4 10 [26] 13 6 2 8 4 7 3 4 7 9 6 4 7 4 4 4 3 $`1` [1] 1 1 3 2 3 4 2 1 $`2` [1] 1 1 3 3 . . . mn=c(length(alist)*1000) il=c(length(alist)*1000) # Now calculate the means for(i in 1:length(alist)) { for(j in 1:1000) {mn[i*j]=mean(sample(alist[[i]], 1000, replace=TRUE)); il[i*j]= hours[i]}} But not even the il vector is correct: il [1] 0 1 2 4 5 6 7 8 9 10 11 12 13 14 [15] 15 16 17 18 19 20 21 22 23 12 5 13 9 14 [29] 0 15 0 16 11 17 7 18 0 19 13 20 0 21 [43] 0 22 15 23 0 16 7 10 17 13 0 18 11 14 [57] 19 1 0 20 0 1 21 16 13 22 0 17 23 14 [71] 0 18 0 1 15 19 11 13 0 20 9 1 0 21 . . . and after a while, both mn and il get lots of NAs. The first 1000 entries should be 0. What am I doing wrong? And is there a way to do this without the two for loops? Thanks, Jim Rome __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing replicated rows
Try this: unique(x)[rep(1:nrow(unique(x)), m),] On Wed, Dec 30, 2009 at 2:35 PM, farida...@gmail.com wrote: Dear All, I have a matrix like X and need to create a new matrix based on the vector m which gives the number of replicates for each unique row. The resulting matrix should look like y. x x1 x2 [1,] 2 0.5 [2,] 2 0.5 [3,] 2 0.5 [4,] 6 1.0 [5,] 6 1.0 [6,] 6 1.0 [7,] 1 1.5 [8,] 1 1.5 [9,] 1 1.5 [10,] 4 2.0 [11,] 4 2.0 [12,] 3 2.5 [13,] 3 2.5 [14,] 3 2.5 m-c(1,1,3,1,2) y y1 y2 [1,] 2 0.5 [2,] 6 1.0 [3,] 1 1.5 [4,] 1 1.5 [5,] 1 1.5 [6,] 4 2.0 [7,] 3 2.5 [8,] 3 2.5 can anyone suggest the best way to do this? Many thanks in advance, Farida [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying outliers in non-normally distributed data
Greetings: I could also use guidance on this topic. I provide manure sample proficiency sets to agricultural labs in the United States and Canada. There are about 65 labs in the program. My data sets are much smaller and typically non-symmetrical with obvious outliers. Usually, there are 30 to 60 sets of data, each with triple replicates (90 to 180 observations). There are definitely outliers caused by the following: reporting in the wrong units, sending in the wrong spreadsheet, entering data in the wrong row, misplacing decimal points, calculation errors, etc. For each analysis, it is common that two to three labs make these types of errors. Since there are replicates, errors like misplaced decimal points are more obvious. However, most of the outlier errors are repeated for all three replicates. I use the median and Median Absolute Deviation (MAD, constant = 1) to flag labs for accuracy. Labs where the average of their three reps deviates more than 2.5 MAD values from the median are flagged for accuracy. With this method, it is not necessary to identify the outliers. A collegue suggested running the data twice. On the first run, outliers more than 4.0 MAD units from the median are removed. On the second run, values exceeding 2.9 times the MAD are flagged for accuracy. I tried this in R with a normally distributed data set of 100,000, and the 4.0 MAD values were nearly identical to the outliers identified with boxplot. With my data set, the flags do not change very much if the data is run one time with the flags set at 2.5 MAD units compared to running the data twice and removing the 4.0 MAD outliers and flagging the second set at 2.9 MAD units. Using either one of these methods might work for you, but I am not sure of the statistical value of these methods. Yours, Jerry Floren Brian G. Peterson wrote: John wrote: Hello, I've been searching for a method for identify outliers for quite some time now. The complication is that I cannot assume that my data is normally distributed nor symmetrical (i.e. some distributions might have one longer tail) so I have not been able to find any good tests. The Walsh's Test (http://www.statistics4u.info/ fundsta...liertest.html#), as I understand assumes that the data is symmetrical for example. Also, while I've found some interesting articles: http://tinyurl.com/yc7w4oq (Missing Values, Outliers, Robust Statistics Non-parametric Methods) I don't really know what to use. Any ideas? Any R packages available for this? Thanks! PS. My data has 1000's of observations.. Take a look at package 'robustbase', it provides most of the standard robust measures and calculations. While you didn't say what kind of data you're trying to identify outliers in, if it is time series data the function Return.clean in PerformanceAnalytics may be useful. Regards, - Brian -- Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://n4.nabble.com/Identifying-outliers-in-non-normally-distributed-data-tp987921p991062.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting power function to practice data
Sounds like you might want to use nls() to fit the data and then use either curve() or predict() to do the plotting. -Peter Ehlers Dror D Lev wrote: Hi, I tried to be comprehensive but Jim's comment is indeed in place. I have data of a practice experiment where people practice a certain motor task and time-to-completion was recorded. Appropriately, the time measure declines as practice goes on. And, again appropriately, the relation seems to be non-linear. It looks like y=1/x but much less steep. I understand that the general case of such functions is called power-function. So what I'm looking for is something like abline() with a power-function fit (rather then a linear one). Jim is also correct writing that I would like to have separate fits for 'c' and 'nc'. Of course this can be achieved using subset() but, as Dennis wrote, some graphic functions include an option to graph the data by groups. Thanks again for any tip or reference. dror - On Tue, Dec 29, 2009 at 10:56 PM, jim holtman jholt...@gmail.com wrote: It would help if you were more explicit on what you were trying to look at. I assume that you want two curves ('c' and'nc') on one graphs and you can do that with the basic plot routines, or you can use the 'lattice' package, but without knowing what you are looking for, it is hard to tell. On Tue, Dec 29, 2009 at 3:33 PM, Dror D Lev dror.te...@gmail.com wrote: Hello, I have practice data of motor action in the format: S | Cond. | Time +-+ 01 | c | 1.23 01 | nc| 0.89 02 | c | 2.15 02 | nc| 1.80 . I want to look at the learning curves graphically. I will appreciate pointers to relevant functions / packages. Thanks in advance, dror [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary 403.202.3921 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What am I doing wrong in my loops?
Here is a hint as to how to replace the loops: hours - 0:5 result - lapply(hours, function(.hr){ + cbind(.hr, replicate(5, mean(sample(1:100, 1000, replace=TRUE + }) do.call(rbind, result) .hr [1,] 0 50.232 [2,] 0 51.135 [3,] 0 49.169 [4,] 0 50.264 [5,] 0 52.621 [6,] 1 51.852 [7,] 1 50.557 [8,] 1 50.501 [9,] 1 49.710 [10,] 1 49.867 [11,] 2 50.536 [12,] 2 51.562 [13,] 2 51.269 [14,] 2 50.638 [15,] 2 51.083 [16,] 3 49.785 [17,] 3 48.843 [18,] 3 51.156 [19,] 3 50.877 [20,] 3 51.012 [21,] 4 52.214 [22,] 4 50.470 [23,] 4 50.388 [24,] 4 49.600 [25,] 4 51.057 [26,] 5 51.202 [27,] 5 51.022 [28,] 5 52.084 [29,] 5 49.944 [30,] 5 49.946 On Wed, Dec 30, 2009 at 12:16 PM, James Rome jamesr...@gmail.com wrote: Oops. Thank you. I guess I needed mn[i*1000 + j]. I should have known better. But I had wanted to make this into a data frame with df=data.frame(mn, il) so that I could plot it using factors. Can I do that with a matrix? And can I eliminate the loops? On 12/30/09 12:11 PM, jim holtman wrote: First of all your expression 'i*j' is storing on top of one another. Look at what happens in the case of i=1,j=8 i=2,j=4 i=4,j=2, These are all storing to the same location. Instead make 'mn' 'li' matrices: mn - matrix(length(alist), 1000) and then within the loop: mn[i,j] - value On Wed, Dec 30, 2009 at 11:52 AM, James Rome jamesr...@gmail.com wrote: Dear kind list people: I have the following code: hours [1] 0 1 2 4 5 6 7 8 9 10 11 12 13 14 15 [16] 16 17 18 19 20 21 22 23 alist $`0` [1] 3 10 10 6 5 6 4 8 9 3 7 5 8 3 6 7 2 6 6 1 4 8 10 4 10 [26] 13 6 2 8 4 7 3 4 7 9 6 4 7 4 4 4 3 $`1` [1] 1 1 3 2 3 4 2 1 $`2` [1] 1 1 3 3 . . . mn=c(length(alist)*1000) il=c(length(alist)*1000) # Now calculate the means for(i in 1:length(alist)) { for(j in 1:1000) {mn[i*j]=mean(sample(alist[[i]], 1000, replace=TRUE)); il[i*j]= hours[i]}} But not even the il vector is correct: il [1] 0 1 2 4 5 6 7 8 9 10 11 12 13 14 [15] 15 16 17 18 19 20 21 22 23 12 5 13 9 14 [29] 0 15 0 16 11 17 7 18 0 19 13 20 0 21 [43] 0 22 15 23 0 16 7 10 17 13 0 18 11 14 [57] 19 1 0 20 0 1 21 16 13 22 0 17 23 14 [71] 0 18 0 1 15 19 11 13 0 20 9 1 0 21 . . . and after a while, both mn and il get lots of NAs. The first 1000 entries should be 0. What am I doing wrong? And is there a way to do this without the two for loops? Thanks, Jim Rome __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Squared correlation error
Hi Nancy, On Wed, Dec 30, 2009 at 12:08 PM, Nancy Adam nancyada...@hotmail.com wrote: Hi everyone, Thanks a lot for the explanation… I tried the following code to compute R2 for a regression system but it does not work: my_svm_model - function(myformula, mydata, mytestdata) { mymodel - svm(myformula, data=mydata) k- summary(mymodel) k$r.squared } Can anyone please tell me what I have to change to compute R2? Out of curiosity, what makes you think you'll find r.squared in your k object? Did you see that in the help somewhere? Anyway, given that you've already shown that you know how to calculate RMSE of a regression model, it should be pretty straightforward to rig up the formulas found on the wikipedia page to get R^2: http://en.wikipedia.org/wiki/Coefficient_of_determination -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Squared correlation error
There is a function called defaultSummay in the caret package that will compute r-squared and rmse. Max On Dec 30, 2009, at 1:30 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: Hi Nancy, On Wed, Dec 30, 2009 at 12:08 PM, Nancy Adam nancyada...@hotmail.com wrote: Hi everyone, Thanks a lot for the explanation… I tried the following code to compute R2 for a regression system but it does not work: my_svm_model - function(myformula, mydata, mytestdata) { mymodel - svm(myformula, data=mydata) k- summary(mymodel) k$r.squared } Can anyone please tell me what I have to change to compute R2? Out of curiosity, what makes you think you'll find r.squared in your k object? Did you see that in the help somewhere? Anyway, given that you've already shown that you know how to calculate RMSE of a regression model, it should be pretty straightforward to rig up the formulas found on the wikipedia page to get R^2: http://en.wikipedia.org/wiki/Coefficient_of_determination -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying outliers in non-normally distributed data
Gents: Whole books could be -- and have been -- written on this matter. Personally, when scientists start asking me about criteria for outlier removal, it sends shivers up my spine and I break into a cold, clammy sweat. What is an outlier anyway? Statisticians (of which I'm one) have promulgated the deception that outliers can be defined by purely statistical criteria, and that they can then be removed from the analysis. That is a lie. The only acceptable scientific definition of an outlier that can be legitimately removed is of data that can be confirmed to have been corrupted in some way, for example, as Jerry describes below. All purely statistical criteria are arbitrary in some way and therefore potentially dangerous. The real question is: what is the scientific purpose of the analysis? -- how are the results to be used? There are a variety of effective so-called robust/resistant statistical procedures (e.g. see the R packages robust, robustbase, and rrcov, among many others) that might then be useful to accomplish the purpose even in the presence of unusual values (outliers is a term I now avoid due to its 'political' implications). This is almost always a wiser course of action (there are even theoretical justifications for this) than using statistical criteria to identify and remove the unusual values. However, use of such tools involves subtle issues that probably cannot be properly aired in a forum such as this. I therefore think you would do well to get a competent local statistician to consult with on these matters. Yes, I do believe that scientists often require advanced statistical tools that go beyond their usual training to properly analyze even what appear to be straightforward scientific data. It is a conundrum I cannot resolve, but that does not mean I can deny it. Finally, a word of wisdom from a long-ago engineering colleague: Whenever I see an outlier, I'm never sure whether to throw it away or patent it. Cheers, Bert Gunter Genentech Nonclinical Statistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jerry Floren Sent: Wednesday, December 30, 2009 9:47 AM To: r-help@r-project.org Subject: Re: [R] Identifying outliers in non-normally distributed data Greetings: I could also use guidance on this topic. I provide manure sample proficiency sets to agricultural labs in the United States and Canada. There are about 65 labs in the program. My data sets are much smaller and typically non-symmetrical with obvious outliers. Usually, there are 30 to 60 sets of data, each with triple replicates (90 to 180 observations). There are definitely outliers caused by the following: reporting in the wrong units, sending in the wrong spreadsheet, entering data in the wrong row, misplacing decimal points, calculation errors, etc. For each analysis, it is common that two to three labs make these types of errors. Since there are replicates, errors like misplaced decimal points are more obvious. However, most of the outlier errors are repeated for all three replicates. I use the median and Median Absolute Deviation (MAD, constant = 1) to flag labs for accuracy. Labs where the average of their three reps deviates more than 2.5 MAD values from the median are flagged for accuracy. With this method, it is not necessary to identify the outliers. A collegue suggested running the data twice. On the first run, outliers more than 4.0 MAD units from the median are removed. On the second run, values exceeding 2.9 times the MAD are flagged for accuracy. I tried this in R with a normally distributed data set of 100,000, and the 4.0 MAD values were nearly identical to the outliers identified with boxplot. With my data set, the flags do not change very much if the data is run one time with the flags set at 2.5 MAD units compared to running the data twice and removing the 4.0 MAD outliers and flagging the second set at 2.9 MAD units. Using either one of these methods might work for you, but I am not sure of the statistical value of these methods. Yours, Jerry Floren Brian G. Peterson wrote: John wrote: Hello, I've been searching for a method for identify outliers for quite some time now. The complication is that I cannot assume that my data is normally distributed nor symmetrical (i.e. some distributions might have one longer tail) so I have not been able to find any good tests. The Walsh's Test (http://www.statistics4u.info/ fundsta...liertest.html#), as I understand assumes that the data is symmetrical for example. Also, while I've found some interesting articles: http://tinyurl.com/yc7w4oq (Missing Values, Outliers, Robust Statistics Non-parametric Methods) I don't really know what to use. Any ideas? Any R packages available for this? Thanks! PS. My data has 1000's of observations.. Take a look at package 'robustbase', it provides most of the standard robust measures and calculations.
Re: [R] Removing replicated rows
It worked perfectly. Thank you! On Wed, Dec 30, 2009 at 12:21 PM, Henrique Dallazuanna www...@gmail.comwrote: Try this: unique(x)[rep(1:nrow(unique(x)), m),] On Wed, Dec 30, 2009 at 2:35 PM, farida...@gmail.com wrote: Dear All, I have a matrix like X and need to create a new matrix based on the vector m which gives the number of replicates for each unique row. The resulting matrix should look like y. x x1 x2 [1,] 2 0.5 [2,] 2 0.5 [3,] 2 0.5 [4,] 6 1.0 [5,] 6 1.0 [6,] 6 1.0 [7,] 1 1.5 [8,] 1 1.5 [9,] 1 1.5 [10,] 4 2.0 [11,] 4 2.0 [12,] 3 2.5 [13,] 3 2.5 [14,] 3 2.5 m-c(1,1,3,1,2) y y1 y2 [1,] 2 0.5 [2,] 6 1.0 [3,] 1 1.5 [4,] 1 1.5 [5,] 1 1.5 [6,] 4 2.0 [7,] 3 2.5 [8,] 3 2.5 can anyone suggest the best way to do this? Many thanks in advance, Farida [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting power function to practice data
Thank you Peter. nls() predict() seems to do the job. dror - On Wed, Dec 30, 2009 at 8:10 PM, Peter Ehlers ehl...@ucalgary.ca wrote: Sounds like you might want to use nls() to fit the data and then use either curve() or predict() to do the plotting. -Peter Ehlers Dror D Lev wrote: Hi, I tried to be comprehensive but Jim's comment is indeed in place. I have data of a practice experiment where people practice a certain motor task and time-to-completion was recorded. Appropriately, the time measure declines as practice goes on. And, again appropriately, the relation seems to be non-linear. It looks like y=1/x but much less steep. I understand that the general case of such functions is called power-function. So what I'm looking for is something like abline() with a power-function fit (rather then a linear one). Jim is also correct writing that I would like to have separate fits for 'c' and 'nc'. Of course this can be achieved using subset() but, as Dennis wrote, some graphic functions include an option to graph the data by groups. Thanks again for any tip or reference. dror - On Tue, Dec 29, 2009 at 10:56 PM, jim holtman jholt...@gmail.com wrote: It would help if you were more explicit on what you were trying to look at. I assume that you want two curves ('c' and'nc') on one graphs and you can do that with the basic plot routines, or you can use the 'lattice' package, but without knowing what you are looking for, it is hard to tell. On Tue, Dec 29, 2009 at 3:33 PM, Dror D Lev dror.te...@gmail.com wrote: Hello, I have practice data of motor action in the format: S | Cond. | Time +-+ 01 | c | 1.23 01 | nc| 0.89 02 | c | 2.15 02 | nc| 1.80 . I want to look at the learning curves graphically. I will appreciate pointers to relevant functions / packages. Thanks in advance, dror [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary 403.202.3921 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lm() and factors appending
How for the love of god can I prevent the lm() function from padding on to my factor variables? I start out with 2 tables: Table1 123123 124351 ... 626773 Table2 Count,IS_DEAD,IS_BURNING 1231,T,F 4521,F,T ... 3321,T,T Everything looks fine when I import the data. then we get a oh_crap - lm(table1 ~ Count + IS_DEAD + IS_BURNING, table2) Magically when I look at my oh_crap coefficents they get turned into Count, IS_DEADTRUE, IS_BURNINGTRUE I get it that it finds them to be factors by how in the name of all that is holy do I prevent them from doing that crap since later after a stepwise removal I go into one of the models grab what coefficents were kept (IS_BURNINGTTRUE now) and read future regressors from the original table which read IS_BURNING rather then IS_BURNINGTRUE. Since there is a mix of numeric and dummy regressors I cannot selectively append TRUE to variables names as I don't have control of what regressors get imported (one month there might be 50 and another month 2). How can I stop lm() from padding onto a coefficent's name? I have no objection to post processing by finding and assasinating any name with the word TRUE appended to the end of a name but in that case then how do I change the coeff name? Maddening... who thought it would be a good idea to append things without asking? A simple lm(appendFactor=FALSE) would have been nice... took me 3 hours to find out what was going wrong on this Grabbing coffee, a carton of cigs, and heading outside to smash my head with a brick to make the hurting stop [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] capturing stdout
I'm wondering if anyone knows of a way to capture system messages, for instance when I do the following: system(../DBScripts/getEODData.sh) .//LSE_20091230.txt: 154.80 kB 207.24 kB/s ./Fundamentals//LSE.txt: 420.58 kB 301.47 kB/s ./Fundamentals//MLSE.txt:3.42 kB 16.20 kB/s Remote host has closed the connection. ncftpget /: remote host closed control connection. I'd like to get the stdout from the system call into some memory location where I can play with it... Is there a simple way to do this? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] capturing stdout
Nick, You don't specify OS, so details might differ, but on my linux system the help for system (which should be the first place you'd look anyway) has an entire section on stdout and stderr Stdout and stderr: Error messages written to ‘stderr’ will be sent by the shell to the terminal unless ‘ignore.stderr = TRUE’. They can be captured (in the most likely shells) by system(some command 21, intern=TRUE) What happens to output sent to ‘stdout’ and ‘stderr’ if ‘intern = FALSE’ is interface-specific, and it is unsafe to assume that such messages will appear on a console (they do on the Mac OS X console, but not on some others). Sarah On Wed, Dec 30, 2009 at 3:15 PM, Nick Torenvliet nick.torenvl...@gmail.com wrote: I'm wondering if anyone knows of a way to capture system messages, for instance when I do the following: system(../DBScripts/getEODData.sh) .//LSE_20091230.txt: 154.80 kB 207.24 kB/s ./Fundamentals//LSE.txt: 420.58 kB 301.47 kB/s ./Fundamentals//MLSE.txt: 3.42 kB 16.20 kB/s Remote host has closed the connection. ncftpget /: remote host closed control connection. I'd like to get the stdout from the system call into some memory location where I can play with it... Is there a simple way to do this? -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] capturing stdout
?capture.output ?sink What is the problem you are trying to solve? Sent from my iPhone. On Dec 30, 2009, at 15:15, Nick Torenvliet nick.torenvl...@gmail.com wrote: I'm wondering if anyone knows of a way to capture system messages, for instance when I do the following: system(../DBScripts/getEODData.sh) .//LSE_20091230.txt: 154.80 kB 207.24 kB/s ./Fundamentals//LSE.txt: 420.58 kB 301.47 kB/s ./Fundamentals//MLSE.txt:3.42 kB 16.20 kB/s Remote host has closed the connection. ncftpget /: remote host closed control connection. I'd like to get the stdout from the system call into some memory location where I can play with it... Is there a simple way to do this? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm() and factors appending
Gee, so much excitement over such a little thing. So how exactly would you prefer lm() to tell you what levels of your factors are being taken as the reference levels? Post-processing, if necessary, would seem to be pretty trivial. If you want to re-run lm() on modified sets of predictors, what's wrong with editing the lm(...) command or using update()? -Peter Ehlers Idgarad wrote: How for the love of god can I prevent the lm() function from padding on to my factor variables? I start out with 2 tables: Table1 123123 124351 ... 626773 Table2 Count,IS_DEAD,IS_BURNING 1231,T,F 4521,F,T ... 3321,T,T Everything looks fine when I import the data. then we get a oh_crap - lm(table1 ~ Count + IS_DEAD + IS_BURNING, table2) Magically when I look at my oh_crap coefficents they get turned into Count, IS_DEADTRUE, IS_BURNINGTRUE I get it that it finds them to be factors by how in the name of all that is holy do I prevent them from doing that crap since later after a stepwise removal I go into one of the models grab what coefficents were kept (IS_BURNINGTTRUE now) and read future regressors from the original table which read IS_BURNING rather then IS_BURNINGTRUE. Since there is a mix of numeric and dummy regressors I cannot selectively append TRUE to variables names as I don't have control of what regressors get imported (one month there might be 50 and another month 2). How can I stop lm() from padding onto a coefficent's name? I have no objection to post processing by finding and assasinating any name with the word TRUE appended to the end of a name but in that case then how do I change the coeff name? Maddening... who thought it would be a good idea to append things without asking? A simple lm(appendFactor=FALSE) would have been nice... took me 3 hours to find out what was going wrong on this Grabbing coffee, a carton of cigs, and heading outside to smash my head with a brick to make the hurting stop [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary 403.202.3921 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] select elements and transpose
Hi: An alternative is the following: # This regenerates your dataset: xx - list(matrix(1:9, nrow = 3, byrow = TRUE), matrix(10:18, nrow = 3, byrow = TRUE), matrix(19:27, nrow = 3, byrow = TRUE)) # Flatten matrix elements of each component into vectors: xxf - lapply(xx, function(x) {y - t(x); dim(y) - NULL; y}) # rbind them into a matrix: do.call(rbind, xxf) # If you prefer the result as a long vector, then as.vector(do.call(rbind, xxf)) Results: do.call(rbind, xxf) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,]123456789 [2,] 10 11 12 13 14 15 16 17 18 [3,] 19 20 21 22 23 24 25 26 27 as.vector(do.call(rbind, xxf)) [1] 1 10 19 2 11 20 3 12 21 4 13 22 5 14 23 6 15 24 7 16 25 8 17 26 9 [26] 18 27 HTH, Dennis On Wed, Dec 30, 2009 at 5:45 AM, Muhammad Rahiz muhammad.ra...@ouce.ox.ac.uk wrote: Hi all, Given the following, xx [[1]] V1 V2 V3 [1,] 1 2 3 [2,] 4 5 6 [3,] 7 8 9 [[2]] V1 V2 V3 [1,]10 11 12 [2,]13 14 15 [3,]16 17 18 [[3]] V1 V2 V3 [1,]19 20 21 [2,]22 23 24 [3,]25 26 27 how do i extract elements in each file so that after transpose, it looks something like the following; 1 10 19 2 11 20 3 12 21 and so on.. Thanks.. -- Muhammad Rahiz | Doctoral Student in Regional Climate Modeling Climate Research Laboratory, School of Geography the Environment Oxford University Centre for the Environment South Parks Road, Oxford, OX1 3QY, United Kingdom Tel: +44 (0)1865-285194 Mobile: +44 (0)7854-625974 Email: muhammad.ra...@ouce.ox.ac.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] factorial analysis influenced by data skewness
Hello I run a factorial analysis on 20 variables describing the behaviour of insects on 9 different resistant plants. I then biologically interpreted each of the 5 factors obtained, with respect to the variables with the highest loading value for each factor. And I run a GLM on each factor with plant as independent variable. Eventually, I inferred the resistance mechanism by comparing (through post-hoc Dunnett's test) the factor score between each resistant plant and a susceptible standard. This analysis gave me good results. However, I am worried that the factors I obtained may vary with the variability of the variables. For instance, if most of the plants exhibit the same type of resistance, resulting in a skewness of my dataset, do the factors reflect all the resistance mechanisms, even the ones in minority? Additionally, I would appreciate your opinion on conducting GLM on factors. thank you very much in advance. Julien. -- View this message in context: http://n4.nabble.com/factorial-analysis-influenced-by-data-skewness-tp991087p991087.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: Negbin Error Warnings
Dear Clara, Thanks for the reply. I am forwarding your message to the list, ok. When I wrote was a way of get further information to help the helpers. happy holidays, milton -- Forwarded message -- From: Clara Brück clara_bru...@web.de Date: 2009/12/30 Subject: Re: [R] Negbin Error Warnings To: milton ruser milton.ru...@gmail.com Dear Milton, Thanks for your email. I ran the NBR-model and then tried to compute the first differences of some expected values: m3.x1 -setx(m3, advisory=1) m3.x2 -setx(m3, advisory=0) m3.s1 -sim (m3,x=m3.x1, x1=m3.x2) However, I then get 50 error warnings, and there all like this: In rpois(k, (mu * rgamma(k, theta))/theta) : NAs produced Bests Clara -Ursprüngliche Nachricht- Von: milton ruser milton.ru...@gmail.com Gesendet: 30.12.09 15:06:04 An: Clara Brück clara_bru...@web.de CC: r-help@r-project.org Betreff: Re: [R] Negbin Error Warnings Dear Clara, could you, please, share the summary() of your data. bests milton 2009/12/30 Clara Brück clara_bru...@web.de Hi, I ran a negative binomial regression (NBR) using the Zelig-package and the negbin model. When I then try to use the simumlation approach using the setx () and sim() functions to calculate expected values and first difference for different levels of one of my independent variables, I get 50 errors warnings, telling me that the calculation rpois produced NAs. However, the data I use doesn't have any NAs. Does this mean that the NBR-model is the wrong model for this data? Thanks in advance. ___ Preisknaller: WEB.DE http://web.de/ DSL Flatrate für nur 16,99 Euro/mtl.! http://produkte.web.de/go/02/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-http://www.r-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. ___ Preisknaller: WEB.DE http://web.de/ DSL Flatrate für nur 16,99 Euro/mtl.! http://produkte.web.de/go/02/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Positioning plots on top of each other (aligment borders)
On 12/30/2009 08:32 PM, Jay wrote: Hello, I want to place two plots on top of each other. However, the problem is that I can't figure out a simple way to align them correctly. Is there a way to specify this? Since the data is bunch of coordinates and the second layer is an outline of a map (a .ps file I import using the grImport package), I suppose one option would be to specify a set of artificial coordinates that make up the very corners of that plot, and then have the second layer will this same space. Any ideas how to do this? Hi John, I may be mistaken, but wouldn't it be easier to overlay your points on a map generated by the maps package? Your user units would then correspond to your coordinates. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: Insufficient memory. Please close the console
I just upgraded to R version 2.10.1 Patched (2009-12-27 r50842) on WinXP, and I'm getting this error inconsistently both on start and close of sessions. I never saw it on previous versions (from 2.6 to 2.9). Not sure what to check on, I don't believe I'm using any more resources than I normally do. Peter Keller [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Robust Bivariate Poisson
To Experts, x = x1 + x3 y = x2 + x3 x3 = cov(x1,x2) By refer to package bivpois, I'ld like to downweight previous x,y in my data based on date, I am wondering if lambda3 should robust as well, since lambda3 measure the co-relationship of lambda1 and lambda2, but if robust lambda3 might effect the covariance of lambda1 and lambda2 I though. I am sorry if misunderstanding the concept of a bivariate Poisson. Appreciate if experts willing to share precious advice. -- View this message in context: http://n4.nabble.com/Robust-Bivariate-Poisson-tp991214p991214.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Negbin Error Warnings
Clara, The most likely problem is that theta==0 for some of the calculations. Or mu is NA, I suppose. -Peter Ehlers milton ruser wrote: Dear Clara, Thanks for the reply. I am forwarding your message to the list, ok. When I wrote was a way of get further information to help the helpers. happy holidays, milton -- Forwarded message -- From: Clara Brück clara_bru...@web.de Date: 2009/12/30 Subject: Re: [R] Negbin Error Warnings To: milton ruser milton.ru...@gmail.com Dear Milton, Thanks for your email. I ran the NBR-model and then tried to compute the first differences of some expected values: m3.x1 -setx(m3, advisory=1) m3.x2 -setx(m3, advisory=0) m3.s1 -sim (m3,x=m3.x1, x1=m3.x2) However, I then get 50 error warnings, and there all like this: In rpois(k, (mu * rgamma(k, theta))/theta) : NAs produced Bests Clara -Ursprüngliche Nachricht- Von: milton ruser milton.ru...@gmail.com Gesendet: 30.12.09 15:06:04 An: Clara Brück clara_bru...@web.de CC: r-help@r-project.org Betreff: Re: [R] Negbin Error Warnings Dear Clara, could you, please, share the summary() of your data. bests milton 2009/12/30 Clara Brück clara_bru...@web.de Hi, I ran a negative binomial regression (NBR) using the Zelig-package and the negbin model. When I then try to use the simumlation approach using the setx () and sim() functions to calculate expected values and first difference for different levels of one of my independent variables, I get 50 errors warnings, telling me that the calculation rpois produced NAs. However, the data I use doesn't have any NAs. Does this mean that the NBR-model is the wrong model for this data? Thanks in advance. ___ Preisknaller: WEB.DE http://web.de/ DSL Flatrate für nur 16,99 Euro/mtl.! http://produkte.web.de/go/02/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-http://www.r-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. ___ Preisknaller: WEB.DE http://web.de/ DSL Flatrate für nur 16,99 Euro/mtl.! http://produkte.web.de/go/02/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary 403.202.3921 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: Insufficient memory. Please close the console
peter_kel...@fws.gov wrote: I just upgraded to R version 2.10.1 Patched (2009-12-27 r50842) on WinXP, and I'm getting this error inconsistently both on start and close of sessions. I never saw it on previous versions (from 2.6 to 2.9). Not sure what to check on, I don't believe I'm using any more resources than I normally do. Do you have a .RData file loading? If so, delete it (or rename it). -Peter Ehlers Peter Keller [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary 403.202.3921 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] what don't I get about numeric/double comparisons in R way?
Hi, I'm pretty much an R noob and I'm missing some paradigm in R I think. I can't figure our how to compare numerics. here's a transcript of my tests. Any pointers? print(range_sd) [1] 34.40783 is.numeric(range_sd) [1] TRUE is.numeric(foo) [1] TRUE is.double(range_sd) [1] TRUE is.double(foo) [1] TRUE identical(range_sd, foo) [1] FALSE identical(range_sd, 34.40783) [1] FALSE identical(range_sd, 34.40783, num.eq=FALSE) [1] FALSE identical(range_sd, 34.40783, num.eq=TRUE) [1] FALSE library(RUnit) Warning message: package 'RUnit' was built under R version 2.10.1 checkEquals(range_sd, 34.40783) Error in checkEquals(range_sd, 34.40783) : Mean relative difference: 5.79431e-08 checkEquals(range_sd, foo) Error in checkEquals(range_sd, foo) : Mean relative difference: 5.79431e-08 if( range_sd == 34.40783 ) { print(foo)} if( range_sd == 34.40783 ) { print(foo)} if( range_sd != 34.40783 ) { print(foo)} [1] foo all.equal(range_sd, 34.40783) [1] Mean relative difference: 5.79431e-08 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what don't I get about numeric/double comparisons in R way?
On 12/31/2009 02:41 PM, donahc...@me.com wrote: Hi, I'm pretty much an R noob and I'm missing some paradigm in R I think. I can't figure our how to compare numerics. here's a transcript of my tests. Any pointers? Hi donahchoo, You're comparing the printed value of range_sd, which has been truncated, to the actual value. As the printout says, the difference is small, but present. If you set range_sd to the printed value: range_sd-34.40783 the comparisons will return TRUE. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what don't I get about numeric/double comparisons in R way?
On Dec 30, 2009, at 10:10 PM, Jim Lemon wrote: On 12/31/2009 02:41 PM, donahc...@me.com wrote: Hi, I'm pretty much an R noob and I'm missing some paradigm in R I think. I can't figure our how to compare numerics. here's a transcript of my tests. Any pointers? Hi donahchoo, You're comparing the printed value of range_sd, which has been truncated, to the actual value. As the printout says, the difference is small, but present. If you set range_sd to the printed value: range_sd-34.40783 the comparisons will return TRUE. Jim Thanks, Jim. I figured this out after sending the email, but I still can't compare. Here's some more tests, note that all.equal returns true but so does != actual_mean - mean( range) actual_sd - sd( range) expected_mean - 218.213483146067 expected_sd - 159.277118043936 print(paste(expected_mean, expected_mean)) print(paste(actual_mean, actual_mean)) print(paste(expected_sd, expected_sd)) print(paste(actual_sd, actual_sd)) if( expected_mean != actual_mean ) { stop( not equal ) } Output: source(/Users/adhamh/Developer/r/test.r) [1] expected_mean 218.213483146067 [1] actual_mean 218.213483146067 [1] expected_sd 159.277118043936 [1] actual_sd 159.277118043936 Error in eval.with.vis(expr, envir, enclos) : not equal identical(expected_mean, actual_mean) [1] FALSE all.equal(expected_mean, actual_mean) [1] TRUE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cross validation for species distribution
Dear, I wanna make cross-validation for the species data of species distribution models. Please kindly suggest any package containing cross validation suiting the purpose. Thank you. Elaine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what don't I get about numeric/double comparisons in R way?
donahc...@me.com wrote: On Dec 30, 2009, at 10:10 PM, Jim Lemon wrote: On 12/31/2009 02:41 PM, donahc...@me.com wrote: Hi, I'm pretty much an R noob and I'm missing some paradigm in R I think. I can't figure our how to compare numerics. here's a transcript of my tests. Any pointers? Hi donahchoo, You're comparing the printed value of range_sd, which has been truncated, to the actual value. As the printout says, the difference is small, but present. If you set range_sd to the printed value: range_sd-34.40783 the comparisons will return TRUE. Jim Thanks, Jim. I figured this out after sending the email, but I still can't compare. Here's some more tests, note that all.equal returns true but so does != actual_mean - mean( range) actual_sd - sd( range) expected_mean - 218.213483146067 expected_sd - 159.277118043936 print(paste(expected_mean, expected_mean)) print(paste(actual_mean, actual_mean)) print(paste(expected_sd, expected_sd)) print(paste(actual_sd, actual_sd)) if( expected_mean != actual_mean ) { stop( not equal ) } Output: source(/Users/adhamh/Developer/r/test.r) [1] expected_mean 218.213483146067 [1] actual_mean 218.213483146067 [1] expected_sd 159.277118043936 [1] actual_sd 159.277118043936 Error in eval.with.vis(expr, envir, enclos) : not equal identical(expected_mean, actual_mean) [1] FALSE all.equal(expected_mean, actual_mean) [1] TRUE Your code is not reproducible, but it seems fairly clear that you haven't yet read the help pages for identical() and all.equal(); or, if you have, you haven't done so very carefully. For ?identical, you'll find this: The function all.equal is also sometimes used to test equality this way, but was intended for something different: it allows for small differences in numeric results. For ?all.equal, you'll see mention of an argument 'tolerance'. As a new R-user, you would be well advised to get very familiar with the '?' key. Welcome to R. -Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary 403.202.3921 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.