[R] mac question
Hi there, currently, I've updated R on my Mac (OS X) to version 2.10. I was wondering if I have to install all additional packages again??? In Windows, I just needed to copy the library folder of the old installation but how does it work with Mac? Can anybody give me a hint? Antje __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] about the cox result
Hi all: I finished cox analysis like this: fit_cox<-coxph(Surv(dat$Time, dat$death) ~ dat$CD4 + strata(dat$gender),data=dat); > fit_cox Call: coxph(formula = Surv(data_ori$Time, data_ori$death) ~ data_ori$drug + strata(data_ori$gender), data = data_ori) coef exp(coef) se(coef)zp data_ori$drugddI 0.216 1.240.146 1.47 0.14 Likelihood ratio test=2.17 on 1 df, p=0.140 n= 467 I wanna extract the result: 0.216 1.240.146 1.47 0.14 and the corresponding name "coef exp(coef) se(coef)zp" from fit_cox. I use the command: str(fit_cox),but I can only find 0.216 of the result,but the other four results( 1.240.146 1.47 0.14) can't be found. Anyone can help me? Thanks a lot! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 1 dimensional optimization with local minima
I am using numerical optimization to fit a 1 parameter model, in which the input parameter is bounded. I am currently using optimize(), however, the problem turns out to have local minima, and optimize does not always seem to find the global minimum. I could to write a wrapping function that tries multiple intervals or starting values, but I would prefer a package that has built-in methods to make it more robust against local minima. I checked the CRAN Task View for Optimization, however there seem to be a lot of alternatives, and not being an optimization expert, I could use some advice. What could be an appropriate package or function for one-dimensional bounded optimization, that includes some protection against local minima? - Jeroen Ooms * Dept. of Methodology and Statistics * Utrecht University Visit http://www.jeroenooms.com www.jeroenooms.com to explore some of my current projects. -- View this message in context: http://old.nabble.com/1-dimensional-optimization-with-local-minima-tp26160001p26160001.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] hierarchical clustering with Jaccard index
hi, I want to do hierarchical clustering with Jaccord index. I tried to do with vegan package for finding index and hierarchical clustering with hclust function. While doing clustering it is showing an error message as "invalid distance method". I would be grateful if anyone tells how to rectify the error. Thanks in advance, kind regards, Ms.Karunambigai M PhD Scholar Dept. of Biostatistics NIMHANS Bangalore India Keep up with people you care about with Yahoo! India Mail. Learn how. http://in.overview.mail.yahoo.com/connectmore [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] design matrix construction question
> On Nov 2, 2009, at 10:40 PM, Ben Bolker wrote: > > with the following simple data frame > > dd = structure(list(z = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L > > ), .Label = c("a", "b"), class = "factor"), x = c(0.3, 0.2, 0.1, > > 0, 0, 0, 0.2, 0.3)), .Names = c("z", "x"), row.names = c(NA, > > -8L), class = "data.frame") > > > > I would like know if it's possible to use model.matrix() > > to construct the following design matrix in some sensible way: > > > > za zb > > 1 1 0 > > 2 1 0 > > 3 1 0 > > 4 0 0 > > 5 0 0 > > 6 0 0 > > 7 0 1 > > 8 0 1 How about something like this? dd$xf <- (dd$x > 0) + 0 model.matrix(~ z:xf - 1, dd) za:xf zb:xf 1 1 0 2 1 0 3 1 0 4 0 0 5 0 0 6 0 0 7 0 1 8 0 1 attr(,"assign") [1] 1 1 attr(,"contrasts") attr(,"contrasts")$z [1] "contr.treatment" > > thanks > > Ben Bolker > > > > -- Ken Knoblauch Inserm U846 Stem-cell and Brain Research Institute Department of Integrative Neurosciences 18 avenue du Doyen Lépine 69500 Bron France tel: +33 (0)4 72 91 34 77 fax: +33 (0)4 72 91 34 61 portable: +33 (0)6 84 10 64 10 http://www.sbri.fr/members/kenneth-knoblauch.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with SSOAP (can't find working examples)
First of all, let me confess that I am a newbie to R and don't know much about the language or the environment. We have a need for plugging in R in our production runtime and need the ability to pull data out of our existing services. I am trying to see if I can take advantage of SSOAP such that we can expose the data via webservices and use SSOAP to call into them. Our runtime is mostly composed of python, so the idea is to use rpy2 to interface from python to R, and use SSOAP to call back into python modules where needed. Now the problem is that I have hard time trying to find a single example that works. I am using windows and here are the version numbers: R: 2.9.2 SSOAP: 0.5-3 XMLSchema: 0.1-1 XML: 2.6-0 The packages are up to date as per "Update packages". I tried all the examples that came under library/SSOAP/examples directory and most failed during genSOAPClientInterface(), while one failed right in processWSDL(). This is the most common error during genSOAPClientInterface(): Error in paste(" .elementFormQualified", .elementFormQualified, sep = " = ") : no slot of name "elementFormQualified" for this object of class "SchemaTypes" There was only one reference to this error in the archives, and it seems like the OP got the problem solved after updating XMLSchema (which happened to be the same version as what I have). I tried a couple of WSDL's that I have for creating python webservices, as well as a "hello world" example, and they all have the same error as well. While KEGG.wsdl, and others gave the above error, the eutils.wsdl gave me the below error: Error: Cannot resolve SOAP type in empty context Interop.wsdl got me the below error: Error: Cannot resolve string in SchemaCollection nwis.wsdl failes in processWSDL() itself, with the below error: Error in parse(text = paste(txt, collapse = "\n")) : unexpected input in "function(x, ..., obj = new( ‘" I am most interested in solving the first error that I reported (as the rest might be real issues with the wsdl syntax). Here is the full output for KEGG.wsdl: [1] "ArrayOfstring" defining class ArrayOfstring defining setAs() for ArrayOfstring [1] "SSDBRelation" defining class SSDBRelation defining setAs() for SSDBRelation [1] "ArrayOfSSDBRelation" defining class ArrayOfSSDBRelation defining setAs() for ArrayOfSSDBRelation [1] "MotifResult" defining class MotifResult defining setAs() for MotifResult [1] "ArrayOfMotifResult" defining class ArrayOfMotifResult defining setAs() for ArrayOfMotifResult [1] "Definition" defining class Definition defining setAs() for Definition [1] "ArrayOfDefinition" defining class ArrayOfDefinition defining setAs() for ArrayOfDefinition [1] "LinkDBRelation" defining class LinkDBRelation defining setAs() for LinkDBRelation [1] "ArrayOfLinkDBRelation" defining class ArrayOfLinkDBRelation defining setAs() for ArrayOfLinkDBRelation Operation list_databases resolve(, SchemaCollection) ArrayOfDefinition resolve(, list) ArrayOfDefinition Note: Method with signature "ArrayType#list" chosen for function "resolve", target signature "ArrayType#SchemaCollection". "SimpleSequenceType#SchemaCollection" would also be valid Error in paste(" .elementFormQualified", .elementFormQualified, sep = " = ") : no slot of name "elementFormQualified" for this object of class "SchemaTypes" In addition: Warning message: In .findOrCopyClass(class2, classDef2, where, "subclass") : Class "VirtualXMLSchemaClass" is defined (with package slot "XMLSchema") but no metadata object found to revise subclass information---not exported? Making a copy in package ".GlobalEnv" I am not an expert in web services and WSDL, but I did compare them with a few public WSDL samples and couldn't really make out an issue. I would really appreciate any help or information in getting SSOAP to work in my environment. Hopefully it is something trivial, like a configuration issue. And one more thing, I found version 0.5.4 of SSOAP at omegahat.org website and tried that out as well, but it didn't make any difference. I found a newer version of XMLSchema as well, but it didn't install (with compilation errors). I have activeperl and mingw in the path. Thank you, Hari __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] why is adnonis function called adonis {package vegan}
Hello all, I used google and looked at the documentation to find out why the ADONIS function is called adonis (in the vegan package). I am writing a document and would like to include a abbreviation list (similar to "ANOSIM = Analysis of Similarities"). regards, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] design matrix construction question
On Nov 2, 2009, at 10:40 PM, Ben Bolker wrote: with the following simple data frame dd = structure(list(z = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L ), .Label = c("a", "b"), class = "factor"), x = c(0.3, 0.2, 0.1, 0, 0, 0, 0.2, 0.3)), .Names = c("z", "x"), row.names = c(NA, -8L), class = "data.frame") I would like know if it's possible to use model.matrix() to construct the following design matrix in some sensible way: za zb 1 1 0 2 1 0 3 1 0 4 0 0 5 0 0 6 0 0 7 0 1 8 0 1 In other words, I want column 1 to be (z=="a" & x>0) and column 2 to be (z=="b" & x>0). I can construct this matrix using sweep(model.matrix(~z-1,dd),1,dd$x>0,"*") and then stick it into lm.fit -- but is there a more elegant way to do this in general? Elegance? You decide. But at least more economical from a keystroke perspective: model.matrix(~z-1,dd)*(dd$x>0) #-- za zb 1 1 0 2 1 0 3 1 0 4 0 0 5 0 0 6 0 0 7 0 1 8 0 1 attr(,"assign") [1] 1 1 attr(,"contrasts") attr(,"contrasts")$z [1] "contr.treatment I haven't found a formula combining (z-1) and I(x>0) that works, although I can imagine there is one. thanks Ben Bolker David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save an object by dynamicly created name
Hi Henrik and David, Thank you very much for your suggestions. I found both meet my needs. In the following code, I found save(obj,file=pathname) would save the content into an object called obj. path <- "~/"; dir.create(path); for (i in 1:10) { assign( paste("m", i, sep=""), i:5) filename <- sprintf("m%02d.Rdta", i) pathname <- file.path(path, filename) obj =get(paste("m", i, sep="")) save(obj, file=pathname) } A tweak to this would be path <- "~/"; dir.create(path); for (i in 1:10) { assign( paste("m", i, sep=""), i:5) filename <- sprintf("m%02d.Rdta", i) pathname <- file.path(path, filename) save(list = paste("m", i, sep=""), file=pathname) } Thanks a lot Hao -Original Message- From: henrik.bengts...@gmail.com [mailto:henrik.bengts...@gmail.com] On Behalf Of Henrik Bengtsson Sent: Monday, November 02, 2009 12:34 AM To: David Winsemius Cc: r-help@r-project.org; jeffc Subject: Re: [R] save an object by dynamicly created name On Sun, Nov 1, 2009 at 9:18 PM, David Winsemius wrote: > > On Nov 1, 2009, at 11:28 PM, Henrik Bengtsson wrote: > >> On Sun, Nov 1, 2009 at 7:48 PM, David Winsemius >> wrote: >>> >>> On Nov 1, 2009, at 10:16 PM, Henrik Bengtsson wrote: >>> path <- "data"; dir.create(path); for (i in 1:10) { m <- i:5; filename <- sprintf("m%02d.Rbin", i); pathname <- file.path(path, filename); save(m, file=pathname); } >>> >>> That would result in each of the ten files containing an object with the >>> same name == "m". (Also on my system R data files have type Rdta.) So I >>> thought what was requested might have been a slight mod: >>> >>> path <- "~/"; >>> dir.create(path); >>> >>> for (i in 1:10) { >>> assign( paste("m", i, sep=""), i:5) >>> filename <- sprintf("m%02d.Rdta", i) >>> pathname <- file.path(path, filename) >>> obj =get(paste("m", i, sep="")) >>> save(obj, file=pathname) >>> } >> >> Then a more convenient solution is to use saveObject() and >> loadObject() of R.utils. saveObject() does not save the name of the >> object save. > > The OP asked for this outcome : > > " I would like to save m as m1, m2, m3 ..., > to file /home/data/m1, /home/data/m2, home/data/m3, ..." > > >> If you want to save multiple objects, the wrap them up >> in a list. > > I agree that a list would makes sense if it were to be stored in one file , > although it was not what requested. That comment was not for the OP, but for saveObject()/loadObject() in general. > But wouldn't that require assign()-ing a name before list()-wrapping? Nope, the whole point of using saveObject()/loadObject() is to save the objects/values without their names that you happens to choose in the current session, and to avoid overwriting existing ones in your next session. My example could also have been: library("R.utils"); saveObject(list(a=1,b=LETTERS,c=Sys.time()), file="foo.Rbin"); y <- loadObject("foo.Rbin"); z <- loadObject("foo.Rbin"); stopifnot(identical(y,z)); If you really want to attach the elements of the saved list, do: attachLocally(loadObject("foo.Rbin")); > str(a) num 1 > str(b) chr [1:26] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" ... > str(c) POSIXct[1:1], format: "2009-11-01 21:30:41" > > I suppose we ought to mention that the use of assign to create a variable is > a FAQ ... 7.21? Yep, I have now referred to it a sufficient number of times > to refer to it by number. > > http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a- variable_003f My personal take on assign() and get() is that if you find yourself using them (at this level), there is a good chance there exists a better solution that you should use instead. My $.02 /H > > -- > David > >> loadObject() does not assign variable, but instead return >> them. Example: >> >> library("R.utils"); >> x <- list(a=1,b=LETTERS,c=Sys.time()); >> saveObject(x, file="foo.Rbin"); >> y <- loadObject("foo.Rbin"); >> stopifnot(identical(x,y)); > >> >> So, for the original example, I'd recommend: >> >> library("R.utils"); >> path <- "data"; >> mkdirs(path); >> >> for (i in 1:10) { >> m <- i:5; >> filename <- sprintf("m%02d.Rbin", i); >> saveObject(m, file=filename, path=path); >> } >> >> and loading the objects back as: >> >> for (i in 1:10) { >> filename <- sprintf("m%02d.Rbin", i); >> m <- loadObject(filename, path=path); >> print(m); >> } >> /Henrik >> >>> >>> -- >>> David. >>> /H On Sun, Nov 1, 2009 at 6:53 PM, jeffc wrote: > > Hi, > > I would like to save a few dynamically created objects to disk. The > following is the basic flow of the code segment > > for(i = 1:10) { > m = i:5 > save(m, file = ...) ## ??? > } > To distinguish different objects to be saved, I would like to save m as > m1, > m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... > > I tried a couple of methods on translating between object names and > strings > (below) b
[R] OpenOffice Calc ODBC equivalent
Hi R users: I am using RODBC to create some new ".xls" files each with several sheets (about 100) with sqlSave() from a data.frame inside R without any problem, but on windows XP platform. I would like to know if it is posible to make a similar solution on linux with openoffice? RODBC work only (for the moment) with ".xls" on windows, via the ODBC driver. Is it posible to make the same thing on openoffice? I mean, an ".odt" file with several sheets each, generated from a data.frame in R? Thank you for your help. Kenneth __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] design matrix construction question
with the following simple data frame dd = structure(list(z = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L ), .Label = c("a", "b"), class = "factor"), x = c(0.3, 0.2, 0.1, 0, 0, 0, 0.2, 0.3)), .Names = c("z", "x"), row.names = c(NA, -8L), class = "data.frame") I would like know if it's possible to use model.matrix() to construct the following design matrix in some sensible way: za zb 1 1 0 2 1 0 3 1 0 4 0 0 5 0 0 6 0 0 7 0 1 8 0 1 In other words, I want column 1 to be (z=="a" & x>0) and column 2 to be (z=="b" & x>0). I can construct this matrix using sweep(model.matrix(~z-1,dd),1,dd$x>0,"*") and then stick it into lm.fit -- but is there a more elegant way to do this in general? I haven't found a formula combining (z-1) and I(x>0) that works, although I can imagine there is one. thanks Ben Bolker -- Ben Bolker Associate professor, Biology Dep't, Univ. of Florida bol...@ufl.edu / www.zoology.ufl.edu/bolker GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc signature.asc Description: OpenPGP digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re ading tokens
On Nov 2, 2009, at 8:00 PM, j daniel wrote: Greetings, I am not familiar with processing text in R. Can someone tell me how to read each line of words as separate elements in a list? FE, I would like to turn: word1 word2 word3 word2 word4 into a list of length two with three character elements in the first list and two elements in the second. I know that this should be easy, but I am a little confused by the text functions. > txt <- textConnection("word1 word2 word3 + word2 word4") > strsplit(readLines(txt), " ") [[1]] [1] "word1" "word2" "word3" [[2]] [1] "word2" "word4" -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re ading tokens
Is this what you want: > x <- readLines(textConnection("word1 word2 word3 + word2 word4")) > closeAllConnections() > yourList <- strsplit(x, '[[:space:]]+') > > > yourList [[1]] [1] "word1" "word2" "word3" [[2]] [1] "word2" "word4" > On Mon, Nov 2, 2009 at 8:00 PM, j daniel wrote: > > Greetings, > > I am not familiar with processing text in R. Can someone tell me how to > read each line of words as separate elements in a list? > > FE, I would like to turn: > > word1 word2 word3 > word2 word4 > > into a list of length two with three character elements in the first list > and two elements in the second. I know that this should be easy, but I am a > little confused by the text functions. > > Thanks in advance! > -- > View this message in context: > http://old.nabble.com/reading-tokens-tp26159915p26159915.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] re ading tokens
Greetings, I am not familiar with processing text in R. Can someone tell me how to read each line of words as separate elements in a list? FE, I would like to turn: word1 word2 word3 word2 word4 into a list of length two with three character elements in the first list and two elements in the second. I know that this should be easy, but I am a little confused by the text functions. Thanks in advance! -- View this message in context: http://old.nabble.com/reading-tokens-tp26159915p26159915.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can't pass file name as parameter to Corpus function
I'm working on a small project to extract high-frequency terms from a document and then display those terms in web page. To this end, I've to pass the file name as parameter to the Corpus function to build a corpus of only one document. I can build the corpus using the code below interactively in R. But calling the function with a file name as the parameter I got the error message saying "Error in eval(expr, envir, enclos) : object 'strFileName' not found" test<-function(strFileName) { src <- URISource(strFileName) cor <- Corpus(src, readerControl = list(reader = readPDF, language = "en_US", load = TRUE)) } After running the following code in R I checked the docURISource$URI and the value is "strFileName" rather than "C:\\Temp\\readme.txt". I also checked the URI when I was debugging the function and the URI is also "strFileName" rather than "C:\\Temp\\readme.txt". strFileName <- "C:\\Temp\\readme.txt" docURISource = URISource(strFileName, encoding = "UTF_8") docCorpus = Corpus(docURISource) So it seems when running interactively the URI contains strFileName as a variable but when called as a function, the URI contains strFileName as the actual value. Can anybody point out what's the problem? thanks in advance!! Jianli -- View this message in context: http://old.nabble.com/Can%27t-pass-file-name-as-parameter-to-Corpus-function-tp26159498p26159498.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] residuals(lme) error message
Dear R-help, I am trying to perform residual diagnostics on a repeated measures mixed effect model. With 3 time points as the fixed effect and subjects as the random effects. > time.linearc<- summary(lmer(cnim~(as.factor(time)), random= ~1|subj, data=ex.rmc, na.action=na.omit)) > time.linearc When I perform : >res <- residuals(time.linearc) I receive the following error message: Error in val[, level] : incorrect number of dimensions If possible could someone explain the problem? Thanks, Rupa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Frequency
try sort (table(MAT), decreasing=T) if MAT is your matrix I think this is what you want. though if you want to sort by the first occurrence then it is a different story. Nikhil On 2 Nov 2009, at 1:35PM, Val wrote: V1 v2 v3 v4 569 10 347 10 46 10 18 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] kernel density estimation and classification
Hi, I am interested to use Kernel Density Estimation on bivariate data for multi-class classification. So far, I have managed to use the 'ks' package to plot the contours of the kernel density estimates based on 8-class training dataset with only 2 variables. However, I do not know how to make use of the Kernel Density Analysis results as input into a classifier (e.g. Bayesian classifiers) to accomplish the task of classifying testing dataset into 8 different classes. I will really appreciate any suggestions or advice. Thanks. regards, WK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Time Series methods
Sure, but it seems like the point of the time series class is so you don't have to create a factor based on the period of sampling. Is there anyway of imposing calculations like median on the time series class, or is Kjetil's suggestion the only approach? Thanks! On Sun, Nov 1, 2009 at 4:09 PM, Kjetil Halvorsen < kjetilbrinchmannhalvor...@gmail.com> wrote: > introduce a factor variable with the months and then use tapply? > > Kjetil > > On Sun, Nov 1, 2009 at 9:07 PM, Tim Bean wrote: > > Hello, I have a quick question about time series methodology. If I want > to > > display a boxplot of time series data, sorted by period, I can type: > > > > boxplot(data ~ cycle(data)); > > > > where data is of class "ts" > > > > Is there a similar method for calculating, say, the median value of each > > time step within the series? (So for a monthly data set, calculate median > > for all Januarys, all Februarys, all Marchs, etc.) > > > > Thanks, > > Tim > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to exclude certain columns by column names?
Try this: > x[, colnames(x) != 'a'] [1] 3 4 On Tue, Nov 3, 2009 at 9:31 AM, Peng Yu wrote: > I can exclude columns by column number using '-'. But I wondering if > there is an easy way to exclude some columns by column names. > >> x=cbind(c(1,2),c(3,4)) >> x > [,1] [,2] > [1,] 1 3 > [2,] 2 4 >> colnames(x)=c('a','b') >> x > a b > [1,] 1 3 > [2,] 2 4 >> x[,-'a'] > Error in -"a" : invalid argument to unary operator >> x[,-1] > [1] 3 4 > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to exclude certain columns by column names?
Hi Peng, Here is a suggestion: subset(x, select = -a) See ?subset for more details. HTH, Jorge On Mon, Nov 2, 2009 at 8:31 PM, Peng Yu <> wrote: > I can exclude columns by column number using '-'. But I wondering if > there is an easy way to exclude some columns by column names. > > > x=cbind(c(1,2),c(3,4)) > > x > [,1] [,2] > [1,]13 > [2,]24 > > colnames(x)=c('a','b') > > x > a b > [1,] 1 3 > [2,] 2 4 > > x[,-'a'] > Error in -"a" : invalid argument to unary operator > > x[,-1] > [1] 3 4 > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to exclude certain columns by column names?
I can exclude columns by column number using '-'. But I wondering if there is an easy way to exclude some columns by column names. > x=cbind(c(1,2),c(3,4)) > x [,1] [,2] [1,]13 [2,]24 > colnames(x)=c('a','b') > x a b [1,] 1 3 [2,] 2 4 > x[,-'a'] Error in -"a" : invalid argument to unary operator > x[,-1] [1] 3 4 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modifying predict.nnet() to function with errorest()
Dave, You should look at the train() function in teh caret package. Max On Mon, Nov 2, 2009 at 6:01 PM, Armitage, Dave wrote: > Greetings, > > I am having trouble calculating artificial neural network misclassification > errors using errorest() from the ipred package. I have had no problems > estimating the values with randomForest() or svm(), but can't seem to get it > to work with nnet(). I believe this is due to the output of the > predict.nnet() function within cv.factor(). Below is a quick example of the > problem I'm experiencing. Any ideas on how to get around it or will it > simply not work with nnet()? > >> library(MASS) >> library(nnet) >> library(ipred) >> data(iris3) >> set.seed(191) >> >> samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25)) >> ird <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]), > > + species = factor(c(rep("s",50), rep("c", 50), rep("v", 50 >> >> errorest(species ~., data = ird, subset = samp, model = nnet, size = 2, >> rang =0.1, decay = 5e-4, maxit = 200) > > # weights: 19 > initial value 73.864441 > . > . > . > final value 0.339114 > converged > Error in cv.factor(y, formula, data, model = model, predict = predict, : > predict does not return factor values > > > > Thanks, > Dave > __ > > Dave Armitage > Wildlife Ecology and Conservation > University of Florida > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qqplot
Dear Carol, > -Original Message- > From: carol white [mailto:wht_...@yahoo.com] > Sent: November-02-09 4:04 PM > To: 'Peter Flom'; John Fox > Cc: r-h...@stat.math.ethz.ch; 'Yihui Xie' > Subject: RE: [R] qqplot > > Thanks for all your replies. > > I just wanted to compare the distributions of two populations and see how > different they are. I thought that QQplot would be a good idea. It's true > that the mean and sd of the two distributions are not the same. I don't know > if any other method or graphical presentation like five number, mean and sd > as suggested, or boxplot would have been more suited to be used. It depends on what you're interested in comparing. If you're interested in differences in shape adjusting for differences in centre and spread, then a QQ plot with a line fit through the first and third quartiles is a reasonable choice. If you centre the two data sets to medians of 0, the intercept of the line will give you information about the difference in centres and the slope about differences in spread. If you're primarily interested in comparing centres and spreads then parallel boxplots would be a good choice. Regards, John > > thanks for your advices, > > --- On Mon, 11/2/09, John Fox wrote: > > > From: John Fox > > Subject: RE: [R] qqplot > > To: "'Peter Flom'" > > Cc: r-h...@stat.math.ethz.ch, "'carol white'" , "'Yihui > Xie'" > > Date: Monday, November 2, 2009, 9:24 AM > > Dear Peter, > > > > I assumed that Carol wanted to compare the shapes of the > > distributions and > > to adjust for differences in centre and spread. To put a > > line through the > > quartiles or to base a line on the medians and IQRs is more > > robust than > > using the means and sds. > > > > Best, > > John > > > > > > > -Original Message- > > > From: r-help-boun...@r-project.org > > [mailto:r-help-boun...@r-project.org] > > On > > > Behalf Of Peter Flom > > > Sent: November-02-09 11:57 AM > > > To: carol white; Yihui Xie > > > Cc: r-h...@stat.math.ethz.ch > > > Subject: Re: [R] qqplot > > > > > > carol white > > wrote > > > > > > >So the conclusion is that abline(0,1) should > > always be used and if it > > > doesn't go through the qqplot, the two distributions > > are not similar? > > > > > > I think it depends what you mean by "similar". > > E.g., if you mean "are > > both > > > of these distributions (e.g.) normal?" then > > abline(0,1) is not always > > useful. > > > But if you mean "Do these have the same mean, sd, and > > distribution?" then > > > abline(0,1) is the way to go. > > > > > > Peter > > > > > > Peter L. Flom, PhD > > > Statistical Consultant > > > Website: www DOT peterflomconsulting DOT com > > > Writing; http://www.associatedcontent.com/user/582880/peter_flom.html > > > Twitter: �...@peterflom > > > > > > __ > > > R-help@r-project.org > > mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, > > reproducible code. > > > > > > > > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding loop
The first thing I would suggest is convert your dataframes to matrices so that you are not having to continually convert them in the calls to the functions. Also I am not sure what the code: realized_prob = with(DF, { ind <- (CHOSEN == 1) n <- tapply(theta_multiple[ind], CS[ind], sum) d <- tapply(theta_multiple, CS, sum) n / d }) is doing. It looks like 'n' and 'd' might have different lengths since they are being created by two different (CS & CS[ind]) sequences. I have no idea why you are converting to the "DF" dataframe. THere is no need for that. You could just leave the vectors (e.g., theta_multiple, CS and ind) as they are and work with them. This is probably where most of your time is being spent. So if you start with matrices and leave the dataframes out of the main loop you will probably see an increase in performance. 2009/11/2 parkbomee : > This is the Rprof() report by self time. > Is it also possible that these routines, which take long self.time, are > causing the optim() to be slow? > > > $by.self > self.time self.pct total.time total.pct > "FUN" 94.16 16.5 94.16 16.5 > "unlist"80.46 14.1 120.54 21.1 > "lapply"76.94 13.5 255.48 44.7 > "match" 60.76 10.6 60.88 10.7 > "as.matrix.data.frame" 31.00 5.4 51.12 8.9 > "as.character" 29.28 5.1 29.28 5.1 > "unique.default"24.36 4.3 24.40 4.3 > "data.frame"21.06 3.7 55.78 9.8 > "split.default" 20.42 3.6 84.38 14.8 > "tapply"13.84 2.4 414.28 72.5 > "structure" 11.32 2.0 22.36 3.9 > "factor"11.08 1.9 127.68 22.3 > "attributes<-" 11.00 1.9 11.00 1.9 > "=="10.56 1.8 10.56 1.8 > "%*%" 10.30 1.8 10.30 1.8 > "as.vector" 10.22 1.8 10.22 1.8 > "as.integer" 9.86 1.7 9.86 1.7 > "list" 9.64 1.7 9.64 1.7 > "exp"7.12 1.2 7.12 1.2 > "as.data.frame.integer" 5.98 1.0 8.10 1.4 > >> To: bbom...@hotmail.com >> CC: jholt...@gmail.com; r-help@r-project.org >> Subject: Re: [R] avoiding loop >> From: mtmor...@fhcrc.org >> Date: Sun, 1 Nov 2009 22:14:09 -0800 >> >> parkbomee writes: >> >> > Thank you all. >> > >> > What Chuck has suggested might not be applicable since the number of >> > different times is around 40,000. >> > >> > The object of optimization in my function is the varying "value", >> > which is basically data * parameter, of which "parameter" is the >> > object of optimization.. >> > >> > And from the r profiling with a subset of data, >> > I got this report..any idea what "" is? >> > >> > >> > $by.total >> > total.time total.pct self.time self.pct >> > "" 571.56 100.0 0.02 0.0 >> > "optim" 571.56 100.0 0.00 0.0 >> > "fn" 571.54 100.0 0.98 0.2 >> >> You're giving us 'by.total', so these are saying that all the time was >> spent in these functions or the functions they called. Probably all >> are in 'optim' and its arguments; since little self.time is spent >> here, there isn't much to work with >> >> > "eval" 423.74 74.1 0.00 0.0 >> > "with.default" 423.74 74.1 0.00 0.0 >> > "with" 423.74 74.1 0.00 0.0 >> >> These are probably in the internals of optim, where the function >> you're trying to optimize is being set up for evaluation. Again >> there's little self.time, and all these say is that a big piece of the >> time is being spent in code called by this code. >> >> > "tapply" 414.28 72.5 13.84 2.4 >> > "lapply" 255.48 44.7 76.94 13.5 >> > "factor" 127.68 22.3 11.08 1.9 >> > "unlist" 120.54 21.1 80.46 14.1 >> > "FUN" 94.16 16.5 94.16 16.5 >> >> these look like they are tapply-related calls (looking at the code for >> tapply, it calls lapply, factor, and unlist, and FUN is the function >> argument to tapply), perhaps from the function you're optimizing (did >> you implement this as suggested below? it would really help to have a >> possibly simplified version of the code you're calling). >> >> There is material to work with here, as apparently a fairly large >> amount of self.time is being spent in each of these functions. So >> here's a sample data set >> >> n <- 10 >> set.seed(123) >> df <- data.frame(time=sort(as.integer(ceiling(runif(n)*n/5))), >> value=ceiling(runif(n)*5)) >> >> It would have been helpful for you to provide reproducible code like >> that above, so that
Re: [R] a prolem with constrOptim
Hi, I noticed that you are setting very small tolerance as convergence criterion. In most optimization problems, it does not make much practical sense to set such a low tolerance. If you increase this, then `constrOptim' also works. For example, constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros, data = data2, method="BFGS", control=list(fnscale=-1,reltol=1e-10)) However, it should be noted that my function `contrOptim.nl' provides a better answer for both data1 and data2 in terms of having a larger log-likelihood at convergence. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Davidov, Ori (NIH/NIEHS) [V] Sent: Monday, November 02, 2009 11:36 AM To: r-help@r-project.org Subject: [R] a prolem with constrOptim Hi, I apologize for the long message but the problem I encountered can't be stated in a few lines. I am having some problems with the function constrOptim. My goal is to maximize the likelihood of product of K multinomials, each with four catagories under linear constraints on the parameter values. I have found that the function does not work for many data configurations. #The likelihood and score functions are: likelihood = function(theta,data) { p = matrix(theta,nrow=3,byrow=F) s = 1-apply(p,2,sum) p = rbind(p,s) Z = matrix(data,nrow=4,byrow=F) sum(log(p)*Z) } score = function(theta,data) { k = length(data)/4 S = numeric(3*k) for(i in 1:k) { n = data[(1+4*(i-1)):(4*i)] p = theta[(1+3*(i-1)):(3*i)] P = 1-sum(p) S[(1+3*(i-1)):(3*i)] = n[1:3]/p-n[4]/P } S } #where theta=(p11,p12,p13,p21,p22,p23,...,pK1,pK2,pK3). #The function Rmat calculates the restriction matrix needed for constrained estimation Rmat = function(k) { R = matrix(1,4,3) R[1,2] = R[1,3] = R[2,3] = R[3,2] = 0 RR = cbind(-R,R) RRR = matrix(0,4*(k-1),3*k) for ( i in 1:(k-1) ) RRR[4*(i-1)+1:4,3*(i-1)+1:6] = RR RRR } #The function gen.data generates data gen.data = function(p,N) { k = ncol(p) Data = matrix(0,4,k) for(j in 1:k) { Data[,j] = as.vector(rmultinom(1, size = N[j], prob=p[,j])) } as.vector(Data) } #I have found that the function returns an error under some configurations of the data ( 0's seem to harm it). Why is this happening and how can it #be corrected? Why do some configurations with 0's crash the program and others don't? # For example for the data: data1 = c(0,6,8,6,3,4,4,9,2,2,5,11) data2 = c(0,6,8,6,0,4,4,9,2,2,5,11) # inputs for constrOptim K = 3 RR = Rmat(K) zeros = rep(0,nrow(RR)) theta.0 = numeric(3*K) for(i in 1:K) {theta.0[(1+3*(i-1)):(3*i)] = rep(1/10,3)+i/100} # data1 gives an error but data2 does not! constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros, data = data1, method="BFGS", control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros, data = data2, method="BFGS", control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par # If you want to further check run the simulation below. K = 3 N = rep(20,K) p = c(1/10,2/10,2/10,5/10) p = matrix(rep(p,K),nrow=4,byrow=F) RR = Rmat(K) zeros = rep(0,nrow(RR)) theta.0 = numeric(3*K) for(i in 1:K) {theta.0[(1+3*(i-1)):(3*i)] = rep(1/10,3)+i/100} for(i in 1:1000) { data = gen.data(p,N) print(i) print(data) theta.til = constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros, data = data, method="BFGS", control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par print(theta.til) } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, repr
[R] modifying predict.nnet() to function with errorest()
Greetings, I am having trouble calculating artificial neural network misclassification errors using errorest() from the ipred package. I have had no problems estimating the values with randomForest() or svm(), but can't seem to get it to work with nnet(). I believe this is due to the output of the predict.nnet() function within cv.factor(). Below is a quick example of the problem I'm experiencing. Any ideas on how to get around it or will it simply not work with nnet()? library(MASS) library(nnet) library(ipred) data(iris3) set.seed(191) samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25)) ird <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]), + species = factor(c(rep("s",50), rep("c", 50), rep("v", 50 errorest(species ~., data = ird, subset = samp, model = nnet, size = 2, rang =0.1, decay = 5e-4, maxit = 200) # weights: 19 initial value 73.864441 . . . final value 0.339114 converged Error in cv.factor(y, formula, data, model = model, predict = predict, : predict does not return factor values Thanks, Dave __ Dave Armitage Wildlife Ecology and Conservation University of Florida __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a prolem with constrOptim
Hi Davidov, You can use the `constrOptim.nl' function that I have written for solving problems like yours. It is better and more general than the `constrOptim' function. It is able to solve your problem, but I am not sure how well posed your problem is. One of the parameters seems to go to 0. May be this ok. Here is the code for solving your problem: source("constrOptim_nl.txt") hin <- function(par, ...) { # function that defines the inequality constraints k = 3 R = matrix(1,4,3) R[1,2] = R[1,3] = R[2,3] = R[3,2] = 0 RR = cbind(-R,R) RRR = matrix(0, 4*(k-1),3*k) for ( i in 1:(k-1) ) RRR[4*(i-1)+1:4,3*(i-1)+1:6] = RR c(RRR %*% par) } ans1 <- constrOptim.nl(par=theta.0, fn=likelihood, hin=hin, data=data1, control=list(fnscale=-1)) ans2 <- constrOptim.nl(par=theta.0, fn=likelihood, hin=hin, data=data2, control=list(fnscale=-1)) This works for both data1 and data2. > ans1 $par [1] 7.158690e-10 2.500022e-01 2.682024e-01 1.314431e-01 1.750843e-01 2.118290e-01 2.089612e-01 9.771505e-02 2.647332e-01 $value [1] -72.81515 > ans2 $par [1] 8.068277e-12 1.659718e-01 2.332808e-01 1.363536e-04 1.91e-01 2.331728e-01 7.855021e-02 1.240325e-01 2.235095e-01 $value [1] -65.75015 Let me know if you are interested in obtaining the `constrOptim.nl' function. Hope this helps, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Davidov, Ori (NIH/NIEHS) [V] Sent: Monday, November 02, 2009 11:36 AM To: r-help@r-project.org Subject: [R] a prolem with constrOptim Hi, I apologize for the long message but the problem I encountered can't be stated in a few lines. I am having some problems with the function constrOptim. My goal is to maximize the likelihood of product of K multinomials, each with four catagories under linear constraints on the parameter values. I have found that the function does not work for many data configurations. #The likelihood and score functions are: likelihood = function(theta,data) { p = matrix(theta,nrow=3,byrow=F) s = 1-apply(p,2,sum) p = rbind(p,s) Z = matrix(data,nrow=4,byrow=F) sum(log(p)*Z) } score = function(theta,data) { k = length(data)/4 S = numeric(3*k) for(i in 1:k) { n = data[(1+4*(i-1)):(4*i)] p = theta[(1+3*(i-1)):(3*i)] P = 1-sum(p) S[(1+3*(i-1)):(3*i)] = n[1:3]/p-n[4]/P } S } #where theta=(p11,p12,p13,p21,p22,p23,...,pK1,pK2,pK3). #The function Rmat calculates the restriction matrix needed for constrained estimation Rmat = function(k) { R = matrix(1,4,3) R[1,2] = R[1,3] = R[2,3] = R[3,2] = 0 RR = cbind(-R,R) RRR = matrix(0,4*(k-1),3*k) for ( i in 1:(k-1) ) RRR[4*(i-1)+1:4,3*(i-1)+1:6] = RR RRR } #The function gen.data generates data gen.data = function(p,N) { k = ncol(p) Data = matrix(0,4,k) for(j in 1:k) { Data[,j] = as.vector(rmultinom(1, size = N[j], prob=p[,j])) } as.vector(Data) } #I have found that the function returns an error under some configurations of the data ( 0's seem to harm it). Why is this happening and how can it #be corrected? Why do some configurations with 0's crash the program and others don't? # For example for the data: data1 = c(0,6,8,6,3,4,4,9,2,2,5,11) data2 = c(0,6,8,6,0,4,4,9,2,2,5,11) # inputs for constrOptim K = 3 RR = Rmat(K) zeros = rep(0,nrow(RR)) theta.0 = numeric(3*K) for(i in 1:K) {theta.0[(1+3*(i-1)):(3*i)] = rep(1/10,3)+i/100} # data1 gives an error but data2 does not! constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros, data = data1, method="BFGS", control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros, data = data2, method="BFGS", control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par # If you want to further check run the simul
[R] AR Simulation with non-normal innovations - Correct
Dear Users, I would like to simulate an AR(1) (y_t=ct1+y_t-1+e_t) model in R where the innovations are supposed to follow a t-GARCH(1,1) proccess. By t-GARCH I want to mean that: e_t=n_t*sqrt(h_t) and h_t=ct2+a*(e_t)^2+b*h_t-1. where n_t is a random variable with t-Student distribution. If someone could give some guidelines, I can going developing the model. I did it in matlab, but the loops are very slowly, so I would like to try R. Thanks in advance, Rick [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] AR Simulation with non-normal innovations
Dear Users, I would like to simulate AR(1) (y_t=ct1+y_t-1+e_t) model in R where the innovations are supposed to follow a t-GARCH(1,1) proccess. By t-GARCH I want to mean that: e_t=n_t*sqrt(h_t) and h_t=ct2+a*(e_t)^2+b*h_t-1. If someone could give some guidelines, I can going developing the model. I did it in matlab, but the loops are very slowly, so I would like to try R. Thanks in advance, Rick [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems with read.csv
I use indata = read.csv(file.choose(),header=TRUE) of course you can specify your file. Joe King 206-913-2912 j...@joepking.com "Never throughout history has a man who lived a life of ease left a name worth remembering." --Theodore Roosevelt -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Fang (Betty) Yang Sent: Monday, November 02, 2009 1:10 PM To: r-help@r-project.org Subject: [R] problems with read.csv Dear all, I'd like to ask help on R code to get the same results as the following Splus code: >indata<-importData("/home/data_new.csv") >indata[1:5,4] [1] 0930 1601 1006 1032 1020 I tried the following R code: > indata<-read.csv("/home/data_new.csv") > indata[1:5,4] [1] 930 1601 1006 1032 1020 I'd like the first one to be 0930, too. Thanks in advance, Betty [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regex matching that gives byte offset?
On Monday 02 November 2009 13:41:45 Prof Brian Ripley wrote: > On Mon, 2 Nov 2009, Johannes Graumann wrote: > > Hmmm ... that should do it, thanks. But how would one use this on a file > > without reading it into memory completely? > > ?file, ?readLines, ?readBin > > will tell you about connections. ... all of which I only get to read by the line and a regexpr on that will not give me the absolute offset. "grep -buo" on the unix command line is really fast for this. If I can't find the native R equivalent, I'm of a mind to do this via a sys call - ugly and not portable, but SOOO fast ... is it possible in R? Joh > > > Joh > > > > On Wednesday 28 October 2009 16:29:00 Prof Brian Ripley wrote: > >> Do you mean like regexpr() (on the same help page)? > >> > >> Depending on your locale, you might actually prefer the character > >> offset: if you want to match in a MBCS and have byte offsets you will > >> need to work a bit harder if useBytes=TRUE is not sufficient for you. > >> > >> On Wed, 28 Oct 2009, Johannes Graumann wrote: > >>> Hi, > >>> > >>> Is there any way of doing 'grep' ore something like it on the content > >>> of a text file and extract the byte positioning of the match in the > >>> file? I'm facing the need to access rather largish (>600MB) XML files > >>> and would like to be able to index them ... > >>> > >>> Thanks for any help or flogging, > >>> > >>> Joh > >>> > >>> __ > >>> R-help@r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html and provide commented, > >>> minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plots with k-means
David, Eduardo, Thanks for the code. I have run it and I'm not sure what to do with the graph when it comes up. Can I interact with it, such as an RGL graph? I've tried clicking or dragging with the mouse and nothing happens. My system is a Windows Vista and R2.9. Best, Iuri. On Mon, Nov 2, 2009 at 7:35 PM, David Winsemius wrote: > The attached file did not come through to the list. I think you have some > non-standard characters (or at least non-standard in my locale). I was able > to get the code to run after using the Zap Gremlins function in > TextWrangler. Prior to that "treatment" pretty much every line threw an > error of this sort: > > > > setClass(Class = 'POI', > +representation(matrizSim = 'matrix',cos.query.docs = 'vector', > Error: unexpected input in: > "setClass(Class = 'POI', > ¬" > > > wordsInQuery = 'ANY',docs = 'matrix', objeto = 'matrix', objetoC > Error: unexpected input in "¬" > > = 'matrix', > Error: unexpected '=' in "=" > > > Pcoords = 'matrix', PcoordsFI = 'matrix', newPcoords = 'matrix', > Error: unexpected input in "¬" > > newcoords = 'numeric' , > Error: unexpected ',' in "newcoords = 'numeric' ," > > > newcoords_1 = 'numeric', M = 'numeric', poisTextCol = > Error: unexpected input in "¬" > > I also needed to remove a couple of spaces between function names and > parentheses when these occurred at ends-of-lines. Attached is a working > version as a .txt file (which should make it through the list-serv: > > > > > > -- David. > > sessionInfo() > R version 2.10.0 Patched (2009-10-29 r50258) > x86_64-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] splines stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] rms_2.1-0 Hmisc_3.7-0 survival_2.35-7 > > loaded via a namespace (and not attached): > [1] cluster_1.12.1 grid_2.10.0 lattice_0.17-26 > > > > > On Nov 2, 2009, at 3:43 PM, eduardo san miguel wrote: > > I send r-code in an attached file. >> >> 2009/11/2 Iuri Gavronski : >> >>> Eduardo, >>> >>> Would you mind sending me the R code in an attached file. Your code >>> didn't >>> work here and I am not sure it is because of line breaks from the email >>> program. >>> >>> Iuri. >>> >>> On Mon, Nov 2, 2009 at 10:53 AM, eduardo san miguel < >>> >> eduardosa...@gmail.com> >> >>> wrote: >>> Hello all, I have almost finished the development of a new package where ideas from Tamara Munzner, George Furnas and Costa and Venturini are implemented. 1.- Da Costa, David & Venturini, Gilles (2006). An Interactive Visualization Environment for Data Exploration Using Points of Interest. adma 2006: 416-423 2.- Furnas, George (1986). Generalized Fisheye Views. Human Factors in computing systems, CHI '86 conference proceedings, ACM, New York, pp. 16-23. 3.- Heidi Lam, Ronald A. Rensink, and Tamara Munzner (2006). Effects of 2D Geometric Transformations on Visual Memory. Proc. Applied Perception in Graphics and Visualization (APGV 2006), 119-126, 2006. 4.- Keith Lau, Ron Rensink, and Tamara Munzner (2004). Perceptual Invariance of Nonlinear Focus+Context Transformations. Proc. First Symposium on Applied Perception in Graphics and Visualization (APGV 04) 2004, pp 65-72. This is a sample with some basic functionality and a VERY BASIC example with kmeans plotting. Comments will be greatly appreciated. Regards -- R CODE require(methods) setClass(Class = 'POI', representation(matrizSim = 'matrix',cos.query.docs = 'vector', wordsInQuery = 'ANY',docs = 'matrix', objeto = 'matrix', objetoC = 'matrix', Pcoords = 'matrix', PcoordsFI = 'matrix', newPcoords = 'matrix', newcoords = 'numeric' , newcoords_1 = 'numeric', M = 'numeric', poisTextCol = 'character' , colores = 'vector' , poisCircleCol = 'character' , linesCol = 'character', itemsCol = 'character', LABELS = 'logical', vscale = 'numeric', hscale = 'numeric', circleCol = 'character', plotCol = 'character', itemsFamily = 'character', lenteDefault = 'numeric', zoomDefault = 'numeric' , rateDefault = 'numeric' , topKDefault = 'numeric' , pal = 'character', selected = 'numeric' , circRadio = 'numeric' , IncVscale = 'numeric', cgnsphrFont = 'numeric', xClick_old = 'numeric', yClick_old = 'numeric', wordsInQueryFull = 'character' ), prototype(cos.query.docs = 0, colores = 0, newcoords = 0, newcoords_1 = 0, M = 3, vscale = 0.5 , hscale = 1.5 , circleCol = 'black' , itemsCol = 'white', poisTextCol = '#fff5ee', poisCircleCol = '#fff5ee', linesCol = 'white', plotCol = 'black', itemsFamily = 'sans', lenteDefault
Re: [R] problems with read.csv
On Nov 2, 2009, at 4:09 PM, Fang (Betty) Yang wrote: Dear all, I'd like to ask help on R code to get the same results as the following Splus code: indata<-importData("/home/data_new.csv") indata[1:5,4] [1] 0930 1601 1006 1032 1020 I tried the following R code: indata<-read.csv("/home/data_new.csv") Do you get what you desire with setting as.is=TRUE? indata<-read.csv("/home/data_new.csv", as.is=TRUE ) Also see the colClasses arguments in read.xxx functions. indata[1:5,4] [1] 930 1601 1006 1032 1020 I'd like the first one to be 0930, too. So you want the strings (aka "character" vectors) to stay strings. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems with read.csv
Superficially, one answer is indata <- read.csv("/home/data_new.csv", colClasses="character") but I'm not sure that's what you want... -Ista On Mon, Nov 2, 2009 at 4:09 PM, Fang (Betty) Yang wrote: > Dear all, > > > > I'd like to ask help on R code to get the same results as the following > Splus code: > > > > > >>indata<-importData("/home/data_new.csv") > > > >>indata[1:5,4] > > [1] 0930 1601 1006 1032 1020 > > > > I tried the following R code: > > > >> indata<-read.csv("/home/data_new.csv") > >> indata[1:5,4] > > [1] 930 1601 1006 1032 1020 > > > > I'd like the first one to be 0930, too. > > > > Thanks in advance, > > > > > > Betty > > > > > > > > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plots with k-means
The attached file did not come through to the list. I think you have some non-standard characters (or at least non-standard in my locale). I was able to get the code to run after using the Zap Gremlins function in TextWrangler. Prior to that "treatment" pretty much every line threw an error of this sort: > setClass(Class = 'POI', +representation(matrizSim = 'matrix',cos.query.docs = 'vector', Error: unexpected input in: "setClass(Class = 'POI', ¬" > wordsInQuery = 'ANY',docs = 'matrix', objeto = 'matrix', objetoC Error: unexpected input in "¬" > = 'matrix', Error: unexpected '=' in "=" > Pcoords = 'matrix', PcoordsFI = 'matrix', newPcoords = 'matrix', Error: unexpected input in "¬" > newcoords = 'numeric' , Error: unexpected ',' in "newcoords = 'numeric' ," > newcoords_1 = 'numeric', M = 'numeric', poisTextCol = Error: unexpected input in "¬" I also needed to remove a couple of spaces between function names and parentheses when these occurred at ends-of-lines. Attached is a working version as a .txt file (which should make it through the list- serv: require(methods) setClass(Class = 'POI', representation(matrizSim = 'matrix',cos.query.docs = 'vector', wordsInQuery = 'ANY',docs = 'matrix', objeto = 'matrix', objetoC = 'matrix', Pcoords = 'matrix', PcoordsFI = 'matrix', newPcoords = 'matrix', newcoords = 'numeric' , newcoords_1 = 'numeric', M = 'numeric', poisTextCol = 'character' , colores = 'vector' , poisCircleCol = 'character' , linesCol = 'character', itemsCol = 'character', LABELS = 'logical', vscale = 'numeric', hscale = 'numeric', circleCol = 'character', plotCol = 'character', itemsFamily = 'character', lenteDefault = 'numeric', zoomDefault = 'numeric' , rateDefault = 'numeric' , topKDefault = 'numeric' , pal = 'character', selected = 'numeric' , circRadio = 'numeric' , IncVscale = 'numeric', cgnsphrFont = 'numeric', xClick_old = 'numeric', yClick_old = 'numeric', wordsInQueryFull = 'character' ), prototype(cos.query.docs = 0, colores = 0, newcoords = 0, newcoords_1 = 0, M = 3, vscale = 0.5 , hscale = 1.5 , circleCol = 'black' , itemsCol = 'white', poisTextCol = '#fff5ee', poisCircleCol = '#fff5ee', linesCol = 'white', plotCol = 'black', itemsFamily = 'sans', lenteDefault = 1, zoomDefault = 15 , rateDefault = 0.1 , topKDefault = 25, pal = 'topo' , selected = 1 , circRadio = 0.25 , IncVscale = 0.05 , cgnsphrFont = 1.01, LABELS = T) ) setGeneric("puntosMedios" , function(Pcoords, detalle = 5){standardGeneric("puntosMedios")}) setMethod("puntosMedios" , signature = "matrix", function(Pcoords, detalle = 5){ for (i in 1:detalle){ new_pcoords = matrix(rep(0,4*nrow(Pcoords)), nrow = 2* nrow(Pcoords), byrow = T ) cont = 0 for (i in 1:nrow(Pcoords)){ if (i == nrow(Pcoords)) { cont = cont + 1 new_pcoords[cont,] = Pcoords[i,] cont = cont + 1 new_pcoords[cont,] = Pcoords[i,] - ((Pcoords[i,]-Pcoords[1,])/2) }else{ cont = cont + 1 new_pcoords[cont,] = Pcoords[i,] cont = cont + 1 new_pcoords[cont,] = Pcoords[i,] - ((Pcoords[i,]-Pcoords[i+1,])/2)}} Pcoords = new_pcoords} return(Pcoords) } ) setGeneric("fishIout" , function(x, value){standardGeneric ("fishIout")}) setMethod("fishIout" , signature = "numeric", function(x, value){ d = value if (x > 0){ signo = 1 }else{ signo = -1 } x = abs(x) return(signo*(-(x/((d*x)-d-1 } ) setGeneric("fishIin" , function(x, value){standardGeneric ("fishIin")}) setMethod("fishIin" , signature = "numeric", function(x, value){ d = value if (x > 0){ signo = 1 }else{ signo = -1 } x = abs(x) return(signo*(((d+1)*x)/(d*x+1))) } ) setGeneric("toPolar" , function(x, y){standardGeneric ("toPolar")}) setMethod("toPolar" , signature = "numeric", function(x, y){ t1 = atan2(y,x) rP = sqrt(x^2+y^2) return(c(t1 = t1,rP = rP)) } ) setGeneric("toCartesian" , function(t1, rP){standardGeneric ("toCartesian")}) setMethod("toCartesian" , signature = "numeric", function(t1, rP){ x1 = rP*cos(t1) y1 = rP*sin(t1) return(c(x = x1,y = y1)) } ) setGeneric("circulo" , function(cx, cy, r, circleCol, PLOT = TRUE){standardGeneric ("circulo")}) setMethod("circulo" , signature = "numeric", function(cx, cy, r, circleCol, PLOT = TRUE){ t = seq(0,2*pi,length=100) circle = t(rbind(cx+sin(t)*r,cy+cos(t)*r)) if (PLOT == TRUE) plot(circle,type='l',,ylim=c(-1.15,1.15),xlim=c(-1.15,1.15), ann=FALSE, axes=F, col = circleCol) return(circle) } ) setGeneric("circulin" , function(cx, cy, r = 0.045, objeto, col = 'blue', PLOT = TRUE, label = 0){ standardGeneric ("circulin")}) setMethod("circulin" , signature = "ANY", function(cx, cy, r = 0.045, objeto, col = 'blue', P
Re: [R] Robust ANOVA or alternative test?
See the nlme library and accompanying book by Pinheiro and Bates. The lme function is appropriate for linear mixed models with normal errors, and the 'weights' argument provides flexible methods for modeling error heterogeneity. hth, Kingsford On Mon, Nov 2, 2009 at 12:10 PM, jinyan fan wrote: > Dear Colleagues, > > I am running a three-way mixed-design ANOVA, with one manipulated IV, a group > membership IV, and pre-/post- within subject factor. I am interested in the > three-way interaction effect. The regular ANVOA is problematic, because (a) > the sample sizes are very unbalanced, due to the group IV, (b) the > homogeneity of variance is violated, and (c) the homogeneity of covariance is > also violated. I wonder if R has a procedure to address this issue, either by > some sort of Robust ANOVA procedure, or by some alternative tests? Thanks a > lot. > > Jinyan Fan > > Assistant Professor > Department of Psychology > Hofstra University > Hempstead, NY 11550 > Work: 516-463-6349 > > > > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] need help in using Hessian matrix
Hi I need to find the Hessian matrix for a complicated function from a certain kind of data but i keep getting this error Error in f1 - f2 : non-numeric argument to binary operator the data is given by U<-runif(n) Us<-sort(U) tau1<- 2 F1tau<- pgamma((tau1/theta1),shape,1) N1<-sum(Ushttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problems with read.csv
Dear all, I'd like to ask help on R code to get the same results as the following Splus code: >indata<-importData("/home/data_new.csv") >indata[1:5,4] [1] 0930 1601 1006 1032 1020 I tried the following R code: > indata<-read.csv("/home/data_new.csv") > indata[1:5,4] [1] 930 1601 1006 1032 1020 I'd like the first one to be 0930, too. Thanks in advance, Betty [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qqplot
Thanks for all your replies. I just wanted to compare the distributions of two populations and see how different they are. I thought that QQplot would be a good idea. It's true that the mean and sd of the two distributions are not the same. I don't know if any other method or graphical presentation like five number, mean and sd as suggested, or boxplot would have been more suited to be used. thanks for your advices, --- On Mon, 11/2/09, John Fox wrote: > From: John Fox > Subject: RE: [R] qqplot > To: "'Peter Flom'" > Cc: r-h...@stat.math.ethz.ch, "'carol white'" , "'Yihui > Xie'" > Date: Monday, November 2, 2009, 9:24 AM > Dear Peter, > > I assumed that Carol wanted to compare the shapes of the > distributions and > to adjust for differences in centre and spread. To put a > line through the > quartiles or to base a line on the medians and IQRs is more > robust than > using the means and sds. > > Best, > John > > > > -Original Message- > > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] > On > > Behalf Of Peter Flom > > Sent: November-02-09 11:57 AM > > To: carol white; Yihui Xie > > Cc: r-h...@stat.math.ethz.ch > > Subject: Re: [R] qqplot > > > > carol white > wrote > > > > >So the conclusion is that abline(0,1) should > always be used and if it > > doesn't go through the qqplot, the two distributions > are not similar? > > > > I think it depends what you mean by "similar". > E.g., if you mean "are > both > > of these distributions (e.g.) normal?" then > abline(0,1) is not always > useful. > > But if you mean "Do these have the same mean, sd, and > distribution?" then > > abline(0,1) is the way to go. > > > > Peter > > > > Peter L. Flom, PhD > > Statistical Consultant > > Website: www DOT peterflomconsulting DOT com > > Writing; http://www.associatedcontent.com/user/582880/peter_flom.html > > Twitter: �...@peterflom > > > > __ > > R-help@r-project.org > mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, > reproducible code. > > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plots with k-means
I send r-code in an attached file. 2009/11/2 Iuri Gavronski : > Eduardo, > > Would you mind sending me the R code in an attached file. Your code didn't > work here and I am not sure it is because of line breaks from the email > program. > > Iuri. > > On Mon, Nov 2, 2009 at 10:53 AM, eduardo san miguel < eduardosa...@gmail.com> > wrote: >> >> Hello all, >> >> I have almost finished the development of a new package where ideas >> from Tamara Munzner, George Furnas and Costa and Venturini are >> implemented. >> >> 1.- Da Costa, David & Venturini, Gilles (2006). An Interactive >> Visualization Environment for Data Exploration Using Points of >> Interest. adma 2006: 416-423 >> >> 2.- Furnas, George (1986). Generalized Fisheye Views. Human Factors in >> computing systems, CHI '86 conference proceedings, ACM, New York, pp. >> 16-23. >> >> 3.- Heidi Lam, Ronald A. Rensink, and Tamara Munzner (2006). Effects >> of 2D Geometric Transformations on Visual Memory. Proc. Applied >> Perception in Graphics and Visualization (APGV 2006), 119-126, 2006. >> >> 4.- Keith Lau, Ron Rensink, and Tamara Munzner (2004). Perceptual >> Invariance of Nonlinear Focus+Context Transformations. Proc. First >> Symposium on Applied Perception in Graphics and Visualization (APGV >> 04) 2004, pp 65-72. >> >> This is a sample with some basic functionality and a VERY BASIC >> example with kmeans plotting. >> >> Comments will be greatly appreciated. >> >> Regards >> >> -- R CODE >> require(methods) >> >> setClass(Class = 'POI', >>representation(matrizSim = 'matrix',cos.query.docs = 'vector', >> wordsInQuery = 'ANY',docs = 'matrix', objeto = 'matrix', objetoC >> = 'matrix', >> Pcoords = 'matrix', PcoordsFI = 'matrix', newPcoords = 'matrix', >> newcoords = 'numeric' , >> newcoords_1 = 'numeric', M = 'numeric', poisTextCol = >> 'character' , colores = 'vector' , >> poisCircleCol = 'character' , linesCol = 'character', itemsCol = >> 'character', >> LABELS = 'logical', vscale = 'numeric', hscale = 'numeric', >> circleCol = 'character', >> plotCol = 'character', itemsFamily = 'character', lenteDefault >> = 'numeric', >> zoomDefault = 'numeric' , rateDefault = 'numeric' , >> topKDefault = 'numeric' , >> pal = 'character', selected = 'numeric' , circRadio = >> 'numeric' , IncVscale = 'numeric', >> cgnsphrFont = 'numeric', xClick_old = 'numeric', yClick_old = >> 'numeric', >> wordsInQueryFull = 'character' ), >> prototype(cos.query.docs = 0, colores = 0, newcoords = 0, >> newcoords_1 = 0, M = 3, >> vscale = 0.5 , hscale = 1.5 , circleCol = 'black' , >> itemsCol = 'white', >> poisTextCol = '#fff5ee', poisCircleCol = '#fff5ee', >> linesCol = 'white', >> plotCol = 'black', itemsFamily = 'sans', lenteDefault = >> 1, zoomDefault = 15 , >> rateDefault = 0.1 , topKDefault = 25, pal = 'topo' , >> selected = 1 , >> circRadio = 0.25 , IncVscale = 0.05 , cgnsphrFont = >> 1.01, LABELS = T) >> ) >> >> setGeneric("puntosMedios" , >> function(Pcoords, detalle = 5){standardGeneric >> ("puntosMedios")}) >> >> setMethod("puntosMedios" , >>signature = "matrix", >>function(Pcoords, detalle = 5){ >> >> for (i in 1:detalle){ >>new_pcoords = matrix(rep(0,4*nrow(Pcoords)), nrow = 2* >> nrow(Pcoords), byrow = T ) >>cont = 0 >>for (i in 1:nrow(Pcoords)){ >> if (i == nrow(Pcoords)) { >>cont = cont + 1 >>new_pcoords[cont,] = Pcoords[i,] >>cont = cont + 1 >>new_pcoords[cont,] = Pcoords[i,] - >> ((Pcoords[i,]-Pcoords[1,])/2) >>}else{ >>cont = cont + 1 >>new_pcoords[cont,] = Pcoords[i,] >>cont = cont + 1 >>new_pcoords[cont,] = Pcoords[i,] - >> ((Pcoords[i,]-Pcoords[i+1,])/2)}} >>Pcoords = new_pcoords} >>return(Pcoords) >> >> } >> ) >> >> setGeneric("fishIout" , >> function(x, value){standardGeneric ("fishIout")}) >> >> setMethod("fishIout" , >>signature = "numeric", >>function(x, value){ >> >> d = value >>if (x > 0){ >>signo = 1 >>}else{ >>signo = -1 >>} >>x = abs(x) >>return(signo*(-(x/((d*x)-d-1 >> } >> ) >> >> setGeneric("fishIin" , >> function(x, value){standardGeneric ("fishIin")}) >> >> setMethod("fishIin" , >>signature = "numeric", >>function(x, value){ >> >> d = value >>if (x > 0){ >>signo = 1 >>}else{ >>signo = -1 >>} >>x = abs(x) >> >>return(signo*(((d+1)*x)/(d*x+1))) >> } >> ) >> >> setGeneric("toPolar" , >> function(x, y){standardGeneric ("toPolar")}) >> >> setMethod("toPolar" , >>signature = "numeric", >>function(x, y){ >> >>t1 = atan2(y,x) >>
[R] reference on p.adjust()
I'm wondering if there is a textbook that summarize the methods on adjusting p-values for multiple comparisons. Most of the references on p.adjust() are over 10 years old. I feel it would be better if somebody could recommend me a textbook on multiple hypothesis correction, so that I can quickly get a general idea of different methods. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Frequency
On Nov 2, 2009, at 2:49 PM, Jorge Ivan Velez wrote: Hi Ashta, Yes, it is possible. Here is a suggestion: # Data set x <- read.table(textConnection("v1 v2 v3 v4 569 10 347 10 46 10 18"), header = TRUE) closeAllConnections() # Table res <- table(data.matrix(x)) f <- sort(res, decreasing = TRUE) data.frame(value = names(f), counts = f) # :-) require(fortunes) fortune('Only how') HTH, Jorge On Mon, Nov 2, 2009 at 2:26 PM, Ashta <> wrote: Thank you Jorge and # 10 4 6 3 5 7 9 18 # 3 2 2 1 1 1 1 1 This one works fine for me. Is it possible to transpose it? I tried t(res[order(res, decreasing = TRUE)]), but it did not work! And in the spirit of that fortune above, also look at: t(t(res[order(res, decreasing = TRUE)])) The extra t() coerces to a matrix and the outer t() does the transpose (I think). -- David. I want the result like this 10 2 4 2 6 2 3 1 . . . . On Mon, Nov 2, 2009 at 1:45 PM, Jorge Ivan Velez <> wrote: Hi Val, Here is a suggestion: res <- table(unlist(x)) res[order(res, decreasing = TRUE)] # 10 4 6 3 5 7 9 18 # 3 2 2 1 1 1 1 1 HTH, Jorge On Mon, Nov 2, 2009 at 1:35 PM, Val <> wrote: BAYESIAN INFERENCES FOR MILKING TEMPERAMENT IN CANADIAN HOLSTEINS Hi All, I have a data set "x" with several variables. Sample of the data is shown below V1 v2 v3 v4 569 10 347 10 46 10 18 I want the frequency of each data point sorted by their occurrence. Below is the output that I want 10=3 6=2 4=2 9=1 5=1 7=1 3=1 How do I do it in R? Thanks in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] "Safe" way to automatically install required packages...
On Mon, Nov 2, 2009 at 10:56 AM, Jonathan Greenberg wrote: > R-helpers: > > I'm working on an r-package that I want to make as easy-to-use as possible > for a novice R-user, which includes automatically installing required > packages. I, myself, am a novice R-packager, so the solution I came up > with was to embed: > > print("Loading required packages...") > if (!require("reshape")) { install.packages("reshape") } > if (!require("reshape")) { > print("Could not install package 'reshape', please contact your > sysadmin.") > return() > } > > in the code proper, and put together the package using package.skeleton() > and R CMD build. > > I'm guessing there's a better way to do this -- any suggestions? > --j Place the dependencies of your package in a comma-seperated list in the depends: field of the DESCRIPTION file. When a user runs install.packages( 'yourPackage', dependencies = T ), R will take care of downloading and installing the dependencies -Charlie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Incremental ReadLines
James, I think those are Unix commands? I'm on Windows, so that's not an option (for now) Also the suggestions posed by Duncan and Phil seem to be working. Thank you so much, such a simple thing to add the "r" or "rt" to the file connection. I read about blocking, but I didn't imagine that it meant "chunks". I was thinking something more like "blocking out", or guarding (perhaps for security). On Mon, Nov 2, 2009 at 1:47 PM, James W. MacDonald wrote: > Hi Gene, > > Rather than using R to parse this file, have you considered using either > grep or sed to pre-process the file and then read it in? > > It looks like you just want lines starting with numbers, so something like > > grep '^[0-9]\+' thefile.csv > otherfile.csv > > should be much faster, and then you can just read in otherfile.csv using > read.csv(). > > Best, > > Jim > > > > Gene Leynes wrote: > >> I've been trying to figure out how to read in a large file for a few days >> now, and after extensive research I'm still not sure what to do. >> >> I have a large comma delimited text file that contains 59 fields in each >> record. >> There is also a header every 121 records >> >> This function works well for smallish records >> getcsv=function(fname){ >>ff=file(description = fname) >>x <- readLines(ff) >>closeAllConnections() >>x <- x[x != ""] # REMOVE BLANKS >>x=x[grep("^[-0-9]", x)] # REMOVE ALL TEXT >> >>spl=strsplit(x,',') # THIS PART IS SLOW, BUT MANAGABLE >> >> >> xx=t(sapply(1:length(spl),function(temp)as.vector(na.omit(as.numeric(spl[[temp]]) >>return(xx) >> } >> It's not elegant, but it works. >> For 121,000 records it completes in 2.3 seconds >> For 121,000*5 records it completes in 63 seconds >> For 121,000*10 records it doesn't complete >> >> When I try other methods to read the file in chunks (using scan), the >> process breaks down because I have to start at the beginning of the file >> on >> every iteration. >> For example: >> fnn=function(n,col){ >>a=122*(n-1)+2 >>xx=scan(fname,skip=a-1,nlines=121,sep=',',quiet=TRUE,what=character(0)) >>xx=xx[xx!=''] >>xx=matrix(xx,ncol=49,byrow=TRUE) >>xx[,col] >> } >> system.time(sapply(1:10,fnn,c=26)) # 0.31 Seconds >> system.time(sapply(91:90,fnn,c=26))# 1.09 Seconds >> system.time(sapply(901:910,fnn,c=26)) # 5.78 Seconds >> >> Even though I'm only getting the 26th column for 10 sets of records, it >> takes a lot longer the further into the file I go. >> >> How can I tell scan to pick up where it left off, without it starting at >> the >> beginning?? There must be a good example somewhere. >> >> I have done a lot of research (in fact, thank you to Michael J. Crawley >> and >> others for your help thus far) >> >> Thanks, >> >> Gene >> >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > University of Michigan > Department of Human Genetics > 5912 Buhl > 1241 E. Catherine St. > Ann Arbor MI 48109-5618 > 734-615-7826 > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting NA's and Graphing a histogram
Hi Koraelus, Here is a suggestion for the first part: index <- apply(Dataset, 1, function(x) sum( is.na( x ) ) < 2 ) d2 <- Dataset[index ,] d2 # Alpha Beta Gamma Delta # A 12 3 4 # BNA2 4 5 # D 891011 # E 5 NA 713 Could you please explain us in more detail what would you like to do in the second part? HTH, Jorge On Mon, Nov 2, 2009 at 1:57 PM, Koraelus <> wrote: > > Let's say I have a dataset > > Dataset<-read.csv("Dataset.txt", header=T, row.names=1) > > Dataset > Alpha Beta Gamma Delta > A 1 23 4 > B NA24 5 > C NA3NA NA > D 8 9 10 11 > E5 NA 713 > F NANA NA 4 > > I want to remove all rows with more than one NA > > So I want it to look like > > Alpha Beta Gamma Delta > A 12 34 > B NA 2 45 > D 8 9 10 11 > E 5 NA 7 13 > > Then I want to turn the rownames into variables A, B, D, E that I can work > with as if I had used the attach() cmd, but on the rownames. > > plot(A) > -- > View this message in context: > http://old.nabble.com/Sorting-NA%27s-and-Graphing-a-histogram-tp26157779p26157779.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Robust ANOVA or alternative test?
Hello On 11/2/09, jinyan fan wrote: > to address this issue, either by some sort of Robust ANOVA procedure, or by > some alternative tests? Thanks a lot. > I cannot address this specific question, but you may want to address robustness-related queries to r-sig-robust, and perhaps check the Robust [1] Task View. Best Liviu [1] http://cran.r-project.org/web/views/Robust.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Frequency
Hi Ashta, Yes, it is possible. Here is a suggestion: # Data set x <- read.table(textConnection("v1 v2 v3 v4 569 10 347 10 46 10 18"), header = TRUE) closeAllConnections() # Table res <- table(data.matrix(x)) f <- sort(res, decreasing = TRUE) data.frame(value = names(f), counts = f) # :-) require(fortunes) fortune('Only how') HTH, Jorge On Mon, Nov 2, 2009 at 2:26 PM, Ashta <> wrote: > Thank you Jorge and > > > res <- table(unlist(x)) > > res[order(res, decreasing = TRUE)] > > # 10 4 6 3 5 7 9 18 > > # 3 2 2 1 1 1 1 1 > > This one works fine for me. Is it possible to transpose it? > I tried t(res[order(res, decreasing = TRUE)]), but it did not work! > > I want the result like this > 10 2 > 4 2 > 6 2 > 3 1 > . . > . . > > > > > On Mon, Nov 2, 2009 at 1:45 PM, Jorge Ivan Velez > <> wrote: > > Hi Val, > > > > Here is a suggestion: > > > > res <- table(unlist(x)) > > res[order(res, decreasing = TRUE)] > > # 10 4 6 3 5 7 9 18 > > # 3 2 2 1 1 1 1 1 > > > > HTH, > > Jorge > > > > > > On Mon, Nov 2, 2009 at 1:35 PM, Val <> wrote: > > > >> BAYESIAN INFERENCES FOR MILKING TEMPERAMENT IN CANADIAN HOLSTEINS > >> > >> Hi All, > >> > >> I have a data set "x" with several variables. Sample of the data is > shown > >> below > >> > >> V1 v2 v3 v4 > >> > >> 569 10 > >> > >> 347 10 > >> > >> 46 10 18 > >> > >> > >> > >> I want the frequency of each data point sorted by their occurrence. > >> > >> > >> > >> Below is the output that I want > >> > >> 10=3 > >> > >> 6=2 > >> > >> 4=2 > >> > >> 9=1 > >> > >> 5=1 > >> > >> 7=1 > >> > >> 3=1 > >> > >> How do I do it in R? > >> > >> > >> > >> Thanks in advance > >> > >> > >> > >> Val > >> > >>[[alternative HTML version deleted]] > >> > >> __ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Incremental ReadLines
Hi Gene, Rather than using R to parse this file, have you considered using either grep or sed to pre-process the file and then read it in? It looks like you just want lines starting with numbers, so something like grep '^[0-9]\+' thefile.csv > otherfile.csv should be much faster, and then you can just read in otherfile.csv using read.csv(). Best, Jim Gene Leynes wrote: I've been trying to figure out how to read in a large file for a few days now, and after extensive research I'm still not sure what to do. I have a large comma delimited text file that contains 59 fields in each record. There is also a header every 121 records This function works well for smallish records getcsv=function(fname){ ff=file(description = fname) x <- readLines(ff) closeAllConnections() x <- x[x != ""] # REMOVE BLANKS x=x[grep("^[-0-9]", x)] # REMOVE ALL TEXT spl=strsplit(x,',') # THIS PART IS SLOW, BUT MANAGABLE xx=t(sapply(1:length(spl),function(temp)as.vector(na.omit(as.numeric(spl[[temp]]) return(xx) } It's not elegant, but it works. For 121,000 records it completes in 2.3 seconds For 121,000*5 records it completes in 63 seconds For 121,000*10 records it doesn't complete When I try other methods to read the file in chunks (using scan), the process breaks down because I have to start at the beginning of the file on every iteration. For example: fnn=function(n,col){ a=122*(n-1)+2 xx=scan(fname,skip=a-1,nlines=121,sep=',',quiet=TRUE,what=character(0)) xx=xx[xx!=''] xx=matrix(xx,ncol=49,byrow=TRUE) xx[,col] } system.time(sapply(1:10,fnn,c=26)) # 0.31 Seconds system.time(sapply(91:90,fnn,c=26))# 1.09 Seconds system.time(sapply(901:910,fnn,c=26)) # 5.78 Seconds Even though I'm only getting the 26th column for 10 sets of records, it takes a lot longer the further into the file I go. How can I tell scan to pick up where it left off, without it starting at the beginning?? There must be a good example somewhere. I have done a lot of research (in fact, thank you to Michael J. Crawley and others for your help thus far) Thanks, Gene [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Incremental ReadLines
On 11/2/2009 2:03 PM, Gene Leynes wrote: I've been trying to figure out how to read in a large file for a few days now, and after extensive research I'm still not sure what to do. I have a large comma delimited text file that contains 59 fields in each record. There is also a header every 121 records You can open the connection before reading, then read in blocks of lines and process those. You don't need to reopen it every time. For example, ff <- file(fname, open="rt") # rt is read text for (block in 1:nblocks) { x <- readLines(ff, n=121) # process this block } close(ff) Duncan Murdoch This function works well for smallish records getcsv=function(fname){ ff=file(description = fname) x <- readLines(ff) closeAllConnections() x <- x[x != ""] # REMOVE BLANKS x=x[grep("^[-0-9]", x)] # REMOVE ALL TEXT spl=strsplit(x,',') # THIS PART IS SLOW, BUT MANAGABLE xx=t(sapply(1:length(spl),function(temp)as.vector(na.omit(as.numeric(spl[[temp]]) return(xx) } It's not elegant, but it works. For 121,000 records it completes in 2.3 seconds For 121,000*5 records it completes in 63 seconds For 121,000*10 records it doesn't complete When I try other methods to read the file in chunks (using scan), the process breaks down because I have to start at the beginning of the file on every iteration. For example: fnn=function(n,col){ a=122*(n-1)+2 xx=scan(fname,skip=a-1,nlines=121,sep=',',quiet=TRUE,what=character(0)) xx=xx[xx!=''] xx=matrix(xx,ncol=49,byrow=TRUE) xx[,col] } system.time(sapply(1:10,fnn,c=26)) # 0.31 Seconds system.time(sapply(91:90,fnn,c=26))# 1.09 Seconds system.time(sapply(901:910,fnn,c=26)) # 5.78 Seconds Even though I'm only getting the 26th column for 10 sets of records, it takes a lot longer the further into the file I go. How can I tell scan to pick up where it left off, without it starting at the beginning?? There must be a good example somewhere. I have done a lot of research (in fact, thank you to Michael J. Crawley and others for your help thus far) Thanks, Gene [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Frequency
Thank you Jorge and > res <- table(unlist(x)) > res[order(res, decreasing = TRUE)] > # 10 4 6 3 5 7 9 18 > # 3 2 2 1 1 1 1 1 This one works fine for me. Is it possible to transpose it? I tried t(res[order(res, decreasing = TRUE)]), but it did not work! I want the result like this 10 2 4 2 6 2 3 1 . . . . On Mon, Nov 2, 2009 at 1:45 PM, Jorge Ivan Velez wrote: > Hi Val, > > Here is a suggestion: > > res <- table(unlist(x)) > res[order(res, decreasing = TRUE)] > # 10 4 6 3 5 7 9 18 > # 3 2 2 1 1 1 1 1 > > HTH, > Jorge > > > On Mon, Nov 2, 2009 at 1:35 PM, Val <> wrote: > >> BAYESIAN INFERENCES FOR MILKING TEMPERAMENT IN CANADIAN HOLSTEINS >> >> Hi All, >> >> I have a data set "x" with several variables. Sample of the data is shown >> below >> >> V1 v2 v3 v4 >> >> 5 6 9 10 >> >> 3 4 7 10 >> >> 4 6 10 18 >> >> >> >> I want the frequency of each data point sorted by their occurrence. >> >> >> >> Below is the output that I want >> >> 10 =3 >> >> 6=2 >> >> 4=2 >> >> 9=1 >> >> 5=1 >> >> 7=1 >> >> 3=1 >> >> How do I do it in R? >> >> >> >> Thanks in advance >> >> >> >> Val >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sorting NA's and Graphing a histogram
Let's say I have a dataset Dataset<-read.csv("Dataset.txt", header=T, row.names=1) Dataset Alpha Beta Gamma Delta A 1 23 4 B NA24 5 C NA3NA NA D 8 9 10 11 E5 NA 713 F NANA NA 4 I want to remove all rows with more than one NA So I want it to look like Alpha Beta Gamma Delta A 12 34 B NA 2 45 D 8 9 10 11 E 5 NA 7 13 Then I want to turn the rownames into variables A, B, D, E that I can work with as if I had used the attach() cmd, but on the rownames. plot(A) -- View this message in context: http://old.nabble.com/Sorting-NA%27s-and-Graphing-a-histogram-tp26157779p26157779.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Robust ANOVA or alternative test?
Dear Colleagues, I am running a three-way mixed-design ANOVA, with one manipulated IV, a group membership IV, and pre-/post- within subject factor. I am interested in the three-way interaction effect. The regular ANVOA is problematic, because (a) the sample sizes are very unbalanced, due to the group IV, (b) the homogeneity of variance is violated, and (c) the homogeneity of covariance is also violated. I wonder if R has a procedure to address this issue, either by some sort of Robust ANOVA procedure, or by some alternative tests? Thanks a lot. Jinyan Fan Assistant Professor Department of Psychology Hofstra University Hempstead, NY 11550 Work: 516-463-6349 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Incremental ReadLines
I've been trying to figure out how to read in a large file for a few days now, and after extensive research I'm still not sure what to do. I have a large comma delimited text file that contains 59 fields in each record. There is also a header every 121 records This function works well for smallish records getcsv=function(fname){ ff=file(description = fname) x <- readLines(ff) closeAllConnections() x <- x[x != ""] # REMOVE BLANKS x=x[grep("^[-0-9]", x)] # REMOVE ALL TEXT spl=strsplit(x,',') # THIS PART IS SLOW, BUT MANAGABLE xx=t(sapply(1:length(spl),function(temp)as.vector(na.omit(as.numeric(spl[[temp]]) return(xx) } It's not elegant, but it works. For 121,000 records it completes in 2.3 seconds For 121,000*5 records it completes in 63 seconds For 121,000*10 records it doesn't complete When I try other methods to read the file in chunks (using scan), the process breaks down because I have to start at the beginning of the file on every iteration. For example: fnn=function(n,col){ a=122*(n-1)+2 xx=scan(fname,skip=a-1,nlines=121,sep=',',quiet=TRUE,what=character(0)) xx=xx[xx!=''] xx=matrix(xx,ncol=49,byrow=TRUE) xx[,col] } system.time(sapply(1:10,fnn,c=26)) # 0.31 Seconds system.time(sapply(91:90,fnn,c=26))# 1.09 Seconds system.time(sapply(901:910,fnn,c=26)) # 5.78 Seconds Even though I'm only getting the 26th column for 10 sets of records, it takes a lot longer the further into the file I go. How can I tell scan to pick up where it left off, without it starting at the beginning?? There must be a good example somewhere. I have done a lot of research (in fact, thank you to Michael J. Crawley and others for your help thus far) Thanks, Gene [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] "Safe" way to automatically install required packages...
If you package "depends" on another package, it will be automatically installed. Hadley On Mon, Nov 2, 2009 at 12:56 PM, Jonathan Greenberg wrote: > R-helpers: > > I'm working on an r-package that I want to make as easy-to-use as possible > for a novice R-user, which includes automatically installing required > packages. I, myself, am a novice R-packager, so the solution I came up > with was to embed: > > print("Loading required packages...") > if (!require("reshape")) { install.packages("reshape") } > if (!require("reshape")) { > print("Could not install package 'reshape', please contact your > sysadmin.") > return() > } > > in the code proper, and put together the package using package.skeleton() > and R CMD build. > > I'm guessing there's a better way to do this -- any suggestions? > --j > > -- > > Jonathan A. Greenberg, PhD > Postdoctoral Scholar > Center for Spatial Technologies and Remote Sensing (CSTARS) > University of California, Davis > One Shields Avenue > The Barn, Room 250N > Davis, CA 95616 > Phone: 415-763-5476 > AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307 > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using exists with coef from an arima fit
On 2/11/2009, at 5:27 PM, Erin Hodgess wrote: Dear R People: I have the output from an arima model fit in an object xxx. I want to verify that the ma1 coefficient is there, so I did the following: xxx$coef ar1ar2ma1 intercept 1.3841297 -0.4985667 -0.996 -0.1091657 str(xxx$coef) Named num [1:4] 1.384 -0.499 -1 -0.109 - attr(*, "names")= chr [1:4] "ar1" "ar2" "ma1" "intercept" exists('xxx$coef["ma1"]') [1] FALSE Other than using xxx$coef[3], is there another way to check this, please? (1) as.logical(length(grep("ma1",names(xxx$coef))) (2) length(grep("ma1",names(xxx$coef))) != 0 (3) with(as.list(xxx$coef),exists("ma1")) spring to mind. cheers, Rolf ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] "Safe" way to automatically install required packages...
R-helpers: I'm working on an r-package that I want to make as easy-to-use as possible for a novice R-user, which includes automatically installing required packages. I, myself, am a novice R-packager, so the solution I came up with was to embed: print("Loading required packages...") if (!require("reshape")) { install.packages("reshape") } if (!require("reshape")) { print("Could not install package 'reshape', please contact your sysadmin.") return() } in the code proper, and put together the package using package.skeleton() and R CMD build. I'm guessing there's a better way to do this -- any suggestions? --j -- Jonathan A. Greenberg, PhD Postdoctoral Scholar Center for Spatial Technologies and Remote Sensing (CSTARS) University of California, Davis One Shields Avenue The Barn, Room 250N Davis, CA 95616 Phone: 415-763-5476 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Frequency
sorry, I forgot to send my reply to the list, I got to remember to hit reply all: So I set up a dummy matrix, v1,v2,v3,v4, an datamatrix v1 = c(5,3,4) v2 = c(6,4,6) v3 = c(9,7,10) v4 = c(10,10,18) datamatrix=c(v1,v2,v3,v4) then do sort(table(datamatrix)) Joe King 206-913-2912 j...@joepking.com "Never throughout history has a man who lived a life of ease left a name worth remembering." --Theodore Roosevelt -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Val Sent: Monday, November 02, 2009 10:35 AM To: r-help@r-project.org Subject: [R] Frequency BAYESIAN INFERENCES FOR MILKING TEMPERAMENT IN CANADIAN HOLSTEINS Hi All, I have a data set "x" with several variables. Sample of the data is shown below V1 v2 v3 v4 569 10 347 10 46 10 18 I want the frequency of each data point sorted by their occurrence. Below is the output that I want 10=3 6=2 4=2 9=1 5=1 7=1 3=1 How do I do it in R? Thanks in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Frequency
Hi Val, Here is a suggestion: res <- table(unlist(x)) res[order(res, decreasing = TRUE)] # 10 4 6 3 5 7 9 18 # 3 2 2 1 1 1 1 1 HTH, Jorge On Mon, Nov 2, 2009 at 1:35 PM, Val <> wrote: > BAYESIAN INFERENCES FOR MILKING TEMPERAMENT IN CANADIAN HOLSTEINS > > Hi All, > > I have a data set "x" with several variables. Sample of the data is shown > below > > V1 v2 v3 v4 > > 569 10 > > 347 10 > > 46 10 18 > > > > I want the frequency of each data point sorted by their occurrence. > > > > Below is the output that I want > > 10=3 > > 6=2 > > 4=2 > > 9=1 > > 5=1 > > 7=1 > > 3=1 > > How do I do it in R? > > > > Thanks in advance > > > > Val > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to print the full name of the factors in summary?
Jack: You are confused! You have specified that the factor is ordered (ordered = TRUE), so you do _not get_ the (6, not 7 btw)single degree of freedom contrasts corresponding to weekday names. Instead you get the (default) contrasts for an ordered factor (.L = linear, .Q = quadratic, etc.). Also you apparently do not have all weekdays present or maybe there is partial confounding with some of your other factors, thus producing the singularities given in the message. While you can read up on such matters ("contrasts") in the docs and texts about R (V&R's MASS has a nice explanation), these issues are inherently technical. So if you do not have the requisite statistical background, I suggest you consult a statistician. To paraphrase an R fortune -- if you do not know how to properly use the statistical tools, you shouldn't be using them. Cheers, Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jen-Chien Chang Sent: Monday, November 02, 2009 7:02 AM To: r-help@r-project.org Subject: [R] how to print the full name of the factors in summary? Hi, I am wondering if there is a simple way to fix the problem I am having. For unknown reason, I could not get the full name of the factors to be printed in the summary. I have tried to used summary.lm as well but the problem still persists. SJ$Weekday <- factor(SJ$Weekday,1:7,c("Mon","Tue","Wed","Thu","Fri","Sat","Sun"),ordered=T ) attach(SJ) lm.SJ <- lm(Demand ~ Weekday+Month+Holiday+Season) summary(lm.SJ) Call: lm(formula = Demand ~ Weekday + Month + Holiday + Season) Residuals: Min 1Q Median 3Q Max -69.767 -12.224 -1.378 10.857 91.376 Coefficients: (3 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) 88.7091 3.3442 26.527 < 2e-16 *** Weekday.L20.8132 2.8140 7.396 1.08e-12 *** Weekday.Q -12.7667 2.8156 -4.534 7.99e-06 *** Weekday.C -10.6375 2.8113 -3.784 0.000182 *** Weekday^4-8.3325 2.8103 -2.965 0.003238 ** - Is there a way for summary to print the full name of the factors and levels? Say Weekday.Tue instead Weekday.L? Thanks! Jack Chang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Frequency
BAYESIAN INFERENCES FOR MILKING TEMPERAMENT IN CANADIAN HOLSTEINS Hi All, I have a data set "x" with several variables. Sample of the data is shown below V1 v2 v3 v4 569 10 347 10 46 10 18 I want the frequency of each data point sorted by their occurrence. Below is the output that I want 10=3 6=2 4=2 9=1 5=1 7=1 3=1 How do I do it in R? Thanks in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] "object not found" within function
Hi, I am trying to write a function to compute many cross-tabulations with the -svytable- command. Here is a simplified example of the structure of my code (adapted from the -svytable- help file): data(api) func.example<-function(variable){ dclus1<-svydesign(id=~1, weights=~pw,data=apiclus1, fpc=~fpc) svytable(~ variable, dclus1) } When I call this function with: func.example(api99) I get the following error: Error in eval(expr, envir, enclos) : object 'variable' not found. (Everything works fine when I type svytable(~ api99, dclus1).) I guess that the problem has something to do with function environments, but since I am new to R, I can't seem to figure out how to rectify the code. I am running R version 2.10 on Windows Vista. Thanks for the help! Thushyanthan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R and Python
I am a long time user of R for financial time series analysis (xts, zoo, etc) and for my next project I am thinking of adding the Python language to the mix. The reason for adding Python is primarily its non-statistical capabilities. So my questions are what have people's experiences been with using interop between R and Python. I see there are two items, rPy and RSPython. It looks like rPy makes it possible to call R code from Python, and RSPython goes both ways. My needs would be to use Python to drive R to get it's extensive time series and stats, and also to get to Python objects from inside R. I searched the list archives and it looks like many people use rPy, but what about RSPython for calling Python from R? It looks like rPy only goes from R into Python, and RSPython has not been updated since 2005? Other messages in the archives state that RSPython only works on Unix? Would I be foolish to build anything mission critical on RSPython? Is there a better way to get at Python from R? Thanks as always. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] a prolem with constrOptim
Hi, I apologize for the long message but the problem I encountered can't be stated in a few lines. I am having some problems with the function constrOptim. My goal is to maximize the likelihood of product of K multinomials, each with four catagories under linear constraints on the parameter values. I have found that the function does not work for many data configurations. #The likelihood and score functions are: likelihood = function(theta,data) { p = matrix(theta,nrow=3,byrow=F) s = 1-apply(p,2,sum) p = rbind(p,s) Z = matrix(data,nrow=4,byrow=F) sum(log(p)*Z) } score = function(theta,data) { k = length(data)/4 S = numeric(3*k) for(i in 1:k) { n = data[(1+4*(i-1)):(4*i)] p = theta[(1+3*(i-1)):(3*i)] P = 1-sum(p) S[(1+3*(i-1)):(3*i)] = n[1:3]/p-n[4]/P } S } #where theta=(p11,p12,p13,p21,p22,p23,...,pK1,pK2,pK3). #The function Rmat calculates the restriction matrix needed for constrained estimation Rmat = function(k) { R = matrix(1,4,3) R[1,2] = R[1,3] = R[2,3] = R[3,2] = 0 RR = cbind(-R,R) RRR = matrix(0,4*(k-1),3*k) for ( i in 1:(k-1) ) RRR[4*(i-1)+1:4,3*(i-1)+1:6] = RR RRR } #The function gen.data generates data gen.data = function(p,N) { k = ncol(p) Data = matrix(0,4,k) for(j in 1:k) { Data[,j] = as.vector(rmultinom(1, size = N[j], prob=p[,j])) } as.vector(Data) } #I have found that the function returns an error under some configurations of the data ( 0's seem to harm it). Why is this happening and how can it #be corrected? Why do some configurations with 0's crash the program and others don't? # For example for the data: data1 = c(0,6,8,6,3,4,4,9,2,2,5,11) data2 = c(0,6,8,6,0,4,4,9,2,2,5,11) # inputs for constrOptim K = 3 RR = Rmat(K) zeros = rep(0,nrow(RR)) theta.0 = numeric(3*K) for(i in 1:K) {theta.0[(1+3*(i-1)):(3*i)] = rep(1/10,3)+i/100} # data1 gives an error but data2 does not! constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros, data = data1, method="BFGS", control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros, data = data2, method="BFGS", control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par # If you want to further check run the simulation below. K = 3 N = rep(20,K) p = c(1/10,2/10,2/10,5/10) p = matrix(rep(p,K),nrow=4,byrow=F) RR = Rmat(K) zeros = rep(0,nrow(RR)) theta.0 = numeric(3*K) for(i in 1:K) {theta.0[(1+3*(i-1)):(3*i)] = rep(1/10,3)+i/100} for(i in 1:1000) { data = gen.data(p,N) print(i) print(data) theta.til = constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros, data = data, method="BFGS", control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par print(theta.til) } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Summing rows based on criteria
I am attempting to clean up some land use building data and need to join some buildings together making sure not to double count GIS slivers. The first data.frame is the original, the 2nd adds all the acres for each identical bldgid. I now want to a) throw out all but one of the the cases where the Years and ImpValue are Identical, b) Sum the impvalues based on: 1) The Years are identical and the ImpValue are not 2)The ImpValues are identical and the Years are not Resulting in the 3rd data frame. Please consider the following DF=cbind(Acres,Bldgid,Year,ImpValue) DF<-as.data.frame(DF) DF Acres Bldgid Year ImpValue 1 100.00 1 1946 1000 2 101.00 2 1952 1400 3 100.00 3 1922 1300 4 130.00 4 1910 900 5 156.00 5 1955 5000 60.50 5 1955 1200 7 293.00 6 1999 500 8 300.00 7 1990 9000 90.09 7 1991 9000 10 100.00 8 2000 1000 11 12.50 8 2000 1000 #Aggregate acres where identical ids dupbuild<-aggregate(DF$Acres,DF["Bldgid"],sum) colnames(dupbuild)[2]<-"Acres" #Add aggregated Acres to DF DF$Acres<-dupbuild$Acres[match(DF$Bldgid,dupbuild$Bldgid)] DF Acres dgid Year ImpValue 1 100.00 1 1946 1000 2 101.00 2 1952 1400 3 100.00 3 1922 1300 4 130.00 4 1910 900 5 156.50 5 1955 5000 6 156.50 5 1955 1200 7 293.00 6 1999 500 8 300.09 7 1990 9000 9 300.09 7 1991 9000 10 112.50 8 2000 1000 11 112.50 8 2000 1000 #desired outcome data frame Acres dgid Year ImpValue 1 100.00 1 1946 1000 2 101.00 2 1952 1400 3 100.00 3 1922 1300 4 130.00 4 1910 900 5 156.50 5 1955 7200 #combined 5 & 6 7 293.00 6 1999 500 8 300.09 7 1990 18000 #combined 8 & 9 10 112.50 8 2000 1000 #one case thrown out So in this case the Impvalue are added together for rows 5 & 6 (from dataframe example 2) b/c the years are identical and the Impvalue is not, rows 8 and 9 have their Impvalue summed because there Years are identical but the improvement value is not, and one of the cases is thrown out of rows 10 & 11 because they have identical years and ImpValue. When rows are joined the Year value is no longer important but to remain consistent i would like to keep the earliest (lowest) year. There will be instances in the actual data where there are more than 2 cases to consider if that makes any coding difference as i didnt include any in my example data. It would also be useful to include a new column keeping track of how many joined bldgids . I think i can figure that one out though. Hope this is all clear. Thanks for the guidance and insights JR -- View this message in context: http://old.nabble.com/Summing-rows-based-on-criteria-tp26157755p26157755.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Avoiding for loops
Wow, That's nice. Should work well, but I just realized that I missed something in explaining my code. I need to calculate the exp function on X so it should be exp(x) / sum(exp(x)) for each group I tried this with: foo <- ave(rawdata$foo,rawdata$code,FUN=function(x) exp(x) / sum(exp(x))) That didn't work. I got a lot of NaN -N On 11/2/09 3:30 AM, Peter Dalgaard wrote: > Dimitris Rizopoulos wrote: > >> you could try something along these lines: >> >> data<- data.frame(y = rnorm(100), group = rep(1:10, each = 10)) >> >> data$sum<- ave(data$y, data$group, FUN = sum) >> data$norm.y<- data$y / data$sum >> data >> > .. or even > > transform(data, norm=ave(y, group, FUN = function(x) x/sum(x))) > > >> I hope it helps. >> >> Best, >> Dimitris >> >> >> Noah Silverman wrote: >> >>> Hi, >>> >>> I'm trying to normalize some data. >>> My data is organized by groups. I want to normalize PER GROUP as >>> opposed to over the entire data set. >>> >>> The current double loop that I'm using takes almost an hour to run on >>> about 30,000 rows of data in 2,500 groups. >>> >>> I'm currently doing this: >>> >>> - >>> for(group in unique(data$group)){ >>> sum_V1<- sum(data$V1[data$group == group]) >>> >>> for(subject in data$subject[data$group == group]){ >>> data$V1_norm[(data$group == group& data$subject == subject)] >>> <- data$V1[(data$group == group& data$subject == subject)] / sum_V1 >>> } >>> } >>> - >>> >>> Can anyone point me to a faster way to do this in R. >>> >>> Thanks! >>> >>> -N >>> >>> __ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qqplot
Peter Ehlers wrote > >That's not what qqline() does and for good reason - it treats >x and y asymmetrically. > >But qqline() is a very simple function, using the quartiles >as also suggested by John. Here's a modified version that >should work for Carol: > >qqline2 <- function (x, y, ...) >{ > y <- quantile(y[!is.na(y)], c(0.25, 0.75)) > x <- quantile(x[!is.na(x)], c(0.25, 0.75)) > slope <- diff(y)/diff(x) > int <- y[1L] - slope * x[1L] > abline(int, slope, ...) >} > >I've just replaced the line using Normal quantiles with >the quantiles of x (and omitted the datax option). > Peter Good point. Peter Peter L. Flom, PhD Statistical Consultant Website: www DOT peterflomconsulting DOT com Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qqplot
John Fox wrote > >I assumed that Carol wanted to compare the shapes of the distributions and >to adjust for differences in centre and spread. To put a line through the >quartiles or to base a line on the medians and IQRs is more robust than >using the means and sds. > Hi John Indeed it is. It all depends on what she wants to use the QQplot for, and what she wants to adjust for. But I would guess (and I haven't done any simuls) that the method with mean and sd will usually be pretty close to the one with medians and IQRs. Maybe I will look at it a bit. Peter Peter L. Flom, PhD Statistical Consultant Website: www DOT peterflomconsulting DOT com Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qqplot
Dear Peter, I assumed that Carol wanted to compare the shapes of the distributions and to adjust for differences in centre and spread. To put a line through the quartiles or to base a line on the medians and IQRs is more robust than using the means and sds. Best, John > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Peter Flom > Sent: November-02-09 11:57 AM > To: carol white; Yihui Xie > Cc: r-h...@stat.math.ethz.ch > Subject: Re: [R] qqplot > > carol white wrote > > >So the conclusion is that abline(0,1) should always be used and if it > doesn't go through the qqplot, the two distributions are not similar? > > I think it depends what you mean by "similar". E.g., if you mean "are both > of these distributions (e.g.) normal?" then abline(0,1) is not always useful. > But if you mean "Do these have the same mean, sd, and distribution?" then > abline(0,1) is the way to go. > > Peter > > Peter L. Flom, PhD > Statistical Consultant > Website: www DOT peterflomconsulting DOT com > Writing; http://www.associatedcontent.com/user/582880/peter_flom.html > Twitter: @peterflom > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qqplot
Peter Flom wrote: David Winsemius wrote I always assumed that the intercept was zero and the slope = unity. y <- rt(200, df = 5) qqnorm(y); qqline(y, col = 2) qqplot(y, rt(300, df = 5)) abline(0, 1, col="red") Suppose you have the following x <- rnorm(500) y <- 500*(x + runif(500, 0,1)) qqplot(x,y) Then an abline(0,1) will not be useful; it will be an almost horizontal line. But m1 <- lm(y~x) abline(coef(m1)) does the trick. Peter That's not what qqline() does and for good reason - it treats x and y asymmetrically. But qqline() is a very simple function, using the quartiles as also suggested by John. Here's a modified version that should work for Carol: qqline2 <- function (x, y, ...) { y <- quantile(y[!is.na(y)], c(0.25, 0.75)) x <- quantile(x[!is.na(x)], c(0.25, 0.75)) slope <- diff(y)/diff(x) int <- y[1L] - slope * x[1L] abline(int, slope, ...) } I've just replaced the line using Normal quantiles with the quantiles of x (and omitted the datax option). -Peter Ehlers Peter L. Flom, PhD Statistical Consultant Website: www DOT peterflomconsulting DOT com Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save an object by dynamicly created name
jeffc wrote: > > Hi, > > I would like to save a few dynamically created objects to disk. The > following is the basic flow of the code segment > > for(i = 1:10) { >m = i:5 >save(m, file = ...) ## ??? > } > To distinguish different objects to be saved, I would like to save m as > m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... > > save(m, file = paste("/home/data/m", i, ".rdata", sep="") Dieter -- View this message in context: http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26157748.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qqplot
carol white wrote >So the conclusion is that abline(0,1) should always be used and if it doesn't >go through the qqplot, the two distributions are not similar? I think it depends what you mean by "similar". E.g., if you mean "are both of these distributions (e.g.) normal?" then abline(0,1) is not always useful. But if you mean "Do these have the same mean, sd, and distribution?" then abline(0,1) is the way to go. Peter Peter L. Flom, PhD Statistical Consultant Website: www DOT peterflomconsulting DOT com Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qqplot
David Winsemius wrote >I always assumed that the intercept was zero and the slope = unity. > > y <- rt(200, df = 5) > qqnorm(y); qqline(y, col = 2) > qqplot(y, rt(300, df = 5)) > abline(0, 1, col="red") > Suppose you have the following x <- rnorm(500) y <- 500*(x + runif(500, 0,1)) qqplot(x,y) Then an abline(0,1) will not be useful; it will be an almost horizontal line. But m1 <- lm(y~x) abline(coef(m1)) does the trick. Peter Peter L. Flom, PhD Statistical Consultant Website: www DOT peterflomconsulting DOT com Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qqplot
So the conclusion is that abline(0,1) should always be used and if it doesn't go through the qqplot, the two distributions are not similar? Thanks --- On Mon, 11/2/09, Yihui Xie wrote: > From: Yihui Xie > Subject: Re: [R] qqplot > To: "carol white" > Cc: "David Winsemius" , r-h...@stat.math.ethz.ch > Date: Monday, November 2, 2009, 8:42 AM > abline(0,1) is somewhere in the > upper-left corner which you are unable > to see. At least the first distribution seems to have a > larger mean > than the second one (i.e. they are not the same > distribution). > > Regards, > Yihui > -- > Yihui Xie > Phone: 515-294-6609 Web: http://yihui.name > Department of Statistics, Iowa State University > 3211 Snedecor Hall, Ames, IA > > > > On Mon, Nov 2, 2009 at 10:33 AM, carol white > wrote: > > if I have the two following matrices, abline(0,1) > doesn't go through. QQplot is attached. > > > > [,1] [,2] [,3] [,4] > [,5] > > 2.149644 1.992864 3.346375 2.793511 3.428230 > > 1.100762 2.152981 2.735401 2.175185 3.323058 > > 1.212406 2.131813 2.672598 2.389996 3.242490 > > 1.183770 1.908633 2.661237 2.590545 2.906059 > > 1.665190 1.778923 2.636062 2.475619 4.013407 > > > > > > 0.601 0.083 0.520 0.920 -0.007 > > -0.778 0.427 -0.605 -0.066 -0.283 > > -0.599 0.348 -0.693 0.284 -0.436 > > -0.519 0.081 -0.590 0.678 -1.095 > > 0.009 -0.253 -0.940 0.526 1.623 > > > > > > --- On Mon, 11/2/09, David Winsemius > wrote: > > > >> From: David Winsemius > >> Subject: Re: [R] qqplot > >> To: "carol white" > >> Cc: r-h...@stat.math.ethz.ch > >> Date: Monday, November 2, 2009, 8:17 AM > >> > >> On Nov 2, 2009, at 10:40 AM, carol white wrote: > >> > >> > Hi, > >> > We could use qqplot to see how two > distributions are > >> different from each other. To show better how they > are > >> different (departs from the straight line), how is > it > >> possible to plot the straight line that goes > through them? I > >> am looking for some thing like qqline for qqnorm. > I thought > >> of abline but how to determine the slope and > intercept? > >> > >> I always assumed that the intercept was zero and > the slope > >> = unity. > >> > >> y <- rt(200, df = 5) > >> qqnorm(y); qqline(y, col = 2) > >> qqplot(y, rt(300, df = 5)) > >> abline(0, 1, col="red") > >> > >> I am open to education if that assumption is too > >> simplistic, but you have not offered anything in > the way of > >> a counter-example. > >> > >> > > >> == > >> > >> David Winsemius, MD > >> Heritage Laboratories > >> West Hartford, CT > >> > >> > > > > > > > > __ > > R-help@r-project.org > mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, > reproducible code. > > > > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qqplot
abline(0,1) is somewhere in the upper-left corner which you are unable to see. At least the first distribution seems to have a larger mean than the second one (i.e. they are not the same distribution). Regards, Yihui -- Yihui Xie Phone: 515-294-6609 Web: http://yihui.name Department of Statistics, Iowa State University 3211 Snedecor Hall, Ames, IA On Mon, Nov 2, 2009 at 10:33 AM, carol white wrote: > if I have the two following matrices, abline(0,1) doesn't go through. QQplot > is attached. > > [,1] [,2] [,3] [,4] [,5] > 2.149644 1.992864 3.346375 2.793511 3.428230 > 1.100762 2.152981 2.735401 2.175185 3.323058 > 1.212406 2.131813 2.672598 2.389996 3.242490 > 1.183770 1.908633 2.661237 2.590545 2.906059 > 1.665190 1.778923 2.636062 2.475619 4.013407 > > > 0.601 0.083 0.520 0.920 -0.007 > -0.778 0.427 -0.605 -0.066 -0.283 > -0.599 0.348 -0.693 0.284 -0.436 > -0.519 0.081 -0.590 0.678 -1.095 > 0.009 -0.253 -0.940 0.526 1.623 > > > --- On Mon, 11/2/09, David Winsemius wrote: > >> From: David Winsemius >> Subject: Re: [R] qqplot >> To: "carol white" >> Cc: r-h...@stat.math.ethz.ch >> Date: Monday, November 2, 2009, 8:17 AM >> >> On Nov 2, 2009, at 10:40 AM, carol white wrote: >> >> > Hi, >> > We could use qqplot to see how two distributions are >> different from each other. To show better how they are >> different (departs from the straight line), how is it >> possible to plot the straight line that goes through them? I >> am looking for some thing like qqline for qqnorm. I thought >> of abline but how to determine the slope and intercept? >> >> I always assumed that the intercept was zero and the slope >> = unity. >> >> y <- rt(200, df = 5) >> qqnorm(y); qqline(y, col = 2) >> qqplot(y, rt(300, df = 5)) >> abline(0, 1, col="red") >> >> I am open to education if that assumption is too >> simplistic, but you have not offered anything in the way of >> a counter-example. >> >> > >> == >> >> David Winsemius, MD >> Heritage Laboratories >> West Hartford, CT >> >> > > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how can I convert .csv format to matrix???
bbslover wrote: > > In my disk C:/ have a a.csv file, I want to read it to R, importantly, > when I use x=read.csv("C:/a.csv") ,the x format is data.frame, I want > to it to become matrix format, how can I do it ? > as.matrix should do the job; if it does not (second example), you probably have rownames or some other string in your data set. Dieter # use read.csv instead x = data.frame(a=1:10,b=11:20) xm = as.matrix(x) str(xm) xm # looks good x1 = data.frame(a=1:10,b=11:20,c=letters[1:10]) xm1 = as.matrix(x1) str(xm1) xm1 # looks strange, because it's all strings -- View this message in context: http://old.nabble.com/how-can-I-convert-.csv-format-to-matrixtp26156643p26157732.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R and Python
I am a long time user of R for financial time series analysis (xts, zoo, etc) and for my next project I am thinking of adding the Python language to the mix. The reason for adding Python is primarily its non-statistical capabilities. So my questions are what have people's experiences been with using interop between R and Python. I see there are two items, rPy and RSPython. It looks like rPy makes it possible to call R code from Python, and RSPython goes both ways. My needs would be to use Python to drive R to get it's extensive time series and stats, and also to get to Python objects from inside R. I searched the list archives and it looks like many people use rPy, but what about RSPython for calling Python from R? It looks like rPy only goes from R into Python, and RSPython has not been updated since 2005? Other messages in the archives state that RSPython only works on Unix? Would I be foolish to build anything mission critical on RSPython? Is there a better way to get at Python from R? Thanks as always. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qqplot
if I have the two following matrices, abline(0,1) doesn't go through. QQplot is attached. [,1] [,2] [,3] [,4] [,5] 2.149644 1.992864 3.346375 2.793511 3.428230 1.100762 2.152981 2.735401 2.175185 3.323058 1.212406 2.131813 2.672598 2.389996 3.242490 1.183770 1.908633 2.661237 2.590545 2.906059 1.665190 1.778923 2.636062 2.475619 4.013407 0.601 0.083 0.5200.920 -0.007 -0.778 0.427 -0.605 -0.066 -0.283 -0.599 0.348 -0.6930.284 -0.436 -0.519 0.081 -0.5900.678 -1.095 0.009 -0.253 -0.9400.526 1.623 --- On Mon, 11/2/09, David Winsemius wrote: > From: David Winsemius > Subject: Re: [R] qqplot > To: "carol white" > Cc: r-h...@stat.math.ethz.ch > Date: Monday, November 2, 2009, 8:17 AM > > On Nov 2, 2009, at 10:40 AM, carol white wrote: > > > Hi, > > We could use qqplot to see how two distributions are > different from each other. To show better how they are > different (departs from the straight line), how is it > possible to plot the straight line that goes through them? I > am looking for some thing like qqline for qqnorm. I thought > of abline but how to determine the slope and intercept? > > I always assumed that the intercept was zero and the slope > = unity. > > y <- rt(200, df = 5) > qqnorm(y); qqline(y, col = 2) > qqplot(y, rt(300, df = 5)) > abline(0, 1, col="red") > > I am open to education if that assumption is too > simplistic, but you have not offered anything in the way of > a counter-example. > > > > == > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > <>__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qqplot
Carol, You could run a line through the pairs of first and third quartiles of the two distributions, i.e., c(quantile(x, .25), quantile(y, .25)) and c(quantile(x, .75), quantile(y, .75)). (Of course, you'd want the line to extend across the whole graph.) I hope this helps, John > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of carol white > Sent: November-02-09 10:40 AM > To: r-h...@stat.math.ethz.ch > Subject: [R] qqplot > > Hi, > We could use qqplot to see how two distributions are different from each > other. To show better how they are different (departs from the straight > line), how is it possible to plot the straight line that goes through them? I > am looking for some thing like qqline for qqnorm. I thought of abline but how > to determine the slope and intercept? > > Best wishes, > > Carol > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to print the full name of the factors in summary?
Jen-Chien and Adaikalavan, Weekday is specified as an ordered factor, for which the default contrast type is contr.poly, i.e., orthogonal-polynomial contrasts. I would have expected a 6th-degree polynomial for the 7 days of the week, and there's a message that suggests that 3 coefficients are aliased, so there must be redundancies among the factors. If you want contr.treatment, treating "Mon" as the baseline level, then simply omit ordered=T from the call to factor(), but this won't make the collinearities disappear. I hope this helps, John -- John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Adaikalavan Ramasamy > Sent: November-02-09 10:43 AM > To: Jen-Chien Chang > Cc: r-help@r-project.org > Subject: Re: [R] how to print the full name of the factors in summary? > > It would be useful to say which package the object SJ comes from or > provide a more reproducible example. > > Assuming that Demand variable is continuous and you are fitting a > standard lm() model, then your results looks suspicious. Where are the > coefficients for Month, Holiday, Season? > > > > > Jen-Chien Chang wrote: > > Hi, > > > > I am wondering if there is a simple way to fix the problem I am having. > > For unknown reason, I could not get the full name of the factors to be > > printed in the summary. I have tried to used summary.lm as well but the > > problem still persists. > > > > SJ$Weekday <- > > > factor(SJ$Weekday,1:7,c("Mon","Tue","Wed","Thu","Fri","Sat","Sun"),ordered=T ) > > > > > > attach(SJ) > > lm.SJ <- lm(Demand ~ Weekday+Month+Holiday+Season) > > summary(lm.SJ) > > Call: > > lm(formula = Demand ~ Weekday + Month + Holiday + Season) > > > > Residuals: > > Min 1Q Median 3Q Max > > -69.767 -12.224 -1.378 10.857 91.376 > > > > Coefficients: (3 not defined because of singularities) > > Estimate Std. Error t value Pr(>|t|) > > (Intercept) 88.7091 3.3442 26.527 < 2e-16 *** > > Weekday.L20.8132 2.8140 7.396 1.08e-12 *** > > Weekday.Q -12.7667 2.8156 -4.534 7.99e-06 *** > > Weekday.C -10.6375 2.8113 -3.784 0.000182 *** > > Weekday^4-8.3325 2.8103 -2.965 0.003238 ** > > - > > > > Is there a way for summary to print the full name of the factors and > > levels? Say Weekday.Tue instead Weekday.L? > > > > Thanks! > > > > Jack Chang > > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qqplot
On Nov 2, 2009, at 10:40 AM, carol white wrote: Hi, We could use qqplot to see how two distributions are different from each other. To show better how they are different (departs from the straight line), how is it possible to plot the straight line that goes through them? I am looking for some thing like qqline for qqnorm. I thought of abline but how to determine the slope and intercept? I always assumed that the intercept was zero and the slope = unity. y <- rt(200, df = 5) qqnorm(y); qqline(y, col = 2) qqplot(y, rt(300, df = 5)) abline(0, 1, col="red") I am open to education if that assumption is too simplistic, but you have not offered anything in the way of a counter-example. == David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] partial matching with grep()
Is this what you want: depends on it occurring at the endo of the string: > grep("\\.x[^x]*$",c("a.x" ,"b.x","a.xx"),value=TRUE) [1] "a.x" "b.x" On Mon, Nov 2, 2009 at 9:57 AM, Vito Muggeo (UniPa) wrote: > dear all, > This is a probably a silly question. > If I type >> grep("x",c("a.x" ,"b.x","a.xx"),value=TRUE) > [1] "a.x" "b.x" "a.xx" > > Instead, I would like to obtain only > "a.x" "b.x" > How is it possible to get this result with grep()? > > many thanks for your attention, > best, > vito > > -- > > Vito M.R. Muggeo > Dip.to Sc Statist e Matem `Vianelli' > Università di Palermo > viale delle Scienze, edificio 13 > 90128 Palermo - ITALY > tel: 091 6626240 > fax: 091 485726/485612 > http://dssm.unipa.it/vmuggeo > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] qqplot
Hi, We could use qqplot to see how two distributions are different from each other. To show better how they are different (departs from the straight line), how is it possible to plot the straight line that goes through them? I am looking for some thing like qqline for qqnorm. I thought of abline but how to determine the slope and intercept? Best wishes, Carol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem building R 2.10.0 - Matrix package
All: Attached is the output file from building R 2.10.0 on RedHat Linux. I have never previously experienced any problems when building R from source with new releases. But, now I get a compile error with the Matrix package: CHOLMOD/Include/cholmod.h:87:22: error: UFconfig.h: No such file or directory make[3]: *** [CHMfactor.o] Error 1 make[3]: Leaving directory `/tmp/RtmppKsKKl/R.INSTALL327b23c6/Matrix/src' ERROR: compilation failed for package 'Matrix' Any help or suggestions would be appreciated. I am simply doing this, as I always have: ./configure --prefix=/Local/install/location (I can not use the default) make I should add that even with this error with the Matrix package, I can do a 'make install' OK. Subsequently, R seems to come up fine from the command line… Regards, Tom -- Thomas E Adams National Weather Service Ohio River Forecast Center 1901 South State Route 134 Wilmington, OH 45177 EMAIL: thomas.ad...@noaa.gov VOICE: 937-383-0528 FAX:937-383-0033 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] partial matching with grep()
On Nov 2, 2009, at 9:57 AM, Vito Muggeo (UniPa) wrote: dear all, This is a probably a silly question. If I type > grep("x",c("a.x" ,"b.x","a.xx"),value=TRUE) [1] "a.x" "b.x" "a.xx" > grep("\\.x\\b",c("a.x" ,"b.x","a.xx"),value=TRUE) [1] "a.x" "b.x" Or: > grep("\\.x$",c("a.x" ,"b.x","a.xx"),value=TRUE) [1] "a.x" "b.x" Why you might ask. You need the double "\\" in order to get the interpreter to interpret them as single escapes prior to the period and the "\" in "\b", which is the end-of-word match pattern, while "$" is an end of sentence pattern. Instead, I would like to obtain only "a.x" "b.x" How is it possible to get this result with grep()? many thanks for your attention, best, vito -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert list to numeric
An example will help show the difference between single vs. double brackets. ## xl is a list with three elements xl <- list( a=1:3, b=2:7, c=c('X','Y')) xl $a [1] 1 2 3 $b [1] 2 3 4 5 6 7 $c [1] "X" "Y" ## with single brackets, we get a subset. A subset of a list is still a list xl[1] $a [1] 1 2 3 ## extract the first element of xl, as itself, whatever it is xl[[1]] [1] 1 2 3 ## with single brackets, we get a subset. In the next example the subset ## consists of the first and third elements of xl, i.e., a list having two elements xl[c(1,3)] $a [1] 1 2 3 $c [1] "X" "Y" ## Similarly for data frames, and note how the formatting is different when the object ## is printed to the screen xd <- data.frame( a=1:3, b=2:4, c=c('X','Y','Z')) xd a b c 1 1 2 X 2 2 3 Y 3 3 4 Z class(xd) [1] "data.frame" xd[2] b 1 2 2 3 3 4 class(xd[2]) [1] "data.frame" xd[[2]] [1] 2 3 4 class(xd[[2]]) [1] "integer" At 6:22 AM -0800 11/2/09, dadrivr wrote: Great, that works very well. What is the purpose of double brackets vs single ones? I will remember next time to include a subset of the data, so that readers can run the script. Thanks again for your help! Benilton Carvalho wrote: it appears that what you really want is to use: task[[i]] instead of task[i] b On Nov 1, 2009, at 11:04 PM, dadrivr wrote: I would like to preface this by saying that I am new to R, so I would ask that you be patient and thorough, so that I'm not completely clueless. I am trying to convert a list to numeric so that I can perform computations on it (specifically mean-center the variable), but I am running into problems. I have imported the data set into "task" (data frame). The data frame is made of factors with variable names in the first row. I am running a loop to set a variable equal to a column in the data frame. Here is an example of my problem: for (i in 1:dim(task)[2]){ predictor.loop <- c(task[i]) predictor.loop.mc <- predictor.loop - mean(predictor.loop, na.rm=T) } I get the following error: Error in predictor.loop - mean(predictor.loop, na.rm = T) : non-numeric argument to binary operator In addition: Warning message: In mean.default(predictor.loop, na.rm = T) : argument is not numeric or logical: returning NA The column is entirely made up of numerical data, except for the header, which is a string. My problem is that I receive an error because the predictor.loop variable is not numerical, so I need to find a way to convert it. I tried using: predictor.loop <- c(as.numeric(task[i])) But I get the following error: "Error: (list) object cannot be coerced to type 'double'" If I call the variable, I can assign it to a numerical list (e.g., predictor loop <- task$variablename), but since I am assigning the variable in a loop, >> I have to find another way as the variable name would have to change >> in each >> loop iteration. Any help would be greatly appreciated. Thanks! >> -- >> View this message in context: >> http://*old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://*stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://*stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://*old.nabble.com/convert-list-to-numeric-tp26155039p26157105.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://*stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA 925-423-1062 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to print the full name of the factors in summary?
It would be useful to say which package the object SJ comes from or provide a more reproducible example. Assuming that Demand variable is continuous and you are fitting a standard lm() model, then your results looks suspicious. Where are the coefficients for Month, Holiday, Season? Jen-Chien Chang wrote: Hi, I am wondering if there is a simple way to fix the problem I am having. For unknown reason, I could not get the full name of the factors to be printed in the summary. I have tried to used summary.lm as well but the problem still persists. SJ$Weekday <- factor(SJ$Weekday,1:7,c("Mon","Tue","Wed","Thu","Fri","Sat","Sun"),ordered=T) attach(SJ) lm.SJ <- lm(Demand ~ Weekday+Month+Holiday+Season) summary(lm.SJ) Call: lm(formula = Demand ~ Weekday + Month + Holiday + Season) Residuals: Min 1Q Median 3Q Max -69.767 -12.224 -1.378 10.857 91.376 Coefficients: (3 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) 88.7091 3.3442 26.527 < 2e-16 *** Weekday.L20.8132 2.8140 7.396 1.08e-12 *** Weekday.Q -12.7667 2.8156 -4.534 7.99e-06 *** Weekday.C -10.6375 2.8113 -3.784 0.000182 *** Weekday^4-8.3325 2.8103 -2.965 0.003238 ** - Is there a way for summary to print the full name of the factors and levels? Say Weekday.Tue instead Weekday.L? Thanks! Jack Chang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package lme4
On Sun, Nov 1, 2009 at 9:01 AM, wenjun zheng wrote: > Hi R Users, > When I use package lme4 for mixed model analysis, I can't distinguish > the significant and insignificant variables from all random independent > variables. > Here is my data and result: > Data: > > Rice<-data.frame(Yield=c(8,7,4,9,7,6,9,8,8,8,7,5,9,9,5,7,7,8,8,8,4,8,6,4,8,8,9), > Variety=rep(rep(c("A1","A2","A3"),each=3),3), > Stand=rep(c("B1","B2","B3"),9), > Block=rep(1:3,each=9)) > Rice.lmer<-lmer(Yield ~ (1|Variety) + (1|Stand) + (1|Block) + > (1|Variety:Stand), data = Rice) > > Result: > > Linear mixed model fit by REML > Formula: Yield ~ (1 | Variety) + (1 | Stand) + (1 | Block) + (1 | > Variety:Stand) > Data: Rice > AIC BIC logLik deviance REMLdev > 96.25 104.0 -42.12 85.33 84.25 > Random effects: > Groups Name Variance Std.Dev. > Variety:Stand (Intercept) 1.345679 1.16003 > Block (Intercept) 0.00 0.0 > Stand (Intercept) 0.89 0.94281 > Variety (Intercept) 0.024691 0.15714 > Residual 0.67 0.81650 > Number of obs: 27, groups: Variety:Stand, 9; Block, 3; Stand, 3; Variety, 3 > Fixed effects: > Estimate Std. Error t value > (Intercept) 7.1852 0.6919 10.38 > Can you give me some advice for recognizing the significant variables among > random effects above without other calculating. Well, since the estimate of the variance due to Block is zero, that's probably not one of the significant random effects. Why do you want to do this without other calculations? In olden days when each model fit involved substantial calculations by hand one did try to avoid fitting multiple models but now that is not a problem. You can get a hint of which random effects will be significant by looking at their precision in a "caterpillar plot" and then fit the reduced model and use anova to compare models. See the enclosed > Any suggestions will be appreciated. > Wenjun > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > R version 2.10.0 Patched (2009-11-01 r50288) Copyright (C) 2009 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(lme4) Loading required package: Matrix Loading required package: lattice > Rice <- data.frame(Yield = c(8,7,4,9,7,6,9,8,8,8,7,5,9,9,5,7, +7,8,8,8,4,8,6,4,8,8,9), +Variety = gl(3, 3, 27, labels = c("A1","A2","A3")), +Stand = gl(3, 1, 27, labels = c("B1","B2","B3")), +Block = gl(3, 9)) > print(fm1 <- lmer(Yield ~ 1 + (1|Block) + (1|Stand) + + (1|Variety) + (1|Variety:Stand), Rice)) Linear mixed model fit by REML Formula: Yield ~ 1 + (1 | Block) + (1 | Stand) + (1 | Variety) + (1 | Variety:Stand) Data: Rice AIC BIC logLik deviance REMLdev 96.25 104.0 -42.1285.33 84.25 Random effects: GroupsNameVariance Std.Dev. Variety:Stand (Intercept) 1.34568 1.16003 Variety (Intercept) 0.02469 0.15713 Stand (Intercept) 0.9 0.94281 Block (Intercept) 0.0 0.0 Residual 0.7 0.81650 Number of obs: 27, groups: Variety:Stand, 9; Variety, 3; Stand, 3; Block, 3 Fixed effects: Estimate Std. Error t value (Intercept) 7.1852 0.6919 10.38 > ## Estimate of Block variance is zero, remove that term > print(fm2 <- lmer(Yield ~ 1 + (1|Stand) + (1|Variety) + (1|Variety:Stand), + Rice)) Linear mixed model fit by REML Formula: Yield ~ 1 + (1 | Stand) + (1 | Variety) + (1 | Variety:Stand) Data: Rice AIC BIC logLik deviance REMLdev 94.25 100.7 -42.1285.33 84.25 Random effects: GroupsNameVariance Std.Dev. Variety:Stand (Intercept) 1.345679 1.16003 Variety (Intercept) 0.024692 0.15714 Stand (Intercept) 0.88 0.94281 Residual 0.67 0.81650 Number of obs: 27, groups: Variety:Stand, 9; Variety, 3; Stand, 3 Fixed effects: Estimate Std. Error t value (Intercept) 7.1852 0.6919 10.38 > ## Very small estimate of variance for Variety > ## Ch
Re: [R] superscript troubles
Hi, Try this, x = rnorm(1) y = rnorm(1) leg = bquote(r^2*"="*.(round(x,digits=3))*", P="*.(round(y, digits=3))) plot.new() legend (bty ="n","topright",legend=leg) HTH, baptiste 2009/11/2 Jacob Kasper : > I know that this has been revisited over and over, yet I cannot figure out > how to solve this case of superscript troubles... > I would like the 2 in r2 to be superscript, yet I am pasting text before and > after it. I have tried several variations but have not solved this yet, any > suggestions? > > legend (bty = > "n","topright",paste("r2=",round(summary(lat_x)$r.squared,digits=3),", > P=",round(coefficients(summary(lat_x))[2,4], digits=3))) > Thank you > Jacob > > -- > Jacob Kasper > http://twitter.com/Protect_Oceans > 66°04' N > 23°07' W > Coastal & Marine Management Master's Student > University Centre of the Westfjords > > Sundstræti 14 37 Devens Rd > Ísafjörður, 400 Swampscott, MA 01907 > Iceland USA > > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] superscript troubles
Jacob Kasper wrote: > I know that this has been revisited over and over, yet I cannot figure out > how to solve this case of superscript troubles... > I would like the 2 in r2 to be superscript, yet I am pasting text before and > after it. I have tried several variations but have not solved this yet, any > suggestions? > > legend (bty = > "n","topright",paste("r2=",round(summary(lat_x)$r.squared,digits=3),", > P=",round(coefficients(summary(lat_x))[2,4], digits=3))) > Thank you > Jacob You want the whole thing to be an expression. To insert values, bquote() is your friend. Something like this legend(bty="n", "topright", bquote(r^2 == .(round(summary(lat_x)$r.squared,digits=3) * "," * P == .(round(coefficients(summary(lat_x))[2,4], digits=3) ) ) (untested, if you want us to test, give us a reproducible example...) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interaction contrasts or posthoc test for glm (MASS) with ANOVA design
On Nov 2, 2009, at 8:40 AM, Maya Pfaff wrote: Dear R experts I am running a negative-binomial GLM (glm.nb) to test the null hypotheses that species 1 and 2 are equally abundant between site 1 and site2, and between each other. So, I have a 2x2 factorial design with factors Site (1,2) and Taxon (1,2). Since the Site:Taxon interaction is significant, I need to do the equivalent to a "post-hoc test" for ANOVA, however, the same tests (e.g. Tukey HSD) do not seem to be applicable for the GLM. I tried specifying orthogonal contrasts, but could not figure out what the interaction contrast (see Site1:Taxon1 in below example) means. I'm a bit puzzled by this request, since it appears that you already have the test result you seek in the form of the line beginning Site1:Taxon1. (It might be a different story if you had more species.) In the default glm setup, the contrasts are of type "treatment". Your Site1:Taxon1 coefficients would then be the mean difference (and se) from the estimates based on Intercept+Taxon+Site( on the scale being regressed on). You will notice that the z value does not change when you use treatment contrasts instead of those you specified. The only thing further to do would be to construct a reduced model without the interaction, take the difference in deviances of the model, and compare to chi-sq(significance level, df=1) and that would give you the ML test rather than the score test which results from the by coefficient tests. When I do that I get s1: 2 x log-likelihood: -1155.4010 s2: 2 x log-likelihood: -1181.8600 so > 1-pchisq(26.459,df=1) [1] 2.691913e-07 (Which is immaterially different than the Pr(>|z|) that you get from the default output. Of course that was the way we did it in school 20 years ago and it would be much faster to do: > anova(s1,s2) Likelihood ratio tests of Negative Binomial Models Response: counts Model theta Resid. df2 x log-lik. Testdf LR stat. Pr(Chi) 1 Site + Taxon 0.5263897 151 -1181.860 2 Site * Taxon 0.6126726 150 -1155.401 1 vs 2 1 26.45960 2.691083e-07 -- David Could you please advise me how to specify a meaningful interaction contrast (i.e. contrast species within sites)? Alternatively, could you recommend a way to do posthoc comparisons? Thanks for your time and kind regards Maya library(MASS) counts <- c(1, 4, 9, 2, 1, 4, 2, 4, 1, 3, 2, 2, 1, 3, 1, 1, 2, 1, 113, 83, 49, 46, 13, 52, 4, 10, 14, 10, 3, 19, 8, 21, 151, 186, 99, 11, 29, 24, 24, 62, 15, 98, 30, 21, 63, 29, 48, 11, 16, 35, 21, 17, 6, 2, 2, 3, 3, 4, 4, 2, 1, 2, 2, 3, 4, 8, 10, 3, 14, 3, 11, 23, 3, 51, 8, 8, 7, 1, 13, 8, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 5, 8, 1, 1, 20, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 3, 0, 1, 0, 5, 1, 1, 9, 0, 34, 4, 1, 17, 0, 7, 33, 86, 73, 67, 79, 109, 27, 37, 23, 12, 17, 41, 8, 38, 4, 23, 14, 49, 64, 39, 31, 156, 110, 97, 33, 170, 137, 72, 28, 54) Site <- factor(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)) Taxon <- factor(c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)) contrasts(Site) <- cbind(c(1,-1)) contrasts(Taxon) <- cbind(c(1,-1)) s1 <- glm.nb(counts ~ Site*Taxon) summary(s1) Call: glm.nb(formula = counts ~ Site * Taxon, init.theta = 0.612672617555492, link = log) Deviance Residuals: Min 1Q Median 3QMax -2.269021 -1.215727 -0.454719 0.003515 2.517288 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept)2.6139 0.1097 23.831 < 2e-16 *** Site1 -0.8664 0.1097 -7.899 2.81e-15 *** Taxon1-0.3814 0.1097 -3.477 0.000506 *** Site1:Taxon1 -0.5977 0.1097 -5.449 5.06e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for Negative Binomial(0.6127) family taken to be 1) Null deviance: 242.94 on 153 degrees of freedom Residual deviance: 176.20 on 150 degrees of freedom AIC: 1165.4 Number of Fisher Scoring iterations: 1 Theta: 0.6127 Std. Err.: 0.0690 2 x log-likelihood: -1155.4010 TukeyHSD(
[R] superscript troubles
I know that this has been revisited over and over, yet I cannot figure out how to solve this case of superscript troubles... I would like the 2 in r2 to be superscript, yet I am pasting text before and after it. I have tried several variations but have not solved this yet, any suggestions? legend (bty = "n","topright",paste("r2=",round(summary(lat_x)$r.squared,digits=3),", P=",round(coefficients(summary(lat_x))[2,4], digits=3))) Thank you Jacob -- Jacob Kasper http://twitter.com/Protect_Oceans 66°04' N 23°07' W Coastal & Marine Management Master's Student University Centre of the Westfjords Sundstræti 1437 Devens Rd Ísafjörður, 400 Swampscott, MA 01907 Iceland USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to print the full name of the factors in summary?
Hi, I am wondering if there is a simple way to fix the problem I am having. For unknown reason, I could not get the full name of the factors to be printed in the summary. I have tried to used summary.lm as well but the problem still persists. SJ$Weekday <- factor(SJ$Weekday,1:7,c("Mon","Tue","Wed","Thu","Fri","Sat","Sun"),ordered=T) attach(SJ) lm.SJ <- lm(Demand ~ Weekday+Month+Holiday+Season) summary(lm.SJ) Call: lm(formula = Demand ~ Weekday + Month + Holiday + Season) Residuals: Min 1Q Median 3Q Max -69.767 -12.224 -1.378 10.857 91.376 Coefficients: (3 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) 88.7091 3.3442 26.527 < 2e-16 *** Weekday.L20.8132 2.8140 7.396 1.08e-12 *** Weekday.Q -12.7667 2.8156 -4.534 7.99e-06 *** Weekday.C -10.6375 2.8113 -3.784 0.000182 *** Weekday^4-8.3325 2.8103 -2.965 0.003238 ** - Is there a way for summary to print the full name of the factors and levels? Say Weekday.Tue instead Weekday.L? Thanks! Jack Chang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using processed objects as arguments of a function
This would return a vector the names of objects whose names begin with "fund": ls()[grep("^fund", ls())] I'm such a grep-noob that I don't know off the top of my head what the pattern should be to restrict it to only those objects with digits in the next position. Perhaps: do.call( "rbind", sapply(ls()[grep("^m", ls())], get) ) -- David. On Nov 2, 2009, at 5:24 AM, Jim Lemon wrote: On 11/02/2009 05:04 PM, Steven Kang wrote: Dear R users, I wish to utilise processed and saved objects as arguments of a function. Specifically, I have created objects using *"assign"*& *"paste"* functions with an incremental index i, the names of the objects are: fund1, fund2, fund3,., fund80,. (where the numerical value increments according to the index i& class of these objects are dataframes) I wish to collapse these objects row wisely using *"rbind"* function. paste("fund", 1:i, sep = "") results in list of objects as characters...& get(paste("fund", 1:i, sep = "")) outputs fund1... Are there any methods to use these objects as an argument of "rbind" to collapse the dataframes? Your expertise in resolving this issue would be highly appreciated. Hi Steven, There is probably a neater way to construct the list of dataframes, but this will probably do what you want: dnames<-paste("fund",1:nfunds,sep="") makelist<-function(x) { nitems<-length(x) newlist<-vector("list",nitems) for(item in 1:nitems) newlist[[item]]<-get(x[item]) return(newlist) } dflist<-makelist(dfnames) do.call("rbind",dflist) Of course all of the dataframes must have the same number of columns or the result will be messy or not there at all. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] partial matching with grep()
dear all, This is a probably a silly question. If I type > grep("x",c("a.x" ,"b.x","a.xx"),value=TRUE) [1] "a.x" "b.x" "a.xx" Instead, I would like to obtain only "a.x" "b.x" How is it possible to get this result with grep()? many thanks for your attention, best, vito -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert list to numeric
It's a way of extracting from a list. See help("[") or help("Extract"). dadrivr wrote: Great, that works very well. What is the purpose of double brackets vs single ones? I will remember next time to include a subset of the data, so that readers can run the script. Thanks again for your help! Benilton Carvalho wrote: it appears that what you really want is to use: task[[i]] instead of task[i] b On Nov 1, 2009, at 11:04 PM, dadrivr wrote: I would like to preface this by saying that I am new to R, so I would ask that you be patient and thorough, so that I'm not completely clueless. I am trying to convert a list to numeric so that I can perform computations on it (specifically mean-center the variable), but I am running into problems. I have imported the data set into "task" (data frame). The data frame is made of factors with variable names in the first row. I am running a loop to set a variable equal to a column in the data frame. Here is an example of my problem: for (i in 1:dim(task)[2]){ predictor.loop <- c(task[i]) predictor.loop.mc <- predictor.loop - mean(predictor.loop, na.rm=T) } I get the following error: Error in predictor.loop - mean(predictor.loop, na.rm = T) : non-numeric argument to binary operator In addition: Warning message: In mean.default(predictor.loop, na.rm = T) : argument is not numeric or logical: returning NA The column is entirely made up of numerical data, except for the header, which is a string. My problem is that I receive an error because the predictor.loop variable is not numerical, so I need to find a way to convert it. I tried using: predictor.loop <- c(as.numeric(task[i])) But I get the following error: "Error: (list) object cannot be coerced to type 'double'" If I call the variable, I can assign it to a numerical list (e.g., predictor loop <- task$variablename), but since I am assigning the variable in a loop, I have to find another way as the variable name would have to change in each loop iteration. Any help would be greatly appreciated. Thanks! -- View this message in context: http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert list to numeric
It's a way of extracting from a list. See help("[") or help("Extract") Regards, Adai dadrivr wrote: Great, that works very well. What is the purpose of double brackets vs single ones? I will remember next time to include a subset of the data, so that readers can run the script. Thanks again for your help! Benilton Carvalho wrote: it appears that what you really want is to use: task[[i]] instead of task[i] b On Nov 1, 2009, at 11:04 PM, dadrivr wrote: I would like to preface this by saying that I am new to R, so I would ask that you be patient and thorough, so that I'm not completely clueless. I am trying to convert a list to numeric so that I can perform computations on it (specifically mean-center the variable), but I am running into problems. I have imported the data set into "task" (data frame). The data frame is made of factors with variable names in the first row. I am running a loop to set a variable equal to a column in the data frame. Here is an example of my problem: for (i in 1:dim(task)[2]){ predictor.loop <- c(task[i]) predictor.loop.mc <- predictor.loop - mean(predictor.loop, na.rm=T) } I get the following error: Error in predictor.loop - mean(predictor.loop, na.rm = T) : non-numeric argument to binary operator In addition: Warning message: In mean.default(predictor.loop, na.rm = T) : argument is not numeric or logical: returning NA The column is entirely made up of numerical data, except for the header, which is a string. My problem is that I receive an error because the predictor.loop variable is not numerical, so I need to find a way to convert it. I tried using: predictor.loop <- c(as.numeric(task[i])) But I get the following error: "Error: (list) object cannot be coerced to type 'double'" If I call the variable, I can assign it to a numerical list (e.g., predictor loop <- task$variablename), but since I am assigning the variable in a loop, I have to find another way as the variable name would have to change in each loop iteration. Any help would be greatly appreciated. Thanks! -- View this message in context: http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert list to numeric
Great, that works very well. What is the purpose of double brackets vs single ones? I will remember next time to include a subset of the data, so that readers can run the script. Thanks again for your help! Benilton Carvalho wrote: > > it appears that what you really want is to use: > > task[[i]] > > instead of task[i] > > b > > On Nov 1, 2009, at 11:04 PM, dadrivr wrote: > >> >> I would like to preface this by saying that I am new to R, so I >> would ask >> that you be patient and thorough, so that I'm not completely >> clueless. I am >> trying to convert a list to numeric so that I can perform >> computations on it >> (specifically mean-center the variable), but I am running into >> problems. I >> have imported the data set into "task" (data frame). The data frame >> is made >> of factors with variable names in the first row. I am running a >> loop to set >> a variable equal to a column in the data frame. Here is an example >> of my >> problem: >> >> for (i in 1:dim(task)[2]){ >> predictor.loop <- c(task[i]) >> predictor.loop.mc <- predictor.loop - mean(predictor.loop, na.rm=T) >> } >> >> I get the following error: >> Error in predictor.loop - mean(predictor.loop, na.rm = T) : >> non-numeric argument to binary operator >> In addition: Warning message: >> In mean.default(predictor.loop, na.rm = T) : >> argument is not numeric or logical: returning NA >> >> The column is entirely made up of numerical data, except for the >> header, >> which is a string. My problem is that I receive an error because the >> predictor.loop variable is not numerical, so I need to find a way to >> convert >> it. I tried using: >> predictor.loop <- c(as.numeric(task[i])) >> But I get the following error: "Error: (list) object cannot be >> coerced to >> type 'double'" >> >> If I call the variable, I can assign it to a numerical list (e.g., >> predictor >> loop <- task$variablename), but since I am assigning the variable in >> a loop, >> I have to find another way as the variable name would have to change >> in each >> loop iteration. Any help would be greatly appreciated. Thanks! >> -- >> View this message in context: >> http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html >> Sent from the R help mailing list archive at Nabble.com. >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://old.nabble.com/convert-list-to-numeric-tp26155039p26157105.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Titles on lattice colorkey
r-help.20.trevva wrote: > > Dear R-ers, > > I'm not sure if this is a missing feature, a support request, or stupidity > on my part, but nevertheless, its a question. Is it possible to add titles > to colorkey legends? As far as I can tell, there is a command to do it for > normal "key" legends, but not for "colorkeys". > > eg it works for a normal key, created through auto.key > > xyplot(decrease ~ treatment, OrchardSprays, groups = rowpos, >type = "a", >auto.key = list(space = "right", points = FALSE, lines = > TRUE,title="Key title")) > > but there is no comparable command for a colorkey > > x <- seq(pi/4, 5 * pi, length = 100) > y <- seq(pi/4, 5 * pi, length = 100) > r <- as.vector(sqrt(outer(x^2, y^2, "+"))) > grid <- expand.grid(x=x, y=y) > grid$z <- cos(r^2) * exp(-r/(pi^3)) > levelplot(z~x*y, grid, cuts = 50, scales=list(log="e"), xlab="", > ylab="", main="Weird Function", sub="with log scales", > region = TRUE, > colorkey = list(space="right",title="Doesn't work")) > > > Cheers, > > Mark > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > Maybe using library(grid) we can solve. Try levelplot(z~x*y, grid, cuts = 50, scales=list(log="e"), xlab="", ylab="", main="Weird Function", sub="with log scales", region = TRUE, colorkey=list(space="right", title="Doesn't work"), legend=list(top=list(fun=grid::textGrob("Size\n(m)", x=1.09 Walmes Zeviani. -- View this message in context: http://old.nabble.com/Titles-on-lattice-colorkey-tp22868692p26156677.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Interaction contrasts or posthoc test for glm (MASS) with ANOVA design
Dear R experts I am running a negative-binomial GLM (glm.nb) to test the null hypotheses that species 1 and 2 are equally abundant between site 1 and site2, and between each other. So, I have a 2x2 factorial design with factors Site (1,2) and Taxon (1,2). Since the Site:Taxon interaction is significant, I need to do the equivalent to a "post-hoc test" for ANOVA, however, the same tests (e.g. Tukey HSD) do not seem to be applicable for the GLM. I tried specifying orthogonal contrasts, but could not figure out what the interaction contrast (see Site1:Taxon1 in below example) means. Could you please advise me how to specify a meaningful interaction contrast (i.e. contrast species within sites)? Alternatively, could you recommend a way to do posthoc comparisons? Thanks for your time and kind regards Maya > library(MASS) > counts <- c(1, 4, 9, 2, 1, 4, 2, 4, 1, 3, 2, 2, 1, 3, 1, 1, 2, 1, 113, 83, 49, 46, 13, 52, 4, 10, 14, 10, 3, 19, 8, 21, 151, 186, 99, 11, 29, 24, 24, 62, 15, 98, 30, 21, 63, 29, 48, 11, 16, 35, 21, 17, 6, 2, 2, 3, 3, 4, 4, 2, 1, 2, 2, 3, 4, 8, 10, 3, 14, 3, 11, 23, 3, 51, 8, 8, 7, 1, 13, 8, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 5, 8, 1, 1, 20, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 3, 0, 1, 0, 5, 1, 1, 9, 0, 34, 4, 1, 17, 0, 7, 33, 86, 73, 67, 79, 109, 27, 37, 23, 12, 17, 41, 8, 38, 4, 23, 14, 49, 64, 39, 31, 156, 110, 97, 33, 170, 137, 72, 28, 54) > Site <- factor(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)) > Taxon <- factor(c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)) > contrasts(Site) <- cbind(c(1,-1)) > contrasts(Taxon) <- cbind(c(1,-1)) > s1 <- glm.nb(counts ~ Site*Taxon) > summary(s1) Call: glm.nb(formula = counts ~ Site * Taxon, init.theta = 0.612672617555492, link = log) Deviance Residuals: Min 1Q Median 3QMax -2.269021 -1.215727 -0.454719 0.003515 2.517288 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept)2.6139 0.1097 23.831 < 2e-16 *** Site1 -0.8664 0.1097 -7.899 2.81e-15 *** Taxon1-0.3814 0.1097 -3.477 0.000506 *** Site1:Taxon1 -0.5977 0.1097 -5.449 5.06e-08 *** --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 (Dispersion parameter for Negative Binomial(0.6127) family taken to be 1) Null deviance: 242.94 on 153 degrees of freedom Residual deviance: 176.20 on 150 degrees of freedom AIC: 1165.4 Number of Fisher Scoring iterations: 1 Theta: 0.6127 Std. Err.: 0.0690 2 x log-likelihood: -1155.4010 > > TukeyHSD(s1) Error in UseMethod("TukeyHSD") : no applicable method for "TukeyHSD" [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.