Re: [R] Array analogue of row and col
slice.index() in base On 4/2/2013 9:53 AM, Enrico Bibbona wrote: Great! Thanks a lot, Enrico 2013/4/2 Duncan Murdoch On 02/04/2013 6:36 AM, Enrico Bibbona wrote: Is there any function that extends to multidimentional arrays the functionalities of "row" and "col" which are just defined for matrices? Thanks, Enrico Bibbona Not as far as I know, but there are a lot of functions in packages. You could write your own something like this. I've skipped any error checking; you'll want to add that. indices <- function(a, which) { d <- dim(a) prod_before <- prod(d[seq_len(which-1)]) result <- rep(seq_len(d[which]), each=prod_before, length.out = prod(d)) dim(result) <- d result } Then indices(a, 1) gives an array of the same shape as a where entry i, j, k, ... is i, indices(a, 2) has entry i,j,k, ... equal to j, etc. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Array analogue of row and col
slice.index() in base On 4/2/2013 6:36 AM, Enrico Bibbona wrote: Is there any function that extends to multidimentional arrays the functionalities of "row" and "col" which are just defined for matrices? Thanks, Enrico Bibbona [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] list of matrices --> array
abind() (from package 'abind') can take a list of arrays as its first argument, so in general, no need for do.call() with abind(). As another poster pointed out, simplify2array() can also be used; while abind() gives more options regarding which dimension is created and how dimension names are constructed. > x <- list(A=cbind(X=c(a=1,b=2,c=3,d=4),Y=5:8,Z=9:12), B=cbind(X=c(a=13,b=14,c=15,d=16),Y=17:20,Z=21:24)) $A X Y Z a 1 5 9 b 2 6 10 c 3 7 11 d 4 8 12 $B X Y Z a 13 17 21 b 14 18 22 c 15 19 23 d 16 20 24 > > dim(abind(x, along=3)) [1] 4 3 2 > dim(abind(x, along=1.5)) [1] 4 2 3 > dim(abind(x, along=0.5)) [1] 2 4 3 > dim(abind(x, along=1, hier.names=T)) # construct rownames in a hierarchical manner A.a, A.b, etc [1] 8 3 > dim(abind(x, along=2, hier.names=T)) # construct colnames in a hierarchical manner [1] 4 6 > abind(x, along=2, hier.names=T) A.X A.Y A.Z B.X B.Y B.Z a 1 5 9 13 17 21 b 2 6 10 14 18 22 c 3 7 11 15 19 23 d 4 8 12 16 20 24 > On 2/14/2013 3:53 AM, Rolf Turner wrote: require(abind) do.call(abind,c(my_list,list(along=0))) # Gives 2 x 4 x 5 do.call(abind,c(my_list,list(along=3))) # Gives 4 x 5 x 2 The latter seems more natural to me. cheers, Rolf Turner On 02/14/2013 07:03 PM, Murat Tasan wrote: i'm somehow embarrassed to even ask this, but is there any built-in method for doing this: my_list <- list() my_list[[1]] <- matrix(1:20, ncol = 5) my_list[[2]] <- matrix(20:1, ncol = 5) now, knowing that these matrices are identical in dimension, i'd like to unfold the list to a 2x4x5 (or some other permutation of the dim sizes) array. i know i can initialize the array, then loop through my_list to fill the array, but somehow this seems inelegant. i also know i can vectorize the matrices and unlist the list, then build the array from that single vector, but this also seems inelegant (and an easy place to introduce errors/bugs). i can't seem to find any built-in that handles this already... but maybe i just haven't looked hard enough :-/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in installing and starting Rattle
On Mon, Mar 7, 2011 at 6:57 AM, nerice wrote: > CHECK FOR CONFLICTS IN YOUR PATH !!! > > I had a related problem when trying to use library "RGtk2" for the first > time. My problem was that when loading the library R was looking for the > file "zlib1.dll" but couldn't find the procedure to launch RGtk2. I was > getting an "Entry Point not found" error from Rgui.exe. > > The reason was because I had another package in my PATH environment variable > (C:\program files\Intel\WiFi\bin) which had a CONFLICTING version of the > "zlib1.dll" - and it was looking in this file and not the "zlib1.dll" which > came with GTK+. > > Removed this conflict from the PATH and all was ok. > > I also had a related problem trying to use libary RGtk2 under Windows XP. I kept getting the message "unable to load shared object C:\R\site-library\RGtk2\libs\i386\RGtk2.dll" Uninstalling and reinstalling various versions of GTK2, both through R and outside R, many times, did not help. With the help of Neil Rice's comment, I found that there was another version of zlib1.dll in a directory on my PATH. Removing that directory from the path (inside R) fixed the problem and RGtk2 runs fine now. Here's an R expression to look for other copies on zlib1.dll on the path: > with(list(x=file.path(strsplit(Sys.getenv("PATH"), ";")$PATH, "zlib1.dll")), > x[file.exists(x)]) [1] "C:\\R\\GTK2-Runtime\\bin/zlib1.dll" > There are many ways to modify the PATH. To set PATH inside R: > Sys.setenv(PATH=paste(grep("UNWANTEDPATH", strsplit(Sys.getenv("PATH"), > ";")[[1]], value=T, invert=T), collapse=";")) (substitute some pattern that matches the unwanted directory for "UNWANTEDPATH", remembering to use four backslashes to match one). -- Tony Plate __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mvbutils and trackObjs
trackObjs will not work together with mvbutils -- both use active bindings to store objects on disk, and I would expect that trying to use both together would cause all sort of nasty conflicts. I would think it would be possible to fold the creation/modification time recording from trackObjs into mvbutils, but it would probably be a significant amount of work. -- Tony Plate On 05/31/2010 09:56 PM, Day, Roger S wrote: Hello Colleagues, I've recently become a fan of Mark Bravington's mvbutils package for organizing analysis projects in a tree. Using cd(), Save(), fixr(), mlazy() etcetera solves nicely some of the nuisances that have worried or annoyed me and sometimes caused big problems over the years. Well thought out. Now one feature that would be fabulous would be automatic time-stamping of objects. The trackObjs package from provides this, among other services. The question is, can the two packages work together peacefully? One curiosity: they have opposite ideas of the word "cache". In mvbutils, a cached object is one that is stored on disk, only retrieved into memory when needed. In trackObjs, a cached object is one that is stored in memory as well as on disk. More than a curiosity, also potentially a bad sign for compatibility between the two packages. Anyone have experience with the two together, and care to share it? Thanks! Roger Day University of Pittsburgh Departments of Biomedical Informatics and Biostatistics University of Pittsburgh Cancer Institute University of Pittsburgh Molecular Medicine Institute - Room 310, Suite 301 Cancer Pavilion (CNPAV) 5150 Centre Ave. Pittsburgh, PA 15232 e-mail: da...@pitt.edu cell phone 412-609-3918 assistant: Lucy Cafeo: (412) 623-2952 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variable variables using R ... e.g., looping over data frames with a numeric separator
On 05/17/2010 03:51 PM, Monte Shaffer wrote: for(i in 1:L-1) { dataStr = gsub(' ','',paste("fData.",i)); dataVar = eval(dataStr); ## GOAL is to grab data frame 'fData.1' and do stuff with it, then in next loop grab data frame 'fData.2' and do stuff with it } As Dan Davison said, the more standard R way would be to put all your data frames in a list, then iterate over the list. However, if do want to have your data frames in separate variables, and then get each data frame in a loop similar to the code fragment above, try something like this: for (i in 1:(L-1)) { dataName <- paste("fData.", i, sep="") df <- get(dataName) ... do something with data frame df ... } You can also give additional arguments to get() to tell it where to look (pos=,envir=), and whether to look in parent environments (inherits=TRUE/FALSE). -- Tony Plate __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the default Date format for write.csvfunction?
You can use a numeric value for the quote= argument to write.table to specify which columns should have quotes. d <- data.frame(ticker=c("IBM", "IBM"), date = as.Date(c("2009-12-03", "2009-12-04")), price=c(120.00, 123.00)) d1 <- as.data.frame(lapply(d, function(x) if (is(x, "Date")) format(x, "%m/%d/%Y") else x)) write.table(d1, quote=which(sapply(d, function(x) !is.numeric(x) & !is(x, "Date" "ticker" "date" "price" "1" "IBM" 12/03/2009 120 "2" "IBM" 12/04/2009 123 -- Tony Plate William Dunlap wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of george@bnymellon.com Sent: Monday, December 28, 2009 8:18 AM To: r-help@r-project.org Subject: Re: [R] How to change the default Date format for write.csvfunction? Hi, This problem might be a little harder than it appears. I receive a few emails all suggesting that convert the Date field to character by calling format(date, "%m/%d/%Y") in one way or another. Well, this is not the solution I'm looking for and it doesn't work for me. All the date fields are generated with quotes around them, which will be treated by other software as string instead of date. Please note, the write.csv() function doesn't put quotes around date. All I need is to change the format behavior of Date without adding any quotes. So the output of CSV I'm looking for should be: "ticker","date","price" "IBM",12/03/2009,120 "IBM",12/04/2009,123 Not this: "ticker","date","price" "IBM","12/03/2009",120 "IBM","12/04/2009",123 Write a function that adds double quotes for the columns with classes that you want quoted and call write.csv with quote=FALSE. E.g., the following function f puts double quotes around character and factor columns: f <- function (dataframe) { doubleQuoteNoFancy <- function(x) paste("\"", x, "\"", sep = "") for (i in seq_along(dataframe)) { if (is(dataframe[[i]], "character")) dataframe[[i]] <- doubleQuoteNoFancy(dataframe[[i]]) else if (is(dataframe[[i]], "factor")) levels(dataframe[[i]]) <- doubleQuoteNoFancy(levels(dataframe[[i]])) else if (is(dataframe[[i]], "Date")) dataframe[[i]] <- format(dataframe[[i]], "%m/%d/%Y") } colnames(dataframe) <- doubleQuoteNoFancy(colnames(dataframe)) dataframe } Use it as: > write.csv(f(d), file=stdout(), quote=FALSE, row.names=FALSE) "ticker","date","price" "IBM",12/03/2009,120 "IBM",12/04/2009,123 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Thanks for trying though. George From: george@bnymellon.com To: r-help@r-project.org Date: 12/28/2009 10:20 AM Subject: [R] How to change the default Date format for write.csv function? Sent by: r-help-boun...@r-project.org Hi, I have a data.frame containing a Date column. When using write.csv() function to generate a CSV file, I always get the Date column formatted as "-MM-DD". I would like to have it formatted as "MM/DD/", but could not find an easy way to do it.Here is the test code: d <- data.frame(ticker=c("IBM", "IBM"), date = as.Date(c("2009-12-03", "2009-12-04")), price=c(120.00, 123.00)) write.csv(d, file="C:/temp/test.csv", row.names=FALSE) The test.csv generated looks like this: "ticker","date","price" "IBM",2009-12-03,120 "IBM",2009-12-04,123 I would like to have the date fields in the CSV formatted as "MM/DD/". Is there any easy way to do this? Thanks in advance. George Zou The information contained in this e-mail, and any attachment, is confidential and is intended solely for the use of the intended recipient. Access, copying or re-use of the e-mail or any attachment, or any information contained therein, by any other person is not authorized. If you are not the intended recipient please return the e-mail to the sender and delete it from your computer. Although we attempt to sweep e-mail and attachments for viruses, we do not guarantee that either are virus-free and accept no liability for any damage sustained as a result of viruses. Please refer to http://disclaimer.bnymellon.com/eu.htm for certain disclosures relating to European legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org m
Re: [R] 2D array of strings
matrix(str, ncol=1) Francesco Napolitano wrote: Sorry for the dumb question, but I couldn't figure this out myself. Consider the following: str <- c("abc","def") array(str, c(2,1)) [,1] [1,] "abc" [2,] "def" How can i obtain the outcome of the second instruction without specifying the number of rows? Thank you in advance, Francesco. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] Error in namespaceExport(ns, exports) :
Try looking in the NAMESPACE file (in the same directory as the DESCRIPTION file). -- Tony Plate David Scherrer wrote: Dear all, I get the error "Error in namespaceExport(ns, exports) : undefined exports function1 , function2" when compiling or even when I roxygen my package. The two function I once had in my package but I deleted them including their .Rd files. I also can't find them in any other function or help file. So does anybody know where these functions are still listed that causes this error? Many thanks, David [[alternative HTML version deleted]] __ r-de...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with complicated regular expression
One of these should be a start. If there can be no extra text at the beginning or end, start with "^" and end with "$". x <- c("WORD ( 123)", "WORD(1 )", "WORD\t ( 21\t)", "WORD \t ( 1 \t )", "decoy((2))", "more words in front(2)") grep("[[:alpha:]]+[ \t]*\\([ \t]*[0-9]+[ \t]*\\)", x) [1] 1 2 3 4 6 grep("^[[:alpha:]]+[ \t]*\\([ \t]*[0-9]+[ \t]*\\)", x) [1] 1 2 3 4 -- Tony Plate Dennis Fisher wrote: Colleagues, I am using R (2.9.2, all platforms) to search for a complicated text string using regular expressions. I would appreciate any help you can provide. The string consists of the following elements: SOMEWORDWITHNOSPACES any number of spaces and/or tabs ( any number of spaces and/or tabs integer any number of spaces and/or tabs ) Examples include: WORD ( 123) WORD(1 ) WORD\t ( 21\t) WORD \t ( 1 \t ) etc. I don't need to substitute anything, only to identify if such a string exists. Any help with regular expressions would be appreciated. Thanks. Dennis Dennis Fisher MD P < (The "P Less Than" Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-866-PLessThan (1-866-753-7784) www.PLessThan.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loadings and scores from fastICA?
ICA and PCA both model the data as a product of two matrices (usually called something like components or loadings & weights or scores). It's how those matrices are constructed that differs. PCA is often a first step in doing ICA. I'd suggest reading the ICA tutorial by *Aapo Hyvärinen and Erkki Oja * (http://www.cis.hut.fi/aapo/papers/IJCNN99_tutorialweb/) -- it's an excellent introduction. -- Tony Plate Joel Fürstenberg-Hägg wrote: Ok, so then the S gives the individual components, good. Thanks Tony! But what about the principal components from the PCA plot, how are they calculated? And are the linear mixing matrix A really the same as the loadings/weights? There must be different loadings for the PCA and ICA right? Best regards, Joel > Date: Wed, 11 Nov 2009 14:29:06 -0700 > From: tpl...@acm.org > To: joel_furstenberg_h...@hotmail.com > CC: r-help@r-project.org > Subject: Re: [R] Loadings and scores from fastICA? > > The help for fastICA says: > > The data matrix X is considered to be a linear combination of > non-Gaussian (independent) components i.e. X = SA where columns of > S contain the independent components and A is a linear mixing > matrix. > > The value of fastICA is a list with components "S" (the estimated source matrix) and "A" (the estimated mixing matrix). Are these what you want? > > -- Tony Plate > > Joel Fürstenberg-Hägg wrote: > > Hi all, > > > > > > > > Does anyone know how to get the independent components and loadings from an Independent Component Analysis (ICA), as well as principal components and loadings from a Pricipal Component analysis (PCA) using the fastICA package? Or perhaps if there's another way to do ICAs in R? > > > > > > Below is an example from the fastICA manual (http://cran.r-project.org/web/packages/fastICA/fastICA.pdf) > > > > > > > > if(require(MASS)) > > { > > x <- mvrnorm(n = 1000, mu = c(0, 0), Sigma = matrix(c(10, 3, 3, 1), 2, 2)) > > x1 <- mvrnorm(n = 1000, mu = c(-1, 2), Sigma = matrix(c(10, 3, 3, 1), 2, 2)) > > X <- rbind(x, x1) > > a <- fastICA(X, 2, alg.typ = "deflation", fun = "logcosh", alpha = 1, method = "R", row.norm = FALSE, maxit = 200, tol = 0.0001, verbose = TRUE) > > par(mfrow = c(1, 3)) > > plot(a$X, main = "Pre-processed data") > > plot(a$X%*%a$K, main = "PCA components") > > plot(a$S, main = "ICA components") > > } > > > > > > > > Best regards, > > > > > > > > Joel > > > > _ > > Hitta kärleken i vinter! > > http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 > > [[alternative HTML version deleted]] > > > > > > > > > > > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > kolla in resten av Windows LiveT. Inte bara e-post - Windows LiveT är mycket mer än din inkorg. Mer än bara meddelanden <http://www.microsoft.com/windows/windowslive/> __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loadings and scores from fastICA?
The help for fastICA says: The data matrix X is considered to be a linear combination of non-Gaussian (independent) components i.e. X = SA where columns of S contain the independent components and A is a linear mixing matrix. The value of fastICA is a list with components "S" (the estimated source matrix) and "A" (the estimated mixing matrix). Are these what you want? -- Tony Plate Joel Fürstenberg-Hägg wrote: Hi all, Does anyone know how to get the independent components and loadings from an Independent Component Analysis (ICA), as well as principal components and loadings from a Pricipal Component analysis (PCA) using the fastICA package? Or perhaps if there's another way to do ICAs in R? Below is an example from the fastICA manual (http://cran.r-project.org/web/packages/fastICA/fastICA.pdf) if(require(MASS)) { x <- mvrnorm(n = 1000, mu = c(0, 0), Sigma = matrix(c(10, 3, 3, 1), 2, 2)) x1 <- mvrnorm(n = 1000, mu = c(-1, 2), Sigma = matrix(c(10, 3, 3, 1), 2, 2)) X <- rbind(x, x1) a <- fastICA(X, 2, alg.typ = "deflation", fun = "logcosh", alpha = 1, method = "R", row.norm = FALSE, maxit = 200, tol = 0.0001, verbose = TRUE) par(mfrow = c(1, 3)) plot(a$X, main = "Pre-processed data") plot(a$X%*%a$K, main = "PCA components") plot(a$S, main = "ICA components") } Best regards, Joel _ Hitta kärleken i vinter! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] partial cumsum
William Dunlap wrote: Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of smu Sent: Wednesday, November 11, 2009 7:58 AM To: r-help@r-project.org Subject: [R] partial cumsum Hello, I am searching for a function to calculate "partial" cumsums. For example it should calculate the cumulative sums until a NA appears, and restart the cumsum calculation after the NA. this: x <- c(1, 2, 3, NA, 5, 6, 7, 8, 9, 10) should become this: 1 3 6 NA 5 11 18 26 35 45 Perhaps > ave(x, rev(cumsum(rev(is.na(x, FUN=cumsum) [1] 1 3 6 NA 5 11 18 26 35 45 Nice simple function! Here's a different approach I use that's faster for long vectors with many NA values. Note however, that this approach can suffer from catastrophic round-off error because it does a cumsum over the whole vector (after replacing NA's with zeros) and then subtracting out the cumsum at the most recent NA values. Most of the body of this function is devoted to allowing (an unreasonable degree of) flexibility in specification of where to reset. cumsum.reset <- function(x, reset.at=which(is.na(x)), na.rm=F) { # compute the cumsum of x, resetting the cumsum to 0 at each element indexed by reset.at if (is.logical(reset.at)) { if (length(reset.at)>length(x)) { if ((length(reset.at) %% length(x))!=0) stop("length of reset.at must be a multiple of length of x") x <- rep(x, len=length(reset.at)) } else if (length(reset.at) x <- c(1, 2, 3, NA, 5, 6, 7, 8, 9, 10) cumsum.reset(x) [1] 1 3 6 0 5 11 18 26 35 45 ave(x, rev(cumsum(rev(is.na(x, FUN=cumsum) [1] 1 3 6 NA 5 11 18 26 35 45 The speedup from not breaking the input vector into smaller vectors is (to me) surprisingly small -- only a factor of 3: x <- replace(rnorm(1e6), sample(1e6, 1), NA) all.equal(replace(ave(x, rev(cumsum(rev(is.na(x, FUN=cumsum), is.na(x), 0), cumsum.reset(x)) [1] TRUE system.time(cumsum.reset(x)) user system elapsed 0.310.030.35 system.time(ave(x, rev(cumsum(rev(is.na(x, FUN=cumsum)) user system elapsed 0.990.051.15 So, I'd go with the ave() approach unless this type of cumsum is the core of a long computationally intensive job. And if that's the case, it would make sense to code it in C. -- Tony Plate Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com any ideas? thank you and best regards, stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparison of vectors in a matrix
This is a tricky data entry problem. The right technique will depend on the fine details of the data, and it's not clear what those are. E.g., when you say "In my first column, for example, I have "henry" ", it's unclear to me whether or not the double quotes are part of the data or not - which is why it's nice to provide reproducible examples. But, if you do have quoted strings in your data fields as they exist in an R matrix, you can do something like the following: # each element of the matrix x contains one or more quoted strings, separated by commas x <- matrix(c('"a", "b"', '"c"', '"b"', '"d"'), ncol=2, dimnames=list(c("row1", "row2"), c("X","Y"))) x X Y row1 "\"a\", \"b\"" "\"b\"" row2 "\"c\"""\"d\"" # use R's parsing and evaluation to turn '"a", "b"' into c("a", "b"), and turn that # into a matrix containing character vectors of various lengths. matrix(lapply(parse(text=paste("c(", x, ")")), eval), ncol=ncol(x), dimnames=dimnames(x)) X Y row1 Character,2 "b" row2 "c" "d" - Tony Plate esterhazy wrote: Yes, thanks for this, this is exactly what I want to do. However, I have a remaining problem which is how to get R to understand that each entry in my matrix is a vector of names. I have been trying to import my text file with the names in each vector of names enclosed in quotes and separated by commas, or separated by spaces, or without quotes, etc, with no luck. Everytime, R seems to consider the vector of names as just one long name. In my first colum, for example, I have "henry", in the second, "mary", "ruth", and in the third "mary", "joseph", and I have no idea how to get R to see that "mary", "ruth", for example, is composed of two strings of text, rather than just one. Thanks for any further help! http://old.nabble.com/file/p26305756/ffoexample.txt ffoexample.txt Tony Plate wrote: Nice problem! If I understand you correctly, here's how to do it (with list-based matrices): set.seed(1) (x <- matrix(lapply(rpois(10,2)+1, function(k) sample(letters[1:10], size=k)), ncol=2, dimnames=list(1:5,c("A","B" A B 1 Character,2 Character,5 2 Character,2 Character,5 3 Character,3 Character,3 4 Character,5 Character,3 5 Character,2 "i" x[1,1] [[1]] [1] "c" "b" x[1,2] [[1]] [1] "c" "d" "a" "j" "f" (y <- cbind(x, "A-B"=apply(x, 1, function(ab) setdiff(ab[[1]], ab[[2]] A B A-B 1 Character,2 Character,5 "b" 2 Character,2 Character,5 "g" 3 Character,3 Character,3 Character,3 4 Character,5 Character,3 Character,2 5 Character,2 "i" Character,2 y[1,3] [[1]] [1] "b" -- Tony Plate esterhazy wrote: Hi, I have a matrix with two columns, and the elements of the matrix are vectors. So for example, in line 3 of column 1 I have a vector v31=("marc", "robert, "marie"). What I need to do is to compare all vectors in column 1 and 2, so as to get, for example setdiff(v31,v32) into a new column. Is there a way to do this in R? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparison of vectors in a matrix
Nice problem! If I understand you correctly, here's how to do it (with list-based matrices): set.seed(1) (x <- matrix(lapply(rpois(10,2)+1, function(k) sample(letters[1:10], size=k)), ncol=2, dimnames=list(1:5,c("A","B" A B 1 Character,2 Character,5 2 Character,2 Character,5 3 Character,3 Character,3 4 Character,5 Character,3 5 Character,2 "i" x[1,1] [[1]] [1] "c" "b" x[1,2] [[1]] [1] "c" "d" "a" "j" "f" (y <- cbind(x, "A-B"=apply(x, 1, function(ab) setdiff(ab[[1]], ab[[2]] A B A-B 1 Character,2 Character,5 "b" 2 Character,2 Character,5 "g" 3 Character,3 Character,3 Character,3 4 Character,5 Character,3 Character,2 5 Character,2 "i" Character,2 y[1,3] [[1]] [1] "b" -- Tony Plate esterhazy wrote: Hi, I have a matrix with two columns, and the elements of the matrix are vectors. So for example, in line 3 of column 1 I have a vector v31=("marc", "robert, "marie"). What I need to do is to compare all vectors in column 1 and 2, so as to get, for example setdiff(v31,v32) into a new column. Is there a way to do this in R? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prcomp - principal components in R
The output of summary prcomp displays the cumulative amount of variance explained relative to the total variance explained by the principal components PRESENT in the object. So, it is always guaranteed to be at 100% for the last principal component present. You can see this from the code in summary.prcomp() (see this code with getAnywhere("summary.prcomp")). Here's how to get the output you want (the last line in the transcript below): set.seed(1) summary(pc1 <- prcomp(x)) Importance of components: PC1 PC2 PC3 PC4 PC5 Standard deviation 1.175 1.058 0.976 0.916 0.850 Proportion of Variance 0.275 0.223 0.190 0.167 0.144 Cumulative Proportion 0.275 0.498 0.688 0.856 1.000 summary(pc2 <- prcomp(x, tol=0.8)) Importance of components: PC1 PC2 PC3 Standard deviation 1.17 1.058 0.976 Proportion of Variance 0.40 0.324 0.276 Cumulative Proportion 0.40 0.724 1.000 pc2$sdev [1] 1.1749061 1.0581362 0.9759016 pc1$sdev [1] 1.1749061 1.0581362 0.9759016 0.9164905 0.8503122 svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1) [1] 1.1749061 1.0581362 0.9759016 0.9164905 0.8503122 cumsum(pc1$sdev^2) / sum((svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1))^2) [1] 0.2752317 0.4984734 0.6883643 0.8558386 1.000 # output in terms of the cumulative % of the total variance cumsum(pc2$sdev^2) / sum((svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1))^2) [1] 0.2752317 0.4984734 0.6883643 It's probably better to get prcomp to compute all the components in the first place, because the SVD is the bulk of the computation anyway (so doing it again will be slower for large matrices.) Then just look at the most important principal components. However, there may be a shortcut for computing the values of D in the SVD of a matrix -- you could look for that if you have demanding computations (e.g., the sqrts of the eigen values of the covariance matrix of scaled x: sqrt(eigen(var(scale(x, center=T, scale=F)), only.values=T)$values)). -- Tony Plate zubin wrote: Hello, not understanding the output of prcomp, I reduce the number of components and the output continues to show cumulative 100% of the variance explained, which can't be the case dropping from 8 components to 3. How do i get the output in terms of the cumulative % of the total variance, so when i go from total solution of 8 (8 variables in the data set), to a reduced number of components, i can evaluate % of variance explained, or am I missing something?? 8 variables in the data set > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE) > summary(princ) Importance of components: PC1 PC2 PC3 PC4 PC5 PC6PC7PC8 Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366 Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238 Cumulative Proportion 0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.* > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.75) > summary(princ) Importance of components: PC1 PC2 PC3 Standard deviation 1.381 1.247 1.211 Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion 0.387 0.703 *1.000* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rm(list<-ls()) error
"<-" and "=" are not universally interchangable. args(rm) function (..., list = character(0L), pos = -1, envir = as.environment(pos), inherits = FALSE) The call rm(list <- ls()) assigns the result of ls() to the variable 'list' and passes that value as an anonymous argument to rm() (Probably more than you want to know: or it would if rm() didn't have non-standard evaluation rules -- as it happens, list <- ls() is recognized as an invalid argument before it is evaluated.) The call rm(list=ls()) calls rm() with the 'list' argument having the value of ls() Here's an example that doesn't confuse things by having non-standard evaluation rules: f <- function(a=1, b=2) cat("a=", a, "b=", b, "\n") b Error: object 'b' not found f(b <- 33) a= 33 b= 2 b [1] 33 f(b=33) a= 1 b= 33 -- Tony Plate Feng Li wrote: Dear R, Why rm(list<-ls()) gives an error but rm(list=ls()) not? I remember the operator ‘<-’ can be used anywhere... Thanks! Feng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create Artificial Binary Matrix based on probability
x <- matrix(sample(0:1, 1200, replace=T, prob=c(0.952, 0.048)), ncol=30) table(x) x 01 1131 69 x <- matrix(sample(0:1, 1200, replace=T, prob=c(0.952, 0.048)), ncol=30) table(x) x 01 1151 49 bikemike42 wrote: Dear All, I am trying to create an artificial binary matrix such that each cell has a probability of 0.048 of having a 1. So far the closest I've come is us by using a random poisson distribution with a mean of 0.048, but I can't figure out how to limit the max value to 1. Otherwise that would work fine it seems. Any suggestions? The main code I've got to create said matrix so far is: a<-replicate(26,rpois(57,0.048)) Thanks in Advance, Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] column names of a correlation matrix
Here's a simple example that might help you get what you want: set.seed(1) x <- matrix(rnorm(30), ncol=3, dimnames=list(NULL, letters[1:3])) (xc <- cor(x)) a b c a 1.000 -0.3767034 -0.7158385 b -0.3767034 1.000 0.6040273 c -0.7158385 0.6040273 1.000 (cd <- data.frame(elt=outer(colnames(xc), colnames(xc), paste, sep=":")[upper.tri(xc)], row=row(xc)[upper.tri(xc)], col=col(xc)[upper.tri(xc)], cor=xc[upper.tri(xc)])) elt row colcor 1 a:b 1 2 -0.3767034 2 a:c 1 3 -0.7158385 3 b:c 2 3 0.6040273 cd[order(-cd$cor),] elt row colcor 3 b:c 2 3 0.6040273 1 a:b 1 2 -0.3767034 2 a:c 1 3 -0.7158385 If you need something more efficient, try using which(..., arr.ind) to pick out matrix style indices, e.g.: (ii <- which(xc > -0.4 & upper.tri(xc), arr.ind=T)) row col a 1 2 b 2 3 cbind(ii, cor=xc[ii]) row colcor a 1 2 -0.3767034 b 2 3 0.6040273 -- Tony Plate Lee William wrote: Hi! All, I am working on a correlation matrix of 4217x4217 named 'cor_expN'. I wish to obtain pairs with highest correlation values. So, I did this b=matrix(data=NA,nrow=4217,ncol=1) rownames(b)=rownames(cor_expN) for(i in 1:4217){b[i,]=max(cor_expN[i,])} head(b) [,1] aaeA_b3241_14 0.7181912 aaeB_b3240_15 0.7513084 aaeR_b3243_15 0.7681684 aaeX_b3242_12 0.5230587 aas_b2836_14 0.6615927 aat_b0885_140.6344144 Now I want the corresponding columns for the above values. For that I tried this c=matrix(data=NA,nrow=4217,ncol=1) for(i in 1:4217){b[i,]=colnames(max(cor_expN[i,]))} And got the following error: Error in b[i, ] = colnames(max(cor_expN[i, ])) : number of items to replace is not a multiple of replacement length Any thoughts? Lee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LDA Precdict - Seems to be predicting on the Training Data
Maybe you're getting strange results because you're not supplying a data object to lda() when you build your fit. When I do it the "standard" way, predict.lda() uses the new data and produces a result of length 6 as expected: myDat <- read.csv("clipboard", sep="\t") fit <- lda(c1 ~ v1 + v2 + v3, data=myDat[1:10,]) predict(fit, myDat[11:16,]) $class [1] c c c b c a Levels: a b c ... -- Tony Plate BostonR wrote: When I import a simple dataset, run LDA, and then try to use the model to forecast out of sample data, I get a forecast for the training set not the out of sample set. Others have posted this question, but I do not see the answers to their posts. Here is some sample data: DateNames v1 v2 v3 c1 1/31/2009 Name1 0.714472361 0.902552278 0.783353694 a 1/31/2009 Name2 0.512158919 0.770451596 0.111853346 a 1/31/2009 Name3 0.470693282 0.129200065 0.800973877 a 1/31/2009 Name4 0.24236898 0.472219638 0.486599763 b 1/31/2009 Name5 0.785619735 0.628511593 0.106868172 b 1/31/2009 Name6 0.718718387 0.697257275 0.690326648 b 1/31/2009 Name7 0.327331186 0.01715109 0.861421706 c 1/31/2009 Name8 0.632011743 0.599040196 0.320741634 c 1/31/2009 Name9 0.302804404 0.475166304 0.907143632 c 1/31/2009 Name10 0.545284813 0.967196462 0.945163717 a 1/31/2009 Name11 0.563720418 0.024862018 0.970685281 a 1/31/2009 Name12 0.357614427 0.417490445 0.415162276 a 1/31/2009 Name13 0.154971203 0.425227967 0.856866993 b 1/31/2009 Name14 0.935080173 0.488659307 0.194967973 a 1/31/2009 Name15 0.363069339 0.334206603 0.639795596 b 1/31/2009 Name16 0.862889297 0.821752532 0.549552875 a Attached is the code: myDat <-read.csv(file="f:\\Systematiq\\data\\TestData.csv", header=TRUE,sep=",") myData <- data.frame(myDat) length(myDat[,1]) train <- myDat[1:10,] outOfSample <- myDat[11:16,] outOfSample <- (cbind(outOfSample$v1,outOfSample$v2,outOfSample$v3)) outOfSample <-data.frame(outOfSample) length(train[,1]) length(outOfSample[,1]) fit <- lda(train$c1~train$v1+train$v2+train$v3) forecast <- predict(fit,outOfSample)$class length(forecast)# I am expecting this to be same as lengthoutOfSample[,1]), which is 6 Output: length(forecast)# I am expecting this to be same as lengthoutOfSample[,1]), which is 6 [1] 10 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rbind to array members
Unfortunately, I can't read your examples. Can you repost without the formatting characters which are confusing when rendered in plain text? One thing to keep in mind is that a R array is a regular object -- e.g., if you have a 2 x 3 x 4 array, then if you want to add a row to a slice, you must add a row to all the slices. When I read "to add a row in place to a single table of the 3 dimensional array" it sounds like you might be trying to do something that's not possible with R arrays. However, if I could see your examples, then I probably give more help. -- Tony Plate Another Oneforyou wrote: <4adbca02.8020...@temple.edu> Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 =20 library(abind) ## array binding I've looked into using abind()=2C but it seems I might not understand it pr= operly. I can build my 2 table array=2C and insert a row into each table using: =A0=A0=A0=A0=A0=A0=A0 x <- array(0=2Cc(1=2C3=2C2)) =A0=A0=A0=A0=A0=A0=A0 x[=2C=2C1] <- c(1=2C2=2C3) =A0=A0=A0=A0=A0=A0=A0 x[=2C=2C2] <- c(7=2C8=2C9) And I can use rbind() or abind() to add a row to the first table and assign= it to a separate object. =A0=A0=A0=A0=A0=A0=A0 y <- rbind(x[=2C=2C1]=2C c(4=2C5=2C6)) =A0=A0=A0=A0=A0=A0=A0 z <- abind(x[=2C=2C1]=2C c(4=2C5=2C6)=2Calong=3D0) But I can't determine how to add a row in place to a single table of the 3 = dimensional array. For example=2C this does not work: =A0=A0=A0=A0=A0=A0=A0 x[=2C=2C1] <- abind(x[=2C=2C1]=2C c(4=2C5=2C6)=2Calon= g=3D0) Error in x[=2C =2C 1] <- abind(x[=2C =2C 1]=2C c(4=2C 5=2C 6)=2C along =3D = 0) :=20 =A0 number of items to replace is not a multiple of replacement length I would like 'x' to be: x =2C =2C 1 =A0=A0=A0=A0 [=2C1] [=2C2] [=2C3] [1=2C]=A0=A0=A0 1=A0=A0=A0 2=A0=A0=A0 3 [2=2C]=A0=A0=A0 4=A0=A0=A0 5=A0=A0=A0 6 =2C =2C 2 =A0=A0=A0=A0 [=2C1] [=2C2] [=2C3] [1=2C]=A0=A0=A0 7=A0=A0=A0 8=A0=A0=A0 9 Thanks for any help. =20 _ Hotmail: Trusted email with Microsoft=92s powerful SPAM protection. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] populating an array
R doesn't access arrays like C, use [i,j] to access a 2-d array, e.g.: my_array <- array(0,dim=c(2,2)) for(i in seq(1,2,by=1)){ + for(j in seq(1,2,by=1)){ + my_array[i,j] = i+j + } + } my_array [,1] [,2] [1,]23 [2,]34 tdm wrote: Hi, Can someone please give me a pointer as to how I can set values of an array? Why does the code below not work? my_array <- array(dim=c(2,2)) my_array[][] = 0 my_array [,1] [,2] [1,]00 [2,]00 for(i in seq(1,2,by=1)){ for(j in seq(1,2,by=1)){ my_array[i][j] = 5 } } Warning messages: 1: In my_array[i][j] = 5 : number of items to replace is not a multiple of replacement length 2: In my_array[i][j] = 5 : number of items to replace is not a multiple of replacement length my_array [,1] [,2] [1,]50 [2,]50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re use objects from within a custom made function
test$x doesn't evaluate the function, you want something like test(1,2)$x, e.g.: test <- function(i, j){ x <- i:j y <- i*j z <- i/j return(list(x=x,y=y,z=z)) } test(1,2)$x [1] 1 2 test(1,2)$y [1] 2 test(1,2)$z [1] 0.5 Or if you want to avoid evaluating your function multiple times: res <- test(1,2) res$x [1] 1 2 res$y [1] 2 res$z [1] 0.5 res $x [1] 1 2 $y [1] 2 $z [1] 0.5 Stropharia wrote: Hi everyone, i'm having a problem extracting objects out of functions i've created, so i can use them for further analysis. Here's a small example: # --- test <- function(i, j){ x <- i:j y <- i*j z <- i/j return(x,y,z) } # --- This returns the 3 objects as $x, $y and $z. I cannot, however, access these objects individually by typing for example: test$x I know i can do this by adding an extra arrow head to the assignment arrow (<<-), but I am sure that is not how it is done in some of the established R functions (like when calling lm$coef out of the lm function). Is there a simple command i've omitted from the function that allows access to objects inside it? Thanks in advance, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with the use of mtext to create main title over multiple plots
Try playing around with the "oma" setting in par() -- it sets the outer margins, which by default are zero. The following shows the mtext label for me, using the windows device: par(mfrow=c(2,2)) par("oma") [1] 0 0 0 0 par("oma"=c(0,0,2,0)) for (i in 1:4) plot(0:1,0:1) mtext(text = "my test plots", side = 3, outer = TRUE) Mark Kimpel wrote: I'm trying to use mtext to create a main title over multiple plots. Below is a simple self-contained example and my sessionInfo (I should note I've also tried this with R-2.8.1 with the same results). When I execute the code chunk below, I get the plots, but no title. I've tried this using the screen driver, pdf, and postscript. I've used different sizes of paper. I suspect I am making an elementary error but searching the help files and help archives hasn't provided me an answer. Thanks for any help, Mark # setwd("~/Desktop") pdf("my.test.plots.pdf", paper = "letter") par(mfrow=c(2,2)) for (i in 1:4){ plot(1:6, 1:6) } mtext(text = "my test plots", side = 3, outer = TRUE) dev.off() # R version 2.10.0 Under development (unstable) (2009-09-21 r49771) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] car_1.2-15 loaded via a namespace (and not attached): [1] tools_2.10.0 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, & Mobile & VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to have 'match' ignore no-matches
x <- data.frame(d=letters[1:3], e=letters[3:5]) lookuptable <- c(a="aa", c="cc", e="ee") match.or.keep <- function(x, lookuptable) {if (is.factor(x)) x <- as.character(x); m <- match(x, names(lookuptable)); ifelse(is.na(m), x, lookuptable[m])} # to return a matrix apply(x, 2, match.or.keep, lookuptable=lookuptable) de [1,] "aa" "cc" [2,] "b" "d" [3,] "cc" "ee" # to return a data frame as.data.frame(lapply(x, match.or.keep, lookuptable=lookuptable)) d e 1 aa cc 2 b d 3 cc ee Jill Hollenbach wrote: Let me clarify: I'm using this-- dfnew<- sapply(df, function(df) lookuptable[match(df, lookuptable [ ,1]), 2]) lookup 0101 01:01 0201 02:01 0301 03:01 0401 04:01 df 0101 0301 0201 0401 0101 0502 dfnew 01:01 03:01 02:01 04:01 01:01 NA but what I want is: dfnew2 01:01 03:01 02:01 04:01 01:01 0502 thanks again, Jill Jill Hollenbach wrote: Hi all, I think this is a very basic question, but I'm new to this so please bear with me. I'm using match to translate elements of a data frame using a lookup table. If the content of a particular cell is not found in the lookup table, the function returns NA. I'm wondering how I can just ignore those cells, and return the original contents if no match is found in the lookup table. Many thanks in advance, this site has been extremely helpful for me so far, Jill Jill Hollenbach, PhD, MPH Assistant Staff Scientist Center for Genetics Children's Hospital Oakland Research Institute jhollenb...@chori.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep or other complex string matching approach to capture necessary information...
You could use grep, but it's probably easier to use %in% (see also is.element()), e.g.: house_info[ house_info[,1] %in% c("Water damage", "water pipes damaged", "leaking water"), ] water_evaluation.water_evaluation_selection. house_number 6 water pipes damaged 489 8 water pipes damaged 512 11 water pipes damaged 597 19 Water damage 478 21 water pipes damaged 373 23 Water damage 465 house_info[ house_info[,1] %in% c("Water damage", "water pipes damaged", "leaking water"), 2] [1] 489 512 597 478 373 465 337 362 234 535 551 351 415 495 220 216 317 443 346 577 585 268 463 441 225 200 304 486 390 476 485 247 [33] 399 504 262 551 575 359 538 sort(unique(house_info[ house_info[,1] %in% c("Water damage", "water pipes damaged", "leaking water"), 2])) [1] 200 216 220 225 234 247 262 268 304 317 337 346 351 359 362 373 390 399 415 441 443 463 465 476 478 485 486 489 495 504 512 535 [33] 538 551 575 577 585 597 Also, an easier way to generated random integers is sample(), e.g. sample(1:3, size=5, rep=T) [1] 3 1 2 1 1 (This is more straightforward, and more easily avoids possibly unintended errors such as floor(runif(100, 1,6) never generating a 6, but do be careful of the gotcha that sample(2:3, ...) will generate a selection of 2's and 3's, while sample(3,...) will generate samples from 1, 2, and 3.) -- Tony Plate Jason Rupert wrote: Say I have the following data: house_number<-floor(runif(100, 200, 600)) water_evaluation<-c("No water damage", "Water damage", "Water On", "Water off", "water pipes damaged", "leaking water") water_evaluation_selection<-floor(runif(100, 1,6)) house_info<-data.frame(water_evaluation[water_evaluation_selection], house_number) And, that I only want to pull out the ones with negative water evaluations, i.e. Water damage, water pipes damaged, and leaking water. Should/could I use grep in order to pull the house numbers out of house_info with those negative water evaluations? I guess I want to know the house numbers from house_info where the water evaluation is negative. Is there a way to use grep or another R function in order to acquire that information? Thank you again in advance for any insights. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Downloading currency data from from Yahoo
you simply had the wrong symbol. a search on yahoo.com suggests the symbol might be "GBP=x", getSymbols('GBP=x',src='yahoo') str(get("GBP=X")) An ‘xts’ object from 2007-01-03 to 2009-09-23 containing: Data: num [1:707, 1:6] 0.51 0.51 0.52 0.52 0.52 0.52 0.51 0.51 0.51 0.51 ... - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:6] "GBP=X.Open" "GBP=X.High" "GBP=X.Low" "GBP=X.Close" ... Indexed by objects of class: [Date] TZ: GMT xts Attributes: List of 2 $ src: chr "yahoo" $ updated: POSIXct[1:1], format: "2009-09-24 19:58:01" -- Tony Plate Bogaso wrote: Hi, I wanted to download some currency data using "quantmod" package, however got following error : getSymbols('USD/GBP',src='yahoo') Error in download.file(paste(yahoo.URL, "s=", Symbols.name, "&a=", from.m, : cannot open URL 'http://chart.yahoo.com/table.csv?s=USD/GBP&a=0&b=01&c=2007&d=8&e=24&f=2009&g=d&q=q&y=0&z=USD/GBP&x=.csv' In addition: Warning message: In download.file(paste(yahoo.URL, "s=", Symbols.name, "&a=", from.m, : cannot open: HTTP status was '404 Not Found' Can anyone please tell me how to get rid of that? Your help will be highly appreciated. Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I get "predict.lm" results with manual calculations ? (a floating point problem)
Results can be slightly different when matrix algebra routines are called. Here's your example again. When the prediction is computed directly using matrix multiplication, the result is the same as 'predict' produces (at least in this case.) set.seed(1) n <- 100 x <- rnorm(n) y <- rnorm(n) aa <- lm(y ~ x) all.equal(as.numeric(predict(aa, new)), as.numeric(aa$coef[1] + aa$coef[2] * new$x), tol=0) [1] "Mean relative difference: 1.840916e-16" all.equal(as.numeric(predict(aa, new)), as.numeric(cbind(1, new$x) %*% aa$coef), tol=0) [1] TRUE These types of small differences are often not indicative of lower precision in one method, but rather just random floating-point inaccuracies that can depend on things like the order numbers are summed in (e.g., ((bigNegNum + bigPosNum) + smallPosNum) will often be slightly different to ((bigPosNum + smallPosNum) + bigNegNum). They can also depend on whether intermediate results are kept in CPU registers, which sometimes have higher precision than 64 bits. Usually, they're nothing to worry about, which is one of the major reasons that all.equal() has a non-zero default for the tol= argument. -- Tony Plate Tal Galili wrote: Hello dear r-help group I am turning for you for help with FAQ number 7.31: "Why doesn't R think these numbers are equal?" http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f *My story* is this: I wish to run many lm predictions and need to have them run fast. Using predict.lm is relatively slow, so I tried having it run faster by doing the prediction calculations manually. But doing that gave me problematic results (I won't go into the details of how I found that out). I then discovered that the problem was that the manual calculations I used for the lm predictions yielded different results than that of predict.lm, *here is an example*: predict.lm.diff.from.manual.compute <- function(sample.size = 100) { x <- rnorm(sample.size) y <- x + rnorm(sample.size) new <- data.frame(x = seq(-3, 3, length.out = sample.size)) aa <- lm(y ~ x) predict.lm.result <- sum(predict(aa, new, se.fit = F)) manual.lm.compute.result <- sum(aa$coeff[1]+ new * aa$coeff[2]) # manual.lm.compute.result == predict.lm.result return(all.equal(manual.lm.compute.result , predict.lm.result, tol=0)) } # and here are the results of running the code several times: predict.lm.diff.from.manual.compute(100) [1] "Mean relative difference: 1.046407e-15" predict.lm.diff.from.manual.compute(1000) [1] "Mean relative difference: 4.113951e-16" predict.lm.diff.from.manual.compute(1) [1] "Mean relative difference: 2.047455e-14" predict.lm.diff.from.manual.compute(10) [1] "Mean relative difference: 1.294251e-14" predict.lm.diff.from.manual.compute(100) [1] "Mean relative difference: 5.508314e-13" And that leaves me with *the question*: Can I reproduce more accurate results from the manual calculations (as the ones I might have gotten from predict.lm) ? Maybe some parameter to increase the precision of the computation ? Many thanks, Tal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] un run run...
You could try setting options(error=function() NULL). This should cause R in batch mode to continue running after an error (the same way it does in interactive mode.) -- Tony Plate Nir Shachaf wrote: Hi All, I am running an Rscript with a bunch of algorithms that are UNSTABLE under some parameter settings. At a certain point one of them sends error massage and my whole run STOPS! What I would like is to save the error massage in some file or variable and carry on to the next command line without stopping this run... Any help or ideas would be welcome, please, with a concrete example (not just - "have you thought of using 'tryCatch' etc.). Thanks all! Nir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A question on operation on list
Here's one way: set.seed(1) x1 <- lapply(1:5, function(i) rnorm(2)) x2 <- lapply(x1, function(x) outer(x, x)) Reduce("+", x2, 0) [,1] [,2] [1,] 1.768406 -1.534413 [2,] -1.534413 3.890200 and another way using the abind package: library(abind) dim(abind(along=0, x2)) [1] 5 2 2 colSums(abind(along=0, x2)) [,1] [,2] [1,] 1.768406 -1.534413 [2,] -1.534413 3.890200 -- Tony Plate megh wrote: Hi, I have created a list object like that : x = vector("list") for (i in 1:5) x[[i]] = rnorm(2) x Now I want to do two things : 1. for each i, I want to do following matrix calculation : t(x[[i]]) %*% x[[i]] i.e. for each i, I want to get a 2x2 matrix 2. Next I want to get x[[1]] + x[[2]] + I did following : res=vector("list"); res = sapply(x, function(i) t(x[[i]]) %*% x[[i]]) However above syntax is not giving desired result. Any suggestion please? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Checking a (new) package - examples require other package functions
Did you try putting library("your-package-name", char=TRUE) at the start of the examples? -- Tony Plate Rebecca Sela wrote: I am creating an R package. I ran R CMD check on the package, and everything passed until it tried to run the examples. Then, the result was: * checking examples ... ERROR Running examples in REEMtree-Ex.R failed. The error most likely occurred in: ### * AutoCorrelationLRtest flush(stderr()); flush(stdout()) ### Name: AutoCorrelationLRtest ### Title: Test for autocorrelation in the residuals of a RE-EM tree ### Aliases: AutoCorrelationLRtest ### Keywords: htest tree models ### ** Examples # Estimation without autocorrelation simpleEMresult<-RandomEffectsTree(Y~D+t+X, data=simpleREEMdata, random=~1|ID, simpleREEMdata$ID) Error: couldn't find function "RandomEffectsTree" Execution halted The function "RandomEffectsTree" is defined in the R code for the package. How can I refer to other functions from the package in examples? (I have the "Writing R-extensions" PDF, so it would be enough to point me to the right page, if the answer is in there and I just missed it.) Thanks! Rebecca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding rows common to two datasets
I think merge() can do what's wanted, but you do have to be careful that values match exactly. Here's an example where two data frames print the same in a row for columns 'a' and 'b', but are not exactly same. merge() returns zero rows. This problem can be fixed in this case by rounding, but that's not a good general solution because very close numbers can round to different numbers, e.g., 1.499 and 1.501. Here are examples: x <- data.frame(a=c(1.001,2), b=c(3,4), c=LETTERS[1:2]) y <- data.frame(a=c(1,2), b=c(3,5), c=LETTERS[3:4]) x a b c 1 1 3 A 2 2 4 B y a b c 1 1 3 C 2 2 5 D # x[1,"a"] and y[1,"a"] look the same, but are very slightly different merge(x, y, by=c("a", "b")) [1] a b c.x c.y <0 rows> (or 0-length row.names) # make x1 a version of x where the values are rounded to whole numbers x1 <- x x1$a <- round(x1$a) merge(x1, y, by=c("a", "b")) a b c.x c.y 1 1 3 A C # intersect() returns columns that are the same in each dataframe, not rows intersect(x, y) c 1 C 2 D intersect(x1, y) a c 1 1 C 2 2 D -- Tony Plate jim holtman wrote: You are missing a comma: common <- intersect(data_frame_x[,c("Latitude", "Longitude")], data_frame_y[,c("Latitude","Longitude")]) On Tue, Apr 28, 2009 at 5:49 AM, Steve Murray wrote: Thanks for the reply, however, when I do the following command, I receive the message: 'data frame with 0 columns and 0 rows'. I've checked again though, and there should be several thousand rows where the Latitude and Longitude pairs are the same. common <- intersect(data_frame_x[c("Latitude", "Longitude")], data_frame_y[c("Latitude","Longitude")]) common data frame with 0 columns and 0 rows Is there an obvious solution to this? Should I be using 'unique' instead, and if so, how would I get the above to correspond to this command? Thanks, Steve Date: Tue, 28 Apr 2009 13:36:51 +0530 Subject: Re: [R] Finding rows common to two datasets From: umesh.sriniva...@gmail.com To: smurray...@hotmail.com CC: r-help@r-project.org Dear Steve, Try ? intersect and see if that might help. Cheers, Umesh On Tue, Apr 28, 2009 at 1:29 PM, Steve Murray> wrote: Dear all, I have 2 data frames, both with 14 columns of data and differing numbers of rows. The first two columns are 'Latitude' and 'Longitude'. I want to find the pairs of Latitude and Longitude coordinates which are common to both datasets, and output a new data frame which is composed of these coincident rows. I tried using the 'unique' command, but had difficulties interpreting the help file. Many thanks for any help offered, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Puzzled by an error with apply()
One simple explanation for the error message you received is that you have a typo: 'lapply' instead of 'apply': x <- matrix(1:6, 2) apply(x, 1, sum) [1] 9 12 lapply(x, 1, sum) Error in match.fun(FUN) : '1' is not a function, character or symbol However, it's difficult to diagnose without at least seeing some cut'n'pasted transcripts. The output of traceback() would also be useful. -- Tony Plate Gang Chen wrote: I've written a function, myFunc, that works fine with myFunc(data, ...), but when I use apply() to run it with an array of data apply(myArray, 1, myFunc, ...) I get a strange error: Error in match.fun(FUN) : '1' is not a function, character or symbol which really puzzles me because '1' is meant to be the margin of the array I want to apply over, but how come does apply() treat it as a function? I have been successfully using apply() for a while, so I must have made a stupid mistake this time. Hopefully somebody can point out something obviously wrong without me providing any details of the function. TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sorting by creation time in ls()
R doesn't keep track of when objects were created, so that's not possible. If this is important to you, you could look at the 'trackObjs' package, which does this and also stores individual objects in individual files (and writes them to the file when they are changed in R). -- Tony Plate Alexy Khrabrov wrote: When trying to remember what did I do in the session, especially after coming back to it after a few days, I'd like to mimic Unix's ls -ltrh -- does R retain the timing a certain variable is created? If not, would it make a useful addition, to have ls with an option to sort by creation time? Cheers, Alexy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange behaviour of ISOdatetime
Have you checked that that time exists in the time zone you are using? From ?ISOdatetime: Note ... Remember that in most timezones some times do not occur and some occur twice because of transitions to/from summer time. What happens in those cases is OS-specific. You could try working out what your system is using as the transition to/from summer time. (If you need to generate times that are 2 hours after midnight, try using ISOdatetime to generate the midnight times and add 2 hours). On my system, all this works fine: ISOdatetime(1995,03,26,2,0,0) [1] "1995-03-26 02:00:00 MST" ISOdatetime(1995,03,26,0,0,0) + 2 * 60 * 60 [1] "1995-03-26 02:00:00 MST" sessionInfo() R version 2.8.1 (2008-12-22) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] TimeWarp_0.7abind_1.2-0 trackObjs_0.8-0 tap.misc_1.0 [5] bmc.misc_1.0RtTests_0.1-5 -- Tony Plate Pedro de Barros wrote: Hi All, I am watching a strange behaviour of ISOdatetime. In my work computer, I get NA when I try to do > ISOdatetime(1995,03,26,2,0,0) [1] NA But on other dates and/or times (hour) works OK > ISOdatetime(1995,03,25,2,0,0) [1] "1995-03-25 02:00:00 GMT" In my home computer, I do not have this problem. I am running the same version of R (2.8.1 patched) on both machines, the same version of Gnu Emacs (22.3.1) and the same version of ESS (5.3.10). Both are running Windows XP. Has anyone experienced this before? Pedro __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] abind
It looks like you are trying to construct a ragged array, where the extent of the dimensions varies. However, in R, ordinary arrays have a regular structure, e.g., the rows of a matrix always have the same number of columns. This is the kind of object abind() constructs. So, to bind your two matrices together, abind() requires that their dimensions match a <- matrix(1:6, ncol=3, byrow=T) b <- matrix(7:10, ncol=2, byrow=T) a [,1] [,2] [,3] [1,]123 [2,]456 b [,1] [,2] [1,]78 [2,]9 10 library(abind) abind(a, b, along=3) Error in abind(a, b, along = 3) : arg 'X2' has dims=2, 2, 1; but need dims=2, 3, X The only way to get abind to bind these together on the third dimension is to pad out the smaller matrix with NA's, e.g.: abind(a, cbind(b, NA), along=3) , , 1 [,1] [,2] [,3] [1,]123 [2,]456 , , 2 [,1] [,2] [,3] [1,]78 NA [2,]9 10 NA ('a' and 'b' do match on the number of rows, so you can bind them together as columns as does cbind(), e.g.: abind(a, b, along=2) [,1] [,2] [,3] [,4] [,5] [1,]12378 [2,]4569 10 ) If none of this is what you want, you could consider storing the matrices in a list, as another poster suggested. -- Tony Plate Suyan Tian wrote: I am trying to combine two arrays with different dimensions into one. For example The first one is 1 2 3 4 5 6 The second one is 7 8 9 10 The resulted one would be like , , 1 1 2 3 4 5 6 , , 2 7 8 9 10 I used abind to do this, but failed. Could somebody please let me know how to do this in R? Thanks so many. Suyan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] name returned by lapply
If you return the value as named list, you get your answer using unlist(res, recursive=F): > res <- lapply(1:2, function(i) {val <- list(i); names(val) <- paste("Hugo", i, sep="_"); return(val)}) > unlist(res, rec=F) $Hugo_1 [1] 1 $Hugo_2 [1] 2 > Antje wrote: Oh true, this would solve the problem too :-) Thanks a lot for the suggestions! Antje Martin Morgan schrieb: Antje <[EMAIL PROTECTED]> writes: Thanks a lot for your help! I know that I cannot directly access the list created, I just was not sure if there is any "format" of the return value which could provide additionally a name for the returned list. I tried to return the values as list with the appropriate name but then I end up with a list entry as list entry... Okay, then I'll solve it with a loop and thanks for the hint with the article maybe this: res <- lapply(1:5, function(i) list(key=paste("Hugo", i, sep="_"), val=i)) val <- lapply(res, "[[", "val") names(val) <- lapply(res, "[[", "key") val $Hugo_1 [1] 1 $Hugo_2 [1] 2 $Hugo_3 [1] 3 $Hugo_4 [1] 4 $Hugo_5 [1] 5 Martin Ciao, Antje Gavin Simpson schrieb: On Fri, 2008-07-18 at 14:19 +0200, Antje wrote: Hi Gavin, thanks a lot for your answer. Maybe I did not explain very well what I want to do and probably chose a bad example. I don't mind spaces or names starting with a number. I could even name it: "Hugo1", "Hugo2", ... My biggest problem is, that not only the values are calculated/estimated within my function but also the names (Yes, in reality my funtion is more complicated). Maybe it's easier to explain like this. the parameter x can be a coordinate position of mountains on earth. Within the funtion the height of the mountain is estimated and it's name. In the end, I'd like to get a list, where the entry is named like the mountain and it contains its height (or other measurements...) ## now that we have a list, we change the names to what you want names(ret) <- paste(1:10, "info_within_function") so this would not work, because I don't have the information anymore about the naming... OK, so you can't do what you want to do in the manner you tried, via lapply as you don't have control of how the list is produced once the loop over 1:10 has been performed. At the stage that 'test' is being applied, all it knows about is 'x' and it doesn;t have access to the list being built up by lapply(). The *apply family of functions help us to *not* write out formal loops in R, but here this is causing you a problem. So we can specify an explicit loop and fill in information as and when we want from within the loop ## create list to hold results n <- 10 ret <- vector(mode = "list", length = n) ## initialise loop for(i in seq_len(n)) { ## do whatever you need to do here, but this line just ## replicates what 'test' did earlier ret[[i]] <- c(1,2,3,4,5) ## now add the name in names(ret)[i] <- paste("Mountain", i, sep = "") } ret Alternatively, collect a vector of names during the loop and then once the loop is finished do a single call to names(ret) to replace all the names at once: n <- 10 ret <- vector(mode = "list", length = n) ## new vector to hold vector of names name.vec <- character(n) for(i in seq_len(n)) { ret[[i]] <- c(1,2,3,4,5) ## now we just fill in this vector as we go name.vec[i] <- paste("Mountain", i, sep = "") } ## now replace all the names at once names(ret) <- name.vec ret This latter version is likely to more efficient if n is big so you don't incur the overhead of the repeated calls to names() The moral of the story is to not jump to using *apply all the time to avoid loops. Loops in R are just fine, so use the tool that helps you do the job most efficiently *and* most transparently. Take a look at the R Help Desk article by Uwe Ligges and John Fox in the current issue of RNews: http://www.r-project.org/doc/Rnews/Rnews_2008-1.pdf Which goes into this in much more detail HTH G __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] automation of R? running an R script at a certain time each night?
Felipe Carrillo wrote: Hi: Edward, were you able to automate the process? if so, do you mind giving me a hint on how you did it? I am facing the same problem. I created a batch file which it runs fine using the task scheduler but only opens Tinn-R and R but it doesn't execute my script. My task scheduler executes everyday at 8:00 am This is my batch file: @echo off Start "" "C:\Program Files\R\R-2.7.0\bin\Rgui.exe" start "" "C:\Documents and settings\Desktop\Software\MyScript.r" What am I missing? Looks to me that the batch file above starts up the R GUI, and opens an editor, so it sounds like it is doing exactly what you're asking it to. Here's how I can get R to run as a scheduled task in Windows XP: Create a new "Scheduled Task" with the following in the "Run:" box: C:\Rbuild\R-2.6.2\bin\Rcmd.exe BATCH c:/Temp/Rtest.R Then schedule it how I want it to run. Here's the contents of my .R file and the output: $ cat c:/Temp/Rtest.R cat(file="c:/Temp/Rtest.txt", date(), "\n", append=TRUE) $ cat c:/Temp/Rtest.txt Fri May 23 10:34:22 2008 Fri May 23 10:34:50 2008 Fri May 23 10:40:01 2008 (I'm sure there are dozens of other ways of doing this, but the above worked for me.) -- Tony Plate --- Edward Wijaya <[EMAIL PROTECTED]> wrote: You might try cron job under Windows. http://drupal.org/node/31506 HTH. - Edward On Thu, May 22, 2008 at 8:51 AM, Thomas Pujol <[EMAIL PROTECTED]> wrote: I am using R in a Windows environment. I store my data in a Microsoft SQL database that gets updated automatically nightly. Once my SQL db is updated, I wish to automatically run an R "script" Any tips on "good" ways to approach this task? Is there an easy way to "launch" an R script using the Windows or other "scheduler"? Can I have an R-script run every hour, and within the script check to determine if my database is updated, and proceed only if it is updated? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Felipe D. Carrillo Fishery Biologist Department of the Interior US Fish & Wildlife Service California, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R is a virus, spyware or malware (gasp!)
And the installed Rgui.exe in 2.6.2 is 10240 bytes, which is their "most common file size." They also say (fill in your own comments...): > RGUI.EXE has been seen to perform the following behavior(s): > > * Can communicate with other computer systems using HTTP protocols > * This Process Creates Other Processes On Disk > * This Process Deletes Other Processes From Disk > * Executes a Process > * The Process is packed and/or encrypted using a software packing process > * Accesses the MS Outlook Address Book and > What you should do about RGUI.EXE: > > Check Your PC Now > The most common objects with the name of RGUI.EXE have yet to be classified as safe by our research department. So, I suspect that they don't actually classify it as a virus, they just haven't classified it as safe... (I couldn't see any email contact address on their web page.) -- Tony Plate Peter Dalgaard wrote: Ioannis Dimakos wrote: Of course it's a virus. Once you catch the virus, you feel this rush, this fever to abandon all other statistical packages. On a more serious note, though, the size of the RGUI.exe file as reported in the webpage is not even near close to the actual size of the R distribution exe pack. Hmm, the installed Rgui.exe in 2.7.0rc (which is all I have, under Wine on a Fedora machine) appears to be 27648 bytes, which is one of the cited numbers. Presumably something needs to be done... I On Sun, May 18, 2008 17:30, Philippe Grosjean wrote: After a search session in Google, I found this page: http://www.prevx.com/filenames/X1993788672854780728-0/RGUI.EXE.html which classifies Rgui.exe (clearly stated as "R for Windows GUI front-end") in a database of virus, spyware and malware! No comments! Philippe Grosjean __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Accessing items in a list of lists
Try this: > data1 <- list(a = 1, b = 2, c = 3) > data2 <- list(a = 4, b = 5, c = 6) > data3 <- list(a = 3, b = 6, c = 9) > comb <- list(data1 = data1, data2 = data2, data3 = data3) > sapply(comb, "[[", "a") data1 data2 data3 1 4 3 > # Also, this can be useful: > comb[[c("data2", "b")]] [1] 5 > [EMAIL PROTECTED] wrote: Using R 2.6.2, say I have the following list of lists, "comb": data1 <- list(a = 1, b = 2, c = 3) data2 <- list(a = 4, b = 5, c = 6) data3 <- list(a = 3, b = 6, c = 9) comb <- list(data1 = data1, data2 = data2, data3 = data3) So that all names for the lowest level list are common. How can I most efficiently access all of the sublist items "a" indexed by the outer list names? For example, I can loop through comb[[i]], unlisting as I go, and then look up the field "a", as below, but there has got to be a cleaner way. finaldata <- double(0) for(i in 1:length(names(comb))) { test <- unlist(comb[[i]]) finaldata <- c(finaldata, test[which(names(test) == "a")]) } data.frame(names(comb), finaldata) Gives what I want: names.comb. finaldata 1 data1 1 2 data2 4 3 data3 3 Any help you can give would be greatly appreciated. Thanks. This information is being sent at the recipient's reques...{{dropped:16}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Max consecutive increase in sequence
If the increases or decreases could be any size, rle(sign(diff(x))) could do it: > x <- c(1, 2, 3, 4, 4, 4, 5, 6, 5, 4, 3, 2, 1, 1, 1, 1, 1) > r <- rle(sign(diff(x))) > r Run Length Encoding lengths: int [1:5] 3 2 2 5 4 values : num [1:5] 1 0 1 -1 0 > i1 <- which(r$lengths==max(r$lengths[r$values==1]) & r$values==1)[1] > i2 <- which(r$lengths==max(r$lengths[r$values==-1]) & r$values==-1)[1] > i1 [1] 1 > i2 [1] 4 > rbind(up=c(start=cumsum(c(1, r$lengths))[i1], len=r$lengths[i1]), down=c(start=cumsum(c(1, r$lengths))[i2], len=r$lengths[i2])) start len up 1 3 down 8 5 > Ingmar Visser wrote: rle(diff(sq)) could be helpful here, best, Ingmar On May 13, 2008, at 11:19 PM, Marko Milicic wrote: Hi all R helpers, I'm trying to comeup with nice and elegant way of "detecting" consecutive increases/decreases in the sequence of numbers. I'm trying with combination of which() and diff() functions but unsuccesifuly. For example: sq <- c(1, 2, 3, 4, 4, 4, 5, 6, 5, 4, 3, 2, 1, 1, 1, 1, 1); I'd like to find way to calculate a) maximum consecutive increase = 3 (from 1 to 4) b) maximum consecutive decrease = 5 (from 6 to 1) All ideas are highly welcomed! -- This e-mail and any files transmitted with it are confid...{{dropped:14}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test
You probably should check this section in your R-help subscription options (via https://stat.ethz.ch/mailman/options/r-help/, I think): Receive your own posts to the list? Ordinarily, you will get a copy of every message you post to the list. If you don't want to receive this copy, set this option to No. I see 5 identical posts with the subject "Several questions about MCMClogit" on R-help recently. -- Tony Plate j t wrote: Sorry to bother your. I am trying to post my question for more than 10 times, but I still didn't see it. It drives my crazy!!! It is a test for posting some simple pure text. Chao __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Format integer
Try something like one of these (as documented in ?formatC) > formatC(13, flag="0", width=10) [1] "13" > sprintf("%010g", 13) [1] "13" > Anh Tran wrote: Hi, What's one way to convert an integer to a string with preceding 0's? such that '13' becomes '013' to be put into a string I've tried formatC, but they removes all the zeros and replace it with blanks Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replicating Rows
Another way is using straightforward indexing: > x <- cbind(trips=c(1,3,2), y=1:3, z=4:6) > x trips y z [1,] 1 1 4 [2,] 3 2 5 [3,] 2 3 6 > # generate row indices with the appropriate > # number of repeats > ii <- rep(seq(len=nrow(x)), x[,1]) [1] 1 2 2 2 3 3 > # use these indices to select data rows > x[ii, -1] y z [1,] 1 4 [2,] 2 5 [3,] 2 5 [4,] 2 5 [5,] 3 6 [6,] 3 6 > Jorge Ivan Velez wrote: Hi Marion, Try this: set.seed(123) mydf=data.frame(trips=rpois(10,5), matrix(rnorm(10*5),ncol=5)) mydf sapply(mydf[,-1],rep,mydf[,1]) HTH, Jorge On Wed, May 7, 2008 at 11:41 PM, <[EMAIL PROTECTED]> wrote: Hi, I have a data matrix in which there are 1000 rows by 30 columns. The first column of each row is a numeric indicating the number of trips taken to a particular location with location attributes in the following column entries for that row. I want to repeat each row based on the number of trips taken (as indicated by the number in the first column)...i.e., if 1,1 indicates 4 trips, I want to replicate row 1 four times, and do this for each entry of column 1. I have played with rep command with little luck. Can anyone help? This is probably very simple. Thank you, mw Marion Wittmann, Ph.D. candidate Environmental Science and Management University of California Santa Barbara, CA 93106-5131 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to stop buffering of "cat"
If you're using Rgui under Windows, see FAQ 7.1: 7.1 When using Rgui the output to the console seems to be delayed. This is deliberate: the console output is buffered ... (the FAQ says how to turn it off -- it's a menu item). Vidhu Choudhary wrote: Hi All, My R code takes very long time to finish the processing. I want to see at what stage the script is running. So I wrote some output messages using cat. But instead of displaying the cat messages at different stages they are buffered and displayed in the end when entire processing is done. Can you please suggest how to stop this buffering or some alternative way to display messages Thank you Vidhu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Applying user function over a large matrix
It's quite possible that much of the time spent in loess() is setting up the data (i.e., the formula, terms, model.frame, etc.), and that much of that is repeated identically for each call to loess(). I would suggest looking at the code of loess() and work out what arguments it is calling simpleLoess() with, and then try calling stats:::simpleLoess() directly. (Of course you have to be careful with this because this is not using the published API). -- Tony Plate Sudipta Sarkar wrote: Respected R experts, I am trying to apply a user function that basically calls and applies the R loess function from stat package over each time series. I have a large matrix of size 21 X 900 and I need to apply the loess for each column and hence I have implemented this separate user function that applies loess over each column and I am calling this function foo as follows: xc<-apply(t,2,foo) where t is my 21 X 900 matrix and loess. This is turning out to be a very slow process and I need to repeat this step for 25-30 such large matrix chunks. Is there any trick I can use to make this work faster? Any help will be deeply appreciated. Regards Sudipta Sarkar PhD Senior Analyst/Scientist Lanworth Inc. (Formerly Forest One Inc.) 300 Park Blvd., Ste 425 Itasca, IL Ph: 630-250-0468 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What objects will save.image saves ? And how to specify objects to be saved..
save.image() is a wrapper for save() (Type 'save.image' (without quotes) at the prompt to see the code for the function.) Try something like this (not tested): > load("oldvars.RData") > old.vars <- ls(all=TRUE) > ... computation ... > save(list=setdiff(ls(all=TRUE), old.vars), file="newvars.RData") You can play with the args (pattern= & all=) of ls() to select which vars you are interested in. If your computations might change some of the existing variables, and you want to save those at the end too, you'll probably have to keep track of those manually. AFAIK, it's not possible to add some new vars to an existing .RData file -- that's why I wrote the above to save them in a .RData file that's different to the one that contained the old variables. -- Tony Plate Ng Stanley wrote: > Read and reread, can't make out. Will try an experiment later > > On Wed, Apr 16, 2008 at 7:51 PM, Henrique Dallazuanna <[EMAIL PROTECTED]> > wrote: > >> See ?save >> >> On Wed, Apr 16, 2008 at 8:46 AM, Ng Stanley <[EMAIL PROTECTED]> >> wrote: >> >>> Hi, >>> >>> I have a R script that loads an image R.data, does some operations, then >>> save to the R.data again. Suppose I have done some computation before >>> loading the R script, will all the objects before the R script execution >>> be >>> saved to R.data ? If yes, how can I specify save.image to save only >>> those >>> objects created in the R script ? >>> >>> Thanks >>> Stanley >>> >>>[[alternative HTML version deleted]] >>> >>> __ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> -- >> Henrique Dallazuanna >> Curitiba-Paraná-Brasil >> 25° 25' 40" S 49° 16' 22" O > > [[alternative HTML version deleted]] > > > > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use of ellipses ... in argument list of optim(), integrate(), etc.
Ravi Varadhan wrote: > Hi, > > I have noticed that there is a change in the use of ellipses or . in R > versions 2.6.1 and later. In versions 2.5.1 and earlier, the . were always > at the end of the argument list, but in 2.6.1 they are placed after the main > arguments and before method control arguments. This results in the user > having to specify the exact (complete) names of the control arguments, i.e. > partial matching is not allowed. > > An example with integrate() : > >> integrate(function(x) exp(-x^2), lower=-Inf, upper=L, subdiv=1000) > > Error in f(x, ...) : unused argument(s) (subdiv = 1000) > >> integrate(function(x) exp(-x^2), lower=-Inf, upper=L, subdivisions=1000) > > 1.633051 with absolute error < 1.6e-06 > > Here is an example with optim(): > >> res <- optim(50, fw, meth="BFGS", control=list(maxit=2, temp=20, > parscale=20)) > > Error in fn(par, ...) : unused argument(s) (meth = "BFGS") > > FYI, I am using R version 2.6.1 on Windows XP. > > May I ask what the rationale behind this change is and also about the pros > and cons of the two different ways of specifying (.)? Putting optim() arguments after the ... disallows the use of abbreviated actual arguments for optim(). This is generally a good thing, because prior to this change, it was impossible to supply, via the '...' arguments of optim(), an argument to fn() whose name was a prefix of one of the arguments of optim(). E.g., if your function had a argument named 'm', you could not previously supply it via the '...' argument of optim(), because if you did something like optim(x, fun, m=240), intending 'm' to be passed to 'fun', the 'm' would instead match the 'method' argument of optim(). The cons of the new argument structure are that abbreviations for names of arguments of optim() can't be used (a minor and debatable con), and that previous code that used abbreviations might break, but it will likely break quickly and noisily, so it's not too bad (the only case where it wouldn't break is when fn has a '...' argument itself, and it ignores unrecognized components, or where the are other argument name collisions). -- Tony Plate > > > > Thank you very much. > > > > Best, > > Ravi. > > > > > --- > > Ravi Varadhan, Ph.D. > > Assistant Professor, The Center on Aging and Health > > Division of Geriatric Medicine and Gerontology > > Johns Hopkins University > > Ph: (410) 502-2619 > > Fax: (410) 614-9625 > > Email: [EMAIL PROTECTED] > > Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html > > > > > > > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using eval-parse-paste in a loop
Using eval-parse for this looks like overkill. You should just be able to do something straightforward like: for (i in 1:120) assign(paste("book", i, sep=""), read.xls(paste("Book", i, ".xls", sep=""), sheet=1, from=4, colClasses="numeric")) which would put your spreadsheets in variables book1, book2, etc. Or, if you want to put everything in a list: books <- lapply(1:120, function(i) read.xls(paste("Book", i, ".xls", sep=""), sheet=1, from=4, colClasses="numeric")) BTW, the reason your attempts with parse() were failing is that default argument of parse() is a filename (do args(parse) to see this quickly). You wanted parse(text=...). -- Tony Plate Michael Anyadike-Danes wrote: > R-helpers > > > > I have 120 small Excel sheets to read and I am using > library(xlsReadWrite): one example below. > > > > I had hoped to read sheets by looping over a list of numbers in their > name (eg Book1.xls, Book2.xls, etc). > > > > I thought I had seen examples which used eval-parse-paste in this way. > > > > However, I have not been able to get it to work.. > > > > 1. is this a feasible approach? > > > > 2. if not advice would be welcome. > > > > 3. Equally, advice about a better approach would also be v. welcome. > > > > I haven't included the data because my failed attempt is > data-independent (and probably more basic). > > > > > > Michael Anyadike-Danes > > > > # show that read.xls works > > > >> test <- read.xls("Book1.xls",sheet=1,from=4,colClasses="numeric") > >> str(test) > > 'data.frame': 23 obs. of 13 variables: > > $ Off.Flows..Thousands.: num 117.19NaN NA 1.43 2.26 ... > > $ Off.Flows..Thousands.: num 8.42 NaN NA 0.08 0.11 0.01 0.11 1.59 0.16 > 0.04 ... > > $ Off.Flows..Thousands.: num 20 NaN NA 0.2 0.3 0.02 0.32 4.39 0.41 > 0.11 ... > > $ Off.Flows..Thousands.: num 12.36 NaNNA 0.14 0.27 ... > > $ Off.Flows..Thousands.: num 7.59 NaN NA 0.11 0.18 0.01 0.14 1.46 0.23 > 0.06 ... > > $ Off.Flows..Thousands.: num 10.31 NaNNA 0.12 0.23 ... > > $ Off.Flows..Thousands.: num 7.55 NaN NA 0.08 0.2 0.01 0.11 1.6 0.23 > 0.05 ... > > $ Off.Flows..Thousands.: num 10.57 NaNNA 0.19 0.21 ... > > $ Off.Flows..Thousands.: num 9.36 NaN NA 0.13 0.26 0.02 0.13 2.12 0.27 > 0.07 ... > > $ Off.Flows..Thousands.: num 8.21 NaN NA 0.1 0.19 0.01 0.1 1.9 0.23 > 0.06 ... > > $ Off.Flows..Thousands.: num 9.04 NaN NA 0.13 0.11 0.01 0.17 1.54 0.17 > 0.05 ... > > $ Off.Flows..Thousands.: num 13.42 NaNNA 0.14 0.19 ... > > $ Off.Flows..Thousands.: num 0.37 NaN NA NaN 0.01 NaN 0.01 0.05 0.02 > NaN ... > > > > ### simple minded attempt substituting eval-parse-paste > > > >> nam <- 1 > > > >> test <- > eval(parse(paste('read.xls("Book',nam,'.xls",sheet=1,from=4,colClasses=" > numeric")',sep=''))) > > Error in file(file, "r") : unable to open connection > > In addition: Warning message: > > In file(file, "r") : > > cannot open file > 'read.xls("Book1.xls",sheet=1,from=4,colClasses="numeric")', reason > 'Invalid argument' > > > > ### stripping off eval, looking for clues > > > > parse(paste('read.xls("Book',nam,'.xls",sheet=1,from=4,colClasses="numer > ic")',sep='')) > > Error in file(file, "r") : unable to open connection > > In addition: Warning message: > > In file(file, "r") : > > cannot open file > 'read.xls("Book1.xls",sheet=1,from=4,colClasses="numeric")', reason > 'Invalid argument' > > > > ### stripping off parse, still looking for clues > > > > paste('read.xls("Book',nam,'.xls",sheet=1,from=4,colClasses="numeric")', > sep='') > > [1] "read.xls(\"Book1.xls\",sheet=1,from=4,colClasses=\"numeric\")" > > > > > > > > Economic Research Institute of Northern Ireland > > Floral Buildings > > 2-14 East Bridge Street > > Belfast BT1 3NQ > > Tel: (028) 90727362 > > Fax: (028) 90319003 > > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame with 0 rows.
Rolf Turner wrote: > For reasons best known only to myself ( :-) ) I wish to create a data > frame with 0 rows and 9 columns. > > The best I've been able to come up with is: > > junk <- as.data.frame(matrix(0,nrow=0,ncol=9)) > > Is there a sexier way? I'm unsure of their virtue or seediness, but here are some alternatives: > data.frame(a=numeric(0), b=numeric(0)) # include 9 arguments if you like [1] a b <0 rows> (or 0-length row.names) > as.data.frame(rep(list(a=numeric(0)), 9)) [1] a a.1 a.2 a.3 a.4 a.5 a.6 a.7 a.8 <0 rows> (or 0-length row.names) > -- Tony Plate > > cheers, > > Rolf > > ## > Attention:\ This e-mail message is privileged and confid...{{dropped:9}} > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: How to make t.test handle "NA" and "essentially constant values" ?
Petr PIKAL wrote: > Hi > > [EMAIL PROTECTED] napsal dne 12.02.2008 09:09:23: > >> Hi, >> >> First problem: >>> test <- matrix(c(1,1,2,1), 2,2) >>> apply(test, 1, function(x) { t.test(x) $p.value }) >> Error in t.test.default(x) : data are essentially constant > > make your data not constant > >> Second problem: >>> test <- matrix(c(1,0,NA,1), 2,2) >>> apply(test, 1, function(x) { t.test(x) $p.value }) >> Error in t.test.default(x) : not enough 'x' observations > > increase number of observations > > >> How to make t-test ignores this errors ? > > Well, the procedure is complaining that you do not give it correct data. > You shall be gratefull for a great software which prevent you from making > silly things as try to compute t.test when data have zero variantion or > number of observations is 1. It's nice that the software recognizes situations in which a sensible answer can't be computed. At that point, there are two possible actions: (1) stop with an informative error, and (2) silently return NA. Option (1) is wonderful for interactive use, but option (2) is easier to handle in programs where one is making many calls to the function as part of some automated procedure (e.g., as part of a bootstrap procedure). Speaking from personal experience, it can be quite a drag when one has set up and mostly-debugged a long computation only to have it stop with an error like "data are essentially constant" right near the end because of some condition for which the function author thought it better to stop with an error rather than return NA (or some other indication that there was no sensible answer) (didn't happen with t.test, but I've experienced it with a few other functions.) So, I don't think it's at all unreasonable for the OP to request a way to make t.test() return NA instead of stopping with an error. Looking at the code for t.test, it doesn't look like there's any argument to specify such behavior, so the options are to write one's own version of t.test, or use try() as other posters have suggested. Here's an example using try(): > my.t.test.p.value <- function(...) { +obj<-try(t.test(...), silent=TRUE) +if (is(obj, "try-error")) return(NA) else return(obj$p.value) + } > my.t.test.p.value(numeric(0)) [1] NA > my.t.test.p.value(1:10) [1] 0.000278196 > my.t.test.p.value(1) [1] NA > my.t.test.p.value(c(1,1,1)) [1] NA > my.t.test.p.value(c(1,2,NA)) [1] 0.2048328 > my.t.test.p.value(c(1,2)) [1] 0.2048328 > hope this helps, Tony Plate > > Regards > Petr > >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum number of variables allowed in a multiple linearregression model
Bert Gunter wrote: > I strongly suggest you collaborate with a local statistician. I can think of > no circumstance where multiple regression on "hundreds of thousands of > variables" is anything more than a fancy random number generator. That sounds like a challenge! What is the largest regression problem (in terms of numbers of variables) that people have encountered where it made sense to do some sort of linear regression (and gave useful results)? (Including multilevel and Bayesian techniques.) However, the original poster did say "hundreds to thousands", which is smaller than "hundreds of thousands". When I try a regression problem with 3,000 coefficients in R running under Windows XP 64 bit with 8Gb of memory on the machine and the /3Gb option active (i.e., R can get up to 3Gb), R 2.6.1 runs out of memory (apparently trying to duplicate the model matrix): R version 2.6.1 (2007-11-26) Copyright (C) 2007 The R Foundation for Statistical Computing ISBN 3-900051-07-0 > m <- 3000 > n <- m * 10 > x <- matrix(rnorm(n*m), ncol=m, nrow=n, dimnames=list(paste("C",1:n,sep=""), paste("X",1:m,sep=""))) > dim(x) [1] 3 3000 > k <- sample(m, 10) > y <- rowSums(x[,k]) + 10 * rnorm(n) > fit <- lm.fit(y=y, x=x) Error: cannot allocate vector of size 686.6 Mb > object.size(x)/2^20 [1] 687.7787 > memory.size() [1] -2022.552 > and the Windows process monitor shows the peak memory usage for Rgui.exe at 2,137,923K. But in a 64 bit version of R, I would be surprised if it was not possible to run this (given sufficient memory). However, R easily handles a slightly smaller problem: > m <- 1000 # of variables > n <- m * 10 # of rows > k <- sample(m, 10) > x <- matrix(rnorm(n*m), ncol=m, nrow=n, dimnames=list(paste("C",1:n,sep=""), paste("X",1:m,sep=""))) > y <- rowSums(x[,k]) + 10 * rnorm(n) > fit <- lm.fit(y=y, x=x) > # distribution of coefs that should be one vs zero > round(rbind(one=quantile(fit$coefficients[k]), zero=quantile(fit$coefficients[-k])), digits=2) 0% 25% 50% 75% 100% one 0.94 0.98 1.04 1.10 1.18 zero -0.30 -0.08 -0.01 0.06 0.29 > To echo Bert Gunter's cautions, one must be careful doing ordinary linear regression with large numbers of coefficients. It does seem a little unlikely that there is sufficient data to get useful estimates of three thousand coefficients using linear regression in data managed in Excel (though I guess it could be possible using Excel 12.0, which can handle up to 1 million rows - recent versions prior to 2008 could handle on 64K rows - see http://en.wikipedia.org/wiki/Microsoft_Excel#Versions ). So, the suggestion to consult a local statistician is good advice - there may be other more suitable approaches, and if some form of linear regression is an appropriate approach, there are things to do to gain confidence that the results of the linear regression convey useful information. -- Tony Plate > > -- Bert Gunter > Genentech Nonclinical Statistics > > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On > Behalf Of Michelle Chu > Sent: Tuesday, February 05, 2008 9:00 AM > To: R-help@r-project.org > Subject: [R] Maximum number of variables allowed in a multiple > linearregression model > > Hi, > > I appreciate it if someone can confirm the maximum number of variables > allowed in a multiple linear regression model. Currently, I am looking for > a software with the capacity of handling approximately 3,000 variables. I > am using Excel to process the results. Any information for processing a > matrix from Excel with hundreds to thousands of variables will helpful. > > Best Regards, > Michelle > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unload & reload a (new version of a) package
See ?detach, in particular the 'unload' argument and "Details" (issues involve namespaces & methods, among other things). Also, note that if the package loaded any compiled code (DLL's in Windows), some OS's do not support unloading & reloading these. -- Tony Plate Harte, Thomas P wrote: > i'm putting the final touches on a package that i'm developing and i > noticed > that if i detach the package, and then re-build & re-install it (using R > CMD) > then I can't get the newer version of the package to load in the > existing R > session (i have to close it out and start a new session, then the newer > version of the package is loaded). > > looking through the source of 'detach' i see : > > .Call("R_lazyLoadDBflush", paste(libpath, "/R/", pkgname, > ".rdb", sep = ""), PACKAGE = "base") > > is there some absolute way similar to the above to flush the package db > and ensure that a newer version of the package can be loaded into the > existing R session? detach calls .Last.lib and seems to go through the > motions of purging the loaded package; why, then, is the package still > lurking around in the existing R session? > > it's not a big deal; it's only a minor pain having to re-start an R > session. > i'm more interesting in why this is happening. > > cheers, > > thomas. > > > > > This message, including any attachments, contains confid...{{dropped:17}} > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] `[.data.frame`(df3, , -2) and NA columns
Assigning a name vector to a dataframe that is shorter than the number of columns results in some columns having NA values for their names. "[.data.frame" has the following code in it: cols <- names(x) ... if (any(is.na(cols))) stop("undefined columns selected") so, if a dataframe x has NA values for column names, you should expect x[...] to *sometimes* stop with that error (with a bit of reading and testing you could probably work out exactly when that error will occur). -- Tony Plate Dieter Menne wrote: > Dear baseRs, > > I recently made a mistake when renaming data frame columns, accidentally > creating an NA column. I found the following strange behavior when negative > indexes are used. > > Can anyone explain what happens here. No "workarounds" required, just > curious. > > Dieter > > Version: Windows, R version 2.6.1 (2007-11-26) > > #- > df = data.frame(a=0:10,b=10:20) > df[,-2] #ok > names(df)=c("A") # implicitly creates an NA column > df[,-2] > df[,-2,drop=FALSE] # has nothing to do with drop > > df3 = data.frame(a=0:10,b=10:20,c=20:30) > df3[,-2] #ok > names(df3)=c("A","B") #creates an NA column > df3[,-2] # error > # Error in `[.data.frame`(df3, , -2) : undefined columns selected > > names(df3)[3]="NaN" # another reserved word > df3[,-2] # no problem > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] new version of trackObjs
The trackObjs package stores objects in files on disk so that files are automatically rewritten when objects are changed, and so that objects are accessible but do not occupy memory until they are accessed. Also tracks times when objects are created and modified, and caches some basic characteristics of objects to allow for fast summaries of objects. This version trackObjs_0.8-0 fixes some bugs: o Fixed faulty detection of conflicting existing objects when starting to track to an existing directory. o Replaced environment on function that is in the active binding for a tracked object. Previously, that function could, if constructed via track(obj <- value), have a copy of the tracked object in its environment, which would stay present taking up memory even if the object was flushed out of the tracking environment. o Fixed bug that stopped track.stop(all=TRUE) from working -- Tony Plate ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Efficient way to find consecutive integers in vector?
Martin Maechler wrote: >>>>>> "MS" == Marc Schwartz <[EMAIL PROTECTED]> >>>>>> on Thu, 20 Dec 2007 16:33:54 -0600 writes: > > MS> On Thu, 2007-12-20 at 22:43 +0100, Johannes Graumann wrote: > >> Hi all, > >> > >> Does anybody have a magic trick handy to isolate directly consecutive > >> integers from something like this: > >> c(1,2,3,4,7,8,9,10,12,13) > >> > >> The result should be, that groups 1-4, 7-10 and 12-13 are consecutive > >> integers ... > >> > >> Thanks for any hints, Joh > > MS> Not fully tested, but here is one possible approach: > > >> Vec > MS> [1] 1 2 3 4 7 8 9 10 12 13 > > MS> Breaks <- c(0, which(diff(Vec) != 1), length(Vec)) > > >> Breaks > MS> [1] 0 4 8 10 > > >> sapply(seq(length(Breaks) - 1), > MS> function(i) Vec[(Breaks[i] + 1):Breaks[i+1]]) > MS> [[1]] > MS> [1] 1 2 3 4 > > MS> [[2]] > MS> [1] 7 8 9 10 > > MS> [[3]] > MS> [1] 12 13 > > > > MS> For a quick test, I tried it on another vector: > > > MS> set.seed(1) > MS> Vec <- sort(sample(20, 15)) > > >> Vec > MS> [1] 1 2 3 4 5 6 8 9 10 11 14 15 16 19 20 > > MS> Breaks <- c(0, which(diff(Vec) != 1), length(Vec)) > > >> Breaks > MS> [1] 0 6 10 13 15 > > >> sapply(seq(length(Breaks) - 1), > MS> function(i) Vec[(Breaks[i] + 1):Breaks[i+1]]) > MS> [[1]] > MS> [1] 1 2 3 4 5 6 > > MS> [[2]] > MS> [1] 8 9 10 11 > > MS> [[3]] > MS> [1] 14 15 16 > > MS> [[4]] > MS> [1] 19 20 > > Seems ok, but ``only works for increasing sequences''. > More than 12 years ago, I had encountered the same problem and > solved it like this: > > In package 'sfsmisc', there has been the function inv.seq(), > named for "inversion of seq()", > which does this too, currently returning an expression, > but returning a call in the development version of sfsmisc: > > Its definition is currently > > inv.seq <- function(i) { > ## Purpose: 'Inverse seq': Return a short expression for the 'index' `i' > ## > ## Arguments: i: vector of (usually increasing) integers. > ## > ## Author: Martin Maechler, Date: 3 Oct 95, 18:08 > ## > ## EXAMPLES: cat(rr <- inv.seq(c(3:12, 20:24, 27, 30:33)),"\n"); eval(rr) > ## r2 <- inv.seq(c(20:13, 3:12, -1:-4, 27, 30:31)); eval(r2); r2 > li <- length(i <- as.integer(i)) > if(li == 0) return(expression(NULL)) > else if(li == 1) return(as.expression(i)) > ##-- now have: length(i) >= 2 > di1 <- abs(diff(i)) == 1#-- those are just simple sequences n1:n2 ! > s1 <- i[!c(FALSE,di1)] # beginnings > s2 <- i[!c(di1,FALSE)] # endings > > ## using text & parse {cheap and dirty} : > mkseq <- function(i,j) if(i == j) i else paste(i,":",j, sep="") > parse(text = > paste("c(", paste(mapply(mkseq, s1,s2), collapse = ","), ")", sep = > ""), > srcfile = NULL)[[1]] > } > > with example code > > > v <- c(1:10,11,6,5,4,0,1) > > (iv <- inv.seq(v)) > c(1:11, 6:4, 0:1) > > stopifnot(identical(eval(iv), as.integer(v))) > > iv[[2]] > 1:11 > > str(iv) > language c(1:11, 6:4, 0:1) > > str(iv[[2]]) > language 1:11 > > > > > Now, given that this stems from 1995, I should be excused for > using parse(text = *) [see fortune(106) if you don't understand]. > > However, doing this differently by constructing the resulting > language object directly {using substitute(), as.symbol(), > as.expression() ... etc} > seems not quite trivial. > > So here's the Friday afternoon / Christmas break quizz: > > What's the most elegant way > to replace the last statements in inv.seq() > ---- > ## using text & parse {cheap and dirty} : > mkseq <- function(i,j) if(i == j) i else paste(i,":",j, sep="") > parse(text = >
Re: [R] assigning and saving datasets in a loop, with names changing with "i"
Marie Pierre Sylvestre wrote: > Dear R users, > > I am analysing a very large data set and I need to perform several data > manipulations. The dataset is so big that the only way I can play with it > without having memory problems (E.g. "cannot allocate vectors of size...") > is to write a batch script to: > > 1. cut the data into pieces > 2. save the pieces in seperate .RData files > 3. Remove everything from the environment > 4. load one of the piece > 5. perform the manipulations on it > 6. save it and remove it from the environment > 7. Redo 4-6 for every piece > 8. Merge everything together at the end > > It works if coded line by line but since I'll have to perform these tasks > on other data sets, I am trying to automate this as much as I can. The trackObjs package is designed to make it easy to work in approximately this manner -- it saves objects automatically to disk but they are still accessible as normal. Here's how you could do the above - this example works with 10 8Mb objects in a R session with a limit of 40Mb. # allow R only 40Mb of vector memory mem.limits(vsize=40e6) mem.limits()/1e6 library(trackObjs) # start tracking to store data objects in the directory 'data' # each object is 8Mb, and we store 10 of them track.start("data") n <- 10 m <- 1e6 constructObject <- function(i) i+rnorm(m) # steps 1, 2 & 3: for (i in 1:n) { xname <- paste("x", i, sep="") cat("", xname) assign(xname, constructObject(i)) # store in a file, accessible by name: track(list=xname) } cat("\n") gc(TRUE) # accessing object by name object.size(x1)/2^20 # In Mb mean(x1) mean(x2) gc(TRUE) # steps 4:6 # accessing object through a constructed name result <- sapply(1:n, function(i) mean(get(paste("x", i, sep="" result # remove the data objects track.remove(list=paste("x", 1:n, sep="")) track.stop() Here's the a full transcript of the above - note how whenever gc() is called there is hardly any vector memory in use. > # allow R only 40Mb of vector memory > mem.limits(vsize=40e6) nsizevsize NA 4000 > mem.limits()/1e6 nsize vsize NA40 > library(trackObjs) > # start tracking to store data objects in the directory 'data' > # each object is 8Mb, and we store 10 of them > track.start("data") > n <- 10 > m <- 1e6 > constructObject <- function(i) i+rnorm(m) > # steps 1, 2 & 3: > for (i in 1:n) { +xname <- paste("x", i, sep="") +cat("", xname) +assign(xname, constructObject(i)) +# store in a file, accessible by name: +track(list=xname) + } x1 x2 x3 x4 x5 x6 x7 x8 x9 x10> cat("\n") > gc(TRUE) Garbage collection 19 = 6+0+13 (level 2) ... 4.0 Mbytes of cons cells used (42%) 0.7 Mbytes of vectors used (5%) used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) Ncells 148362 4.0 35 9.4 NA 35 9.4 Vcells 89973 0.71950935 14.9 38.2 2112735 16.2 > # accessing object by name > object.size(x1)/2^20 # In Mb [1] 7.629417 > mean(x1) [1] 0.998635 > mean(x2) [1] 1.999656 > gc(TRUE) Garbage collection 22 = 7+1+14 (level 2) ... 4.0 Mbytes of cons cells used (43%) 0.7 Mbytes of vectors used (6%) used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) Ncells 149264 4.0 35 9.4 NA 35 9.4 Vcells 90160 0.71560747 12.0 38.2 2112735 16.2 > # steps 4:6 > result <- sapply(1:n, function(i) mean(get(paste("x", i, sep="" > result [1] 0.998635 1.999656 2.997368 4.000197 5.000159 6.001216 6.999552 [8] 7.999743 8.82 10.001355 > # remove the data objects > track.remove(list=paste("x", 1:n, sep="")) [1] "x1" "x2" "x3" "x4" "x5" "x6" "x7" "x8" "x9" "x10" > track.stop() > > > I am using a loop in which I used 'assign' and 'get' (pseudo code below). > My problem is when I use 'get', it prints the whole object on the screen. > I am wondering whether there is a more efficient way to do what I need to > do. Any help would be appreciated. Please keep in mind that the whole > process is quite computer-intensive, so I can't keep everything in the > environment while R performs calculations. > > Say I have 1 big dataframe called data. I use 'split' to divide it into a > list of 12 dataframes (call this list my.list) > > my.fun is a function that takes a dataframe, performs several > manipulations on it and returns a dataframe. > > > for (i in 1:12){ > assign( paste( "data", i, sep=""), my.fun(my.list[i])) # this works > # now I need to save this new object as a RData. > > # The following line does not work > save(paste("data", i, sep = ""), file = paste( paste("data", i, sep = > ""), "RData", sep=".")) > } > > # This works but it is a bit convoluted!!! > temp <- get(paste("data", i, sep = "")) > save(temp, file = "lala.RData") > } > > > I am *sure* there is something more clever to do but I can't find it. Any > help would be appreciated. > > best regards, > > MP > > __ >
Re: [R] Function reference
R does this sort of thing easily without any parse/eval acrobatics needed. E.g., you can do: > stu <- function(x) {return( 1 + (2*x*x) - (3*x) )} > (x <- 0:3) [1] 0 1 2 3 > stu(x) [1] 1 0 3 10 > metafun <- function(FUN, data) FUN(data) > metafun(stu, x) [1] 1 0 3 10 > # if you want to be able use a character-data name for the function: > metafun2 <- function(FUN, data) {if (is.character(FUN)) FUN <- getFunction(FUN); FUN(data)} > metafun2("stu", x) [1] 1 0 3 10 > metafun2(stu, x) [1] 1 0 3 10 > What went wrong with your code was that your parse() constructed a list of 4 expressions, and evaluating that returned the value of the last one: > (fun <- "stu") [1] "stu" > paste( fun, "(", x, ")", sep = "" ) [1] "stu(0)" "stu(1)" "stu(2)" "stu(3)" > parse( text = paste( fun, "(", x, ")", sep = "" ) ) expression(stu(0), stu(1), stu(2), stu(3)) attr(,"srcfile") > (Others have observed that in a very large proportion of the situations where people reach for parse/eval, there's a neater, cleaner & more direct way of doing the job.) -- Tony Plate Talbot Katz wrote: > Hi. > > I'm looking for an R equivalent to something like function pointers in C/C++. > I have a search procedure that evaluates the fitness of each point it > reaches as it moves along, and decides where to move next based on its > fitness evaluation. I want to be able to pass different fitness functions to > this procedure. I am trying to find a good way to do this. I was thinking > of passing in the name of the function and then using eval. However, I > haven't gotten this to work the way I'd like it to. Consider the following > example: > > >> stu <- function(x) {return( 1 + (2*x*x) - (3*x) )}> (x=0:3)[1] 0 1 2 3> >> stu(x)[1] 1 0 3 10> (fun="stu")[1] "stu"> eval( parse( text = paste( fun, >> "(", x, ")", sep = "" ) ) )[1] 10> > > > Notice that the function I defined called "stu" will operate on a vector x > and return a vector y = stu(x) such that y[i] equals stu(x[i]). When I tried > to pass stu and x to a procedure that would evaluate stu(x) I only get > stu(x[N]), when N is the last element of x. What am I doing wrong? Is there > a better way to pass function references? > > I can get the following to work, but it seems awfully clunky: > >> sapply( 1:length(x), function(i){ return( eval( parse( text = paste( fun, >> "(", x[i], ")", sep = "" ) ) ) ) } )[1] 1 0 3 10> > > Thanks! > > -- TMK --212-460-5430 home917-656-5351 cell > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Factor Madness
Whoops, it looks like there's a typo in ?cbind (R version 2.6.0 Patched (2007-10-11 r43143)), and I blindly copied it into my message. That should read (emphasis added): "and convert character columns to factors unless stringsAsFactors = ***FALSE***" Here's an example: > x <- data.frame(X=1:3) > sapply(cbind(x, letters[1:3]), class) X letters[1:3] "integer" "factor" > sapply(cbind(x, letters[1:3], stringsAsFactors=FALSE), class) X letters[1:3] "integer" "character" > Thanks to Mark Leeds for pointing that out to me in a private message! (I see this still in the source at https://svn.r-project.org/R/trunk/src/library/base/man/cbind.Rd -- is that the right place to look for the latest source to make sure it hasn't been fixed already?) -- Tony Plate Tony Plate wrote: > From ?cbind: > > Data frame methods > The cbind data frame method is just a wrapper for data.frame(..., > check.names = FALSE). This means that it will split matrix columns in data > frame arguments, and convert character columns to factors unless > stringsAsFactors = TRUE is passed. > > (I'm guessing 'spectrum' is a data.frame before the code fragment you've > shown) > > hope this helps, > > Tony Plate > > Johannes Graumann wrote: >> Why is class(spectrum[["Ion"]]) after this "factor"? >> >> spectrum <- cbind(spectrum,Ion=rep("", >> nrow(spectrum)),Deviation.AMU=rep(0.0, nrow(spectrum))) >> >> slowly going crazy ... >> >> Joh >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Factor Madness
From ?cbind: Data frame methods The cbind data frame method is just a wrapper for data.frame(..., check.names = FALSE). This means that it will split matrix columns in data frame arguments, and convert character columns to factors unless stringsAsFactors = TRUE is passed. (I'm guessing 'spectrum' is a data.frame before the code fragment you've shown) hope this helps, Tony Plate Johannes Graumann wrote: > Why is class(spectrum[["Ion"]]) after this "factor"? > > spectrum <- cbind(spectrum,Ion=rep("", > nrow(spectrum)),Deviation.AMU=rep(0.0, nrow(spectrum))) > > slowly going crazy ... > > Joh > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Array dimnames
I can't quite understand what you're having difficulty with (is it constructing the array, or coping with the different 'matrices' having different column names, or something else?) However, your sample data looks like it has a mixture of factor (region) and numeric data (Qty), so you're probably storing it in a data frame. AFAIK, there is no 3d object in R that can store mixed-type data like a data frame can. An array object in R has to have the same data type for every column etc. -- Tony Plate dave mitchell wrote: > Dear all, > Possibly a rudimentary question, however any help is greatly appreciated. I > am sorting a large matrix into an array of dim(p(i),q,3). I put each entry > into a corresponding matrix (1 of the 3) based on some criteria. I figure > this will assist me in condensing code as I can loop through the 3rd > dimension of the array instead of generating 3 separate matrices and using > the same block of code 3 times. My question is how to get the colnames of > the 3 nested matrices in the array to match the colnames of the data > matrix. In other words... > > DATA: >Exp region Qty Ct ...q > 1 S CB 3.55 2.15 . > 2 S TG 4.16 2.18 . > 3 C OO 2.36 3.65 . > 4 C . . . > . . . . . > . . . . . > . . . . . > p ... > > > > ARRAY > 1 >[,1] [,2][,3] [,4]...q > 1 SOME DATA WILL FILL THIS . > 2 . . .. > 3 . . . . > 4 .. . . > . . . . . > . . . .. > . . . . . > P(1) ... > > 2 >[,1] [,2][,3] [,4]...q > 1 SOME DATA WILL FILL THIS . > 2 . . .. > 3 . . . . > 4 .. . . > . . . . . > . . . .. > . . . . . > P(2) ... > 3 >[,1] [,2][,3] [,4]...q > 1 SOME DATA WILL FILL THIS . > 2 . . .. > 3 . . . . > 4 .. . . > . . . . . > . . . .. > . . . . . > P(3) ... > > Again, how to get those [,1], [,2]... to read (and operate) in the same > fashion as the column names in the data matrix? Also, am I interpreting the > dimensions of the array incorrectly? Please feel free to post any helpful > links on the subject, as I have found "dimnames" and "array" in the R-help > documentation unhelpful. Any help is greatly appreciated. > > Dave Mitchell > Undergraduate: Statistics and Mathematics, > University of Illinois, Urbana-Champaign > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what does cut(data, breaks=n) actually do?
Peter Dalgaard wrote: > melissa cline wrote: >> Hello, >> >> I'm trying to bin a quantity into 2-3 bins for calculating entropy and >> mutual information. One of the approaches I'm exploring is the cut() >> function, which is what the mutualInfo function in binDist uses. When it's >> called in the format cut(data, breaks=n), it somehow splits the data into n >> distinct bins. Can anyone tell me how cut() decides where to cut? >> >> > This is one case where reading the actual R code is easier that > explaining what it does. From cut.default > > if (length(breaks) == 1) { > if (is.na(breaks) | breaks < 2) > stop("invalid number of intervals") > nb <- as.integer(breaks + 1) > dx <- diff(rx <- range(x, na.rm = TRUE)) > if (dx == 0) > dx <- rx[1] > breaks <- seq.int(rx[1] - dx/1000, rx[2] + dx/1000, length.out = nb) > } > > so basically it takes the range, extends it a bit and splits in into > equally long segments. > > (For the sometimes more attractive option of splitting into groups of > roughly equal size, there is cut2 in the Hmisc package, or use quantile()) > It can be a bit dangerous to use quantile() to provide breaks for cut(), because quantiles can be non-unique, which cut() doesn't like: > x1 <- c(1,1,1,1,1,1,1,1,1,2) > cut(x1, breaks=quantile(x1, (0:2)/2)) Error in cut.default(x1, breaks = quantile(x1, (0:2)/2)) : 'breaks' are not unique > However, cut2() in Hmisc handles this situation gracefully: > library(Hmisc) Attaching package: 'Hmisc' The following object(s) are masked from package:base : format.pval, round.POSIXt, trunc.POSIXt, units > cut2(x1, g=2) [1] 1 1 1 1 1 1 1 1 1 2 Levels: 1 2 > (Additionally, a potentially dangerous peculiarity of quantile() for this kind of purpose is that its return values can be out of order (i.e., diff(quantile(...))<0, at rounding error level), but this doesn't actually upset cut() in R because cut() sorts the breaks prior to using them.) -- Tony Plate __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix graph
Try these: > x <- matrix(rnorm(100), ncol=10) > persp(x) > contour(x) Also, look at the R graph gallery: http://addictedtor.free.fr/graphiques/ -- Tony Plate threshold wrote: > Hi All, simple question: > do you know how to graph the following object/matrix in a 'surface manner': > > [,1] [,2] [,3][,4] [,5][,6] > [1,] -0.154 -0.065 0.129 0.637 0.780 0.221 > [2,] 0.236 0.580 0.448 0.729 0.859 0.475 > [3,] 0.401 0.506 0.310 0.650 0.822 0.448 > [4,] 0.548 0.625 0.883 0.825 0.945 0.637 > [5,] 0.544 0.746 0.823 0.877 0.861 0.642 > [6,] 0.262 0.399 0.432 0.620 0.711 0.404 > > will be very grateful for hints. > > rob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to know created time of object in R?
The trackObjs package will do this, at the level of objects in an environment. E.g., from the docs: > library(trackObjs) > track.start("tmp1") > x <- 123 # Not yet tracked > track(x) # Variable 'x' is now tracked > track(y <- matrix(1:6, ncol=2)) # 'y' is assigned & tracked > z1 <- list("a", "b", "c") > z2 <- Sys.time() > track(list=c("z1", "z2")) # Track a bunch of variables > track.summary(size=F) # See a summary of tracked vars classmode extent lengthmodified TA TW x numeric numeric[1] 1 2007-09-07 08:50:58 0 1 y matrix numeric [3x2] 6 2007-09-07 08:50:58 0 1 z1 listlist [[3]] 3 2007-09-07 08:50:58 0 1 z2 POSIXt,POSIXct numeric[1] 1 2007-09-07 08:50:58 0 1 > # (TA="total accesses", TW="total writes") (creation time is tracked too, but not displayed with default settings) For more info, look in the trackObjs package. -- Tony Plate Dong-hyun Oh wrote: > Dear UseRs, > > I would like to know the created time and date of specific object. > Is there any function for it? > > Thank you in advance. > > > Sincerely, > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] timezone conversion difficulties with the new US daylight saving time switch over
Mike Waters wrote: > Tony Plate wrote: >> [...] > You don't say if this an R-specific problem, or if it afflicts your > computer system clock as well. Thanks, I should have noted that my computer system clock is fine, and as far as I can tell it (correctly) believes we are still in Daylight Saving mode. It did not incorrectly fall back last Sunday at the date that would have been the end of Daylight Saving under the old rules. In case it matters, I do have a program running on my computer that synchronizes time: "Domain Time II version 3.1.b.20040724R" -- Tony Plate __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] timezone conversion difficulties with the new US daylight saving time switch over
I'm having difficulties with daylight saving times in US time zones. (Apologies for the long post, but the problem seems subtle and complex, unless I'm doing something completely wrong, in which case it should be evident from the first 10 lines below.) This is what I see, using a (slightly modified) example from ?as.POSIXlt : > as.POSIXlt((d <- Sys.time()), "EST5EDT") # the current time in New York [1] "2007-10-30 12:38:47 EST" > d [1] "2007-10-30 11:38:47 Mountain Daylight Time" > The problem is that Mountain Time is 2 hours behind Eastern Time and the US is still on Daylight Saving Time - the current time in New York should be reported as "13:38:47 EDT", not "12:38:47 EST". This is running on Windows XP 64 bit (SP2), and I see the same behavior on Windows XP 32 bit and on Windows 2000 Server (SP4). AFAICS, this problem only occurs this week, and this week is unusual in that it is the first year that Daylight Saving Time in the US ends on the first Sunday in November rather than the last Sunday in October (I don't know whether this is the cause of the problem, but it seems likely). I see the same problem around the same week last year, but before and after this week in both years, the conversions are fine: > # *** problem in 2007 > as.POSIXlt(as.POSIXct("2007-10-30 11:38:47"), "EST5EDT") [1] "2007-10-30 12:38:47 EST" > # before the problem week 2007 > as.POSIXlt(as.POSIXct("2007-10-20 11:38:47"), "EST5EDT") [1] "2007-10-20 13:38:47 EDT" > # after the problem week 2007 > as.POSIXlt(as.POSIXct("2007-11-05 11:38:47"), "EST5EDT") [1] "2007-11-05 13:38:47 EST" > # *** problem in 2006 > as.POSIXlt(as.POSIXct("2006-10-30 11:38:47"), "EST5EDT") [1] "2006-10-30 12:38:47 EST" > # before the problem week 2006 > as.POSIXlt(as.POSIXct("2006-10-27 11:38:47"), "EST5EDT") [1] "2006-10-27 13:38:47 EDT" > # after the problem week 2006 > as.POSIXlt(as.POSIXct("2006-11-07 11:38:47"), "EST5EDT") [1] "2006-11-07 13:38:47 EST" > My computer is set to the Mountain Daylight timezone, and is set to automatically adjust for Daylight Saving Time changes. > sessionInfo() R version 2.6.0 Patched (2007-10-11 r43143) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base > Sys.timezone() [1] "Mountain Daylight Time" > If I explicitly set env var TZ, the conversion problems go away, but the time reported by Sys.time() is inappropriately not in daylight saving time: > Sys.time() [1] "2007-10-30 13:14:38 Mountain Daylight Time" > Sys.setenv(TZ="MST7MDT") > Sys.time() [1] "2007-10-30 12:14:51 MST" > If I set my system timezone to Eastern Daylight Time, and restart R, I also get problematic behavior (as.POSIXlt inappropriately adjusting a time by an hour on the day after Daylight saving time ends): > Sys.timezone() [1] "Eastern Daylight Time" > as.POSIXlt((d <- Sys.time()), "EST5EDT") [1] "2007-10-30 12:57:40 EST" > d [1] "2007-10-30 13:57:40 Eastern Daylight Time" > > # *** problem week 2007 > as.POSIXlt(as.POSIXct("2007-10-30 11:38:47"), "EST5EDT") [1] "2007-10-30 10:38:47 EST" > # before the problem week 2007 > as.POSIXlt(as.POSIXct("2007-10-20 11:38:47"), "EST5EDT") [1] "2007-10-20 11:38:47 EDT" > # after the problem week 2007 > as.POSIXlt(as.POSIXct("2007-11-05 11:38:47"), "EST5EDT") [1] "2007-11-05 11:38:47 EST" > # *** problem week 2006 - the day is after the switch, but > # the time gets adjusted by one hour > as.POSIXlt(as.POSIXct("2006-10-30 11:38:47"), "EST5EDT") [1] "2006-10-30 10:38:47 EST" > # before the problem week 2006 > as.POSIXlt(as.POSIXct("2006-10-27 11:38:47"), "EST5EDT") [1] "2006-10-27 11:38:47 EDT" > # after the problem week 2006 > as.POSIXlt(as.POSIXct("2006-11-10 11:38:47"), "EST5EDT") [1] "2006-11-10 11:38:47 EST" > The problem in 2006 goes away if I set TZ="EST5EDT": > Sys.setenv(TZ = "EST5EDT") > Sys.timezone() [1] "EST" > as.POSIXlt(as.POSIXct("2006-10-30 11:38:47"), "EST5EDT") [1] "2006-10-30 11:38:47 EST" > Questions: (1) should I be using a different way to convert times between time zones? (2) is there a problem in how R
Re: [R] functions applied to two vectors
> a <- c(2, 3, 7, 5) > b <- c(4, 7, 8, 9) > mapply(seq, a, b) [[1]] [1] 2 3 4 [[2]] [1] 3 4 5 6 7 [[3]] [1] 7 8 [[4]] [1] 5 6 7 8 9 > mapply(sum, a, b) [1] 6 10 15 14 > hope this helps, Tony Plate Anya Okhmatovskaia wrote: > Hi, > > I am very new to R, so I apologize if I have missed some trivial thing in > the manuals/archives. I am trying to get rid of for loops in my code and > replace them with R vector magic (with no success). > > Let's say I have 2 vectors of the same length: > a <- c(2, 3, 7, 5) > b <- c(4, 7, 8, 9) > > What I'd like to do is to generate a list(?) of 4 sequences using a[i] as a > start indices, and b[i] as end indices: > 2,3,4 > 3,4,5,6,7 > 7,8 > 5,6,7,8,9 > > My first guess: "a:b", of course, does not work - only one sequence gets > generated using the first values from both vectors, plus a warning. Is there > a special syntax I can use to make ":" treat its operands as vectors? > More generally, is there a standard way to apply some arbitrary functions to > two or more vectors in an element-by-element fashion? For instance, sum(a,b) > will sum all values in both vectors; how can I make it produce a vector of > pairwise sums instead? > > Thank you, > Anuta > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Input appreciated: R teaching idea + a way to improve R-wiki
hadley wickham wrote: > On 10/23/07, Philippe Grosjean <[EMAIL PROTECTED]> wrote: >> Hi Matt, >> >> The R-Wiki is actively maintained... the addition of material to it is >> up to R users with any kind of initiative like this being warmly >> welcome. As for Bill Venable's comment, I totally agree: you should >> better test your concept first, and be ready to have very poor, as well >> as probably some excellent documents. I think it should be wise to >> announce to your students that "the best documents will be posted to the >> R wiki", so that you may place a filter somewhere. >> >> As for the format, PDF is interesting as the student could learn Sweave >> too. However, the R Wiki allows for further corrections and additions to >> the documents. For the possible section in the Wiki, may be, a dedicated >> section like "Users' guide (written by users)" could be created, and >> then, you will organize material inside as you like. Otherwise, the >> existing "Guides" section should be fine (feel free to create >> subdirectories). >> >> I tend to give a lot of attention to documents written by "beginners", >> because they are the best people to tell what is difficult and what is >> not in R! It is the starting motivation for the R Wiki, indeed. > > But they are simultaneously the worst people to provide good advice. > The wiki seems to be riddled with poor practice and "hacks" to get > around misunderstandings of the way R works. Is there any way on the R-Wiki for people to quickly and easily add an annotation indicating that they believe some particular advice is poor practice? Ideally, these annotations would be easily searchable so that other users could find and fix or respond to them. -- Tony Plate > > Hadley __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Save and load workspace in R: strange error.
Sounds like you are having permissions problems. And as you're using a mix of Unix and WinXP, you might be suffering from some strange permissions settings. WinXP allows a very rich set of permissions which for many exotic combinations have no corresponding mapping to the much simpler 9-octet Unix permissions space -- so the limited view provided by the Unix permission string -rw-r--r-- may not really reflect what you can do with the file. On my own system, where I use WinXP and cygwin, I've sometimes seem very strange Windows permission sets that look OK in cygwin, but essentially disable use of a particular file. I generally fix this by resetting ownership and all permissions using Windows dialogs. It sounds like you need to get your sysadmins to help you sort this problem out. -- Tony Plate Hongxiao Zhu wrote: > Tony, > > Thanks for return. Actually, the data object 'junk4.RData' was created > but have size 0. It seems no data was saved. But the real data that I > want to load have data in it, which I can't load use my own user > account. But if using other people's user account under the same > system, it can be loaded. > > All the files has the following property if I use ls -l: > -rw-r--r-- > > The OS I used is windows xp. But I use SSH to connect to the unix server. > > I have been using this server for a long time, this error happened > since some day and from then on, I can never load/save workspace. > > Hong > > ** > * Hongxiao Zhu * > * Department of Statistics, Rice Univeristy * > * Office: DH 3136, Phone: 713-348-2839 * > * http://www.stat.rice.edu/~hxzhu/ * > ** > > On Wed, 3 Oct 2007, Tony Plate wrote: > >> Did you check whether 'junk4.RData' was created and what its length >> was - maybe an empty file is being created. Is there some sort of >> quota or permissions problem? My suggestion would be to look at the >> size and permissions on the directory and the file. If you need more >> help, I would suggest posting more details back to the list, e.g., >> what OS you are using, and a directory listing that shows file sizes >> and permissions (i.e., as you get with 'ls -l' on Unix systems.) >> >> -- Tony Plate >> >> Hongxiao Zhu wrote: >>> Hi, >>> >>> I tried to load a .RData object on unix system using R, it gives error: >>> >>> Error: restore file may be empty -- no data loaded >>> In addition: Warning message: >>> file 'junk3.RData' has magic number '' >>> Use of save versions prior to 2 is deprecated >>> >>> This happens only for using MY user account for the Unix system. I >>> tried to use a friends's user account to load the same data object, >>> it is >>> fine. And it never happened to me before until sometime last week. >>> And This error happens even when I generate a simple random number >>> from my user account and save it, and load it again.(So obviously it >>> is not a R version mismatch problem). Does anybody know what happened? >>> >>> Here is an example what happened: >>> >>>> x=rnorm(100) >>>> save.image('junk4.RData') >>>> load('junk4.RData') >>> Error: restore file may be empty -- no data loaded >>> In addition: Warning message: >>> file 'junk4.RData' has magic number '' >>> Use of save versions prior to 2 is deprecated >>> >>> Thanks for any suggestion. >>> >>> Hongxiao >>> >>> >>> ** >>> * Hongxiao Zhu * >>> * Department of Statistics, Rice Univeristy * >>> * Office: DH 3136, Phone: 713-348-2839 * >>> * http://www.stat.rice.edu/~hxzhu/ * >>> >>> __ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> !DSPAM:4703b16f15261021468! >> > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Save and load workspace in R: strange error.
Did you check whether 'junk4.RData' was created and what its length was - maybe an empty file is being created. Is there some sort of quota or permissions problem? My suggestion would be to look at the size and permissions on the directory and the file. If you need more help, I would suggest posting more details back to the list, e.g., what OS you are using, and a directory listing that shows file sizes and permissions (i.e., as you get with 'ls -l' on Unix systems.) -- Tony Plate Hongxiao Zhu wrote: > Hi, > > I tried to load a .RData object on unix system using R, it gives error: > > Error: restore file may be empty -- no data loaded > In addition: Warning message: > file 'junk3.RData' has magic number '' > Use of save versions prior to 2 is deprecated > > This happens only for using MY user account for the Unix system. I > tried to use a friends's user account to load the same data object, it is > fine. And it never happened to me before until sometime last week. > And This error happens even when I generate a simple random number > from my user account and save it, and load it again.(So obviously it is > not a R version mismatch problem). Does anybody know what happened? > > Here is an example what happened: > >> x=rnorm(100) >> save.image('junk4.RData') >> load('junk4.RData') > Error: restore file may be empty -- no data loaded > In addition: Warning message: > file 'junk4.RData' has magic number '' > Use of save versions prior to 2 is deprecated > > Thanks for any suggestion. > > Hongxiao > > > ** > * Hongxiao Zhu * > * Department of Statistics, Rice Univeristy * > * Office: DH 3136, Phone: 713-348-2839 * > * http://www.stat.rice.edu/~hxzhu/ * > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how can I attach a variable stored in
Here's a function that does what I think you want to do: > attach.firstvar <- function(file) { + tmpenv <- new.env() + vars <- load(file, envir=tmpenv) + x <- get(vars[1], envir=tmpenv, inherits=FALSE) + if (is.list(x)) + attach(x, name=vars[1]) + return(vars) + } > x <- list(xa=1, xb=2, xc=3) > save(list="x", file="tmp1.rda") > remove(list="x") > attach.firstvar("tmp1.rda") [1] "x" > ls(pos=2) [1] "xa" "xb" "xc" > find("xa") [1] "x" > search() [1] ".GlobalEnv""x" "package:stats" [4] "package:graphics" "package:grDevices" "package:utils" [7] "package:datasets" "package:methods" "Autoloads" [10] "package:base" > xa [1] 1 > xb [1] 2 > Peter Waltman wrote: > Hi Mark - > > Thanks for the reply. Sorry I didn't really clarify too well what I'm > trying to do. The issue is not that I can't see the variable that gets > loaded. > > The issue is that the variable is a list variable, and I'd like to write > a function that will take the .RData filename and attach the variable it > contains so that I can more easily access its contents, i.e. > > foo.bar <- list( "a"= "a", "b"=1 ) > save( file="foo.bar.RData", foo.bar ) > rm( foo.bar ) > > my.fn <- function( fname ) { >load( fname ) >attach( ls( pat="foo" ) ) # I want to attach( foo.bar ), but > this doesn't work > } > > ls() # prints out "foo.bar" > > attach( ls( ) ) # still doesn't work > attach( foo.bar ) # works > > So, basically, the question is how can I attach the variable that's > stored in a file if I don't already know it's name? > > Thanks again! > > Peter > > Leeds, Mark (IED) wrote: >> I don't think I understand your question but John Fox has written a very >> nice documentat about scoping and environments on his website. >> It's probably easy to find the site by googling "John Fox" but, if you >> can't find it, let me know. >> >> As I said, I don't think that I understand your question but, if you >> loaded a list variable using load("whatever.Rdata"), the variable will >> just be suitting in your workspace. You don't need to attach anything >> because load just loads the data right into the workspace. >> So typing the variable name should show the data. >> >> >> >> >> -Original Message- >> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] >> On Behalf Of Peter Waltman >> Sent: Thursday, September 20, 2007 1:32 PM >> To: r-help@r-project.org >> Subject: [R] how can I attach a variable stored in >> >> Hi - >> >> Any help would be greatly appreciated. >> >> I'm loading a list variable that's stored in an .RData file and would >> like attach it. >> >> I've used attach( ), but that only lets me see the variable >> that's stored in the file. >> >> As the variable name is of the form "comp.x.x", I've tried using attach( >> ls( pat="comp" ) ), but get an error as ls() just gives back a string. >> >> I've also played around with eval(), but don't really quite get what >> that function does since it seems to get into the R internals which I >> don't entirely understand and I haven't found any great unified >> documentation on R's handling environment and scoping. >> >> Thanks, >> >> Peter Waltman >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> >> This is not an offer (or solicitation of an offer) to buy/sell the >> securities/instruments mentioned or an official confirmation. Morgan >> Stanley may deal as principal in or own or act as market maker for >> securities/instruments mentioned or may advise the issuers. This is not >> research and is not from MS Research but it may refer to a research >> analyst/research report. Unless indicated, these views are the author's and >> may differ from those of Morgan Stanley research or others in the Firm. We >> do not represent this is accurate or complete and we may not update this. >> Past performance is not indicative of future returns. For additional >> information, research reports and important disclosures, contact me or see >> https://secure.ms.com/servlet/cls. You should not use e-mail to request, >> authorize or effect the purchase or sale of any security or instrument, to >> send transfer instructions, or to effect any other transactions. We cannot >> guarantee that any such requests received v i! > a e-mail will be processed in a timely manner. This communication is solely > for the addressee(s) and may contain confidential information. We do not > waive confidentiality by mistransmission. Contact me if you do not wish to > receive these communications. In the UK, this communication is directed in > the UK to those persons
Re: [R] a quick question about "format()"
runner wrote: > In the documentation of 'pairs'(package:graphics), within the last example, > it reads: > > format(c(r, 0.123456789), digits=3)[1] > > Why not simple use: format(r, digits=3)? What is the difference? Here are some examples of the difference: > for (r in 1.2*10^(-6:9)) cat(format(c(r, 0.123456789), digits=3)[1], format(r, digits=3), "\n") 1.20e-06 1.2e-06 0.12 1.2e-05 0.00012 0.00012 0.0012 0.0012 0.012 0.012 0.120 0.12 1.200 1.2 12.000 12 120.000 120 1200.000 1200 1.20e+04 12000 1.20e+05 12 1.20e+06 120 1.20e+07 1.2e+07 1.20e+08 1.2e+08 1.20e+09 1.2e+09 > > > Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.