Re: [R] Conditional statement
Generate the numbers, test for zero and then set negatives to zero: > set.seed(1) > x <- rnorm(100,5,3) > sum(x<0) [1] 3 > x[x<0] <- 0 > sum(x<0) [1] 0 > On Mon, Nov 16, 2009 at 7:43 AM, Rafael Moral wrote: > Dear useRs, > > I wrote a function that simulates a stochastic model in discrete time. > The problem is that the stochastic parameters should not be negative and > sometimes they happen to be. > How can I conditionate it to when it draws a negative number, it transforms > into zero in that time step? > > Here is the function: > > stochastic_prost <- function(Fmean, Fsd, Smean, Ssd, f, s, n, time, > out=FALSE, plot=TRUE) { > nt <- rep(0, time) > nt[1] <- n > for(n in 2:time) { > nt[n] <- 0.5*rnorm(1, Fmean, Fsd)*rnorm(1, Smean, > Ssd)*exp(1)^(-(f+s)*nt[n-1])*nt[n-1]} > if(out==TRUE) {print(data.frame(nt))} > if(plot==TRUE) {plot(1:time, nt, type='l', main='Simulation', > ylab='Population', xlab='Generations')} > } > > The 2 rnorm()'s should not be negative; when negative they should turn into > zero. > > Thanks in advance, > Rafael > > > > > [[elided Yahoo spam]] > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting values from correlation matrix
Assuming that your data is in a dataframe 'cordata' , then following should work: cordata$cor2_value <- sapply(1:nrow(cordata), function(.row){ cor2[cordata$rowname[.row], cordata$colname[.row]] } On Mon, Nov 16, 2009 at 11:44 AM, Lee William wrote: > Hi! All, > > I have 2 correlation matrices of 4000x4000 both with same row names and > column names say cor1 and cor2. I have extracted some information from 1st > matrix cor1 which is something like this: > > rowname colname cor1_value > a b 0.8 > b a 0.8 > c f 0.62 > d k 0.59 > - - -- > - - -- > > Now I wish to extract values from matrix cor2 for the same rowname and > colname as above so that it looks similar to something like this with values > in cor2_value: > > rowname colname cor1_value cor2_value > a b 0.8 --- > b a 0.8 --- > c f 0.62 --- > d k 0.59 --- > - - -- --- > - - -- --- > > I am running out of ideas. So I decided to post this on mailing list. Please > Help! > > Best > Lee > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re placing the dates format in R for exporting the data set...
First of all '2009-08-06' is 1995; this is probably not what you were expecting. What do you what your expression to do? Is 'toms_dat' a dataframe? if so, your expression 'toms_dat ==2009-08-06' seem strange. So tell us what you want to do, not how you want to do it. On Tue, Nov 17, 2009 at 4:54 PM, ychu066 wrote: > > hi everyone, i am having difficulties with replacing the dates format in R > for exporting the data set... > > eg: the code that i used was > toms_dat<- replace(toms_dat, toms_dat ==2009-08-06, 2) > toms_dat<- replace(toms_dat, toms_dat ==2009-08-04, 1) > > but when i export the data as into txt file or excel file the dates come up > with very large numbers .:drunk: > > please help me ...=) > -- > View this message in context: > http://old.nabble.com/replacing-the-dates-format-in-R-for-exporting-the-data-set...-tp26396492p26396492.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting a dataframe with date format
It plots fine for me. I see 2007 and 2008 on the x-axis. On Tue, Nov 17, 2009 at 4:56 PM, separent wrote: > > I tried to plot the attached dataframe with the following command. > > plot(inclino.06.1.r00.time.select.transpose[,1],inclino.06.1.r00.time.select.transpose[,2]) > > The first column is in date format, second is numeric. The plot does not > correspond to my values. Why? > > Regards, > > Serge-Étienne Parent > Golder Associés > Montréal > > http://old.nabble.com/file/p26396493/inclino.06.1.r00.time.select.transpose.rda > inclino.06.1.r00.time.select.transpose.rda > -- > View this message in context: > http://old.nabble.com/Plotting-a-dataframe-with-date-format-tp26396493p26396493.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re placing the dates format in R for exporting the data set...
?write.table If you read the help file, and do a little experimenting, you will see that there is a parameter 'rownames=FALSE' that may answer your question. Also since you did not have column names on your input, you get V1, V2,... You can put your own column names. It helps again to read the help file on 'read.table' and look at the parameter 'col.names'. There is also the colnames function. It also might help to (re)read the Intro to R. On Tue, Nov 17, 2009 at 8:27 PM, ychu066 wrote: > > Moreover, I want to rename the column name V1,V2,V3,V4.V146. how do i > write the code in R ??? > > thanks everyone that look at the thread/ > > > > ychu066 wrote: >> >> hi everyone, i am having difficulties with replacing the dates format in R >> for exporting the data set... >> >> eg: the code that i used was >> toms_dat<- replace(toms_dat, toms_dat ==2009-08-06, 2) >> toms_dat<- replace(toms_dat, toms_dat ==2009-08-04, 1) >> >> but when i export the data as into txt file or excel file the dates come >> up with very large numbers .:drunk: >> >> please help me ...=) >> > http://old.nabble.com/file/p26400792/what.csv what.csv > -- > View this message in context: > http://old.nabble.com/replacing-the-dates-format-in-R-for-exporting-the-data-set...-tp26396492p26400792.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error message; ylim + log="y"
like this? > plot(c(),c(), xlim=c(1,10), ylim=c(0,1), log="y") Error in axis(side = side, at = at, labels = labels, ...) : CreateAtVector [log-axis()]: axp[0] = 0 < 0! In addition: Warning messages: 1: In is.na(y) : is.na() applied to non-(list or vector) of type 'NULL' 2: In plot.window(...) : nonfinite axis limits [GScale(-inf,4,2, .); log=1] 3: In axis(side = side, at = at, labels = labels, ...) : CreateAtVector "log"(from axis()): axp[0] = 0 ! You have no data to plot. What were you expecting it to do? When you say "lot of error messages", please include them and also follow the posting guide. On Wed, Nov 18, 2009 at 4:52 PM, Martin Batholdy wrote: > Hi, > > > I get a lot of error messages with this command, but I don't understand why; > > plot(c(),c(), xlim=c(1,10), ylim=c(0,1), log="y") > > > thanks for any help! > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there an variant of apply() that does not return anything?
invisible(apply(...)) On Thu, Nov 19, 2009 at 5:21 PM, Peng Yu wrote: > There are a few version of apply() (e.g., lapply(), sapply()). I'm > wondering if there is one that does not return anything but just > silently apply a function to the list argument. > > For example, the plot function is applied to each element in 'alist'. > It is redundant to return anything from apply. > > apply(alist,function(x){ plot each element of alist}) > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove leading and trailing white spaces
try this: > x <- ' middle of the string ' > sub("^[[:space:]]*(.*?)[[:space:]]*$", "\\1", x, perl=TRUE) [1] "middle of the string" On Fri, Nov 20, 2009 at 10:51 AM, Bos, Roger wrote: > I have a character string and I would like to remove the leading and > tailing white spaces. The example for 'sub' shows how to remove the > trailing white spaces, but I still can't figure out how to remove both > trailing and leading white spaces because I can't find any documentation > for what "+$" means or what "\\s+$" means. Maybe its because I don't > have a Unix background. Thanks in advance for any help with this. > > str <- ' Now is the time ' > sub(' +$', '', str) ## spaces only > sub('[[:space:]]+$', '', str) ## white space, POSIX-style > sub('\\s+$', '', str, perl = TRUE) ## Perl-style white space > > Thanks, > > Roger > *** > > This message is for the named person's use only. It may\...{{dropped:23}} > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read a file into a matrix
What is wrong with the "extra step"? Is it taking too much time (you did not specify that), is it taking too much memory? How many times are you going to be doing it? If not many, then may be it is OK. You have to quantify what you are asking for. It may take longer to send a message to R-Help and get a response than to just read the file in and process it. On Fri, Nov 20, 2009 at 1:01 PM, Peng Yu wrote: > On Sat, Nov 21, 2009 at 11:55 AM, Steve Lianoglou > wrote: >>> read.delim gives me a data.frame. Is there a function that can return >>> the result in a matrix rather than data.frame? >> >> m <- as.matrix(read.delim(..)) > > I knew this approach. But this takes an extra step. Is there a command > that read a file directly into a matrix? > > Regards, > Peng > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to concatenate a vector of strings to a string?
?paste Read the help file, esp the collapse parameter. Might help to reread Intro to R. On Fri, Nov 20, 2009 at 8:03 PM, Peng Yu wrote: >> paste(c('a','b'),sep='') > [1] "a" "b" > > The above command doesn't concatenate the strings in a single string. > I'm wondering what is the correct way to do so. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with indexing
try this: > # create a factor and then convert back to numeric > x$nb <- as.integer(factor(x$name, levels=unique(x$name))) + 99 > x name freq nb 1 Mary1 100 2 Mary2 100 3 Mary3 100 4 Sam1 101 5 Sam2 101 6 John1 102 7 John2 102 8 John3 102 9 John4 102 On Sat, Nov 21, 2009 at 7:00 PM, Dana Sevak wrote: > Dear R Helpers, > > I am missing something very elementary here, and I don't seem to get it from > the help pages of the ave, seq and seq_along functions, so I wonder if you > could offer a quick help. > > To use an example from an earlier post on this list, I have a dataframe of > this kind: > > dat = data.frame(name = rep(c("Mary", "Sam", "John"), c(3,2,4))) > dat$freq = ave(seq_along(dat$name), dat$name, FUN = seq_along) > > dat > name freq > 1 Mary 1 > 2 Mary 2 > 3 Mary 3 > 4 Sam 1 > 5 Sam 2 > 6 John 1 > 7 John 2 > 8 John 3 > 9 John 4 > > What I need is another column assigning a number to each name starting from > index 100, that is: > > name freq nb > 1 Mary 1 100 > 2 Mary 2 100 > 3 Mary 3 100 > 4 Sam 1 101 > 5 Sam 2 101 > 6 John 1 102 > 7 John 2 102 > 8 John 3 102 > 9 John 4 102 > > What is the easiest way to do this? > > Thanks a lot for your kind help. > > Dana > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to read BRFSS file
Exactly what have you tried and what did not work? I downloaded the '.asc' (text) version of the data and it appears to be fixed format with 1294 characters per line; there are about 414K lines of data in the file. How much of the data do you need to extract? You can read in a portion of the file at a time and then extract just the fields that you need for processing. If it is not too many fields, this should be a reasonable sized object. On Sat, Nov 21, 2009 at 7:58 PM, chloe yoon wrote: > hello, > I am trying to do exploratory factor analysis with BRFSS dataset ( > http://www.cdc.gov/brfss/technical_infodata/surveydata/2008.htm) for a > couple of days, but I was not able to do that and got frustrated. Can > anybody help me with step by step guide? BRFSS dataset provides ASCII or SAS > format. > Thank you. > > chloe > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing "+" and "?" signs
'?' is a metacharacter in a regular expression. You have to escape it: > x <- "asdf+,jkl?" > > gsub("?", " ", x) Error in gsub("?", " ", x) : invalid regular expression '?' In addition: Warning message: In gsub("?", " ", x) : regcomp error: 'Invalid preceding regular expression' > # escape it > gsub("\\?", " ", x) [1] "asdf+,jkl " On Sun, Nov 22, 2009 at 6:01 PM, Steven Kang wrote: > Hi all, > > > I get an error message when trying to replace *+* or *?* signs (with empty > space) from a string. > > x <- "asdf+,jkl?" > > gsub("?", " ", x) > > > Error message: > > Error in > gsub("?", " ", x) : > invalid regular expression '?' > In addition: Warning message: > In gsub("?", " ", x) : > regcomp error: 'Invalid preceding regular expression' > > Your expertise in resolving this issue would be appreciated. > > Thanks. > > > > Steven > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Check if string has all alphabets or numbers
try this: > mywords<- c("harry","met","sally","subway10","1800Movies","12345") > grep("^[[:alpha:]]*$", mywords) # letters [1] 1 2 3 > grep("^[[:digit:]]*$", mywords) # numbers [1] 6 > On Mon, Nov 23, 2009 at 8:28 AM, Harsh wrote: > Hi R users, > I'd like to know if anyone has come across problems wherein it was necessary > to check if strings contained all alphabets, some numbers or all numbers? > > In my attempt to test if a string is numeric, alpha-numeric (also includes > if string is only alphabets) : > > # Reproducible R code below > mywords<- c("harry","met","sally","subway10","1800Movies","12345") > > mywords.alphanum > <-lapply(sapply(mywords,function(x)strsplit(x,NULL)),function(y) > ifelse(sum(is.na(sapply(y,as.numeric))) == 0 & length(y) > > 0,"numeric","alpha-numeric")) > > names(mywords.alphanum)[(which(mywords.alphanum == "numeric"))] > > > I understand that such "one-liners" (the second line of code above) that > make multiple calls are discouraged, but I seem to find then fascinating. > > Looking forward to alternate solutions/packages for the above problem. > > Thanks > Harsh Singhal > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Check if string has all alphabets or numbers
Added a little more: > mywords<- c("harry","met","sally","subway10","1800Movies","12345", "not > correct 123") > all.letters <- grep("^[[:alpha:]]*$", mywords) > all.numbers <- grep("^[[:digit:]]*$", mywords) # numbers > mixed <- grep("^[[:digit:][:alpha:]]*$", mywords) > all.letters [1] 1 2 3 > all.numbers [1] 6 > # mixed > setdiff(mixed, c(all.numbers, all.letters)) [1] 4 5 > # not any of the above > setdiff(seq(length(mywords)), c(mixed, all.numbers, all.letters)) [1] 7 > On Mon, Nov 23, 2009 at 8:28 AM, Harsh wrote: > Hi R users, > I'd like to know if anyone has come across problems wherein it was necessary > to check if strings contained all alphabets, some numbers or all numbers? > > In my attempt to test if a string is numeric, alpha-numeric (also includes > if string is only alphabets) : > > # Reproducible R code below > mywords<- c("harry","met","sally","subway10","1800Movies","12345") > > mywords.alphanum > <-lapply(sapply(mywords,function(x)strsplit(x,NULL)),function(y) > ifelse(sum(is.na(sapply(y,as.numeric))) == 0 & length(y) > > 0,"numeric","alpha-numeric")) > > names(mywords.alphanum)[(which(mywords.alphanum == "numeric"))] > > > I understand that such "one-liners" (the second line of code above) that > make multiple calls are discouraged, but I seem to find then fascinating. > > Looking forward to alternate solutions/packages for the above problem. > > Thanks > Harsh Singhal > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Check if string has all alphabets or numbers
Here is the way you can use grepl to get the various combinations: > mywords<- c("harry","met","sally","subway10","1800Movies","12345", + "not correct 123", "") > > numbers <- grepl("^[[:digit:]]+$", mywords) > letters <- grepl("^[[:alpha:]]+$", mywords) > both <- grepl("^[[:digit:][:alpha:]]+$", mywords) > > mywords[letters] [1] "harry" "met" "sally" > mywords[numbers] [1] "12345" > mywords[xor((letters | numbers), both)] # letters & numbers mixed [1] "subway10" "1800Movies" > > On Mon, Nov 23, 2009 at 9:17 AM, hadley wickham wrote: >>> mywords<- c("harry","met","sally","subway10","1800Movies","12345", "not >>> correct 123") >>> all.letters <- grep("^[[:alpha:]]*$", mywords) >>> all.numbers <- grep("^[[:digit:]]*$", mywords) # numbers >>> mixed <- grep("^[[:digit:][:alpha:]]*$", mywords) > > mywords<- c("harry","met","sally","subway10","1800Movies","12345", > "not correct 123", "") > mywords[grepl("^[[:digit:][:alpha:]]*$", mywords)] > > So maybe you should use > > mywords[grepl("^[[:digit:][:alpha:]]+$", mywords)] > > > And grepl is highly recommended over grep. > > Hadley > > -- > http://had.co.nz/ > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ow to have R automatically print traceback upon errors
I use this: options(error=utils::recover) and anytime an error occurs in the interactive mode, it will print out the traceback and then allow you to explore the variables at each level of the stack; just like putting 'browser()' in the code at the error point. Here is what I get in running under Windows: > x <- function() xyz() # non-existent function > > > x() # caa the function Error in x() : could not find function "xyz" <== error message Enter a frame number, or 0 to exit 1: x() <== traceback Selection: 0 > On Mon, Nov 23, 2009 at 7:52 PM, Hao Cen wrote: > Hi, > > I wonder how to have R automatically print stack trace produced by > traceback upon errors during interactive uses. I tried the suggestions on > http://old.nabble.com/Automatically-execute-traceback-when-execution-of-script-causes-error--td22368483.html#a22368775 > > and used options(error = recover) > options(showErrorCalls = T) > > It just produces an extra message like "recover called non-interactively; > frames dumped, use debugger() to view" > > Thanks > > Jeff > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write to file append by column
You can not append a column. Best bet, read the old file in, do a 'cbind', write the object back out. On Tue, Nov 24, 2009 at 5:59 AM, e-letter wrote: > Readers, > > Scenario: data x consists of one column; > 1 > 2 > 3 > > data y; > 4 > 5 > 6 > > Is it possible to write to file such that the file is: > 1,4 > 2,5 > 3,6 > > using the write.file function? I have tried the command: > > write(x,file="file.csv",ncolumns=1,append=TRUE,sep=",") > write(y,file="file.csv",ncolumns=1,append=TRUE,sep=",") > > but the result is: > > 1 > 2 > 3 > 4 > 5 > 6 > > yours, > > rhelpatconference.jabber.org > r 251 > mandriva 2008 > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write to file append by column
Here is the way to get the required output: > x <- data.frame(a=1:3) > write.csv(x, file='tempxx.csv', row.names=FALSE) > # new data > newData <- data.frame(b=4:6) > # read old data back in > oldData <- read.csv('tempxx.csv') > # cbind the new data > write.csv(cbind(oldData, newData), file='tempxx.csv', row.names=FALSE) > > file.show('tempxx.csv') "a","b" 1,4 2,5 3,6 On Tue, Nov 24, 2009 at 8:22 AM, e-letter wrote: > On 24/11/2009, jim holtman wrote: >> You can not append a column. Best bet, read the old file in, do a >> 'cbind', write the object back out. >> >> On Tue, Nov 24, 2009 at 5:59 AM, e-letter wrote: >>> Readers, >>> >>> Scenario: data x consists of one column; >>> 1 >>> 2 >>> 3 >>> >>> data y; >>> 4 >>> 5 >>> 6 >>> >>> Is it possible to write to file such that the file is: >>> 1,4 >>> 2,5 >>> 3,6 >>> >>> using the write.file function? I have tried the command: >>> >>> write(x,file="file.csv",ncolumns=1,append=TRUE,sep=",") >>> write(y,file="file.csv",ncolumns=1,append=TRUE,sep=",") >>> >>> but the result is: >>> >>> 1 >>> 2 >>> 3 >>> 4 >>> 5 >>> 6 >>> >>> yours, >>> >>> rhelpatconference.jabber.org >>> r 251 >>> mandriva 2008 >>> >>> __ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> -- >> Jim Holtman >> Cincinnati, OH >> +1 513 646 9390 >> >> What is the problem that you are trying to solve? >> > This is the requested format: > 1,4 > 2,5 > 3,6 > > The write functions described previously produce the following format: > 1 > 2 > 3 > 4 > 5 > 6 > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re moving white space help
Is this what you want: > x <- " and fgh-" > gsub(" +", "", x) [1] "andfgh-" On Tue, Nov 24, 2009 at 4:18 PM, Ramyathulasingam wrote: > > Hi there > > I am trying to remove the white space and replace it with nothing but didnt > have any luck with that > > x <- and fgh- > > i can replace the comma using gsub > gsub("\\-","",x) > but i cant replace the white space with nothing. > > Ramya > -- > View this message in context: > http://old.nabble.com/Removing-white-space-help-tp26503431p26503431.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test Binary File
Do you know how it is structured? Is it 64-bit floating point, 32-bit floating point, 64 bit integer, 32 bit integer, byte values, etc.? If we know the structure, then we can determine how to decode the information. On Wed, Nov 25, 2009 at 7:34 AM, Jason Rupert wrote: > I've got an error with the way I'm using readBin on a binary file of unknown > internal structure. I know the structure consists of rows and columns, but > I'm not sure how many of each. > > So, does anyone know of a valid test set of binary data that I could > reference while trying to figure out the technique of using readBin? > > It would be really helpful to try out readBin on a readily available and > understood binary file instead of starting with one of dubious internal > structure. > > Thank you again for your help and feedback. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tick marks on fold change versus fold change plot
It sounds like you want to plot 'log' on both axis: plot(..., log='xy') On Wed, Nov 25, 2009 at 12:24 PM, Alla Bulashevska wrote: > > Dear R users, > i try to produce the fold change versus fold change plot > where i have the values for x and y ranging from 0.01 to > 100. So i start with > plot(x,y,xlim=c(0.01,100),ylim=c(0.01,100), axes=F). > Then i would like both axes to have tick marks as > c(0.01,0.1,1,10,100) but they should appear equidistant. > How should i manage this? > Thank you for your help, > Alla. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Importing many files from a single code
Exactly what do you mean by "import"? What commands are you using? You can get a list of the files in a directory and then iterate through reading each one in. If you use 'lapply', you can 'read.table' in some data frames and then 'rbind' them into a single data frame. You need to be more specific on the problem you are trying to solve. On Wed, Nov 25, 2009 at 9:35 AM, ram basnet wrote: > Dear R users, > > Does somebody know the way to import many files by a single command in R ? I > have 50 files in a directory and now, i am importing the files repeatedly > (one by one). If there is a way to import all files at a time, it makes much > more easy and save times too. > Thanks in advance. > > > Sincerely, > Ram Kumar Basent > Wageningen University, > the Netherlands > > > > > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unique observations
shouldn't the first observation for Tree1 be " Tree1 leaves 01-01-2009"? > x <- read.table(textConnection("Tree disease date + Tree1 leaves 01-01-2009 + Tree2 roots 13-09-2009 + Tree1 roots 24-10-2009"), header=TRUE) > closeAllConnections() > # split by "Tree" and take first observation > do.call('rbind', lapply(split(x, x$Tree), function(.tr) .tr[1,])) Tree disease date Tree1 Tree1 leaves 01-01-2009 Tree2 Tree2 roots 13-09-2009 > On Wed, Nov 25, 2009 at 9:44 AM, John Lipkins wrote: > Hey R list, > > A beginners question. How can I do the following: > > In my research population it is possible that several items can appear > several times, measured on different moments in time. This is being supplied > in a total list with all observations identified by a number (per item) and > a moment of observation (date). Now I want to make a unique list of this > observation preserving the characteristics of the first observation. As > example: > > Tree disease date > Tree1 leaves 01-01-2009 > Tree2 roots 13-09-2009 > Tree1 roots 24-10-2009 > > Now I want to create a list of unique elements (in the example only once > Tree1 and Tree2) with the first observed disease and date. For the example > the result would look like: > > Tree disease date > Tree1 roots 24-10-2008 > Tree2 roots 13-09-2009 > > Can someone help me with this question? > > Thanks in advance. > Kind regards, > > John > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] difference of two rows
Try this: > x <- read.table(textConnection("ID YEAR + 13 2007 + 15 2003 + 15 2006 + 15 2008 + 21 2006 + 21 2007"), header=TRUE) > x$diff <- ave(x$YEAR, x$ID, FUN=function(a) c(diff(a), NA)) > > x ID YEAR diff 1 13 2007 NA 2 15 20033 3 15 20062 4 15 2008 NA 5 21 20061 6 21 2007 NA On Wed, Nov 25, 2009 at 10:55 AM, clion wrote: > > Dear R user, > I'd like to calculate the difference of two rows, where "ID" is the same. > eg.: I've got the following dataframe: > ID YEAR > 13 2007 > 15 2003 > 15 2006 > 15 2008 > 21 2006 > 21 2007 > > and I'd like to get the difference, like this: > ID YEAR diff > 13 2007 NA > 15 2003 3 > 15 2006 2 > 15 2008 NA > 21 2006 1 > 21 2007 NA > > that should be fairly easy...I hope > Thanks for any helpful comments > B. > > > > -- > View this message in context: > http://old.nabble.com/difference-of-two-rows-tp26515212p26515212.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Feature request for as.Date() function
Seems to work fine in my testing: > x <- read.csv(textConnection("date,value + 2009-01-01,10 + 2009-02-01,1 + 'NA', 3"), colClasses=c("Date", 'integer')) > > str(x) 'data.frame': 3 obs. of 2 variables: $ date :Class 'Date' num [1:3] 14245 14276 NA $ value: int 10 1 3 > x <- read.csv(textConnection("date,value + 2009-01-01,10 + 2009-02-01,1 + NA, 3"), colClasses=c("Date", 'integer')) > > str(x) 'data.frame': 3 obs. of 2 variables: $ date :Class 'Date' num [1:3] 14245 14276 NA $ value: int 10 1 3 > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. On Wed, Nov 25, 2009 at 12:38 PM, wrote: > Hello - > > I have a csv file with a few date columns. Some of the records have an > "NA" character string instead of the date. When I attempt to use > read.csv() and typecast the columns using colClasses, I receive the > following error: > Error in charToDate(x) : > character string is not in a standard unambiguous format > > Similarly, the following command produces the same error: > as.Date("NA") > > However, as.Date(NA) performs as documented. > > Can we enhance the as.Date() function to convert "NA" strings into NA > value prior to type conversion? > > Thanks! > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 2-character plotting characters?
?text You will have to plot your own text at each point that want. On Sat, Nov 28, 2009 at 6:38 PM, Ben Seligman wrote: > I am trying to make a plot using the plot command in which I would like the > plotting characters to be two-character strings (they're two-letter > abbreviations of country names). I've tried the pch argument and this, of > course, only produces 1-character strings. Looking through Intro to R and > the reference manual, I can't find any obvious way around this. Would > anyone have any suggestions? > > Thanks so much! > > -Ben > > -- > Benjamin Seligman > Stanford University, School of Medicine > MD Candidate, SMS II > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to find where the source code of an R function or package is installed?
Check out: Uwe Ligges. R Help Desk: Accessing the sources. R News, 6(4):43-45, October 2006 On Sat, Nov 28, 2009 at 11:00 PM, Peng Yu wrote: > I'm wondering where is the source of an R function or a package is. > For example, where is 'attributes'? > >> attributes > function (obj) .Primitive("attributes") > > I also do understand what .Primitive mean. Could somebody let me know > how to locate source file in an R installation? Why typing > 'attributes' does not give its definition? > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing objects from a list based on nrow
One thing to be careful of is if no dataframe have less than 3 rows: > df1<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5)) > df2<-data.frame(letter=c("A","B"),number=c(1,2)) > df3<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5)) > df4<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5)) > > lst<-list(df1,df3,df4) > lst [[1]] letter number 1 A 1 2 B 2 3 C 3 4 D 4 5 E 5 [[2]] letter number 1 A 1 2 B 2 3 C 3 4 D 4 5 E 5 [[3]] letter number 1 A 1 2 B 2 3 C 3 4 D 4 5 E 5 > lst[-which(sapply(lst, nrow) < 3)] list() > Notice the list is now empty. Instead use: > lst[sapply(lst, nrow) >=3] [[1]] letter number 1 A 1 2 B 2 3 C 3 4 D 4 5 E 5 [[2]] letter number 1 A 1 2 B 2 3 C 3 4 D 4 5 E 5 [[3]] letter number 1 A 1 2 B 2 3 C 3 4 D 4 5 E 5 On Sun, Nov 29, 2009 at 3:43 AM, Linlin Yan wrote: > Try these: > sapply(lst, nrow) # get row numbers > which(sapply(lst, nrow) < 3) # get the index of rows which has less than 3 > rows > lst <- lst[-which(sapply(lst, nrow) < 3)] # remove the rows from the list > > On Sun, Nov 29, 2009 at 4:36 PM, Tim Clark wrote: >> Dear List, >> >> I have a list containing data frames of various numbers of rows. I need to >> remove any data frame that has less than 3 rows. For example: >> >> df1<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5)) >> df2<-data.frame(letter=c("A","B"),number=c(1,2)) >> df3<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5)) >> df4<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5)) >> >> lst<-list(df1,df2,df3,df4) >> >> How can I determine that the second object (df2) has less than 3 rows and >> remove it from the list? >> >> Thanks! >> >> Tim >> >> >> >> >> Tim Clark >> Department of Zoology >> University of Hawaii >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RSQLite does not read very large values correctly
It appears that you were reading the number in as an integer and not numeric. The value that you are seeing (-596864072) is the numeric value trucated to 32 bit. The number would have been in hex (6DC6C93B8) but dropping the leading '6' you will get the result as a 32 bit integer. Check your data base definition and how you are reading in your data. On Mon, Nov 30, 2009 at 11:14 AM, Ruecker, Sebastian wrote: > Hello, > > I am trying to import data from an SQLite database to R. > Unfortunately, I seem to get wrong data when I try to import very large > numbers. > > For example: > I look at the database via SQLiteStudio(v.1.1.3) and I see the following > values: > > OrderID Day TimeToclose > 1 2009-11-25 29467907000 > 2 2009-11-25 29467907000 > 3 2009-11-25 29467907000 > > > Now I run this R Code: > >> library("DBI") >> library("RSQLite") >> >> # DB Connection >> con <- dbConnect(dbDriver("SQLite"), "C:/Temp/TickDB01.db") >> raw_Data <- dbGetQuery(con, "SELECT OrderID, Day, TimeToClose FROM > Tr_TickData WHERE OrderID in (1,2,3)") >> raw_Data > OrderID Day TimeToClose > 1 1 2009-11-25 -596864072 > 2 2 2009-11-25 -596864072 > 3 3 2009-11-25 -596864072 > > > The values are totally wrong... Is it because RSQLite has a problem with > big numbers? > TimeToClose is microseconds till 17:00. > > When I make the numbers smaller, it works again: > >> raw_Data <- dbGetQuery(con, "SELECT TimeToClose/1000 as TTC FROM > Tr_TickData WHERE OrderID in (1,2,3)") >> raw_Data > TTC > 1 29467907 > 2 29467907 > 3 29467907 > > > I would appreciate any help with this problem! > > Thanks and regards, > > Sebastian > > __________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] paste name in for loop?
Here is what you want: xout <- c(1,5,10,25,50,100) for(i in xout) { print(paste("Areal_Ppt_",i,"sqmi.txt", sep="")) } Notice that 'i' will be assigned each value in xout; you do not have to index into the vector. Notice that you second value is 50 which is xout[5]. On Mon, Nov 30, 2009 at 7:49 PM, Douglas M. Hultstrand wrote: > Hello, > > I am trying to create subsets of grouped data (by area size), and use the > area size as part of the output name. The code below works for area (xout) > 1 and 50, the other files are given NA for an area. > > A simple example: > xout <- c(1,5,10,25,50,100) > for(i in xout) { print(paste("Areal_Ppt_",xout[i],"sqmi.txt", sep="")) } > [1] "Areal_Ppt_1sqmi.txt" > [1] "Areal_Ppt_50sqmi.txt" > [1] "Areal_Ppt_NAsqmi.txt" > [1] "Areal_Ppt_NAsqmi.txt" > [1] "Areal_Ppt_NAsqmi.txt" > [1] "Areal_Ppt_NAsqmi.txt" > > The actual code and partial dataset are below. > > Thanks for your help, > Doug > > ### > ### Real Code ### > ### > data2 <- read.table("GROUP.txt", header=T, sep=",") > xout <- c(1,5,10,25,50,100) > for(i in xout) { > name <- paste("Areal_Ppt_",xout[i],"sqmi.txt", sep="") > b.1 <- subset(data2, area == i) > write.table(b.1, file=name,quote=FALSE,row.names=FALSE, sep=",") > } > > ## > ### Dataset GROUP.txt ### > ### > hr,area,avg_ppt > 21,1,0 > 21,5,0.001 > 21,10,0.001 > 21,25,0.005 > 21,50,0.01 > 21,100,0.011 > 22,1,0.003 > 22,5,0.005 > 22,10,0.00824 > 22,25,0.04258 > 22,50,0.057 > 22,100,0.101 > 23,1,2.10328 > 23,5,2.02755 > 23,10,1.93808 > 23,25,1.78408 > 23,50,1.67407 > 23,100,1.568 > 24,1,3.20842 > 24,5,3.09228 > 24,10,2.95452 > 24,25,2.71661 > 24,50,2.54607 > 24,100,2.38108 > > -- > - > Douglas M. Hultstrand, MS > Senior Hydrometeorologist > Metstat, Inc. Windsor, Colorado > voice: 970.686.1253 > email: dmhul...@metstat.com > web: http://www.metstat.com > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error message when logical indexing vecor is all FALSE
?try On Tue, Dec 1, 2009 at 12:22 PM, Jannis wrote: > Dears, > > > is there any way to "switch off" or work around the error message that > pops up when I do something like: > > > A<-B['logical vector'] > > > and when 'logical vector' only consists of FALSE values? My problem is > that this message always kicks me out of my loops and always testing via > an if clause whether 'logical vector' contains any TRUE values is much > too complex due to many different conditions and several of the above > statements (and actually it seems to make my code really slow). > > > Cheers > Jannis > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sort a data frame by a vector
Is this what you want: > dataDF = data.frame(A1 = c("B", "A", "C"), A2 = c(1,2,3)) > dataDF A1 A2 1 B 1 2 A 2 3 C 3 > dataDF[order(dataDF$A1),] A1 A2 2 A 2 1 B 1 3 C 3 > If you want the sequence "CAB" then you will have to change the factors in column 1: > dataDF$A1 <- factor(dataDF$A1, levels=c("C", "A", "B")) > dataDF[order(dataDF$A1),] A1 A2 3 C 3 2 A 2 1 B 1 > On Tue, Dec 1, 2009 at 10:36 PM, Hao Cen wrote: > Hi, > > > > I have a a vector and a data frame with two columns > > vec = c("C", "A", "B") > > dataDF = data.frame(A1 = c("B", "A", "C"), A2 = c(1,2,3)) > > > > I would like to sort the data frame by column A1 such that the order of > elements in A1 is as the same as in vec. > > > > After the ordering, the data frame would be > > A1 A2 > > C 3 > > A 2 > > B 1 > > > > Any suggestions would be appreciated. > > > > Thanks in advance > > > > Jeff > > > [[alternative HTML version deleted]] > > __________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sort a data frame by a vector
The factor statement should have been: (missed the 'vec' on the first reading) dataDF$A1 <- factor(dataDF$A1, levels=vec) On Tue, Dec 1, 2009 at 10:57 PM, jim holtman wrote: > Is this what you want: > >> dataDF = data.frame(A1 = c("B", "A", "C"), A2 = c(1,2,3)) >> dataDF > A1 A2 > 1 B 1 > 2 A 2 > 3 C 3 >> dataDF[order(dataDF$A1),] > A1 A2 > 2 A 2 > 1 B 1 > 3 C 3 >> > > If you want the sequence "CAB" then you will have to change the > factors in column 1: > >> dataDF$A1 <- factor(dataDF$A1, levels=c("C", "A", "B")) >> dataDF[order(dataDF$A1),] > A1 A2 > 3 C 3 > 2 A 2 > 1 B 1 >> > > > On Tue, Dec 1, 2009 at 10:36 PM, Hao Cen wrote: >> Hi, >> >> >> >> I have a a vector and a data frame with two columns >> >> vec = c("C", "A", "B") >> >> dataDF = data.frame(A1 = c("B", "A", "C"), A2 = c(1,2,3)) >> >> >> >> I would like to sort the data frame by column A1 such that the order of >> elements in A1 is as the same as in vec. >> >> >> >> After the ordering, the data frame would be >> >> A1 A2 >> >> C 3 >> >> A 2 >> >> B 1 >> >> >> >> Any suggestions would be appreciated. >> >> >> >> Thanks in advance >> >> >> >> Jeff >> >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] find the index of the next largest element in a sorted vector
Is this what you want: > x <- c(0,3,4) > ?findInterval > findInterval(2, x) [1] 1 > On Wed, Dec 2, 2009 at 9:34 AM, Hao Cen wrote: > Hi, > > How can I find the index of the next largest element in a sorted vector if > an element is not found. > > for example, searching 2 in c(0,3,4) would return 1 since 2 is not in the > vector and 0 is the next largest element to 2. > > I tried which and match and neither returns such information. > >> which(c(0,3,4) == 2) > integer(0) >> match(2, c(0,3,4)) > [1] NA > > > thanks > > Jeff > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reordering the results from table(cut()) by break argument
try this: > dat <- rnorm(100) > breaks <- -3:3 > table((cut(dat, breaks))) (-3,-2] (-2,-1] (-1,0] (0,1] (1,2] (2,3] 1 10 35 39 13 2 > x <- table((cut(dat, breaks))) > rev(x) (2,3] (1,2] (0,1] (-1,0] (-2,-1] (-3,-2] 2 13 39 35 10 1 > On Wed, Dec 2, 2009 at 12:18 PM, Mark Heckmann wrote: > I have a vector and need to count how many data points fall inside each bin: > > dat <- rnorm(100) > breaks <- -3:3 > table((cut(dat, breaks))) > > (-3,-2] (-2,-1] (-1,0] (0,1] (1,2] (2,3] > 3 13 42 30 12 0 > > if I reverse the breaks vector, the results remains the same: > breaks <- rev(breaks) > table((cut(dat, breaks))) > > (-3,-2] (-2,-1] (-1,0] (0,1] (1,2] (2,3] > 3 13 42 30 12 0 > > What I would like is break to also determine the order of the table output, > in this case it should also be reversed, like: > ( 3, 2] ( 2, 1] ( 1,0] (0,-1] (-1,-2] (-2,-3] > 0 12 30 42 13 3 > > Thus I would like to reorder the vector using break, but I do not know how. > > TIA > Mark > ––– > Mark Heckmann > Dipl. Wirt.-Ing. cand. Psych. > Vorstraße 93 B01 > 28359 Bremen > Blog: www.markheckmann.de > R-Blog: http://ryouready.wordpress.com > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading comments in text file from R
comment.char='' On Wed, Dec 2, 2009 at 2:05 PM, Graham Smith wrote: > Thanks all. > > I assumed it would be easy, but searching yielded nothing useful. > > Graham > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting vectors from a matrix (err, I think) in RMySQL
try this: > salaries yearID POS pct 12009 RF 203 22009 DH 200 32009 1B 198 42009 3B 180 52009 LF 169 62009 SS 156 72009 CF 148 82009 2B 97 92009 C 86 10 2008 DH 234 11 2008 1B 199 12 2008 RF 197 13 2008 3B 191 14 2008 SS 180 15 2008 CF 164 16 2008 LF 156 17 2008 2B 104 18 2008 C 98 > x <- split(salaries[c('yearID','pct')], salaries$POS) > x $`1B` yearID pct 32009 198 11 2008 199 $`2B` yearID pct 82009 97 17 2008 104 $`3B` yearID pct 42009 180 13 2008 191 $C yearID pct 92009 86 18 2008 98 $CF yearID pct 72009 148 15 2008 164 $DH yearID pct 22009 200 10 2008 234 $LF yearID pct 52009 169 16 2008 156 $RF yearID pct 12009 203 12 2008 197 $SS yearID pct 62009 156 14 2008 180 > On Wed, Dec 2, 2009 at 4:01 PM, Wells Oliver wrote: > I have a query which returns a data set like so: > >> salaries > yearID POS pct > 1 2009 RF 203 > 2 2009 DH 200 > 3 2009 1B 198 > 4 2009 3B 180 > 5 2009 LF 169 > 6 2009 SS 156 > 7 2009 CF 148 > 8 2009 2B 97 > 9 2009 C 86 > 10 2008 DH 234 > 11 2008 1B 199 > 12 2008 RF 197 > 13 2008 3B 191 > 14 2008 SS 180 > 15 2008 CF 164 > 16 2008 LF 156 > 17 2008 2B 104 > 18 2008 C 98 > > I'd like to make a vector for all data for a given position, so for example > here I'd like all yearID and pct for POS 'RF which should look like: > > yearID pct > 1 2009 203 > 2 2008 197 > > Apologies if I'm mangling terminology here. > > -- > Wells Oliver > we...@submute.net > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculation problem when export and import data
Exactly what errors are you getting? What is the 'str(a)' so we have an idea of the data you are processing. Why don't you use save/load so that the data is saved in the original format. Have you checked the structure of the data before/after the write.table/read.table? Also take a look at what is being returned with your 'mean(a[rep,])'; this would appear to be multivalued depending on what your dataframe is: e.g., > x <- data.frame(a=1:10, b=1:10, c=letters[1:10]) > mean(x) a b c 5.5 5.5 NA Warning message: In mean.default(X[[3L]], ...) : argument is not numeric or logical: returning NA So there is more information that you have to provide; also try to look at the structure of all your objects to see if they are what you think they should be. On Wed, Dec 2, 2009 at 6:36 PM, aegea wrote: > > Hello, > > I have a question on export and import data. Thank you for any suggestions. > > data 'simul' is generated as follows: > N <- 20 > n <- N/2 > nsets <- 10 > simul <- matrix(0,nsets,N) > th <- c(0,1, 1) > for(i in 1:nsets){ > simul[i,] <- rnorm(N,mean= rep(th[1:2],N/2),sd=th[3]) > } > > I exported data as follows: > write.table(simul, file="D:\\test.txt", row.names=F, col.names=F) > > When I want to use this data, I imported as follows: > a=read.table("D:\\test.txt") > > So far, it works well. When I deal with data, I need use each row to do > calculations: > > for(rep in 1:nsets){ > y <- a[rep,] > b<-c(mean(y)+3, mean(y)-4) # cannot calculate mean(y), the mean of this row > m<-sd(y) # also cannot calculate sd(y) > } > > I need a lot of calculation based on y, but after I imported data, R comes > error on it. > > Could you please give me some suggestions? > > > -- > View this message in context: > http://n4.nabble.com/calculation-problem-when-export-and-import-data-tp947250p947250.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data manipulation
try this: > x <- c('v2FfaPre15','v2FfaPre10','v2FfaPre5','v2Ffa2', > 'v2Ffa3','v2Ffa4') > sub("^.*?([0-9]+)$", "\\1", x, perl=TRUE) [1] "15" "10" "5" "2" "3" "4" > On Thu, Dec 3, 2009 at 9:00 AM, oscar linares wrote: > Dear Wiza[R]ds, > > I have a data.frame header that looks like this: > > v2FfaPre15 v2FfaPre10 v2FfaPre5 v2Ffa2 v2Ffa3 v2Ffa4 > > I need it to look like this, > > 15 10 5 2 3 4 > > i.e., with v2FfaPre and v2Ffa stripped off > > Any suggestions, > > Thanks in advance! > > -- > Oscar > Oscar A. Linares, MD > Translational Medicine Unit > LaPlaisance Bay, Bolles Harbor > Monroe, Michigan 48161 > > Department of Medicine, > University of Toledo College of Medicine > Toledo, OH 43606-3390 > > Department of Internal Medicine, > The Detroit Medical Center (DMC) > Harper University Hospital > Wayne State University School of Medicine > Detroit, Michigan 48201 > > [[alternative HTML version deleted]] > > __________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two-way/Three-way sum.
try this: > x State Month Year Value 1 NC Jan 1996 1 2 NC Jan 1996 2 3 NC Feb 1997 2 4 NC Feb 1997 3 5 NC Mar 1998 3 6 NC Mar 1998 4 7 NY Jan 1996 4 8 NY Jan 1996 5 9 NY Feb 1997 5 10NY Feb 1997 6 11NY Mar 1998 6 12NY Mar 1998 7 > tapply(x$Value, list(x$State, x$Year), sum) 1996 1997 1998 NC357 NY9 11 13 > > tapply(x$Value, list(x$State, x$Year, x$Month), sum) , , Feb 1996 1997 1998 NC NA5 NA NY NA 11 NA , , Jan 1996 1997 1998 NC3 NA NA NY9 NA NA , , Mar 1996 1997 1998 NC NA NA7 NY NA NA 13 > On Thu, Dec 3, 2009 at 1:50 PM, Peng Cai wrote: > Hi R Users, > > I'm wondering how can I calculate two (or three) way sum of a variable. A > sample data is: > > State Month Year Value > NC Jan 1996 1 > NC Jan 1996 2 > NC Feb 1997 2 > NC Feb 1997 3 > NC Mar 1998 3 > NC Mar 1998 4 > NY Jan 1996 4 > NY Jan 1996 5 > NY Feb 1997 5 > NY Feb 1997 6 > NY Mar 1998 6 > NY Mar 1998 7 > > I'm trying to sum up "value" column by State*Month and by State*Month*Year. > Also, I may need to calculate mean value along with "sum". > > Any help would be greatly appreciated, > > Thanks, > Peng > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataset index
Does this do what you want: > x <- matrix(c( + 0, 0, 0, + 0, 0, 0, + 0, 1, 0, + 0, 1, 0, + 0, 1, 0, + 1, 2, 1, + 1, 2, 1, + 1, 3, 1, + 1, 3, 1, + 1, 3, 1), + ncol = 3, byrow = T, + dimnames = list(1:10, c("gender", "race", "disease"))) > key <- apply(x, 1, paste, collapse=":") > m.flags <- lapply(unique(key), function(.indx){ + key == .indx + }) > # create the keys > do.call(rbind, m.flags) 1 2 3 4 5 6 7 8 910 [1,] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [2,] FALSE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE [3,] FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE [4,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE > On Thu, Dec 3, 2009 at 5:07 PM, Lisa wrote: > > Hello, All, > > I have a dataset that looks like this: > > x <- matrix(c( > 0, 0, 0, > 0, 0, 0, > 0, 1, 0, > 0, 1, 0, > 0, 1, 0, > 1, 2, 1, > 1, 2, 1, > 1, 3, 1, > 1, 3, 1, > 1, 3, 1), > ncol = 5, byrow = T, > dimnames = list(1:10, c("gender", "race", "disease"))) > > I want to write a function to produce several matrices including only “TRUE” > and “FALSE” for the different levels of the variables (these matrices may be > thought as index matrices), like > >> m1 > TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > >> m2 > FALSE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE > >> m3 > FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE > >> m4 > FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE > > Can anyone please help how to get this done? Your help would be greatly > appreciated. > > Lisa > > -- > View this message in context: > http://n4.nabble.com/dataset-index-tp948049p948049.html > Sent from the R help mailing list archive at Nabble.com. > > ______ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] corrupted matrix data.. sporadic result appears to be pairs of decimal numbers glommed together
Those appear to be complex numbers; some place in your script you must be computing something that return a complex number. Do an str on the matrix to see what it says; see if it says this: > x.1 [,1] [,2] [1,] 0.1820848-0.032i 0.1820848-0.032i [2,] 0.1820848-0.032i 0.1820848-0.032i > str(x.1) cplx [1:2, 1:2] 0.182-0i 0.182-0i 0.182-0i ... > If it does, look closely at your script. On Thu, Dec 3, 2009 at 2:54 PM, Stephen Grubb wrote: > Hello, > > We are occasionally getting matrix results that appear to be corrupted... > here are the last several rows of an example. These are supposed to be > floating point numbers. > > [25015,] 1.820848e-01-3.2090e-06i > [25016,] 2.178046e-01-4.8140e-06i > [25017,] 1.820848e-01-3.2090e-06i > [25018,] 1.820848e-01-3.2090e-06i > [25019,] 1.144594e-01-1.6657e-06i > [25020,] 1.820848e-01-3.2090e-06i > [25021,] -1.293271e-01+4.3889e-06i > [25022,] 1.144594e-01-1.6657e-06i > [25023,] 1.820848e-01-3.2090e-06i > [25024,] 1.820848e-01-3.2090e-06i > [25025,] 1.173487e-01-4.4415e-07i > [25026,] 1.820848e-01-3.2090e-06i > [25027,] 1.375304e-01-3.6167e-06i > [25028,] 1.820848e-01-3.2090e-06i > [25029,] -1.293271e-01+4.3889e-06i > [25030,] 1.820848e-01-3.2090e-06i > [25031,] 1.820848e-01-3.2090e-06i > [25032,] 1.820848e-01-3.2090e-06i > [25033,] 1.820848e-01-3.2090e-06i > > Any general idea what may be going on here? > > It is a sporadic problem... it occurs maybe 2% or 3% of the time when running > this particular script on various data. > > I apologize for not including a pared-down example that reproduces the > problem we are using an R script written elsewhere on large data sets. > If someone wants more specifics please follow up. > > Steve Grubb > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing 'output.csv' file
The csv file is exactly as you describe it. ";35" represents two columns of data. If you read it is, you will probably get NA as the value in the first column. So what problem are you having in reading in the data? > x <- read.csv(textConnection("8;32 + 9;33 + 10;34 + ;35 + ;36 + ;37 + ;38"), header=FALSE, sep=';') > closeAllConnections() > x V1 V2 1 8 32 2 9 33 3 10 34 4 NA 35 5 NA 36 6 NA 37 7 NA 38 On Fri, Dec 4, 2009 at 5:49 AM, Maithili Shiva wrote: > > Dear Mr Signer and Mr Cleland, > > Thanks a lot for you great help. However, the output which I am getting is > as given below - > > > > > > > > x > > 1;25 > > 2;26 > > 3;27 > > 4;28 > > 5;29 > > 6;30 > > 7;31 > > 8;32 > > 9;33 > > 10;34 > > ;35 > > ;36 > > ;37 > > ;38 > > ;39 > > ;40 > > ;41 > > ;42 > > ;43 > > ;44 > > ;45 > > ;46 > > ;47 > > ;48 > > ;49 > > ;50 > > However, my requirement is I should get the csv file as > > M N > 1 25 > 2 26 > 3 27 > > > 10 34 >35 >36 >37 > > .. > .. >50 > > So that I can acrry out further calcualtions on this output file. Please > guide. > > Regards > > Maithili > > --- On Fri, 4/12/09, Johannes Signer wrote: > > > From: Johannes Signer > Subject: Re: [R] writing 'output.csv' file > To: "Maithili Shiva" > Date: Friday, 4 December, 2009, 10:29 AM > > > Hello, > > maybe that helps: > > write.csv(paste((c(m,rep(" ",length(N)-length(M,n, sep=";"), > "output.csv", row.names=F) > > Johannes > > > On Fri, Dec 4, 2009 at 11:12 AM, Maithili Shiva > wrote: > > Dear R helpers > > Suppose > > M <- c(1:10) # length(M) = 10 > N <- c(25:50) # length(N) = 26 > > I wish to have an outut file giving M and N. So I have tried > > write.csv(data.frame(M, N), 'output.csv', row.names = FALSE) > > but I get the following error message > > Error in data.frame(M, N) : > arguments imply differing number of rows: 10, 26 > > How do I modify my write.csv command to get my output in a single (csv) > file irrespective of lengths. > > Plese Guide > > Thanks in advance > > Maithili > > > > The INTERNET now has a personality. YOURS! See your Yahoo! Homepage. >[[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > > > > > [[elided Yahoo spam]] > >[[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] selective subsetting of a correlation matrix
Will something like this work for you: > x <- matrix(1:100,10) > dimnames(x) <- list(letters[1:10], LETTERS[1:10]) > x A B C D E F G H I J a 1 11 21 31 41 51 61 71 81 91 b 2 12 22 32 42 52 62 72 82 92 c 3 13 23 33 43 53 63 73 83 93 d 4 14 24 34 44 54 64 74 84 94 e 5 15 25 35 45 55 65 75 85 95 f 6 16 26 36 46 56 66 76 86 96 g 7 17 27 37 47 57 67 77 87 97 h 8 18 28 38 48 58 68 78 88 98 i 9 19 29 39 49 59 69 79 89 99 j 10 20 30 40 50 60 70 80 90 100 > x[c('c','g','j'), c("B","E","I")] B E I c 13 43 83 g 17 47 87 j 20 50 90 > On Fri, Dec 4, 2009 at 8:18 AM, Lee William wrote: > Dear All, > I have a correlation matrix say 'M' (4000x4000) for 4000 genes and I want > to > subset it to 'N' (190x190) for 190 genes. > The list of those 190 genes are in variable 't'. So the idea is to read the > names of genes from variable 't' and subset the matrix M accordingly. > Any thoughts are welcome! > > Best > Lee > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Class attributes
Here a way of doing it: for (i in 5:12){ # convert to character so you can substitute 'x' a <- as.character(dd[,i]) a[a == 'x'] <- '0' replace with zero dd[,i] <- as.numeric(a) } On Fri, Dec 4, 2009 at 11:55 AM, Allen L wrote: > > Dear R forum, > I want to replace all the elements in a data frame (dd) which match the > character "x" with "0". > What's the most elegant way of doing this (there must be an easy way which > I've missed)? I settled on the following loop: > > >for(i in 5:12){# These are the column of dd I am > interested > in > >dd[which(dd[,i]=="x"),i]<-0 > >} > > The problem with this is that the columns which used to contain "x" are > still considered factors and I am unable to coerce them into numeric: > > > mean.species.biomass<-colMeans(as.numeric(dd.p[,5:12])) > >Error in inherits(x, "data.frame") : > (list) object cannot be coerced to type 'double' > > I'm tried unclassing & reclassing, other functions etc. but nothing seems > to > work. What is wrong? > Thanks in advance, > Allen > -- > View this message in context: > http://n4.nabble.com/Class-attributes-tp948693p948693.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep() exclude certain patterns?
use !grepl On Fri, Dec 4, 2009 at 2:43 PM, Peng Yu wrote: > On Fri, Dec 4, 2009 at 11:54 AM, Duncan Murdoch > wrote: > > On 04/12/2009 12:52 PM, Peng Yu wrote: > >> > >> The external grep program has an option -v to select non-matching > >> lines. I'm wondering if how to exclude certain patterns in grep() in > >> R? > >> > > > > ?grep > > I don't see which argument to use. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot data from tapply
Here is one way of doing it: x=c(1,2,3,1) y=c(1,2,3,1) ss=c(55,NA,55,88) ss_byxy_test=tapply( ss, list( x, y), mean, na.rm=TRUE) # use the 'reshape' package ss_byxy_test # now 'melt' the data to get it into a format for plotting (ss_melt <- melt(ss_byxy_test)) # create the plot area so you can add the 'ss' as text plot(0, type='n', xlim=range(ss_melt$X1), ylim=range(ss_melt$X2), xlab="X", ylab="Y") text(ss_melt$X1, ss_melt$X2, ss_melt$value, font=2, col='red') On Sat, Dec 5, 2009 at 4:49 PM, dwwc wrote: > > i have three data, x coordinate, y coordinate and signal strength > > i use tapply() function to get the average ss in the give x,y location > x=c(1,2,3,1) > y=c(1,2,3,1) > ss=c(55,NA,55,88) > ss_byxy_test=tapply( ss, list( x, y), mean) > and I get this table > 1 2 3 > 1 71.5 NA NA > 2 NA NA NA > 3 NA NA 55 > but i don't know how to plot different the ss with the xy location, > can anyone help me > -- > View this message in context: > http://n4.nabble.com/plot-data-from-tapply-tp949436p949436.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot data from tapply
I left off the statement to load the reshape package. If you don't have it, install it from CRAN: x=c(1,2,3,1) y=c(1,2,3,1) ss=c(55,NA,55,88) ss_byxy_test=tapply( ss, list( x, y), mean, na.rm=TRUE) ss_byxy_test # use the 'reshape' package library(reshape) # now 'melt' the data to get it into a format for plotting (ss_melt <- melt(ss_byxy_test)) # create the plot area so you can add the 'ss' as text plot(0, type='n', xlim=range(ss_melt$X1), ylim=range(ss_melt$X2), xlab="X", ylab="Y") text(ss_melt$X1, ss_melt$X2, ss_melt$value, font=2, col='red') On Sat, Dec 5, 2009 at 4:49 PM, dwwc wrote: > > i have three data, x coordinate, y coordinate and signal strength > > i use tapply() function to get the average ss in the give x,y location > x=c(1,2,3,1) > y=c(1,2,3,1) > ss=c(55,NA,55,88) > ss_byxy_test=tapply( ss, list( x, y), mean) > and I get this table > 1 2 3 > 1 71.5 NA NA > 2 NA NA NA > 3 NA NA 55 > but i don't know how to plot different the ss with the xy location, > can anyone help me > -- > View this message in context: > http://n4.nabble.com/plot-data-from-tapply-tp949436p949436.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data manipulation/subsetting and relation matrix
try this: myDat <- read.table(textConnection("group id 1 101 1 201 1 301 2 401 2 501 2 601 3 701 3 801 3 901"),header=TRUE) closeAllConnections() corr_mat <-as.matrix(read.table(textConnection("1 1 .5 0 0 0 0 0 0 0 2 .5 1 0 0 0 0 0 0 0 3 00 1.0 0 0 0 0 0 0 4 00 0 1 .5 .5 0 0 0 5 00 0 .5 1.5 0 0 0 6 00 0 .5 .5 1 00 0 7 00 0 00 0 1 0 0 8 0 0 0 00 0 0 1 .5 9 0 0 0 0 00 0 .5 1"),header=FALSE)) closeAllConnections() corr_mat <- corr_mat[,-1] colnames(corr_mat) <- myDat$id rownames(corr_mat) <- myDat$id # split out the groups groups <- split(as.character(myDat$id), myDat$group) # process each subgroup result <- lapply(groups, function(.grp){ subgroup <- corr_mat[.grp, .grp] output <- NULL # zero the diag diag(subgroup) <- 0 same <- apply(subgroup, 1, function(x) any(x != 0)) if (any(same)){ # some match, choose one output <- sample(same[same], 1) } if (any(!same)){ # get all that don't correlate output <- c(output, same[!same]) } output }) # output as matrix do.call(rbind, lapply(names(result), function(x) cbind(x, names(result[[x]] On Mon, Dec 7, 2009 at 7:38 PM, Juliet Hannah wrote: > Hi List, > > Here is some example data. > > myDat <- read.table(textConnection("group id > 1 101 > 1 201 > 1 301 > 2 401 > 2 501 > 2 601 > 3 701 > 3 801 > 3 901"),header=TRUE) > closeAllConnections() > > corr_mat <-read.table(textConnection("1 1 .5 0 0 0 0 0 0 0 > 2 .5 1 0 0 0 0 0 0 0 > 3 00 1.0 0 0 0 0 0 0 > 4 00 0 1 .5 .5 0 0 0 > 5 00 0 .5 1.5 0 0 0 > 6 00 0 .5 .5 1 00 0 > 7 00 0 00 0 1 0 0 > 8 0 0 0 00 0 0 1 .5 > 9 0 0 0 0 00 0 .5 1"),header=FALSE) > closeAllConnections() > > corr_mat <- corr_mat[,-1] > colnames(corr_mat) <- myDat$id > rownames(corr_mat) <- myDat$id > > I need to subset this data such that observations within a group are not > related, which is indicated by a 0 in corr_mat. > > For example, within group 1, 101 and 201 are related, so one of these > has to be selected, say > 101. 301 is not related to 101 or 201, so the final set for group 1 > consists of 101 and 301. There will always be at least 2 members in > each group. I need to carry this task on all groups. > > One possible final data set looks like: > > group id > 1 1 101 > 3 1 301 > 4 2 401 > 7 3 701 > 8 3 801 > > Any suggestions? Thanks! > > Juliet > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with split eating giga-bytes of memory
size of the attributes that get copied, I guess. > > >> > > >> > > >> > > >> > > >>> myDataFrame <- data.frame(matrix(LETTERS, ncol = 7, nrow = 399000)) > > >>> mySplitVar <- factor(as.character(1:1400)) > > >>> myDataFrame <- cbind(myDataFrame, mySplitVar) > > >>> object.size(myDataFrame) > > >>> ## 12860880 bytes # ~ 13MB > > >>> myDataFrame.split <- split(myDataFrame, myDataFrame$mySplitVar) > > >>> object.size(myDataFrame.split) > > >>> ## 144524992 bytes # ~ 144MB > > >>> > > >> > > >> Note: > > >> > > >> only.attr <- lapply(myDataFrame.split,function(x) > sapply(x,attributes)) > > >>> > > >>> > > > (object.size(myDataFrame.split)-object.size(myDataFrame))/object.size(only.attr) > > >>> > > >> 1.03726179240978 bytes > > >> > > >> > > >>> > > >> > > >> object.size(selectSubAct.df) > > >>> ## 52,348,272 bytes # ~ 52MB > > >>> > > >> > > >> What was this?? > > >> > > >> > > >> Chuck > > >> > > >> > > >>> sessionInfo() > > >>>> > > >>> R version 2.10.0 Patched (2009-10-27 r50222) > > >>> x86_64-unknown-linux-gnu > > >>> > > >>> locale: > > >>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > >>> [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 > > >>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > > >>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > > >>> [9] LC_ADDRESS=C LC_TELEPHONE=C > > >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > >>> > > >>> attached base packages: > > >>> [1] stats graphics grDevices datasets utils methods base > > >>> > > >>> loaded via a namespace (and not attached): > > >>> [1] tools_2.10.0 > > >>> > > >>> Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry > > >>> Indiana University School of Medicine > > >>> > > >>> 15032 Hunter Court, Westfield, IN 46074 > > >>> > > >>> (317) 490-5129 Work, & Mobile & VoiceMail > > >>> (317) 399-1219 Skype No Voicemail please > > >>> > > >>>[[alternative HTML version deleted]] > > >>> > > >>> > > >>> __ > > >>> R-help@r-project.org mailing list > > >>> https://stat.ethz.ch/mailman/listinfo/r-help > > >>> PLEASE do read the posting guide > > >>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > > >>> and provide commented, minimal, self-contained, reproducible code. > > >>> > > >>> > > >> Charles C. Berry(858) 534-2098 > > >>Dept of Family/Preventive > > >> Medicine > > >> E mailto:cbe...@tajo.ucsd.edu UC San Diego > > >> http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego > > 92093-0901 > > >> > > >> > > >> > > > > > >[[alternative HTML version deleted]] > > > > > > __ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > -- > > http://had.co.nz/ > > > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with split eating giga-bytes of memory
Here is an example: > # create test data > N <- 100 > x <- data.frame(a=sample(LETTERS, N, TRUE), b=sample(letters, N, TRUE), + c=as.numeric(1:N), d=runif(N)) > system.time({ + x.df <- split(x, x$a) # split + print(sapply(x.df, function(a) sum(a$c))) + }) A B C D E F G H 19132375146 19261600080 19290064552 19355472666 19143448231 18973627622 19278423676 19362576931 I J K L M N O P 19405443596 19295695044 19052377988 19236047192 19143226220 19197703946 19297192525 19129252399 Q R S T U V W X 19272964991 19315856972 19355660155 19303178409 19242322477 19081573240 19309444512 19077003863 Y Z 19259313705 19228653862 user system elapsed 1.270.021.28 > # now use indices > system.time({ + x.indx <- split(seq(nrow(x)), x$a) # create list of indices + print(sapply(x.indx, function(a) sum(x$c[a]))) + }) A B C D E F G H 19132375146 19261600080 19290064552 19355472666 19143448231 18973627622 19278423676 19362576931 I J K L M N O P 19405443596 19295695044 19052377988 19236047192 19143226220 19197703946 19297192525 19129252399 Q R S T U V W X 19272964991 19315856972 19355660155 19303178409 19242322477 19081573240 19309444512 19077003863 Y Z 19259313705 19228653862 user system elapsed 0.230.000.23 > > > > > On Tue, Dec 8, 2009 at 10:26 PM, Mark Kimpel wrote: > Jim, could you provide a code snippit to illustrate what you mean? > > Hadley, good point, I did not know that. > > Mark > > Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry > Indiana University School of Medicine > > 15032 Hunter Court, Westfield, IN 46074 > > (317) 490-5129 Work, & Mobile & VoiceMail > (317) 399-1219 Skype No Voicemail please > > > On Tue, Dec 8, 2009 at 11:00 PM, jim holtman wrote: > >> Also instead of 'splitting' the data frame, I split the indices and then >> use those to access the information in the original dataframe. >> >> >> On Tue, Dec 8, 2009 at 9:54 PM, Mark Kimpel wrote: >> >>> Hadley, Just as you were apparently writing I had the same thought and >>> did >>> exactly what you suggested, converting all columns except the one that I >>> want split to character. Executed almost instantaneously without problem. >>> Thanks! Mark >>> >>> Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry >>> Indiana University School of Medicine >>> >>> 15032 Hunter Court, Westfield, IN 46074 >>> >>> (317) 490-5129 Work, & Mobile & VoiceMail >>> (317) 399-1219 Skype No Voicemail please >>> >>> >>> On Tue, Dec 8, 2009 at 10:48 PM, hadley wickham >>> wrote: >>> >>> > Hi Mark, >>> > >>> > Why are you using factors? I think for this case you might find >>> > characters are faster and more space efficient. >>> > >>> > Alternatively, you can have a look at the plyr package which uses some >>> > tricks to keep memory usage down. >>> > >>> > Hadley >>> > >>> > On Tue, Dec 8, 2009 at 9:46 PM, Mark Kimpel >>> wrote: >>> > > Charles, I suspect your are correct regarding copying of the >>> attributes. >>> > > First off, selectSubAct.df is my "real" data, which turns out to be >>> of >>> > the >>> > > same dim() as myDataFrame below, but each column is make up of >>> strings, >>> > not >>> > > simple letters, and there are many levels in each column, which I did >>> not >>> > > properly duplicate in my first example. I have ammended that below >>> and >>> > with >>> > > the split the new object size is now not 10X the size of the >>> original, >>> > but >>> > > 100X. My "real" data is even more complex than this, so I suspect >>> that is >>> > > where the problem lies. I need to search for a better solution to my >>> > problem >>> > > than split, for which I will start a separate thread if I can't >>> figure >>> > > something out. >>> > > >>
Re: [R] What is the function to test if a vector is ordered or not?
Try all(diff(order(yourVector)) == 1) On Wed, Dec 9, 2009 at 10:10 PM, Peng Yu wrote: > I did a search on www.rseek.org to look for the function to test if a > vector is ordered or not. But I don't find it. Could somebody let me > know what function I should use? > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] incorrect multiple outputs
If I rad you code right, file.rows is equal to 1 and your 'for' loop will only iterate once. Is that what you were expecting? No reproducible code provided, so that is my best guess. >file.rows<- c(nrow(file)/288) # "input_file.txt" contains 288 reformatted lines for each original data file ... >for (k in 1:file.rows){ # iterates code for each 288 line block of "input_file.txt" ... On Thu, Dec 10, 2009 at 11:39 AM, biscuit wrote: > > HI, > I'm having trouble with a piece of Rscript which keeps outputting > incorrectly. it's something like this: the code reads in from a file which > contains (reformated) input > > >file<-read.table(file="input_file.txt",sep="\t")[,c(1,3:5)] > > > >file.rows<- c(nrow(file)/288) # "input_file.txt" contains 288 reformatted > lines for each original data file > ... > >for (k in 1:file.rows){ # iterates code for each 288 line block of > "input_file.txt" > ... > >cv[k] <- 100*(sd(x.blank)/mean(x.blank)) > >t[k] <- > (mean(x.note)-mean(x.blank))/sqrt(((sd(x.note)^2)/8)+((sd(x.blank)^2)/16)) > >t11[k] <- > (sqrt(8)*(mean(x.note11)-mean(x.blank)))/sqrt(sd(x.note11)^2+sd(x.blank)^2) > >} > > > > >all.data<-data.frame(barcodes,t=format(as.numeric(t),digits=3),t11=format(as.numeric(t11),digits=3),cv=format(as.numeric(cv),digits=3)) > >write.table(all.data, file= > "R_drug_plot.log",append=TRUE,sep="\t",row.names=FALSE) > > this all works correctly except that I believed it would output to file > after completing the loop, instead it's writing to file every iteration. so > the output file looks like: > > headers > a1 > headers > a1 > a2 > headers > a1 > a2 > a3 > ... > > I have checked the missing sections of code and can confirm there are no > missing/additional brackets. Has anyone any idea why this is happening and > what I can do about it? > -- > View this message in context: > http://n4.nabble.com/incorrect-multiple-outputs-tp957192p957192.html > Sent from the R help mailing list archive at Nabble.com. > > __________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About R memory management?
If you really want to code like a C++ coder in R, then create your own object and extend it when necessary: # take a variation of this; preallocate and then extend when you read a limit x <- numeric(2) for (i in 1:100){ if (i > length(x)){ # double the length (or whatever you want) length(x) <- length(x) * 2 } x[i] <- i } On Thu, Dec 10, 2009 at 11:30 AM, Peng Yu wrote: > I have a situation that I can not predict the final result's dimension. > > In C++, I believe that the class valarray could preallocate some > memory than it is actually needed (maybe 2 times more). The runtime > for a C++ equivalent (using append) to the R code would still be C*n, > where C is a constant and n is the length of the vector. However, if > it just allocate enough memory, the run time will be C*n^2. > > Based on your reply, I suspect that R doesn't allocate some memory > than it is currently needed, right? > > On Fri, Dec 11, 2009 at 11:22 AM, Henrik Bengtsson > wrote: > > Related... > > > > Rule of thumb: > > Pre-allocate your object of the *correct* data type, if you know the > > final dimensions. > > > > /Henrik > > > > On Thu, Dec 10, 2009 at 8:26 AM, Peng Yu wrote: > >> I'm wondering where I can find the detailed descriptions on R memory > >> management. Understanding this could help me understand the runtime of > >> R program. For example, depending on how memory is allocated (either > >> allocate a chuck of memory that is more than necessary for the current > >> use, or allocate the memory that is just enough for the current use), > >> the performance of the following program could be very different. > >> Could somebody let me know some good references? > >> > >> unsorted_index=NULL > >> for(i in 1:100) { > >> unsorted_index=c(unsorted_index, i) > >> } > >> unsorted_index > >> > >> __ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > __________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to draw three line on the same picture ?
try this: x <- read.table(textConnection("No V1 V2 V3 1 0.23 0.12 0.89 2 0.11 0.56 0.12"), header=TRUE) matplot(x[,1], x[,-1], type='l') On Fri, Dec 11, 2009 at 3:39 AM, z_axis wrote: > > thanks for your answer ! Would you mind giving me an example using my data > ? > > Sincerely! > > > Patrick Connolly-4 wrote: > > > > On Thu, 10-Dec-2009 at 10:14PM -0800, z_axis wrote: > > > > |> > > |> The following is sampling data: > > |> No V1 V2 V3 > > |> 1 0.23 0.12 0.89 > > |> 2 0.11 0;56 0.12 > > |> ... > > |> > > |> I just want to draw three lines on same picture according to value of > > V1, V2 > > |> and V3. > > > > ?lines > > > > > > -- > > ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. > >___Patrick Connolly > > {~._.~} Great minds discuss ideas > > _( Y )_ Average minds discuss events > > (:_~*~_:) Small minds discuss people > > (_)-(_). Eleanor Roosevelt > > > > ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > View this message in context: > http://n4.nabble.com/How-to-draw-three-line-on-the-same-picture-tp960823p960897.html > Sent from the R help mailing list archive at Nabble.com. > > __________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Recoding factor labels that are lists into first element of list
try this: > x <- data.frame(a=c('cat', 'cat,dog', 'dog', 'dog,cat')) > x a 1 cat 2 cat,dog 3 dog 4 dog,cat > levels(x$a) [1] "cat" "cat,dog" "dog" "dog,cat" > # change the factors > x$a <- factor(sapply(strsplit(as.character(x$a), ','), '[[', 1)) > x a 1 cat 2 cat 3 dog 4 dog > levels(x$a) [1] "cat" "dog" On Thu, Dec 10, 2009 at 10:53 PM, Jennifer Walsh wrote: > Hi all, > > I've Googled far and wide but don't think I know the correct terms to > search for to find an answer. > > I have a massive dataset where one of the factors is made up of both > individual items and lists of items (for example, "cat" and "cat, dog, > bird"). I would like to recode this factor somehow into only the first > element of the list (so every list starting with "cat," plus the > observations that were already just "cat" would all be set equal to "cat"). > I would ideally like to do this in some simple way that does not require me > to write hundreds of different sets of code (since the lists probably start > with 300+ different items). Is this possible? Extremely complicated? > > Also, I am sure this is much simpler, but I cannot seem to get rid of > levels of a factor that have no observations. I have tried setting the > levels of the factor to only the ones with observations that I am interested > in, but every time I summarize the variable there are still 100+ labels all > with "0" as their count. This hasn't happened to me before; is there an > explanation for it? > > Thanks very much, > Jen > > --- > Jennifer Walsh > Graduate Student, Developmental Psychology > University of Michigan > 2020 East Hall, 530 Church St. > Ann Arbor, MI 48109-1043 > > __________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] match problem
?merge What is the problem you are trying to solve? Sent from my iPhone. On Dec 15, 2009, at 4:50, "Bunny, lautloscrew.com" > wrote: Hi all, I dont know if match is the right approach here. I´d like to match t o data.frames. One big dataframe and one small dataframe. In SQL, wh at i want to do what only be simple relation. The first consists of two columns a) some value b) some key that is explained in the other dataframe. What i want to do is create (cbind) both to one dataframe like: df1: a b 1 2 2 3 3 2 4 2 5 1 df2: 1 class1 2 class2 3 class3 to newdf: a b class 1 2 class2 2 3 class3 3 2 class2 ... and so forth I have connected R to several relational databases but I dont think it´s necessary here and that there should be a simpler solution for my problem. Maybe some combination of do.call, mapply, lapply or mat ch can do the job. Unfortunately I am not really familiar with these and still keep tryin. thx in advance for any help. Best regards matt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparison of these types is not implemented
you need: r_squared[[i]] What is the problem you are trying to solve? Sent from my iPhone. On Dec 15, 2009, at 2:29, Tom Pitt wrote: Hi All, Can you tell me why I get the error message below? It's driving me nuts. Thanks, Tom r_squared [[1]] [1] 0.9083936 [[2]] [1] 0.8871647 [[3]] [1] 0.8193883 [[4]] [1] 0.728157 [[5]] [1] 0.8849525 [[6]] [1] 0.8459416 [[7]] [1] 0.6702318 [[8]] [1] 0.02997816 [[9]] [1] 0.8974268 [[10]] [1] 0.881217 [[11]] [1] 0.8006688 [[12]] [1] 0.7207697 [[13]] [1] 0.8703734 [[14]] [1] 0.8384346 [[15]] [1] 0.6237472 biggest=c(0,0) for (i in 1:15) { + + if (r_squared[i]>biggest[1]) biggest=c(r_squared[i],i)} Error in r_squared[i] > biggest[1] : comparison of these types is not implemented -- View this message in context: http://n4.nabble.com/comparison-of-these-types-is-not-implemented-tp964195p964195.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with spliting a dataframe values
Does this do what you want. > x <- "a,b,c|1,2,3|4,5,6|7,8,8" > x.1 <- strsplit(x, "[|]") > x.1 [[1]] [1] "a,b,c" "1,2,3" "4,5,6" "7,8,8" > x.2 <- lapply(x.1, strsplit, ',') > x.2 [[1]] [[1]][[1]] [1] "a" "b" "c" [[1]][[2]] [1] "1" "2" "3" [[1]][[3]] [1] "4" "5" "6" [[1]][[4]] [1] "7" "8" "8" > do.call(rbind, x.2[[1]]) [,1] [,2] [,3] [1,] "a" "b" "c" [2,] "1" "2" "3" [3,] "4" "5" "6" [4,] "7" "8" "8" > On Thu, Dec 17, 2009 at 9:11 AM, venkata kirankumar wrote: > Hi all, > Hi this is kiran > I am facing a problem to split a dataframe > > that is.. > i have a string like:"a,b,c|1,2,3|4,5,6|7,8,8" > first I have to split with respect to "|" > I did it with command > > unlist(strsplit("a,b,c|1,2,3|4,5,6|7,8,8", "\\,")) > > > after getting that set i made it as a dataframe and it comes like > > a,b,c > 1,2,3 > 4,5,6 > 7,8,8 > > now i have to split this dataframe with respect to "," and i have to get > it > like > > > a b c > 1 2 3 > 4 5 6 > 7 8 8 > > > this one i am not able to findout > can any one help me to get it done > > thanks in advance > kiran > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] to remove an error with log(zero)
Does this do what you want: > x [1] 1 2 4 0 7 5 0 0 0 9 11 12 > # create a matrix with the first column being a sequence number > x.mat <- cbind(seq(length(x)), x) > # remove zeros in second column > x.mat <- x.mat[x.mat[,2] != 0,] > x.mat x [1,] 1 1 [2,] 2 2 [3,] 3 4 [4,] 5 7 [5,] 6 5 [6,] 10 9 [7,] 11 11 [8,] 12 12 > # now create an approxfun to interprete missing values > x.fun <- approxfun(x.mat[,1], x.mat[,2]) > # now fill out a new matrix with interpreted values > x.new <- cbind(seq(length(x)), x.fun(seq(length(x > x.new [,1] [,2] [1,]1 1.0 [2,]2 2.0 [3,]3 4.0 [4,]4 5.5 [5,]5 7.0 [6,]6 5.0 [7,]7 6.0 [8,]8 7.0 [9,]9 8.0 [10,] 10 9.0 [11,] 11 11.0 [12,] 12 12.0 > On Thu, Dec 17, 2009 at 8:16 PM, Moohwan Kim wrote: > Dear R family > > I have an arbitrary column vector. > 1 > 2 > 4 > 0 > 7 > 5 > 0 > 0 > 0 > 9 > 11 > 12 > When I attempt to take natural logarithm of the series, as you guess > there is an error message. To overcome this problem, my idea is to > replace a zero or zeros in a row with appropriate numbers. > In order to implement it, I need to detect where zeros are. > Then I am going to take the average of two adjacent neighbors. In the > case of zeros in a row, I guess I might apply the above idea > sequentially. > > Would you help me out to escape from this jungle? > Thanks in advance. > > Best > Moohwan > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] some help regarding combining columns from different files
In your function, you have temp <- read.table(fnames,header=T,sep="\t",stringsAsFactors=F,quote="\"") I think you mean: temp <- read.table(i,header=T,sep="\t",stringsAsFactors=F,quote="\"") Also 'files' is a parameter, but you are using 'fnames' in the 'for' loop; shouldn't that be 'files'? On Thu, Dec 17, 2009 at 3:51 PM, Harikrishnadhar wrote: > Dear all, > > Here is my code which am using to combine 5th column from different data > sets. > > Here is the function to do my job > > > genesymbol.append.file <-NULL > gene.column <- NULL > readGeneSymbol <- function(files,genesymbol.column=5){ > for(i in fnames){ > temp <- read.table(fnames,header=T,sep="\t",stringsAsFactors=F,quote="\"") > gene.column<-cbind(gene.column,temp[,genesymbol.column]) > genesymbol.append.file$genecolumns <- gene.column > genesymbol.append.file > } > } > > > > > test <- readGeneSymbol(fnames,genesymbol.column=5) > > Here is the warning message am getting only the 5th column from the first > column is taken > > > Warning messages: > 1: In file(file, "r") : only first element of 'description' argument used > 2: In file(file, "r") : only first element of 'description' argument used > > > > Please help me to solve this > > > > > > > > -- > Thanks > Hari > 215-385-4122 > > > > > > > > > > > > > > > > > > > > > > > "If there is anyone out there who still doubts that America is a place > where > all things are possible" > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write.csv and col.names=F
In R.2.9.2 I get the following error message if setting col.names=FALSE: > write.csv(x, '', col.names=FALSE) "","a","b" "1",1,1 "2",2,2 "3",3,3 "4",4,4 "5",5,5 "6",6,6 "7",7,7 "8",8,8 "9",9,9 "10",10,10 Warning message: In write.csv(x, "", col.names = FALSE) : attempt to set 'col.names' ignored You have to use write.table if you don't want the column names; it is on the help page: > write.table(x,sep=',', col.names=FALSE) "1",1,1 "2",2,2 "3",3,3 "4",4,4 "5",5,5 "6",6,6 "7",7,7 "8",8,8 "9",9,9 "10",10,10 > On Fri, Dec 18, 2009 at 8:37 AM, Reeyarn_ææºæ´_10928113 wrote: > On Fri, Dec 18, 2009 at 7:52 AM, kayj wrote: > > > > Hi All, > > > > I always have a problem with write.csv when I want the column names to be > > ignored, when I specify col.names=F, I get a header of V1 V2 V3 V4 etc. > > > > I tried that and found the same problem, however, I found > write.table(mydata, file="data.csv",col.names=F) > works. > > write.csv calls write.table to save data, is there something wrong with it? > > -- > Best Regards, > Reeyarn T. Lee > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] integer(0) and NA do not equal FALSE
try using 'grepl' > if( grepl("hi", "hop", fixed = TRUE) ){ + print('yes, your substring is in your string') + } else print('no, your substring is not in your string') [1] "no, your substring is not in your string" > On Sat, Dec 19, 2009 at 3:47 PM, Jonathan wrote: > Hi, > A noobie question: I'm simply trying to run a conditional statement that > evaluates if a substring is found within a larger string. I find that if > it > IS found, my function returns TRUE (great!), but if not, the condition does > not evaluate to FALSE. > > ex): > > if( grep("hi", "hop", fixed = TRUE) ) > print('yes, your substring is in your string') > else print('no, your substring is not in your string') > > alternatively, I could replace grep with pmatch: > > if (pmatch('hi','hop')) > print('yes, your substring is in your string') > else print('no, your substring is not in your string') > > > The first example, using grep, returns logical(0). The second, using > pmatch, returns NA. Any idea how to convert either of those to FALSE, or > else a different function that would do the trick? > > Thanks, > Jon > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table: mysterious line omissions
Most likely an unbalanced quote. put the following option in the read.table: quote='', comment.char='' On Sat, Dec 19, 2009 at 11:42 PM, Jonathan wrote: > Hello again, > I am simply trying to import a rectangular table of strings. The > table's dimensions are 1990 x 2, yet my read.table() command can only find > 362 of the rows (and they're not the first 362). I would've taken the time > to figure out how to use scan, readLines, or some other tool that can read > in character strings, and then parse and input to a table, but that seems > like overkill, and probably it would be good to understand what's wrong > with > my text file. > > The file is here. > > https://regtransfers-sth-se.diino.com/download/jonsleepy/_mydropbox_/finalInput.xls > > The code is here: > temp <- as.matrix(read.table('finalInput.xls', header=FALSE, sep = "\t")) > dim(temp) #expect 1990 x 2; but find 362 x 2 > > Sorry to require a download (this probably won't make people happy), but > since my problem is file-specific, the file is needed for troubleshooting. > > I generated it with some grep, gawk commands using Cygwin in a Windows > environment (though subsequently converted it to Windows format - R loads > it > exactly the same way, regardless of whether it's in linux or windows > format) > > Regards, > Jonathan > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] "Object is not a matrix" Error
Where is the object 'write'? SHouldn't you be using: lm(visits ~ (day.f)) On Sun, Dec 20, 2009 at 5:59 PM, John Paul Telthorst wrote: > I'm trying to follow this guide here: > http://www.ats.ucla.edu/stat/r/modules/dummy_vars.htm > > In which I'm creating categorical variables using the factor function. > > I am able to go through the example listed above and have everything work, > however, when I try to input my own numbers, I get an error. I input the > following: > > > > hits = read.csv(file.choose()) > > > attach(hits) > > > day.f <- factor(day) > > > lm(write ~ (day.f)) > > lm(write ~ (day.f)) > > Error in model.frame.default(formula = write ~ (day.f), drop.unused.levels > = > > TRUE) : > > object is not a matrix > > > > So I import "hits = read.csv(file.choose())" a .csv file, which has the > columns "visits" and "day" where "visits" is the number of hits to a > website, and "day" is a number 1-7, for example 1 corresponds to Sunday and > 7 corresponds to Saturday. I understand that the day variable needs to be > a > categorical variable, and I'm trying to use the factor function to do this. > I would like to be able to run a regression that will correlate the day > with the number of hits. > > Any help would be much appreciated. > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Column naming issues using read.table
This reads in your posted data: > x <- read.table(textConnection("Samplerate = 2 samps/sec + Nr Cnt1X Cnt1Y Cnt2X Cnt2Y sec100 hour + 153 84 43 2 22 12 + 290 155 74 0 72 12 + 390 155 74 0 121 12"), skip=1, header=TRUE) > closeAllConnections() > > x Nr Cnt1X Cnt1Y Cnt2X Cnt2Y sec100 hour 1 1538443 2 22 12 2 290 15574 0 72 12 3 390 15574 0121 12 On Wed, Dec 23, 2009 at 8:31 PM, arthurbeer01 wrote: > > Hi, this is my first post so please be gentle. > I quite new to R and using it for my biology degree. > > My problem is. Im trying to import data from a .csv file using the > read.table command. The .csv file header starts on row 2 but is contained > in > column 1, i have 600 data files and for future ease would rather not edit > each file seperatly. The data starts on row three and I only need the first > 381 data points. > > The R error message using the code iv got so far is > > Error in read.table(file("s1-2c83.csv"), header = FALSE, sep = ",", quote = > "", : > more columns than column names > > The code I have so far is > > framename<-read.table(file ("s1-2c83.csv"), > header = FALSE, # FLASE indicates headers are not included in input file > sep = ",",# must have "," otherwise errors in table > quote = "", > dec = ".", > row.names = 1, # must = 1 or extra column of row numbering is entered > col.names = ("Nr2sec,Cnt1X,Cnt1Y,Cnt2X,Cnt2Y,sec100,hour"), > as.is = FALSE, > na.strings = "NA", > colClasses = NULL, > nrows = 381, # rows to stop data.table recording (not input file row > number!) > skip = 2,# number of rows to skp before reading data from input file > strip.white = FALSE, > comment.char = "") > > write.csv(framename, file = "s1-2c83-ok.csv") > > If I delete the line col.names, Iv manged to get the data read and saved to > a new .csv file but cannot work out how to get the column headers renamed. > The read.table (framename) displays the headers as v1,v2,v3 etc, this is > what i cant change. Also it has the first column without a header (i think > its the row number) which I dont want in the output file > > The read data file example s1-2c83.csv > > 1:Samplerate = 2 samps/sec > 2: Nr Cnt1X Cnt1Y Cnt2X Cnt2Y sec100 hour > 3: 153 84 43 2 22 12 > 4: 290 155 74 0 72 12 > 5: 390 155 74 0 121 12 > > Any help will be greatly appreciated after the 5hrs Iv spent already on > this > problem. > > Many thanks in advance > > Adam > > > > -- > View this message in context: > http://n4.nabble.com/Column-naming-issues-using-read-table-tp978241p978241.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] by-group processing
Ths should do it: > do.call(rbind, lapply(split(x, x$ID), tail, 1)) ID Type N 45900 45900I 7 46550 46550I 7 49270 49270E 3 On Wed, May 6, 2009 at 6:09 PM, Max Webber wrote: > Given a dataframe like > > > data >ID Type N > 1 45900A 1 > 2 45900B 2 > 3 45900C 3 > 4 45900D 4 > 5 45900E 5 > 6 45900F 6 > 7 45900I 7 > 8 49270A 1 > 9 49270B 2 > 10 49270E 3 > 18 46550A 1 > 19 46550B 2 > 20 46550C 3 > 21 46550D 4 > 22 46550E 5 > 23 46550F 6 > 24 46550I 7 > > > > containing an identifier (ID), a variable type code (Type), and > a running count of the number of records per ID (N), how can I > return a dataframe of only those records with the maximum value > of N for each ID? For instance, > > > data >ID Type N > 7 45900I 7 > 10 49270E 3 > 24 46550I 7 > > I know that I can use > > > tapply ( data $ N , data $ ID , max ) > 45900 46550 49270 > 7 7 3 > > > > to get the values of the maximum N for each ID, but how is it > that I can find the index of these values to subsequently use to > subscript data? > > > -- > maxine-webber > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert large integers to hex
You can use the 'bc' command (use Cygwin if on Windows); /cygdrive/c: bc bc 1.06 Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. x=6595137340052185552 obase=16 x 5B86A277DEB9A1D0 You can call this from R. On Wed, May 6, 2009 at 3:26 PM, Sundar Dorai-Raj wrote: > Hi, > > I'm wondering if someone has solved the problem of converting very > large integers to hex. I know about format.hexmode and as.hexmode, but > these rely on integers. The numbers I'm working with are overflowing > and losing precision. Here's an example: > > x <- "6595137340052185552" # stored as character > as.integer(x) # warning about inaccurate conversion > format.hexmode(as.numeric(x)) # warnings about loss of precision > as.hexmode(x) # more warnings and does not do what I expected > > I'm planning on writing a function that will do this, but would like > to know if anybody already has a solution. Basically, I would like the > functionality of format.hexmode on arbitrarily large integers. > > Thanks, > > --sundar > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matching multiple columns in a data frame
?merge > merge(A,B) C1 C2 1 A 200 On Thu, May 7, 2009 at 2:19 AM, Raghavan, Nandini [PRDUS] < nragh...@its.jnj.com> wrote: > Hello, > > > > I am trying to extract a subset of a dataframe A (2 columns) by > extracting all entries in A (several repeated entries) that match > dataframe B in both columns. For example, part of A and B are shown > below. > > The following does not seem to work correctly. This only seems to select > on the first component and all instances of the second. > > ind <- A$C1 %in% B[,1] & A$C2 %in% B[,2] > > Any suggestions as to how to do this in general (even for matches in > multiple columns) would be appreciated. > > > > Regards, > > Nandini > > > > > > A: > > C1 C2 > > 1 F 1500 > > 2 P 120 > > 4 F 250 > > 5 I 200 > > 6 D 2010 > > 7 F 1000 > > 8 V0 > > 9 F 2100 > > 10 F 500 > > 11 E 1800 > > 12 A 500 > > 13 V0 > > 14 I 125 > > 15 I 30 > > 16 M 300 > > 17 D 75 > > 18 V 500 > > 19 A 200 > > 20 M 1000 > > 21 P 225 > > > > B: > > C1 C2 > > 1 A 200 > > 2 A 600 > > 3 A 1500 > > 4 B 100 > > 5 B 1000 > > 6 C 5000 > > 7 C 225 > > 8 C 150 > > 9 C 150 > > 10 C 200 > > > > > > > > > > > > > > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creation of a matrix
Is this what you want: > x <- data.frame(n=sample(10, n, TRUE), text=sample(LETTERS, n, TRUE)) > table(x$text, x$n) 1 2 3 4 5 6 7 8 9 10 A 6 5 2 0 8 1 5 3 6 4 B 2 2 5 2 2 7 5 4 4 5 C 7 4 6 4 3 6 3 6 5 4 D 9 5 1 6 3 1 3 2 6 3 E 2 6 4 3 3 5 2 7 6 3 F 6 5 3 5 3 5 1 2 2 10 G 4 4 2 5 5 3 2 7 3 3 H 4 4 4 5 3 3 3 6 3 4 I 9 3 6 1 4 4 3 4 3 4 J 4 7 3 4 3 3 4 1 2 5 K 2 5 5 3 3 6 9 6 5 3 L 3 3 5 4 3 3 3 3 5 5 M 3 9 2 3 2 0 2 3 5 6 N 4 1 0 5 8 4 4 3 6 2 O 3 4 3 4 8 4 2 5 5 4 P 3 6 2 6 4 4 3 4 3 6 Q 5 2 2 5 3 3 0 2 5 4 R 1 5 6 4 5 4 2 2 4 4 S 6 2 4 2 1 7 0 1 1 2 T 4 3 1 7 2 3 4 1 8 1 U 4 5 11 8 3 2 5 3 4 5 V 6 3 1 1 1 0 2 5 5 3 W 3 5 1 4 4 5 6 3 4 2 X 5 4 3 5 5 6 3 3 3 6 Y 6 6 6 3 2 1 3 4 4 1 Z 3 6 1 5 6 1 8 1 3 4 > On Fri, May 8, 2009 at 4:48 AM, Erika Ahl wrote: > Hi all, > > I have a relative large amount (several thousand rows, but a small > amount of unique objects) of data in a format like this: > > 1 text_string > 1 text_string > 1 text_string > 2 text_string > 2 text_string > 3 text_string > 3 text_string > 3 text_string > 3 text_string > 3 text_string > . > . > . > n text_string > > I want to create an n x p matrix, n objects (=40) and p unique text > strings. Nij is number of occurrences of a text string j in object i. > > > What is the most efficient way of creating this matrix? > > Best regards, > > Erika Ahl > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extending strsplit to handle missing text that doesn't have the target on which to split
Find the values that are missing a comma and add it: > dat <- c("Tue, 15 Nov 2005 09:44:50 EST", + "15 Nov 2005 09:10:00 +0100", + "Tue, 15 Nov 2005 09:44:50 EST", + "Tue, 15 Nov 2005 16:29:57 +", + "Wed, 16 Nov 2005 07:00:45 EST", + "Wed, 16 Nov 2005 05:28:00 -0800", + "Wed, 16 Nov 2005 14:06:21 +", + "15 Nov 2005 09:10:00 +0100") > # add comma if missing > missing <- !grepl(',', dat) > dat[missing] <- paste('', dat[missing], sep=',') > tmp.dat.data <- matrix(unlist(strsplit(dat,",")),ncol = 2, byrow = TRUE) > > tmp.dat.data [,1] [,2] [1,] "Tue" " 15 Nov 2005 09:44:50 EST" [2,] """15 Nov 2005 09:10:00 +0100" [3,] "Tue" " 15 Nov 2005 09:44:50 EST" [4,] "Tue" " 15 Nov 2005 16:29:57 +" [5,] "Wed" " 16 Nov 2005 07:00:45 EST" [6,] "Wed" " 16 Nov 2005 05:28:00 -0800" [7,] "Wed" " 16 Nov 2005 14:06:21 +" [8,] """15 Nov 2005 09:10:00 +0100" > On Thu, May 7, 2009 at 9:30 AM, Chris Evans wrote: > I am sure there is an obvious answer to this that I'm missing but I > can't find it. I'm parsing headers of Emails and most have a date like > this: > "Wed, 16 Nov 2005 05:28:00 -0800" > and I can parse that using: > > tmp.dat.data <- matrix(unlist(strsplit(headers$Date.line,",")), >ncol = 2, byrow = TRUE) > before going on to look at the day and date/time data. > > However, a very few headers I want to parse are missing the initial day > of the week and look like this: > "15 Nov 2005 09:10:00 +0100" > > That means that my use of strsplit() results in that date/time part > being all of the item in the list for those entries so the effect of > matrix(unlist()) is to pull the next list entry "up" in the matrix. > Because I happened to have only two errant entries I didn't see what was > happening for a moment. (An odd number gives a warning message about > dimensions not fitting but an odd number has silently moved things > up/left so doesn't: no quarrel with that from me, my stupidity that I > was slow to see what was happening!) > > I'm sure I should be able to find a simple way to get around this but at > the moment I can't. > > Here's a simple, reproducible example: > > dat <- c("Tue, 15 Nov 2005 09:44:50 EST", > "15 Nov 2005 09:10:00 +0100", > "Tue, 15 Nov 2005 09:44:50 EST", > "Tue, 15 Nov 2005 16:29:57 +", > "Wed, 16 Nov 2005 07:00:45 EST", > "Wed, 16 Nov 2005 05:28:00 -0800", > "Wed, 16 Nov 2005 14:06:21 +", > "15 Nov 2005 09:10:00 +0100") > tmp.dat.data <- matrix(unlist(strsplit(dat,",")),ncol = 2, byrow = TRUE) > > > tmp.dat.data comes out as a 7x2 matrix contents: > > [,1] [,2] > [1,] "Tue" " 15 Nov 2005 09:44:50 EST" > [2,] "15 Nov 2005 09:10:00 +0100" "Tue" > [3,] " 15 Nov 2005 09:44:50 EST" "Tue" > [4,] " 15 Nov 2005 16:29:57 +" "Wed" > [5,] " 16 Nov 2005 07:00:45 EST" "Wed" > [6,] " 16 Nov 2005 05:28:00 -0800" "Wed" > [7,] " 16 Nov 2005 14:06:21 +" "15 Nov 2005 09:10:00 +0100" > > I'd like an 8x2 matrix with tmp.dat.data[2,1] == "" and > tmp.dat.data[8,1] == "" > > I'm sure there must be a simple way to achieve this by rolling a > slightly different variant of strsplit that pads things and then > applying that to the input vector but I'm failing to see how to do this > at the moment. > > TIA, > > Chris > > -- > Applied researcher, neither statistician nor programmer! > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sscanf
You can always use regular expressions: > x <- "Condition: 311" > as.integer(sub(".*?(\\d +).*", "\\1 ", x, perl=TRUE)) [1] 311 > On Fri, May 8, 2009 at 10:16 AM, Matthias Gondan wrote: > Dear list, > > Apparently, there is no function like sscanf in R. > > I have a string, "Condition: 311", and I would like > to read out the number and store it to a numeric > variable. Is there an easy way to do this? > > Best wishes, > > Matthias > -- > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading large files quickly
First 'wc' and readLines are doing vastly different functions. 'wc' is just reading through the file without having to allocate memory to it; 'readLines' is actually storing the data in memory. I have a 150MB file I was trying it on, and here is what 'wc' did on my Windows system: /cygdrive/c: time wc tempxx.txt 1055808 13718468 151012320 tempxx.txt real0m2.343s user0m1.702s sys 0m0.436s /cygdrive/c: If I multiply that by 25 to extrapolate to a 3.5GB file, it should take about a little less than one minute to process on my relatively slow laptop. 'readLines' on the same file takes: > system.time(x <- readLines('/tempxx.txt')) user system elapsed 37.820.47 39.23 If I extrapolate that to 3.5GB, it would take about 16 minutes. Now considering that I only have 2GB on my system, I would not be able to read the whole file in at once. You never did specify what type of system you were running on and how much memory you had. Were you 'paging' due to lack of memory? > system.time(x <- readLines('/tempxx.txt')) user system elapsed 37.820.47 39.23 > object.size(x) 84814016 bytes On Sat, May 9, 2009 at 12:25 PM, Rob Steele wrote: > I'm finding that readLines() and read.fwf() take nearly two hours to > work through a 3.5 GB file, even when reading in large (100 MB) chunks. > The unix command wc by contrast processes the same file in three > minutes. Is there a faster way to read files in R? > > Thanks! > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating a "conditional time" variable
Here is yet another way of doing it (always the case in R): #Simulated data frame: year from 1990 to 2003, for 5 different ids, each having one or two eif "events" test<-data.frame(year=rep(1990:2003,5),id=gl(5,length(1990:2003)), eif=as.vector(sapply(1:5,function(z){ a<-rep(0,length(1990:2003)) a[sample(1:length(1990:2003),sample(1:2,1))]<-1 a }))) # partition by 'id' and then by 'eif' changes test.new <- do.call(rbind, lapply(split(test, test$id), function(.id){ # now by 'eif' changes do.call(rbind, lapply(split(.id, cumsum(.id$eif)), function(.eif){ # create new dataframe with column cbind(.eif, conditional_time=seq(nrow(.eif))) })) })) On Sat, May 9, 2009 at 1:40 PM, Vincent Arel-Bundock wrote: > Hi everyone, > > Please forgive me if my question is simple and my code terrible, I'm new to > R. I am not looking for a ready-made answer, but I would really appreciate > it if someone could share conceptual hints for programming, or point me > toward an R function/package that could speed up my processing time. > > Thanks a lot for your help! > > ## > > My dataframe includes the variables 'year', 'id', and 'eif' and has +/- 1.9 > million id-year observations > > I would like to do 2 things: > > -1- I want to create a 'conditional_time' variable, which increases in > increments of 1 every year, but which resets during year(t) if event 'eif' > occured for this 'id' at year(t-1). It should also reset when we switch to > a > new 'id'. For example: > > dataframe = test > yearid eif conditional_time > > 1990 1010 01 > 1991 1010 02 > 1992 1010 13 > 1993 1010 01 > 1994 1010 02 > 1995 1010 03 > 1996 1010 04 > 1997 1010 15 > 1998 1010 01 > 1999 1010 02 > 2000 1010 03 > 2001 1010 04 > 2002 1010 05 > 2003 1010 06 > 1990 2010 01 > 1991 2010 02 > 1992 2010 03 > 1993 2010 04 > 1994 2010 05 > 1995 2010 06 > 1996 2010 07 > 1997 2010 08 > 1998 2010 09 > 1999 2010 010 > 2000 2010 011 > 2001 2010 112 > 2002 2010 01 > 2003 2010 02 > > -2- In a copy of the original dataframe, drop all id-year rows that > correspond to years after a given id has experienced his first 'eif' event. > > I have written the code below to take care of -1-, but it is incredibly > inefficient. Given the size of my database, and considering how slow my > computer is, I don't think it's practical to use it. Also, it depends on > correct sorting of the dataframe, which might generate errors. > > ## > > for (i in 1:nrow(test)) { >if (i == 1) {# If first id-year >cond_time <- 1 >test[i, 4] <- cond_time > >} else if ((test[i-1, 1]) != (test[i, 4])) { # If new id >cond_time <- 1 >test[i, 4] <- cond_time > } else {# Same id as previous row >if (test[i, 3] == 0) { >test[i, 4] <- sum(cond_time, 1) >cond_time <- test[i, 6] >} else { >test[i, 4] <- sum(cond_time, 1) >cond_time <- 0 >} >} > } > > -- > Vincent Arel > M.A. Student, McGill University > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating a "conditional time" variable
Corrected version. I forgot the the count had to change 'after' eif==1: #Simulated data frame: year from 1990 to 2003, for 5 different ids, each having one or two eif "events" test<-data.frame(year=rep(1990:2003,5),id=gl(5,length(1990:2003)), eif=as.vector(sapply(1:5,function(z){ a<-rep(0,length(1990:2003)) a[sample(1:length(1990:2003),sample(1:2,1))]<-1 a }))) # partition by 'id' and then by 'eif' changes test.new <- do.call(rbind, lapply(split(test, test$id), function(.id){ # now by 'eif' changes do.call(rbind, lapply(split(.id, cumsum(c(0, diff(.id$eif) == -1))), function(.eif){ cbind(.eif, conditional_time=seq(nrow(.eif))) })) })) On Sat, May 9, 2009 at 1:40 PM, Vincent Arel-Bundock wrote: > Hi everyone, > > Please forgive me if my question is simple and my code terrible, I'm new to > R. I am not looking for a ready-made answer, but I would really appreciate > it if someone could share conceptual hints for programming, or point me > toward an R function/package that could speed up my processing time. > > Thanks a lot for your help! > > ## > > My dataframe includes the variables 'year', 'id', and 'eif' and has +/- 1.9 > million id-year observations > > I would like to do 2 things: > > -1- I want to create a 'conditional_time' variable, which increases in > increments of 1 every year, but which resets during year(t) if event 'eif' > occured for this 'id' at year(t-1). It should also reset when we switch to > a > new 'id'. For example: > > dataframe = test > yearid eif conditional_time > > 1990 1010 01 > 1991 1010 02 > 1992 1010 13 > 1993 1010 01 > 1994 1010 02 > 1995 1010 03 > 1996 1010 04 > 1997 1010 15 > 1998 1010 01 > 1999 1010 02 > 2000 1010 03 > 2001 1010 04 > 2002 1010 05 > 2003 1010 06 > 1990 2010 01 > 1991 2010 02 > 1992 2010 03 > 1993 2010 04 > 1994 2010 05 > 1995 2010 06 > 1996 2010 07 > 1997 2010 08 > 1998 2010 09 > 1999 2010 010 > 2000 2010 011 > 2001 2010 112 > 2002 2010 01 > 2003 2010 02 > > -2- In a copy of the original dataframe, drop all id-year rows that > correspond to years after a given id has experienced his first 'eif' event. > > I have written the code below to take care of -1-, but it is incredibly > inefficient. Given the size of my database, and considering how slow my > computer is, I don't think it's practical to use it. Also, it depends on > correct sorting of the dataframe, which might generate errors. > > ## > > for (i in 1:nrow(test)) { >if (i == 1) {# If first id-year >cond_time <- 1 >test[i, 4] <- cond_time > >} else if ((test[i-1, 1]) != (test[i, 4])) { # If new id >cond_time <- 1 >test[i, 4] <- cond_time > } else {# Same id as previous row >if (test[i, 3] == 0) { >test[i, 4] <- sum(cond_time, 1) >cond_time <- test[i, 6] >} else { >test[i, 4] <- sum(cond_time, 1) >cond_time <- 0 >} >} > } > > -- > Vincent Arel > M.A. Student, McGill University > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading large files quickly
Since you are reading it in chunks, I assume that you are writing out each segment as you read it in. How are you writing it out to save it? Is the time you are quoting both the reading and the writing? If so, can you break down the differences in what these operations are taking? How do you plan to use the data? Is it all numeric? Are you keeping it in a dataframe? Have you considered using 'scan' to read in the data and to specify what the columns are? If you would like some more help, the answer to these questions will help. On Sat, May 9, 2009 at 10:09 PM, Rob Steele wrote: > Thanks guys, good suggestions. To clarify, I'm running on a fast > multi-core server with 16 GB RAM under 64 bit CentOS 5 and R 2.8.1. > Paging shouldn't be an issue since I'm reading in chunks and not trying > to store the whole file in memory at once. Thanks again. > > Rob Steele wrote: > > I'm finding that readLines() and read.fwf() take nearly two hours to > > work through a 3.5 GB file, even when reading in large (100 MB) chunks. > > The unix command wc by contrast processes the same file in three > > minutes. Is there a faster way to read files in R? > > > > Thanks! > > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate over x cases
Here is a way of doing it: > x block trial x y 1 1 1 605 150 2 1 1 603 148 3 1 1 604 140 4 1 1 600 140 5 1 1 590 135 6 1 1 580 135 7 1 2 607 148 8 1 2 605 152 10 1 2 600 158 > do.call(rbind, lapply(split(x, list(x$block, x$trial), drop=TRUE), head, 2)) block trial x y 1.1.1 1 1 605 150 1.1.2 1 1 603 148 1.2.7 1 2 607 148 1.2.8 1 2 605 152 On Mon, May 11, 2009 at 7:49 AM, Jens Bölte wrote: > Hello, > > I have been struggling for quite some time to find a solution for the > following problem. I have a data frame which is organized by block and > trial. Each trial is represented across several rows in this data frame. I'd > like to extract the first x rows per trial and block. > > For example >block trial x y > 1 1 1 605 150 > 2 1 1 603 148 > 3 1 1 604 140 > 4 1 1 600 140 > 5 1 1 590 135 > 6 1 1 580 135 > 7 1 2 607 148 > 8 1 2 605 152 > 10 1 2 600 158 > . > > Selecting only the the first two rows per trial should result in > block trial x y 1 1 605 150 > 1 1 603 148 > 1 2 607 148 > 1 2 605 152 > > The data I am dealing with a x-y coordinates (samples) from an eye-tracking > experiment. I receive the data in this format and need to eliminate unwanted > samples. > > Thanks Jens Bölte > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] readBin: read from defined offset TO defined offset?
Can you be more specific on how you want to "define the endpoint of that read". What is the criteria you want to use? Can you read in a block and then search of the pattern? On Mon, May 11, 2009 at 7:05 AM, Johannes Graumann wrote: > Hello, > > With the help of "seek" I can start "readBin" from any byte offset within > my > file that I deem appropriate. > What I would like to do is to be able to define the endpoint of that read > as > well. Is there any solution to that already out there? > > Thanks for any hints, Joh > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing any text beginning with...
Is this what you want (using regular expressions): > x <- "ENSG /// ENSGy /// ENSG" > sub("^([[:alpha:]]+).*", "\\1 ", x) [1] "ENSG" > On Mon, May 11, 2009 at 9:01 AM, Amélie Baud wrote: > Hi ! > > >From an Ensembl annotation like ENSG /// ENSGy /// ENSG, I am > trying to keep only the first part: ENSG. I wasn't able to find any > helpful information about how to do it. Could you help me with that please ? > Is the use of the equivalent to the Excel * (any text) a good way of doing > it and how ? > Your help will be very much appreciated. > > Amelie > > > > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for a quick way to combine rows in a matrix
Try this: > key <- rownames(a) > key[key == "AT"] <- "TA" > do.call(rbind, by(a, key, colSums)) V2 V3 V4 V5 AA 1 5 9 13 TA 5 13 21 29 TT 4 8 12 16 On Mon, May 11, 2009 at 4:53 PM, Crosby, Jacy R wrote: > I'm working with genotype data in a frequency table: > > > a=matrix(1:16, nrow=4) > > rownames(a)=c("AA","AT","TA","TT") > > a > [,1] [,2] [,3] [,4] > AA159 13 > AT26 10 14 > TA37 11 15 > TT48 12 16 > > 'AT' and 'TA' are essentially the same, and I'd like to combine (add) the > rows to reflect this. The final matrix should be: > > [,1] [,2] [,3] [,4] > AA159 13 > AT513 21 29 > TT48 12 16 > > Is there a fast way to do this? > > Thanks in advance! > > Jacy Crosby > jacy.r.cro...@uth.tmc.edu > > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to take away the same varible when I use "merge"
Can you provide commented, minimal, self-contained, reproducible code. You can check out 'duplicated' to remove duplicates. On Tue, May 12, 2009 at 7:06 AM, Xin Shi wrote: > Dear: > > > > I am trying to merge two tables by a common variable. However, there are a > few same variables which are in both of two tables. How can I take them > away > when I merge the two tables? > > > > Thanks! > > > > Xin > > > > > > > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
?curve just create an R expression for the equation and then plot it. I am not sure exactly what your expression is supposed to be. On Tue, May 12, 2009 at 10:22 PM, Debbie Zhang wrote: > > > Dear R users, > > Does anyone know how to graph the function below? > > sqrt(2)Ã(n/2)/[sqrt(n - 1)Ã((n - 1)/2] > > Please help. > > Debbie > > _ > Want to stay on top of your life online? Find out how with Windows Live! > http://windowslive.ninemsn.com.au/ >[[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dates and arrays
On Wed, May 13, 2009 at 4:23 PM, myshare wrote: > hi, > > I have a and data frame with date-column and some other columns. > My first question is what is the fastest way to get the index of an > array if I know the value f.e > > > x = c(4,5,6,7,8) > > so i know the value is 6.. i.e. the index is 3. What I currently do is > loop over the array, I was thinking if there > is faster more direct way. which(x == 6) will give you the index. > > The next one...is I have a data frame one of the columns is Date based > (stored as string), as you may be guessed > I have the date and I want to find the index ;), but here is one more > complication. > The dates are not sequential, but only dates when the day is Mon-Fri > i.e. for Sat and Sun i don't store information. > > So I have first convert the date I have into the closest Monday. > Let me give you one example. Let say I have the date 2000/01/01 (Sat), > now to be able to find any information I have to find the nearest > Monday in this case it is 2000/01/03 (Mon).. > So now that I have this new date I can find the index of the element > in the array where it is stored and from this I can get the real data > I need. > In short conversation is from Data ==> nearest Monday ==> index of the > element in the array where it is stored. Here is a way of adjusting a date to the nearest Monday if it is a weekend: > x <- seq(as.Date('2009-05-01'), by='1 day', length=30) > x [1] "2009-05-01" "2009-05-02" "2009-05-03" "2009-05-04" "2009-05-05" "2009-05-06" "2009-05-07" [8] "2009-05-08" "2009-05-09" "2009-05-10" "2009-05-11" "2009-05-12" "2009-05-13" "2009-05-14" [15] "2009-05-15" "2009-05-16" "2009-05-17" "2009-05-18" "2009-05-19" "2009-05-20" "2009-05-21" [22] "2009-05-22" "2009-05-23" "2009-05-24" "2009-05-25" "2009-05-26" "2009-05-27" "2009-05-28" [29] "2009-05-29" "2009-05-30" > x.new <- x + ifelse(weekdays(x) == "Saturday", 2, ifelse(weekdays(x) == "Sunday", 1, 0)) > x.new [1] "2009-05-01" "2009-05-04" "2009-05-04" "2009-05-04" "2009-05-05" "2009-05-06" "2009-05-07" [8] "2009-05-08" "2009-05-11" "2009-05-11" "2009-05-11" "2009-05-12" "2009-05-13" "2009-05-14" [15] "2009-05-15" "2009-05-18" "2009-05-18" "2009-05-18" "2009-05-19" "2009-05-20" "2009-05-21" [22] "2009-05-22" "2009-05-25" "2009-05-25" "2009-05-25" "2009-05-26" "2009-05-27" "2009-05-28" [29] "2009-05-29" "2009-06-01" > > > > thank you very much > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] specify the number of decimal numbers
Depending on what you want to do, use 'sprintf': > x <- 1.23456789 > x [1] 1.234568 > as.character(x) [1] "1.23456789" > sprintf("%.1f %.3f %.5f", x,x,x) [1] "1.2 1.235 1.23457" > On Thu, May 14, 2009 at 7:40 AM, lehe wrote: > > Hi, > I was wondering how to specify the number of decimal numbers in my > computation using R? I have too many decimal numbers for my result, when I > convert them to string with as.character, the string will be too long. > Thanks and regards! > -- > View this message in context: > http://www.nabble.com/specify-the-number-of-decimal-numbers-tp23538852p23538852.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] specify the number of decimal numbers
It all depends on what you want to do with the result. Here are some variations: > x <- matrix(runif(16), 4) > x [,1] [,2] [,3] [,4] [1,] 0.2655087 0.2016819 0.62911404 0.6870228 [2,] 0.3721239 0.8983897 0.06178627 0.3841037 [3,] 0.5728534 0.9446753 0.20597457 0.7698414 [4,] 0.9082078 0.6607978 0.17655675 0.4976992 > x[] <- sprintf("%.3f", x) > x [,1][,2][,3][,4] [1,] "0.266" "0.202" "0.629" "0.687" [2,] "0.372" "0.898" "0.062" "0.384" [3,] "0.573" "0.945" "0.206" "0.770" [4,] "0.908" "0.661" "0.177" "0.498" > print(x, quote=FALSE) [,1] [,2] [,3] [,4] [1,] 0.718 0.935 0.267 0.870 [2,] 0.992 0.212 0.386 0.340 [3,] 0.380 0.652 0.013 0.482 [4,] 0.777 0.126 0.382 0.600 > x <- matrix(runif(16), 4) > signif(x,3) [,1] [,2] [,3] [,4] [1,] 0.718 0.935 0.2670 0.870 [2,] 0.992 0.212 0.3860 0.340 [3,] 0.380 0.652 0.0134 0.482 [4,] 0.777 0.126 0.3820 0.600 > Can you specify what you want and how are you going to use it. Is it for generating a report? On Thu, May 14, 2009 at 8:03 AM, lehe wrote: > > Thanks! > In my case, I need to deal with a lot of such results, e.g. elements in a > matrix. If using sprintf, does it mean I have to apply to each result > individually? Is it possible to do it in a single command? > > > jholtman wrote: > > > > Depending on what you want to do, use 'sprintf': > > > >> x <- 1.23456789 > >> x > > [1] 1.234568 > >> as.character(x) > > [1] "1.23456789" > >> sprintf("%.1f %.3f %.5f", x,x,x) > > [1] "1.2 1.235 1.23457" > >> > > > > > > On Thu, May 14, 2009 at 7:40 AM, lehe wrote: > > > >> > >> Hi, > >> I was wondering how to specify the number of decimal numbers in my > >> computation using R? I have too many decimal numbers for my result, when > >> I > >> convert them to string with as.character, the string will be too long. > >> Thanks and regards! > >> -- > >> View this message in context: > >> > http://www.nabble.com/specify-the-number-of-decimal-numbers-tp23538852p23538852.html > >> Sent from the R help mailing list archive at Nabble.com. > >> > >> __ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > <http://www.r-project.org/posting-guide.html> > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > > > > > -- > > Jim Holtman > > Cincinnati, OH > > +1 513 646 9390 > > > > What is the problem that you are trying to solve? > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > View this message in context: > http://www.nabble.com/specify-the-number-of-decimal-numbers-tp23538852p23539189.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Duplicates and duplicated
Don't think I have seen this one come across: > x <- c(1,2,3,2,4,4,6,1) > duplicated(x) | duplicated(x, fromLast=TRUE) [1] TRUE TRUE FALSE TRUE TRUE TRUE FALSE TRUE On Thu, May 14, 2009 at 12:09 PM, Bert Gunter wrote: > ... or, similar in character to Gabor's solution: > > tbl <- table(x) > (tbl[as.character(sort(x))]>1)+0 > > > Bert Gunter > Nonclinical Biostatistics > 467-7374 > > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On > Behalf Of Gabor Grothendieck > Sent: Thursday, May 14, 2009 7:34 AM > To: christiaan pauw > Cc: r-help@r-project.org > Subject: Re: [R] Duplicates and duplicated > > Noting that: > > > ave(x, x, FUN = length) > 1 > [1] FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE FALSE > > try this: > > > rbind(x, dup = ave(x, x, FUN = length) > 1) >[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] > x 123445678 9 > dup000110000 0 > > > On Thu, May 14, 2009 at 2:16 AM, christiaan pauw wrote: > > Hi everybody. > > I want to identify not only duplicate number but also the original number > > that has been duplicated. > > Example: > > x=c(1,2,3,4,4,5,6,7,8,9) > > y=duplicated(x) > > rbind(x,y) > > > > gives: > >[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] > > x123445678 9 > > y000010000 0 > > > > i.e. the second 4 [,5] is a duplicate. > > > > What I want is the first and second 4. i.e [,4] and [,5] to be TRUE > > > >[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] > > x123445678 9 > > y000110000 0 > > > > I assume it can be done by sorting the vector and then checking is the > next > > or the previous entry matches using > > identical() . I am just unsure on how to write such a loop the logic of > > which (I think) is as follows: > > > > sort x > > for every value of x check if the next value is identical and return TRUE > > (or 1) if it is and FALSE (or 0) if it is not > > AND > > check is the previous value is identical and return TRUE (or 1) if it is > and > > FALSE (or 0) if it is not > > > > Im i thinking correct and can some help to write such a function > > > > regards > > Christiaan > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code. > > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Importing data into R and combining 2 files
What have you tried? Check the Intro manual for hints. ?read.table probably using sep='\t' On Thu, May 14, 2009 at 1:30 PM, Sunita22 wrote: > > Hello > > I have to import 2 txt files into R. 1 file contains the data and the other > contains the header, column headings, datatypes and labels for the data. > > I have 2 problems: > > 1) my data file has mixed type of data e.g. 1 2 3 4 5 3-5 02/04/06 3 4 5 > and > so on, the data file is tab separated. when I import it, the data is > getting > stored in one single variable say V1. I need to separate it into rows and > columns. how do I this? Which commands in R would be useful for the same? > > 2) The other file is also tab separated. the 6 lines contains header and > introduction as in the name of the dataset, year, etc. and then column > names > its datatypes and labels. After importing the data in this file also gets > stored in one single variable. I need to separate it into rows and columns. > how do I this? Which commands in R would be useful for the same? > > Thank you in advance > > Regards > Sunita > -- > View this message in context: > http://www.nabble.com/Importing-data-into-R-and-combining-2-files-tp23545291p23545291.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Output of binary representation
Are you looking for how the floating point is represented in the IEEE-754 format? If so, you can use writeBin: > writeBin(pi,raw(),endian='big') [1] 40 09 21 fb 54 44 2d 18 On Sun, May 17, 2009 at 1:23 PM, Ted Harding wrote: > I am interested in studying the binary representation of numerics > (doubles) in R, so am looking for possibilities of output of the > internal binary representations. sprintf() with format "a" or "A" > is halfway there: > > sprintf("%A",pi) > # [1] "0X1.921FB54442D18P+1" > > but it is in hex. > > The following illustrate the sort of thing I want: > > 1.1001 0010 0001 1011 0101 0100 0100 0100 0010 1101 0001 1000 > times 2 > > 11.0010 0100 0011 0110 1010 1000 1000 1000 0101 1010 0011 000 > > 0.1100 1001 1101 1010 1010 0010 0010 0001 0110 1000 1100 0 > times 4 > > (without the spaces -- only put in above for clarity). > > While I could take the original output "0X1.921FB54442D18P+1" from > sprintf() and parse it out into binary using gsub() or the like, > of submit it to say an 'awk' script via an external file, this would > be a tedious business! > > Is there some function already in R which outputs the bits in the > binary representation directly? > > I see that Dabid Hinds asked a similar question on 17 Aug 2005: > "Raw data type transformations" > > http://finzi.psych.upenn.edu/R/Rhelp02/archive/59900.html > > (without, apparently, getting any response -- at any rate within > the following 3 months). > > With thanks for any suggestions, > Ted. > > > E-Mail: (Ted Harding) > Fax-to-email: +44 (0)870 094 0861 > Date: 17-May-09 Time: 18:23:49 > -- XFMail -- > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple plotting errors
One way is to create a list of the dataframes and then use 'sapply' to extract the values: df.list <- list(FeketeJAN, ..., FeketeDEC) plot(sapply(df.list, function(a) a["AMAZON", "SUM_"])) On Mon, May 18, 2009 at 7:17 AM, Steve Murray wrote: > > Dear R Users, > > I have 12 data frames, each of 12 rows and 2 columns. > > e.g. FeketeJAN > MEANSUM_ > AMAZON 144.4997874 68348.4 > NILE 5.4701955 1394.9 > CONGO71.3670036 21196.0 > MISSISSIPPI 18.9273250 6511.0 > AMUR 1.8426874 466.2 > PARANA 58.3835497 13486.6 > YENISEI 1.4668313 592.6 > OB1.4239179 559.6 > LENA 0.9342164 387.7 > NIGER 4.7245709 826.8 > ZAMBEZI 76.6893794 8665.9 > YANGTZE 10.6759257 1729.5 > > > I want to do a line plot of the value of Amazon 'Sum' (in this case, > 68348.4) for each of the 12 data frames. I've tried doing this as follows: > > plot(FeketeJAN[1,2], FeketeFEB[1,2], FeketeMAR[1,2], *through to December* > type="l") > > but receive: Error in strsplit(log, NULL) : non-character argument > > > I've also tried: > > plot(FeketeJAN$AMAZON[,2], FeketeFEB$AMAZON[,2], *through to December* > type="l") > > but receive: > > Error in plot.window(...) : need finite 'xlim' values > In addition: Warning messages: > 1: In min(x) : no non-missing arguments to min; returning Inf > 2: In max(x) : no non-missing arguments to max; returning -Inf > 3: In min(x) : no non-missing arguments to min; returning Inf > 4: In max(x) : no non-missing arguments to max; returning -Inf > > > What is it that I'm doing wrong?! > > Many thanks for any advice, > > Steve > > > > _ > [[elided Hotmail spam]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split data frame based on Class
?split new.df <- split(old.df, old.df$Class) will create a list of dataframes split by Class On Mon, May 18, 2009 at 7:23 AM, Chris Arthur wrote: > Each row of my data frame is assigned to a class (eg country). Can you > suggest how I break apart the data frame so that I create new data frames > for each class > > eg > > If Class = "US" put in new dataframe dataUS > > Thanks in advance for your help > > Chris > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parsing configuration files
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. There are regular expressions that can be used. It is very dependent upon the format of a configuration file; an example would help to show the way. On Mon, May 18, 2009 at 6:10 AM, Marie Sivertsen wrote: > Dear list, > > Is there any functionality in R that would allow me to parse config files? > I have trie ??config and apropos('config') without succes, and also search > the R package site. > > Mvh. > Marie > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error in importing text files
gt; each question, and three addtiional data points of no interest. The data >> are arranged in an unstacked (long) text file such that each line contains >> all of the above information and there are 34 (32 responses plus 2 extra >> lines of meaningless data) lines per measurement occasion (upto 850 lines >> of data if all 34 lines are present ! >> for all 25 measurment occasions). Below is an example of how the data are >> arranged. >> >> 20080204131646 23256063 6 0 >> "" >> 20080204131646 233152-1 7 0 >> "" >> 20080204150043 2-32767 0 0 65535 >> "" >> 20080204182117 2 1283-1 7 0 >> "" >> 20080204182117 2 283834 6 0 >> "" >> 20080204182117 2 326636 6 0 >> "" >> Year/Month/Day/Time Palm ID Response/Q#Latency Response 3 >> meangingless columnsThe dataset presented above begins with question >> 32 >> of one measurement occasion on Febraury 4, 2008 taken at 13:16:46. The >> next line (33) is in the datafile because participants had to click a >> button to exit the measurement occasion. You then see the beginning of >> another measurement occasion (20080204192117) in which the participant did >> not respond (-32767). The next measurement occasion begins on the next >> line which actually starts with response 2 because participants were >> required to read a screen and click through prior to answering any >> questions. Thus, anytime participants simply read an instruction page >> responses are coded as a -1. What I would like to do is write code to >> automatically import these 107 files into R and structure them >> appropriately while importing them. Furthermore, I would like for the >> code >> to use conditional statements so that whenever it encounters a -32767! >> it inserts 32 variables (columns) with missing data and whenever it >> encounters a -1 it deletes that column all together. I would also like >> the >> code to separate the combined year/month/day/time column into 4 separate >> columns (year, month, day, time). Finally, I would like the code to stack >> the 32 responses during each measurement occasion so that I have 32 >> columns >> of reponses plus columns for year, month, day, and latency, but leave each >> measurment occasion unstacked. >> >> Thanks! >> >> Eric S McKibben >> Industrial-Organizational Psychology Graduate Student >> Clemson University >> Clemson, SC >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> >> >> > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to calculate means of matrix elements
You can convert it to an array and then use apply: > mat1 [,1] [,2] [,3] [,4] [1,]32 124 [2,] 14 13 132 [3,] 15969 [4,]2 15 13 19 > mat2 [,1] [,2] [,3] [,4] [1,]0 11 107 [2,] 1293 13 [3,] -4 130 14 [4,] -20 -4 -1 > mat3 [,1] [,2] [,3] [,4] [1,] 206 16 23 [2,] 248 11 12 [3,] 15 136 16 [4,]5 22 20 25 > > x <- array(c(mat1,mat2,mat3), dim=c(4,4,3)) > apply(x,c(1,2),mean) [,1] [,2] [,3] [,4] [1,] 7.67 6.33 12.67 11.3 [2,] 16.67 10.00 9.00 9.0 [3,] 8.67 11.67 4.00 13.0 [4,] 1.67 12.33 9.67 14.3 On Mon, May 18, 2009 at 8:40 PM, dxc13 wrote: > > useR's, > I have several matrices of size 4x4 that I want to calculate means of their > respective positions with. For example, consider I have 3 matrices given > by > the code: > mat1 <- matrix(sample(1:20,16,replace=T),4,4) > mat2 <- matrix(sample(-5:15,16,replace=T),4,4) > mat3 <- matrix(sample(5:25,16,replace=T),4,4) > > The result I want is one matrix of size 4x4 in which position [1,1] is the > mean of position [1,1] of the given three matrices. The same goes for all > other positions of the matrix. If these three matrices are given in > separate text files, how can I write code that will get this result I need? > > Thanks in advance, > dxc13 > -- > View this message in context: > http://www.nabble.com/how-to-calculate-means-of-matrix-elements-tp23607694p23607694.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to copy files from one direction to another?
?file.copy On Tue, May 19, 2009 at 9:51 PM, XinMeng wrote: > There's 10 files in c:\\ > I wanna copy 3 of them to d:\\ > > How to do it via R? > > > Thanks! > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace / swap values of subset of a data.frame
Exactly what are you trying to do? Are you trying to just change a subset of the values? 'subset' does not have an 'assignment' operator. Maybe you want something like this (but it is not clear from your description. Also it is not clear if you have exactly the same set of matching values in the two data frames for the subset conditions. If you do, then this might work: data1[(data1$Subject==25) & (data1$Session==1), 22] <- data2[(data2$Subject==25)&(data2$Session==1), 23] On Tue, May 19, 2009 at 3:50 PM, tsunhin wong wrote: > Dear R users, > > I have 1 data.frame of 1500x80 - data1. I found out that there are a > few cells of data that I have misplace, and I need to fix the ordering > of them. > In an attempt trying to swap column 22 & 23 of the Subject with > misplaced data, I did the following: > > data2 <- data1 > > subset(data1,(Subject==25 & Session==1))[,22] <- > subset(data2,(Subject==25 & Session==1))[,23] > > (error messages... "Could not find function "subset<-") > > subset(data1,(Subject==25 & Session==1))[,23] <- > subset(data2,(Subject==25 & Session==1))[,22] > > (error messages... "Could not find function "subset<-") > > Please, please point me to some ways to achieve the swapping. > Thanks a lot! > > Cheers, > > John > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Too large a data set to be handled by R?
If your 1500 X 2 matrix is all numeric, it should take up about 240MB of memory. That should easily fit within the 2GB of your laptop and still leave room for several copies that might arise during the processing. Exactly what are you going to be doing with the data? A lot will depend on the functions/procedures that you will be calling, or the type of transformations you might be doing. On Tue, May 19, 2009 at 11:59 PM, tsunhin wong wrote: > Dear R users, > > I have been using a dynamic data extraction from raw files strategy at > the moment, but it takes a long long time. > In order to save time, I am planning to generate a data set of size > 1500 x 2 with each data point a 9-digit decimal number, in order > to save my time. > I know R is limited to 2^31-1 and that my data set is not going to > exceed this limit. But my laptop only has 2 Gb and is running 32-bit > Windows / XP or Vista. > > I ran into R memory problem issue before. Please let me know your > opinion according to your experience. > Thanks a lot! > > - John > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] turning off specific types of warnings
?suppressWarnings On Wed, May 20, 2009 at 8:10 AM, Eleni Rapsomaniki wrote: > Dear R users, > > I have a long function that among other things uses the "survest" function > from the Design package. This function generates the warning: > > In survest.cph (...) > S.E. and confidence intervals are approximate except at predictor means. > Use cph(...,x=T,y=T) (and don't use linear.predictors=) for better > estimates. > > I would like to turn this specific warning off, as it makes it difficult to > detect other (potentially more crucial) warnings generated by other parts of > my code. > > Is there a way to do this? > > Eleni Rapsomaniki > > Research Associate > Strangeways Research Laboratory > Department of Public Health and Primary Care > > University of Cambridge > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficiency when processing ordered data frames
How much is it currently costing you in time to do the selection process? Is it having a large impact on your program? Is it the part that is really consuming the overall time? What is your concern in this area? Here is the timing that it take so select from 10M values those that are less than a specific value. This takes less than 0.2 seconds: > x <- runif(1e7) > system.time(y <- x < .5) user system elapsed 0.150.050.20 > x <- sort(x) > system.time(y <- x < .5) user system elapsed 0.110.030.14 > On Wed, May 20, 2009 at 8:54 AM, Brigid Mooney wrote: > Hoping for a little insight into how to make sure I have R running as > efficiently as possible. > > Suppose I have a data frame, A, with n rows and m columns, where col1 > is a date time stamp. Also suppose that when this data is imported > (from a csv or SQL), that the data is already sorted such that the > time stamp in col1 is in ascending (or descending) order. > > If I then wanted to select only the rows of A where col1 <= a certain > time, I am wondering if R has to read through the entirety of col1 to > select those rows (all n of them). Is it possible for R to recognize > (or somehow be told) that these rows are already in order, thus > allowing the computation could be completed in ~log(n) row reads > instead? > > Thanks! > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Class for time of day?
If you want the hours from a POSIXct, here is one way of doing it; you can create a function for doing it: > x <- Sys.time() > x [1] "2009-05-20 12:17:13 EDT" > y <- difftime(x, trunc(x, units='days'), units='hours') > y Time difference of 12.28697 hours > as.numeric(y) [1] 12.28697 > It depends on what type of computations you want to do with it. You can leave it as POSIXct and carry out a lot of them. Can you specify what you want? On Wed, May 20, 2009 at 10:57 AM, Stavros Macrakis wrote: > What is the recommended class for time of day (independent of calendar > date)? > > And what is the recommended way to get the time of day from a POSIXct > object? (Not a string representation, but a computable representation.) > > I have looked in the man page for DateTimeClasses, in the Time Series > Analysis Task View and in Spector's Data Manipulation book but haven't > found > these. Clearly I can create my own Time class and hack around with the > internal representation of POSIXct, e.g. > >days <- unclass(d)/(24*3600) >days-floor(days) > > and write print.Time, `-.Time`, etc. etc. but I expect there is already a > standard class or CRAN package. > > -s > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.