Re: [R] split data with missing data condition
Try this: a - read.table(textConnection( x y + 59.74889 3.1317081 + 38.77629 1.7102589 +NA 2.2312962 + 32.35268 1.3889621 + 74.01394 1.5361227 + 34.82584 1.1665412 + 42.72262 2.7870875 + 70.54999 3.3917257 + 59.37573 2.6763249 + 68.87422 1.9697770 + 19.00898 2.0584415 + 60.27915 2.5365194 + 50.76850 2.3943836 + NA 2.2862790 + 39.01229 1.7924957), header=TRUE) a - as.matrix(a) # good data a.good - a[complete.cases(a),, drop=FALSE] a.bad - a[!complete.cases(a),, drop=FALSE] a.good xy [1,] 59.74889 3.131708 [2,] 38.77629 1.710259 [3,] 32.35268 1.388962 [4,] 74.01394 1.536123 [5,] 34.82584 1.166541 [6,] 42.72262 2.787088 [7,] 70.54999 3.391726 [8,] 59.37573 2.676325 [9,] 68.87422 1.969777 [10,] 19.00898 2.058441 [11,] 60.27915 2.536519 [12,] 50.76850 2.394384 [13,] 39.01229 1.792496 a.bad xy [1,] NA 2.231296 [2,] NA 2.286279 On Fri, Oct 15, 2010 at 8:45 AM, Jumlong Vongprasert jumlong.u...@gmail.com wrote: Dear all I have data like this: x y [1,] 59.74889 3.1317081 [2,] 38.77629 1.7102589 [3,] NA 2.2312962 [4,] 32.35268 1.3889621 [5,] 74.01394 1.5361227 [6,] 34.82584 1.1665412 [7,] 42.72262 2.7870875 [8,] 70.54999 3.3917257 [9,] 59.37573 2.6763249 [10,] 68.87422 1.9697770 [11,] 19.00898 2.0584415 [12,] 60.27915 2.5365194 [13,] 50.76850 2.3943836 [14,] NA 2.2862790 [15,] 39.01229 1.7924957 and I want to spit data into two set of data, data set of nonmising and data set of missing. How I can do this. Many Thanks. Jumlong -- Jumlong Vongprasert Institute of Research and Development Ubon Ratchathani Rajabhat University Ubon Ratchathani THAILAND 34000 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dealing with Non-Standard Hours
You could have posted an example of your data. You can use 'sub' to substitute one set of characters for another in your data. There are other ways of doing it if we had an example of your data. On Fri, Oct 15, 2010 at 5:55 PM, Clint Bowman cl...@ecy.wa.gov wrote: A data set I obtained has the hours running from 01 through 24 rather than the conventional 00 through 23. My favorite, strptime, balks at hour 24. I thought it would be easy to correct but it must be too late on Friday for my brain and caffeine isn't helping. TIA for a hint, Clint -- Clint Bowman INTERNET: cl...@ecy.wa.gov Air Quality Modeler INTERNET: cl...@math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600 FAX: (360) 407-7534 Olympia, WA 98504-7600 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] delete data row
It is best to use 'all.equal' keeping in mind FAQ 7.31. On Sat, Oct 16, 2010 at 1:30 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: Dear IRD, One way is to select every row except those where y = y.j and then assign that to IR. In my example, which() returns a vector of the row numbers where the condition evaluated TRUE, then I used `-` to select not those rows. IR - IR[-which(IR$y == y.j), ] HTH, Josh On Sat, Oct 16, 2010 at 5:02 AM, IRD ird_u...@hotmail.com wrote: Dear All I have data like this: IR x y [1,] 5 2.865490 [2,] 3 1.454611 [3,] 3 2.258772 [4,] 6 1.476128 [5,] 4 2.771606 y.j y 2.865490 and I want to delete data row in IR where y = y.j How I can do. IRD __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Xlsx and R -read problem
try library(RODBC) and odbcConnectExcel2007; it has worked for me. On Sat, Oct 16, 2010 at 10:17 AM, ashz a...@walla.co.il wrote: Hi, I have an excel 2007 file located in C:\know and called try.xlsx. Whan I try to read it I get this error: file - system.file(know, try.xlsx, package = xlsx) res - read.xlsx(file, 2) # read the second sheet Error in .jnew(java/io/FileInputStream, file) : java.io.FileNotFoundException: Can someone tell me what is the problem? and how to solve it. Cheers, Ashz -- View this message in context: http://r.789695.n4.nabble.com/Xlsx-and-R-read-problem-tp2998304p2998304.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector multiplication
?outer outer(1:2, 1:3, *) [,1] [,2] [,3] [1,]123 [2,]246 On Sun, Oct 17, 2010 at 3:25 AM, Ron Michael ron_michae...@yahoo.com wrote: Is there any operator in R, which will multiply each possible combination of the elements of 2 vectors? Suppose I have 2 vectors (1,2) and (1,2,3). If I multiply those 2, I should get:(1,2,3,2,4,6) Thanks, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to convert string to object?
Have you tried this: temp = ~aparch( temp1 = paste(temp,1, sep = ) temp2 = paste(temp1,1, sep = ,) temp3 = paste(temp2, ),sep = ) temp3 [1] ~aparch(1,1) as.formula(temp3) ~aparch(1, 1) x - as.formula(temp3) str(x) Class 'formula' length 2 ~aparch(1, 1) ..- attr(*, .Environment)=environment: R_GlobalEnv On Sun, Oct 17, 2010 at 2:53 PM, lord12 trexi...@yahoo.com wrote: temp = ~aparch( temp1 = paste(temp,1, sep = ) temp2 = paste(temp1,1, sep = ,) temp3 = paste(temp2, ),sep = ) temp 3 is a character but I want to convert to formula object. How do I do this? -- View this message in context: http://r.789695.n4.nabble.com/how-to-convert-string-to-object-tp2999281p2999281.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Where precision change
FAQ 7.31 ?all.equal On Mon, Oct 18, 2010 at 5:58 AM, Alaios ala...@yahoo.com wrote: Hello everyone. I need some help to understand when number precision in R is set. For this please consider the following example for (i in c(2:length(final))){ sizex - c(sizex,(final[i]-final[i-1],digits=2))) # round is used to remove values that are too small like e-17. print(round(final[i]-final[i-1],digits=2)) } final[2]-final[1] return something like 4.440892e-16, which means that these two numbers are the same. They are two but as they were derived from a different process they are not the same for precision. Also the line print(round(final[2]-final[1]),digits=2) returns 0 which is correct When the above loop stops executing inside sizex variable I find the value 4.440892e-16 which I was not expecting. As you can see from small code snippet before setting the value in the sizex I try to round it. The print gives the right value but for some reason it seems that inside the loop the precision in sizex is changed. Can you please help me clarify that? Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function using values separated by a comma
Try this (I think your result in [2,2] is incorrect): dat - read.table(tc - textConnection( + '0,1 1,3 40,10 0,0 + 20,5 4,2 10,40 10,0 + 0,11 1,2 120,10 0,0'), as.is = TRUE) closeAllConnections() # split the data and create new matrix newDat - lapply(dat, function(.col){ + # split by comma, unlist, convert to numeric and divide + x1 - matrix(as.numeric(unlist(strsplit(.col, ','))), nrow = 2) + x1[1, ] / colSums(x1) + }) do.call(cbind, newDat) V1V2 V3 V4 [1,] 0.0 0.250 0.80 NaN [2,] 0.8 0.667 0.20 1 [3,] 0.0 0.333 0.923077 NaN On Mon, Oct 18, 2010 at 2:37 AM, burgundy saub...@yahoo.com wrote: Hi, Thanks again for your help with this. I would like to use a variation of this function in a similar dataset (numeric) with elements separated by a comma e.g. dat - read.table(tc - textConnection( '0,1 1,3 40,10 0,0 20,5 4,2 10,40 10,0 0,11 1,2 120,10 0,0'), sep=) to simply calculate the frequency of the first number divided by the total number, i.e. x[1]/sum(x). to produce: [,1] [,2] [,3] [,4] [1,] 0 0.25 0.8 NaN [2,] 0.8 0.33 0.2 1 [3,] 0 0.33 0.92 NaN My actual dataset is an enormous file (800,000 rows and 100 columns). Any advice on how I can do this, maybe using gsubfn? Thank you very much! -- View this message in context: http://r.789695.n4.nabble.com/function-using-values-separated-by-a-comma-tp2967870p2999723.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting elements from a nested list
the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting elements from a nested list
, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 3L, 3L, 2L, 3L, 2L, 3L, 3L, 3L, 3L, 2L, 3L, 3L, 3L, 2L, 3L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 3L, 3L, 3L, 2L, 3L, 3L, 3L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 3L, 2L, 3L, 3L, 2L, 3L, 3L, 2L, 3L, 3L, 2L, 3L, 3L, 2L), .Label = c(0, 1, 2), class = factor), structure(c(3L, 2L, 3L, 2L, 3L, 3L, 1L, 3L, 2L, 2L, 1L, 3L, 3L, 2L, 1L, 3L, 3L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 3L, 3L, 2L, 1L, 2L, 1L, 1L, 3L, 1L, 2L, 1L, 3L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 3L, 3L, 3L, 3L, 1L, 3L, 1L, 2L, 1L, 3L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 1L, 3L, 1L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 2L, 3L, 3L, 2L, 3L, 1L, 2L, 1L, 1L, 2L, 3L, 3L, 2L), .Label = c(0, 1, 2), class = factor), structure(c(3L, 1L, 3L, 1L, 3L, 3L, 1L, 3L, 1L, 1L, 1L, 3L, 3L, 2L, 1L, 3L, 3L, 1L, 3L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 1L, 1L, 2L, 1L, 1L, 3L, 1L, 2L, 1L, 3L, 1L, 1L, 3L, 1L, 2L, 2L, 1L, 3L, 2L, 1L, 3L, 3L, 3L, 3L, 1L, 3L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 3L, 1L, 3L, 1L, 1L, 1L, 2L, 1L, 1L, 3L, 1L, 3L, 3L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 2L), .Label = c(0, 1, 2), class = factor))) On Oct 18, 2010, at 4:09 PM, jim holtman wrote: files did not make it through the mailer. How did you attach them? try outputting the data using 'dput' and then attaching a '.txt' file, or just pasting them in the email. On Mon, Oct 18, 2010 at 2:40 PM, Gregory Ryslik rsa...@comcast.net wrote: Hi Everyone, This is closer to what I need but this returns me a matrix where each element is a factor. Instead I would want a list of lists. The first entry of the list should equal the first column of the matrix that mapply makes, the second entry to the second column etc... I've attached the two files that have all.predicted.values and max.growth from dput to make for easy testing. Thanks again! Kind regards, Greg On Oct 18, 2010, at 1:33 PM, Erich Neuwirth wrote: You probably need mapply since you have 2 list of arguments which you want to use in sync mapply(function(x1,x2)x1[[x2]],all.predicted.values,max.growth) might be what you want. On Oct 18, 2010, at 5:17 PM, Gregory Ryslik wrote: Unfortunately, that gives me null everywhere. Here's the data I have for all.predicted.values and max.growth. Perhaps this will help. Thus I want all.predicted.values[[1]][[4]] then all.predicted.values[[2]][3]] and then all.predicted.values[[3]][[4]]. I've attached what your statement outputs at the end. Thanks again! Browse[2] max.growth [[1]] [1] 4 [[2]] [1] 3 [[3]] [1] 4 Browse[2] all.predicted.values [[1]] [[1]][[1]] [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [55] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Levels: 0 1 2 [[1]][[2]] [1] 2 2 2 0 2 0 2 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 2 0 0 2 2 2 2 0 0 0 2 2 0 0 2 2 0 2 2 2 2 2 0 2 2 2 0 2 2 0 [55] 0 0 2 0 2 0 0 0 0 2 2 2 2 0 2 2 2 0 2 2 0 0 2 2 2 2 2 2 2 0 0 0 2 0 2 2 2 2 0 2 2 2 0 2 0 0 Levels: 0 1 2 [[1]][[3]] [1] 0 0 0 0 2 0 0 0 0 0 0 0 0 2 0 0 0 0 0 2 0 2 2 2 0 0 0 2 0 0 2 0 0 0 0 0 0 0 2 0 0 0 0 0 2 2 0 0 0 2 0 0 0 0 [55] 0 0 2 0 2 0 0 0 0 2 2 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 2 0 0 Levels: 0 1 2 [[1]][[4]] [1] 0 0 0 0 2 0 0 0 0 0 0 0 0 2 0 0 0 0 0 2 0 2 2 2 0 0 0 2 0 0 2 0 0 0 0 0 0 0 2 0 0 0 0 0 2 2 0 0 0 2 0 0 0 0 [55] 0 0 2 0 2 0 0 0 0 2 2 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 2 0 0 Levels: 0 1 2 [[2]] [[2]][[1]] [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [55] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 Levels: 0 1 2 [[2]][[2]] [1] 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 1 2 1 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 1 2 2 1 1 2 2 2 2 2 2 2 2 1 2 2 [55] 2 2 2 2 1 2 2 2 2 1 2 2 1 1 1 2 2 2 1 2 1 2 1 2 1 2 2 2 1 1 2 2 1 2 2 1 1 2 1 1 1 2 2 1 2 2 Levels: 0 1 2 [[2]][[3]] [1] 2 2 2 0 1 2 2 2 2 2 1 2 2 2 0 1 2 1 2 2 2 2 2 2 2 0 0 2 1 2 2 2 0 0 1 2 0 0 1 2 0 1 1 2 2 2 0 2 2 2 0 1 2 2 [55] 0 2 2 2 1 0 0 0 0 1 2 2 1 1 1 2 2 0 1 2 1 0 1 2 1 2 2 2 1 1 2 2 1 2 2 1 1 2 1 1 1 2 2 1 0 2 Levels: 0 1 2 [[3]] [[3]][[1]] [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [55] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Levels: 0 1 2 [[3]][[2]] [1] 2 2 2 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 0 0 2 2 2 2 2 0 0 2 2 2 0 2 2 0 2 2 2 2 2 0 2 2 2 0 2 2 2 [55] 0 2 2 2 2 2 0 0 2 2 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
Re: [R] Milliseconds and Time object
Is this what you are after: date - '2010-10-19' as.POSIXct(date) [1] 2010-10-19 EDT milli - 360 # one hour in milliseconds as.POSIXct(date) + milli / 1000 [1] 2010-10-19 01:00:00 EDT On Tue, Oct 19, 2010 at 3:24 AM, statquant2 statqu...@gmail.com wrote: Hello all, my question for today is the following : I have 1. a date (in a string but straightforward to convert to any format) 2. the time as the number of milliseconds elapsed since hour 00:00:00.000 of this date. My question is : 1. Is there a in built function that can give me the date+time (as POSIX object for instance) from what I have ? -- View this message in context: http://r.789695.n4.nabble.com/Milliseconds-and-Time-object-tp3001570p3001570.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANOVA stuffs_How to save each result from FOR command?
Here is how you can get the results back in a list that you can then analyze: results_ezANOVA - list() for(i in 1:90) { results_ezANOVA[[i]] - ezANOVA(data=subset(ast.ast_coef, ast.ast_coef$coef_thr==i), dv=.(ast.values), between=.(gender), wid=.(subj), within=.(cond)) } On Tue, Oct 19, 2010 at 6:16 AM, BumSeok Jeong bumseok.je...@gmail.com wrote: Dear R experts, I'm new in R and a beginner in terms of statistics. It should be simple question, but definitely difficult to solve it by myself. I'd like to see main effect of group(gender: sample size is different(M:F=23:18) and one of condition(cond) and the interaction at each subset from 90 datasets So I perform anova 90 times using a command like below; for(i in 1:90) {results_ezANOVA = ezANOVA(data=subset(ast.ast_coef, ast.ast_coef$coef_thr==i), dv=.(ast.values), between=.(gender), wid=.(subj), within=.(cond))} But I got the last(90th) result, not all. Here are my questions. 1) Is my command correct? 2) If correct, please let me know if I can get all 90 results. 3) What kind of postHoc would be appropriate? Thank you, Jeong [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowsum
Another option to consider: x A B C 1 89 1 140 2 89 6 20 3 89 29 137 4 89 52 13 5 89 57 10 6 89 97 23 7 89 1 37 8 89 1 12 9 89 1 3 10 52 1 11 11 52 1 31 12 52 1 16 13 52 1 6 14 52 1 10 15 52 1 13 16 52 1 10 17 52 1 25 18 52 1 2 19 52 59 38 20 52 97 75 21 57 1 14 22 57 1 13 23 57 1 14 24 57 114 12 25 57 1 23 26 57 6 26 require(sqldf) sqldf(select A, B, sum(C) from x group by A, B) A B sum(C) 1 52 1124 2 52 59 38 3 52 97 75 4 57 1 64 5 57 6 26 6 57 114 12 7 89 1192 8 89 6 20 9 89 29137 10 89 52 13 11 89 57 10 12 89 97 23 On Wed, Oct 20, 2010 at 5:42 AM, xtracto b2017...@lhsdv.com wrote: Hello, I am trying to achieve something which I *think* is possible using rowsum, but a little help should be useful: Consider the following dataframe DF0: A B C 89 1 140 89 06 20 89 29 137 89 52 13 89 57 10 89 97 23 89 1 37 89 1 12 89 1 3 52 1 11 52 1 31 52 1 16 52 1 6 52 1 10 52 1 13 52 1 10 52 1 25 52 1 2 52 59 38 52 97 75 57 1 14 57 1 13 57 1 14 57 114 12 57 1 23 57 06 26 I need create a new dataframe containing the sums of all the rows where B = 1 for the different values of A, keeping the rows with other B values the same. That is, for this data sample, the result I expect is something like this (the order of the rows does not matter): A B C 89 1 192 #From adding up: [140 + 37 + 12 + 3] 89 06 20 89 29 137 89 52 13 89 57 10 89 97 23 52 1 124 # From adding up: [11 + 31 + 16 + 6 + 10 + 13 + 10 + 25 + 2] 52 59 38 52 97 75 57 1 64 #From adding up: [14 +13 +14 +23] 57 114 12 57 06 26 Now, I now it should be possible to first separate the data in two sets, where DF1 - DF0[DF0$B != 1,] DF2 - DF0[DF0$B == 1,] Then I should apply sumrow to DF2 with some group vector, but I do not know where to go from here. Can anyone help? Thanks in advance! -- View this message in context: http://r.789695.n4.nabble.com/rowsum-tp3003551p3003551.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot help
I think that your first problem is that you have a very large range of values and the CIs are small in comparison, so you won't see any difference on the plots. Do you want to plot each of the 35 values showing the complete range and then where the actual value lies either inside/outside the range? Maybe you need to scale your data so at least everything is in a similar range. Are you just trying to show the spread of each individual group, or all together? On Wed, Oct 20, 2010 at 5:34 AM, Peter Francis peterfran...@me.com wrote: Dear List, I am relatively new to R and am trying to create more attractive plots than excel can manage! I have looked through the various programmes ggplot, lattice, hmisc etc but my case seems to be not metnioned, maybe it is but i have not noticed - if this is the case i apologise. * #I have a series of simulated values, which are means sim - c(0.0012,0.0009,2,2,9,12,0.0009,2,19,1,1,0.0013,1,0.0009,0.0009,1,26,3,1,2,1,0.0009,1,0.2323,4,2,0.0009,0.0009,0.0009,52,49,1,3,7) #and actual values actual - c(0,0,2,0,13,20,0,3,38,0,0,0,1,0,0,0,27,2,0,0,1,0,1,0,4,2,0,0,0,54,21,0,4,11) #The X axes is family, ranging from 1-35, where the Y axes is sim and actual values. #What i want to do is plot the simulated values with the 95% CI values, and then plot the actual values and see if they fall in the CI'S which they do. The idea is that there is no significant difference between the actual values and the simulated values. #I thave Ci for sim and this is where the trouble begins! simCI - c(0.000908781,0.001248025,0.000928731,0.000885441,0.002384808,0.002700088,0.005377963,0.006202863,0.000918969,0.002566072,0.007687229,0.001593536,0.001578519,0.001299327,0.00217493,0.000908781,0.00090428,0.001550469,0.008840134,0.003300862,0.001546501,0.002775418,0.0014778,0.00090428,0.001546201,0.000898151,0.003446757,0.002854941,0.000863444,0.000918969,0.000924599,0.011732253,0.011488353,0.001788464) # i then put this in a dataframe simvsact - data.frame(sim = sim, actual = actual, simCI.lower = sim - simCI, simCI.upper = sim + simCI, fam = factor(paste('Family', 1:34, sep = ''))) * As afore mentioned i was looking at getting a x/y scatter plot ( i think this would be best, if not other suggestions would be greatly appreciated) with the CI range block highlighted and the actual line a different colour running through the CI range. I hope this makes sense. Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create a list fails
I do not see that it has been created yet. You my have defined it earlier in the 'list' expression, but as an object it is not available yet. You might have to do something like uwing 'within' x - list() x - within(x, { + a = 1:10 + b = 11:20 + c = a + b + d = a * b}) x $d [1] 11 24 39 56 75 96 119 144 171 200 $c [1] 12 14 16 18 20 22 24 26 28 30 $b [1] 11 12 13 14 15 16 17 18 19 20 $a [1] 1 2 3 4 5 6 7 8 9 10 On Wed, Oct 20, 2010 at 10:42 AM, Maas James Dr (MED) j.m...@uea.ac.uk wrote: I can not understand why this fails faicoutput2 - list(stuff21 = as.numeric(faicout$coefficients[2]), + stuff31=as.numeric(faicout$coefficients[3]), + stuff41=as.numeric(faicout$coefficients[4]), + stuff32=(stuff21-stuff31), + stuff42=(stuff21-stuff41), + stuff43=(stuff32-stuff42) + ) Error: object 'stuff21' not found Why does it have to be found, exist previously ... it is being created? But this works fine data - list(Ntrials =numtritot, Ncomparisons=2, treat=c(rep(1 ,N.trials[1,2]), rep(2,N.trials[1,3])), total.patientnums.trt1=dat2[ ,2], total.patientnums.trt23=dat2[ ,2], num.countstrt1=dat2[ ,5], num.countstrt23=dat2[ ,6] ) === Dr. Jim Maas University of East Anglia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Initiating graphics recording in RGraphics window via a script
?windows On Wed, Oct 20, 2010 at 11:19 AM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! Sometimes I have to produce several graphs at a time, but need to be able to see them all one by one in the RGraphics window. I do it manually like this: I create some plot: plot(1:5) It opens the RGraphics window. I click on the window, go (in the menue) to History-Recording, and then run my several graphs, e.g.: for(i in 10:12){ plot(1:i) } Is there any way to avoid doing it manually initiate the graphics recording in the RGraphics window in the script itself? Thanks a lot for your help! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot creates a straigth line
'plot' is doing exactly what you are asking it to do. Take a close look at your data: cyto_std_concod [1,] 11.371777 10.00 [2,] 9.786814 8.00 [3,] 8.201852 6.00 [4,] 6.616889 4.00 [5,] 5.031927 2.00 [6,] 3.446964 1.00 [7,] 11.371777 10.50 [8,] 9.786814 7.80 [9,] 8.201852 6.40 [10,] 6.616889 3.80 [11,] 5.031927 2.10 [12,] 3.446964 0.95 see between the 6th and 7th values you are asking it to draw a line from the lowest to highest. You want split your plotting into 'plot(data[1:6])' and lines(data[7:12]) to avoid the straight line. But should should away look at your data to see if that is what you intended. On Wed, Oct 20, 2010 at 11:18 AM, 1Rnwb sbpuro...@gmail.com wrote: Hello all, I am using 'plot' to create standard curves for elisa data. when I use 'plot' with type='b' i get the points connected with lines and one straigth line from the lowest datapoint to the highest data point. how can i avoid/remove it from the figure. i am using R2.9.1, below is the example of the data. od-c(10, 8, 6,4,2,1, 10.5,7.8,6.4,3.8,2.1,0.95) cyto_conc=2650 # Highest cytokine concentration user defined cyto_std_conc -c(cyto_conc) for (i in 1:5) { cyto_conc = cyto_conc /3 cyto_std_conc -c(cyto_std_conc ,cyto_conc) } cyto_std_conc-log2(rep(cyto_std_conc,2)) cyto-cbind(cyto_std_conc,od) plot(cyto_std_conc,od, type='b') I have searched help using '?plot' in R as well as google, all the examples which are available online gives me the plot the way it is shown in the example. but when i use the plot for my data it gives me a straight line. thanks sharad -- View this message in context: http://r.789695.n4.nabble.com/Plot-creates-a-straigth-line-tp3004090p3004090.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficient test for missing values (NAs)
?complete.cases Sent from my iPhone On Oct 20, 2010, at 18:53, Ali Tofigh alix.tof...@gmail.com wrote: What is the best way to detect whether or not a (potentially large) matrix contains missing values (NAs) or not? I use if (sum(is.na(x)) 0) {...} are there more efficient ways? /Ali __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rounding up (always)
why don't you just use 'pretty' pretty(c(-1225, 2224)) [1] -1500 -1000 -500 0 500 1000 1500 2000 2500 pretty(c(-4.28, 6.45)) [1] -6 -4 -2 0 2 4 6 8 On Wed, Oct 20, 2010 at 8:38 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Thank you for your help, everyone. Actually, I am building a lot of graphs (in a loop) but the values on the y axes from graph to graph could range from [-5; 5] to [-10,000; 10,000]. So, I am trying to create ylim ranging from ymin to ymax such that they look appropriate for the range. For example, if we are taking the actual range from -4.28 to 6.45, I'd like the range to be -5 to 7. But if the range is from -1225 to 2248, then I'd like it to be from -1500 to 2500 or from -2000 to 3000. Hence, my original question. Dimitri On Wed, Oct 20, 2010 at 5:55 PM, Ted Harding ted.hard...@wlandres.net wrote: On 20-Oct-10 21:27:46, Duncan Murdoch wrote: On 20/10/2010 5:16 PM, Dimitri Liakhovitski wrote: Hello! I am trying to round the number always up - i.e., whatever the positive number is, I would like it to round it to the closest 10 that is higher than this number, the closest 100 that is higher than this number, etc. For example: x-3241.388 signif(x,1) rounds to the closest thousand, i.e., to 3,000, but I'd like to get 4,000 instead. signif(x,2) rounds to the closest hundred, i.e., to 3,200, but I'd like to get 3,300 instead. signif(x,3) rounds to the closest ten, i.e., to 3,240, but I'd like to get 3,250 instead. Of course, I could do: floor(signif(x,1)+1000) floor(signif(x,2)+100) floor(signif(x,3)+10) But it's very manual - because in the problem I am facing the numbers sometimes have to be rounded to a 1000, sometimes to a 100, etc. Write a function. You have very particular needs, so it's unlikely there's already one out there that matches them. Duncan Murdoch As Duncan and Clint suggest, writing a function is straightforward: for the problem as you have stated it, on the lines of function(x,k){floor(signif(x,k-as.integer(log(x,10)-1))) + 10^k} However, what do you *really* want to happen to 3000? Ted. E-Mail: (Ted Harding) ted.hard...@wlandres.net Fax-to-email: +44 (0)870 094 0861 Date: 20-Oct-10 Time: 22:55:47 -- XFMail -- -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help: Using vectorization method for vectors comparision
try this: a - c(5, 10, 13, 19, 23) b - c(1, 4, 7, 9, 15) # use outer for comparison z - outer(a, b, ) # use rowSums to get the indices (may have to check for zero) b[rowSums(z)] [1] 4 9 9 15 15 On Wed, Oct 20, 2010 at 10:41 PM, bruclee brouc...@gmail.com wrote: I am trying to compare two sorted vectors, all elements in both vectors are not duplicated. Ex. a = c[5, 10, 13, 19, 23] b = c[1, 4, 7, 9, 15] For each element in a, i need find the max element in b which is smaller than it, so the short answer will look like [4, 9, 9, 15, 15]. I dont want to use any loop since my real project contains element in millions. Is there any way to speed up my operation? Many Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Help-Using-vectorization-method-for-vectors-comparision-tp3004952p3004952.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem on using read.csv function
An example would be useful. Look at what is causing what you think are numbers to be interpreted as character strings and therefore being changed to numbers. For example, are there commas in the numbers, are some missing and replaced by some character sequence that represents missing values. You can always convert the column to numerics: yourData$col - as.numeric(as.character(yourData$col)) So there is something in your data that is causing the conversion. After you do the conversion above, look for NAs in the data. This might show you were the problem is. On Thu, Oct 21, 2010 at 10:17 PM, mou sonia paperh...@gmail.com wrote: Hi, I'm using read.csv to import a table. But sevel columns are changed to factor variables automatically. They are actually numbers not factor levels. Why this happened? How can I get the correct table? Thanks a lot. Sonia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cbind query
This comes about since when using read.table (which I assume you did, but you did not show us what commands you were using), characters are converted to factors. If you are not using factors, then you probably want the data read in as characters. You should understand the use of 'str' to look at the structure of your objects to understand what might be happening. See example below: tab - read.table(textConnection( one two + 1 apple fruit + 2 ball game + 3 chair wood + 4 wood plain + 5 fruitbanana + 6 cloth silk)) closeAllConnections() str(tab) # note the you have 'factors' 'data.frame': 6 obs. of 2 variables: $ one: Factor w/ 6 levels apple,ball,..: 1 2 3 6 5 4 $ two: Factor w/ 6 levels banana,fruit,..: 2 3 6 4 1 5 cbind(tab$one, tab$two) # this gives numeric values of the factors [,1] [,2] [1,]12 [2,]23 [3,]36 [4,]64 [5,]51 [6,]45 # now read in data and not convert to factors (note: as.is=TRUE) tab - read.table(textConnection( one two + 1 apple fruit + 2 ball game + 3 chair wood + 4 wood plain + 5 fruitbanana + 6 cloth silk), as.is = TRUE) closeAllConnections() str(tab) # now you have characters 'data.frame': 6 obs. of 2 variables: $ one: chr apple ball chair wood ... $ two: chr fruit game wood plain ... cbind(tab$one, tab$two) # this gives character values [,1][,2] [1,] apple fruit [2,] ball game [3,] chair wood [4,] wood plain [5,] fruit banana [6,] cloth silk On Fri, Oct 22, 2010 at 7:06 AM, karthicklakshman karthick.laksh...@gmail.com wrote: I am new to R and request your kind help. I have a table like the one below, one two 1 apple fruit 2 ball game 3 chair wood 4 wood plain 5 fruit banana 6 cloth silk Note: duplicate entries are there the task is to create relations to each each row entries, like apple - fruit . when I tried to combine column1 with column 2 (one, two), using cbind the string is changed to numerical value...something like this [,1] [,2] [1,] 10 53 [2,] 25 562 [3,] 25 462 [4,] 25 1045 [5,] 25 488 [6,] 26 1062 [7,] 27 951 [8,] 27 144 [9,] 27 676 [10,] 27 486 Please suggest me how to get the string names back like the first table in the out put, using cbind. Thanks in advance regards kaarz -- View this message in context: http://r.789695.n4.nabble.com/Cbind-query-tp3006988p3006988.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Controlling number of numbers before R rewrites to +e18 etc
Your best bet is to make sure that you read the IDs in as characters. If they are being read in as floating point numbers, then there is only 15 digits of accuracy, so if you have IDs 18-22 digits, you will be missing data. So if you are using read.table, then look at colClasses to see how to do this. Provide a subset of your data and the statements that you are using to read in the data. On Fri, Oct 22, 2010 at 1:15 PM, ZeMajik zema...@gmail.com wrote: Hey, I'm using R as a pre-processor for a large dataset with IDs which are numeric (but has no numeric meaning so can be seen as factors). I do some data formating and then write it out to a csv file. However the problem is that the IDs are very long, 18-22 chars long more precisely. R is constantly rewriting these IDs to the abbreviated +eX which hinders me from exporting the data to the csv since the IDs are no longer intact. I've tried telling R that ID column is a factor, but this results in two problems: 1) Since I have millions of rows and R is slower handling factors than numbers my comp can't run the process in any kind of reasonable time. and 2) Some IDs STILL seem to be rewritten somehow. The second point made me believe that perhaps R is rewriting upon import? Does anyone have any tips on how to solve this problem? Thanks, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table input array
You need to make sure that your data is read in a characters and not factors: x - read.table(textConnection(ktot attractors pctstatesinattractors t lengths + 1.0 2.0 3.8146973E-4 17 c(2,2) + 1.0 1.0 5.722046E-4 28 c(2) + 1.0 2.0 9.536743E-4 18 c(2,2) + 1.0 1.0 0.0010490417 14 c(1)), as.is = TRUE, header = TRUE) closeAllConnections() str(x) 'data.frame': 4 obs. of 5 variables: $ ktot : num 1 1 1 1 $ attractors : num 2 1 2 1 $ pctstatesinattractors: num 0.000381 0.000572 0.000954 0.001049 $ t: int 17 28 18 14 $ lengths : chr c(2,2) c(2) c(2,2) c(1) x ktot attractors pctstatesinattractors t lengths 11 2 0.0003814697 17 c(2,2) 21 1 0.0005722046 28c(2) 31 2 0.0009536743 18 c(2,2) 41 1 0.0010490417 14c(1) x$varList- lapply(x$lengths, function(a) mean(eval(parse(text=a x ktot attractors pctstatesinattractors t lengths varList 11 2 0.0003814697 17 c(2,2) 2 21 1 0.0005722046 28c(2) 2 31 2 0.0009536743 18 c(2,2) 2 41 1 0.0010490417 14c(1) 1 On Fri, Oct 22, 2010 at 10:19 PM, Balpo ba...@gmx.net wrote: Hello again Jim (and everyone) I am having a weird problem here with the same parsing thing. For example, for the first row I have the following 5 columns. 1.0 2.0 3.8146973E-4 17 c(2,2) I need to convert that c(2,2) into a list and get its mean, in this particular case mean=2. My program does: t1 - read.table(file=file.dat, header=T, colClasses=c(numeric, numeric, numeric, numeric, factor)) t1$lengthz - lapply(t1$lengths, function(a) eval(parse(text=a)))#As Jim thought me t1$avglen - as.vector(mode=numeric, lapply(t1$lengthz, function(i) mean(i))) but the 6th column is strangely getting 780 instead of 2. This solution used to work! :-( Do you have any idea about what is going on? I attach file.dat. Thank you for your support. Balpo On 19/07/10 16:38, Balpo wrote: Thank you a lot, Jim. Issue solved. Balpo On 16/07/10 11:27, jim holtman wrote: Here is a way of creating a separate list of variable length vectors that you can use in your processing: # read into a dataframe x- read.table(textConnection(A B C T Lengths + 1 4.0 0.0015258789 18 c(1,2,3) + 1 4.0 0.0015258789 18 c(1,2,6,7,8,3) + 1 4.0 0.0015258789 18 c(1,2,3,1,2,3,4,5,6,7,9) + 1 4.0 0.0015258789 18 c(1,2,3) + 1 1.0 0.0017166138 24 c(1,1,4)), header=TRUE) # create a 'list' with the variable length vectors # assuming the the Lengths are legal R expressions using 'c' x$varList- lapply(x$Lengths, function(a) eval(parse(text=a))) x A B C T Lengths varList 1 1 4 0.001525879 18 c(1,2,3) 1, 2, 3 2 1 4 0.001525879 18 c(1,2,6,7,8,3) 1, 2, 6, 7, 8, 3 3 1 4 0.001525879 18 c(1,2,3,1,2,3,4,5,6,7,9) 1, 2, 3, 1, 2, 3, 4, 5, 6, 7, 9 4 1 4 0.001525879 18 c(1,2,3) 1, 2, 3 5 1 1 0.001716614 24 c(1,1,4) 1, 1, 4 str(x) 'data.frame': 5 obs. of 6 variables: $ A : int 1 1 1 1 1 $ B : num 4 4 4 4 1 $ C : num 0.00153 0.00153 0.00153 0.00153 0.00172 $ T : int 18 18 18 18 24 $ Lengths: Factor w/ 4 levels c(1,1,4),c(1,2,3),..: 2 4 3 2 1 $ varList:List of 5 ..$ : num 1 2 3 ..$ : num 1 2 6 7 8 3 ..$ : num 1 2 3 1 2 3 4 5 6 7 ... ..$ : num 1 2 3 ..$ : num 1 1 4 On Fri, Jul 16, 2010 at 10:51 AM, Balpoba...@gmx.net wrote: Hello to all! I am new with R and I need your help. I'm trying to read a file which contests are similar to this: A B C T Lengths 1 4.0 0.0015258789 18 c(1,2,3) 1 1.0 0.0017166138 24 c(1,1,4) So all the columns are numeric values, except Lengths, which is supposed to be an variable length array of integers. How can I make R read them as arrays of integers? Or otherwise, convert the character array to an array of integers. When I read the file, I do it like this t1 = read.table(file=paste(./borrar.dat,sep=), header=T, colClasses=c(numeric, numeric, numeric, numeric, array)) But the 5th column is treated as an array of characters, and when trying to convert it to another class of data, I either get two strings c(1,2,3) and c(1,1,4) or using a toRaw converter, I get the corresponding ASCII ¿? values. Should the input be modified in order to be able to read it as an array of integers? Thank you for your help. Balpo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting
Re: [R] Summarizing For Values with Multiple categories
Here is another way of doing it using some of the functions in a step-by-step manner: # had to put some separators in since data format was not apparent # best to provide sample data with 'dput' x - read.table(textConnection(Cat1|Cat2 |Cat3 | COG |Counts + A | B | C |COG1 |10 + B | D ||COG2 | 20 + C |||COG3 | 30 + D |||COG4 | 40) + , header = TRUE + , as.is = TRUE + , strip.white = TRUE + , sep = | + ) closeAllConnections() x Cat1 Cat2 Cat3 COG Counts 1ABC COG1 10 2BD COG2 20 3C COG3 30 4D COG4 40 # pull out the data into a 'long' format based on the first 3 columns # iterate over the first three columns combining with Counts long - do.call(rbind, lapply(x[1:3], function(.col){ + cbind(.col, x[['Counts']]) + })) # remove blanks long - long[long[,1] != , ] # now aggregate converting the character 'counts' to numeric tapply(as.numeric(long[,2]), long[,1], sum) A B C D 10 30 40 60 On Sat, Oct 23, 2010 at 7:03 PM, Alison Waller alison.wal...@embl.de wrote: Thanks! I tried reading the help for aggregate and can't figure out which form of the formula I am using, and therefore the syntax. I'm getting the below error. aggregate(counts ~ ind, merge(stack(CAT2COG), df, by = 1), sum) Error in as.data.frame.default(x) : cannot coerce class formula into a data.frame aggregate(counts ~ Cats, merge(stack(CAT2COG), df, by = 1), sum) Error in as.data.frame.default(x) : cannot coerce class formula into a data.frame Cats [1] A B C D E Levels: A B C D E aggregate(counts ~ COGs, merge(stack(CAT2COG), df, by = 1), sum) Error in as.data.frame.default(x) : cannot coerce class formula into a data.frame On 24-Oct-10, at 12:50 AM, Gabor Grothendieck wrote: aggregate(counts ~ ind, merge(stack(CAT2COG), df, by = 1), sum) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] If Statement Help
Need to understand how 'indexing' is done in R: x - read.table(textConnection( Price + 2010-10-11 99 + 2010-10-12101 + 2010-10-13102 + 2010-10-14103 + 2010-10-15 99 + 2010-10-18 98 + 2010-10-19 97 + 2010-10-20101 + 2010-10-21101 + 2010-10-22101), header = TRUE) closeAllConnections() x Price 2010-10-1199 2010-10-12 101 2010-10-13 102 2010-10-14 103 2010-10-1599 2010-10-1898 2010-10-1997 2010-10-20 101 2010-10-21 101 2010-10-22 101 x[x$Price 100,, drop = FALSE] Price 2010-10-12 101 2010-10-13 102 2010-10-14 103 2010-10-20 101 2010-10-21 101 2010-10-22 101 On Sat, Oct 23, 2010 at 9:56 PM, Jason Kwok jayk...@gmail.com wrote: Price 2010-10-11 99 2010-10-12 101 2010-10-13 102 2010-10-14 103 2010-10-15 99 2010-10-18 98 2010-10-19 97 2010-10-20 101 2010-10-21 101 2010-10-22 101 I have this dataset and I only want to return instances when the Price is 100. If I use the code: Price 100 then it will evaluate each entry as TRUE or FALSE. What is the code to only return TRUE results? Thanks, Jay [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Find index of a string inside a string?
I think what you want is 'regexpr': regexpr(bcd, aabcd) [1] 3 attr(,match.length) [1] 3 On Mon, Oct 25, 2010 at 7:27 AM, yoav baranan ybara...@hotmail.com wrote: Hi, I am searching for the equivalent of the function Index from SAS. In SAS: index(abcd, bcd) will return 2 because bcd is located in the 2nd cell of the abcd string. The equivalent in R should do this: myIndex - foo(abcd, bcd) #return 2. What is the function that I am looking for? I want to use the return value in substr, like I do in SAS. thanks, y. baranan. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot does not work
try 'graphics.off()' to close any device that might be open and see if you see any output. On Mon, Oct 25, 2010 at 5:44 AM, Alaios ala...@yahoo.com wrote: Hello everyone The following two commands plot.default(seq(1,5),seq(2,6)) plot(seq(1,5),seq(2,6)) plot nothing. One day ago this would create a simple plot diagram but unfortunately right now no plot appears. ?plot returns Help on topic 'plot' was found in the following packages: Plot a Raster* object (in package raster in library /home/apa/R/x86_64-unknown-linux-gnu-library/2.11) Generic X-Y Plotting (in package graphics in library /usr/lib64/R/library) What do you think I should blame for that? Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Controlling number of numbers before R rewrites to +e18 etc
You can always read a portion of the file and then write it out. For large files, I will read in 10,000 line, fix them up and then write them out and go back and process the next batch of lines. You haven't shown us what a sample of your input/output is, or how you are processing them. Depending on what type of preprocessing needs to be done to the data, PERL is also an option. But most things I used to use PERL for, I can do within R these days. Here is an example of reading in your IDs: x - read.table(textConnection(1234567890123456789012 987654321234567898765432 98765432123456789876543 + 1234567890123456789012 987654321234567898765432 98765432123456789876543 + 1234567890123456789012 987654321234567898765432 98765432123456789876543 + 1234567890123456789012 987654321234567898765432 98765432123456789876543 + 1234567890123456789012 987654321234567898765432 98765432123456789876543 + 1234567890123456789012 987654321234567898765432 98765432123456789876543 + 1234567890123456789012 987654321234567898765432 98765432123456789876543) + , colClasses = rep('character', 3)) closeAllConnections() str(x) 'data.frame': 7 obs. of 3 variables: $ V1: chr 1234567890123456789012 1234567890123456789012 1234567890123456789012 1234567890123456789012 ... $ V2: chr 987654321234567898765432 987654321234567898765432 987654321234567898765432 987654321234567898765432 ... $ V3: chr 98765432123456789876543 98765432123456789876543 98765432123456789876543 98765432123456789876543 ... x V1 V2 V3 1 1234567890123456789012 987654321234567898765432 98765432123456789876543 2 1234567890123456789012 987654321234567898765432 98765432123456789876543 3 1234567890123456789012 987654321234567898765432 98765432123456789876543 4 1234567890123456789012 987654321234567898765432 98765432123456789876543 5 1234567890123456789012 987654321234567898765432 98765432123456789876543 6 1234567890123456789012 987654321234567898765432 98765432123456789876543 7 1234567890123456789012 987654321234567898765432 98765432123456789876543 On Mon, Oct 25, 2010 at 4:41 AM, ZeMajik zema...@gmail.com wrote: Thanks Jim, but I still got the problem that the pre-processing becomes way too computationally expensive. R seems to handle characters and factors much much worse than numeric IDs. I don't have enough RAM to even write the file when they are viewed as chars instead of numeric values! Anyone have any other ideas? Is it not possible to tell R not to rewrite upon import? It wouldn't matter if it only would write the correct IDs to the exported csv file, but it exports the abbreviated version which is of no use. Mike On Sat, Oct 23, 2010 at 3:56 AM, jim holtman jholt...@gmail.com wrote: Your best bet is to make sure that you read the IDs in as characters. If they are being read in as floating point numbers, then there is only 15 digits of accuracy, so if you have IDs 18-22 digits, you will be missing data. So if you are using read.table, then look at colClasses to see how to do this. Provide a subset of your data and the statements that you are using to read in the data. On Fri, Oct 22, 2010 at 1:15 PM, ZeMajik zema...@gmail.com wrote: Hey, I'm using R as a pre-processor for a large dataset with IDs which are numeric (but has no numeric meaning so can be seen as factors). I do some data formating and then write it out to a csv file. However the problem is that the IDs are very long, 18-22 chars long more precisely. R is constantly rewriting these IDs to the abbreviated +eX which hinders me from exporting the data to the csv since the IDs are no longer intact. I've tried telling R that ID column is a factor, but this results in two problems: 1) Since I have millions of rows and R is slower handling factors than numbers my comp can't run the process in any kind of reasonable time. and 2) Some IDs STILL seem to be rewritten somehow. The second point made me believe that perhaps R is rewriting upon import? Does anyone have any tips on how to solve this problem? Thanks, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe, transform, strsplit
try this: df have want 1 a.b.ca 2 d.e.fd 3 g.h.ig df$get - gsub(^([^.]+).*, \\1, df$have) df have want get 1 a.b.ca a 2 d.e.fd d 3 g.h.ig g On Mon, Oct 25, 2010 at 12:53 PM, Matthew Pettis matthew.pet...@gmail.com wrote: Hi, I have a dataframe that has a column of vectors that I need to extract off the character string before the first '.' character and put it into a separate column. I thought I could use 'strsplit' for it within 'transform', but I can't seem to get the right invocation. Here is a sample dataframe that has what I have, what I want, and what I get. Can someone tell me how to get what is in the 'want' column from the 'have' column programatically? tia, Matt df - data.frame(have=c(a.b.c, d.e.f, g.h.i), want=c(a,d,g)) df.xform - transform(df, get=strsplit(as.character(have), split=., fixed=TRUE)[[1]][1]) df.xform [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non-numeric argument to binary operator error while reading ncdf file
put: options(error=utils::recover) in your script so that when an error occurs you are dropped into the 'browser' so that you can examine the variables at that point in time. There are several references on how to use the debugging tools in R that will help you resolve your problem. We can not tell from the information you provided what the problem is. You need to at least provide provide commented, minimal, self-contained, reproducible code so other can reproduce the error and provide feed back. On Wed, Oct 27, 2010 at 9:27 AM, Charles Novaes de Santana charles.sant...@imedea.uib-csic.es wrote: Hi everyone, I am a newbie in R and in this discussion list. I am trying to use R package ncdf to read values of temperature from a NCDF file. I did it before to another file using the function get.var.ncdf, but now there is an error that I can not solve, and I would really appreciate if you could help me. I am using R version 2.11.1 (2010-05-31) in a machine with Linux 2.6.26-2-amd64. library(ncdf) file_temp-open.ncdf(File.nc) temp-get.var.ncdf(file_temp,tasmax,verbose=TRUE) [1] get.var.ncdf: entering. Here is varid: [1] tasmax [1] checking to see if passed varid is actually a dimvar [1] entering vobjtodimname with varid= tasmax [1] vobjtodimname: is a character type varid. This file has 3 dims [1] vobjtodimname: no cases found, returning FALSE [1] get.var.ncdf: isdimvar: FALSE [1] vobjtovarid: entering with varid=tasmax [1] Variable named tasmax found in file with varid= 4 [1] vobjtovarid: returning with varid deduced from name; varid= 4 [1] get.var.ncdf: ending up using varid= 4 [1] ndims: 3 [1] get.var.ncdf: varsize: [1] 68 40 21275 [1] get.var.ncdf: start: [1] 1 1 1 [1] get.var.ncdf: count: [1] 68 40 21275 [1] get.var.ncdf: totvarsize: 57868000 [1] Getting var of type 4 (1=short, 2=int, 3=float, 4=double, 5=char, 6=byte) [1] get.var.ncdf: C call returned 0 [1] count.nodegen: 68 Length of data: 57868000 [2] count.nodegen: 40 Length of data: 57868000 [3] count.nodegen: 21275 Length of data: 57868000 [1] get.var.ncdf: final dims of returned array: [1] 68 40 21275 [1] varid: 4 [1] nc$varid2Rindex: 0 nc$varid2Rindex: 0 nc$varid2Rindex: 0 [4] nc$varid2Rindex: 1 [1] nc$varid2Rindex[varid]: 1 [1] get.var.ncdf: setting missing values to NA Error en mv * 1e-05 : non-numeric argument to binary operator Thank you very much for your attention! Cheers, Charles -- Um axé! :) -- Charles Novaes de Santana PhD student - Global Change Laboratorio Internacional de Cambio Global Department of Global Change Research Instituto Mediterráneo de Estudios Avanzados(CSIC/UIB) Calle Miquel Marques 21, 07006 Esporles - Islas Baleares - España [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Alter character attribute
try this: x - read.table(textConnection( ID date time + 1 22 9/10/2007 0:00:00 + 2 44 2/2/2006 0:00:00), header = TRUE) closeAllConnections() x ID datetime 1 22 9/10/2007 0:00:00 2 44 2/2/2006 0:00:00 x$month - sub(^([[:digit:]]+).*, \\1, x$date) x$year - sub(.*?([[:digit:]]+)$, \\1, x$date) x ID datetime month year 1 22 9/10/2007 0:00:00 9 2007 2 44 2/2/2006 0:00:00 2 2006 On Thu, Oct 28, 2010 at 6:40 PM, LCOG1 jr...@lcog.org wrote: Hi everyone I have some records that include a date attribute for the date and time but i need to separate the data and analyze it separately in GIS by Month and Year, so i need to pull these attributes out and create their own attribute field. So the input: RawData2.. returns ID period_end_date 1 22 9/10/2007 0:00:00 2 44 2/2/2006 0:00:00 and i need to get ID period_end_date Month Year 22 9/10/2007 0:00:00 9 2007 44 2/2/2006 0:00:00 2 2006 The below gets me this in list form which i can then add back into the initial data frame BUT i have over 4.5 million records and when i run the below it ran for more than 18 hours and only go through about 2.7 millions records when i gave up and ended the process. So how can i make this more efficient and possibly add the new attributes (month/year) to the data frame on the fly. Thanks guys #Create sample data RawData2..-data.frame(ID=c(22,44),period_end_date=c(9/10/2007 0:00:00,2/2/2006 0:00:00)) #Create lists to store month and year results Data.Month_-list() Data.Year_-list() #pull out year/month attribute at put in own column for(i in 1:length(RawData2..$ID)){ #Select Record Data.X-RawData..[i,] #Separate date into month, day, and year DateSplit-strsplit(Data.X$period_end_date,/) #Select month Month-unlist(DateSplit)[1] #Separate year from time attribute Year.X-strsplit(unlist(DateSplit)[3], ) Year.Y-unlist(Year.X)[1] Data.Month_[[i]]-Month Data.Year_[[i]]-Year.Y } -- View this message in context: http://r.789695.n4.nabble.com/Alter-character-attribute-tp3018202p3018202.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Alter character attribute
I didn't see you test so, so here is the solution with your data: RawData2..-data.frame(ID=c(22,44),period_end_date=c(9/10/2007 0:00:00, + 2/2/2006 0:00:00)) RawData2..$month - sub(^([[:digit:]]+).*, \\1, RawData2..$period_end_date) RawData2..$year - sub(.*/([[:digit:]]+) .*, \\1, RawData2..$period_end_date) RawData2.. ID period_end_date month year 1 22 9/10/2007 0:00:00 9 2007 2 44 2/2/2006 0:00:00 2 2006 On Thu, Oct 28, 2010 at 6:40 PM, LCOG1 jr...@lcog.org wrote: Hi everyone I have some records that include a date attribute for the date and time but i need to separate the data and analyze it separately in GIS by Month and Year, so i need to pull these attributes out and create their own attribute field. So the input: RawData2.. returns ID period_end_date 1 22 9/10/2007 0:00:00 2 44 2/2/2006 0:00:00 and i need to get ID period_end_date Month Year 22 9/10/2007 0:00:00 9 2007 44 2/2/2006 0:00:00 2 2006 The below gets me this in list form which i can then add back into the initial data frame BUT i have over 4.5 million records and when i run the below it ran for more than 18 hours and only go through about 2.7 millions records when i gave up and ended the process. So how can i make this more efficient and possibly add the new attributes (month/year) to the data frame on the fly. Thanks guys #Create sample data RawData2..-data.frame(ID=c(22,44),period_end_date=c(9/10/2007 0:00:00,2/2/2006 0:00:00)) #Create lists to store month and year results Data.Month_-list() Data.Year_-list() #pull out year/month attribute at put in own column for(i in 1:length(RawData2..$ID)){ #Select Record Data.X-RawData..[i,] #Separate date into month, day, and year DateSplit-strsplit(Data.X$period_end_date,/) #Select month Month-unlist(DateSplit)[1] #Separate year from time attribute Year.X-strsplit(unlist(DateSplit)[3], ) Year.Y-unlist(Year.X)[1] Data.Month_[[i]]-Month Data.Year_[[i]]-Year.Y } -- View this message in context: http://r.789695.n4.nabble.com/Alter-character-attribute-tp3018202p3018202.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] check RAM usage
?memory.size use before/after a sequence of commands to get an idea of the memory usage. On Fri, Oct 29, 2010 at 5:21 AM, Joel joda2...@student.uu.se wrote: Hi Is there any way to check an certain command or procedure's RAM usage? Im after something similar to system.time(bla) that gives me the time the command took to preform but for RAM usage. Hope you understand what i mean. Best regards Joel -- View this message in context: http://r.789695.n4.nabble.com/check-RAM-usage-tp3018753p3018753.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading multiple .csv-files and assigning them to variable names
Read them into a list; much easier to handle: myList - lapply(filenames, read.csv) On Fri, Oct 29, 2010 at 5:16 AM, Sarah Moens sara...@telenet.be wrote: Hi all, I've been trying to find a solution for the problem of reading multiple files and storing them in a variable that contains the names by which I want to call the datasets later on. For example (5 filenames): - The filenames are stored in one variable: filenames = paste(paste('name', '_', 1:5, sep = ''), '.csv', sep = '') - Subsequently I have a variable just containing the meaningful names for the dataset meaningfulnames = c('name1','name2'...,'name5') - I want to link each of these names to the data that is read for (i in 1:5) { meaningfulnames[i] = read.csv(filenames[i], header = TRUE, sep = ',') } I need to read in quite a lot of datafiles. I have a code doing this one at a time, but since the number of datafiles I need to read will increase in the future, I want to make sure I have a more flexible solution for this. Thanks a lot for your help. I have tried to look in the help pages and also came across dbfread, but I can't seem to find something I can use or understand at this point. Sarah __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Printing data.frame data: alternatives to print?
Is this what you want: df f1 f2 1 Maj I Minor A 2 Maj I Minor A 3 Maj I Minor A 4 Maj II Minor A 5 Maj II Minor B 6 Maj II Minor B 7 Maj III Minor B 8 Maj III Minor C 9 Maj III Minor C df[!duplicated(df),] f1 f2 1 Maj I Minor A 4 Maj II Minor A 5 Maj II Minor B 7 Maj III Minor B 8 Maj III Minor C On Fri, Oct 29, 2010 at 9:53 AM, Matthew Pettis matthew.pet...@gmail.com wrote: Hi, I have a data frame with two factors (well, more, but 2 for simple consideration), and I want to display the different combinations of the them that actually occur in the data. In reality, there are too many of them to do to do a 'table' call and have one col vertical and one col horizontal (I don't want any of the factors listed horizontally). Before I try to write a function to do this for me, I was wondering if there were alternate printing styles for data that already exist, and if someone could direct me to them? Inclded is a sample code and 2 possibilities (others welcome for consideration) of how I want to display some data. Thanks, Matt - df - data.frame( f1=rep(c(Maj I, Maj II, Maj III), each=3), f2=c(Minor A, Minor A, Minor A, Minor A, Minor B, Minor B, Minor B, Minor C, Minor C) ) - What I want printed is something like: --- f1 f2 Maj I Minor A Maj II Minor A Minor B Maj III Minor B Minor C --- or --- f1 f2 Maj I Minor A Maj II Minor A Maj II Minor B Maj III Minor B Maj III Minor C __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grouping question
try this: x [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 y - cut(x, breaks=c(-Inf,6,18, Inf), labels=c('a','b','c')) levels(y) - c('night','day','night') y [1] night night night night night night night day day day day day day day day day day day [19] day night night night night night night Levels: night day On Fri, Oct 29, 2010 at 8:56 PM, will phillips will.phill...@q.com wrote: Hello I have what is probably a very simple grouping question however, given my limited exposure to R, I have not found a solution yet despite my research efforts and wild attempts at what I thought might produce some sort of result. I have a very simple list of integers that range between 1 and 24. These correspond to hours of the day. I am trying to create a grouping of Day and Night with Day = 6 to 17.99 Night = 1 to 5.59 and 18 to 24 Using the Cut command I can create the segments but I have not found a combine type of command to merger the two night segments. No luck with if/else either. Any help would be greatly appreciated Thank you Will -- View this message in context: http://r.789695.n4.nabble.com/grouping-question-tp3019922p3019922.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Possible memory leak in loop.
If you are running on Windows, you might want to use 'perfmon' to look at the memory usage of the process over time. You might also want to put calls to memory.size in your looping code to see if there are things you are doing in the code that might temporarily use a lot of space and maybe fragment memory. On Mon, Nov 1, 2010 at 8:35 AM, Jonathan P Daily jda...@usgs.gov wrote: I was trying to use memory.size() to determine whether a code loop I am executing created a memory leak, since one replicate of the simulation takes 670.98 seconds according to proc.time(), while 5 replicates takes 170762 seconds. So I set it up as: memA - memory.size() looping code... memB - memory.size() memA returns as 9.3, and memB returns 11.3. I'm not familiar with fluctuation patterns in RAM usage (if there are any). Does anyone with more experience know if this is indicative of a memory leak? Thanks, Jon -- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it. - Jubal Early, Firefly [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write.csv changes the format of the date
works fine on 2.11.1 Windows: x - structure(list(Last_Successful_Run = structure(1L, .Label = 30/10/2010, + class = factor)), .Names = Last_Successful_Run, class = data.frame, + row.names = c(NA, + -1L)) x Last_Successful_Run 1 30/10/2010 str(x) 'data.frame': 1 obs. of 1 variable: $ Last_Successful_Run: Factor w/ 1 level 30/10/2010: 1 write.csv(x, file='x.csv') z - read.csv('x.csv') z X Last_Successful_Run 1 1 30/10/2010 Your data is a 'factor' so it should not be doing any date conversion. On Mon, Nov 1, 2010 at 9:14 AM, Santosh Srinivas santosh.srini...@gmail.com wrote: Dear Group, Why does write.csv modify the date format when it write to a file. I have the following variable Param_Dat: dput(Param_Dat) structure(list(Last_Successful_Run = structure(1L, .Label = 30/10/2010, class = factor)), .Names = Last_Successful_Run, class = data.frame, row.names = c(NA, -1L)) When I do: write.csv(Param_Dat,Param.csv,quote=F,row.names=F) The format of the info in the file is: Last_Successful_Run 31OCT2010 I want to retain the dd/mm/ format ... Please advise. -- Thanks R-Helpers. Yes, this is a silly question and it will not be repeated! :-) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] foreloop? aggregating time series data into groups
you can use na.locf in the zoo package: require(zoo) x-c(0,2,0,1,0,0,0,0,1,0,1,0,0,0,2,1,0,0,0,2,0,0,0,1) # replace zeros with NA x[x == 0] - NA x [1] NA 2 NA 1 NA NA NA NA 1 NA 1 NA NA NA 2 1 NA NA NA 2 NA NA NA 1 na.locf(x, fromLast = TRUE) [1] 2 2 1 1 1 1 1 1 1 1 1 2 2 2 2 1 2 2 2 2 1 1 1 1 On Mon, Nov 1, 2010 at 3:34 PM, blurg ian.jh...@gmail.com wrote: I have a data set similar to the set below where 1 and 2 indicate test results and 0 indicates time points in between where there are no test results. I would like to allocate the time points leading up to a test result with the value of the test result. What I have: What I want: 1 1 0 1 0 1 0 1 1 1 0 2 0 2 2 2 0 1 0 1 1 1 0 2 2 2 I have attempted methods creating a data.frame of the the breaks/changes in of values to from 0 to 1 or to 2. x-c(0,2,0,1,0,0,0,0,1,0,1,0,0,0,2,1,0,0,0,2,0,0,0,1) x1 - which(diff(x) == 1) x2 - which(diff(x) == 2) What ever the solution, I can't be entered by hand due to the size of the dataset (10 million and change). Any ideas? This is my first time posting to this forum and I am relatively new to R, so please don't flame me to hard. Desperate times call for desperate measures. Thanks. -- View this message in context: http://r.789695.n4.nabble.com/foreloop-aggregating-time-series-data-into-groups-tp3022667p3022667.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting data from one column with strings
try sqldf: x Sample_no Species Nitrogen Carbon 1 1 Cod 15.2 -19.0 2 2 Haddock 14.8 -20.2 3 3 Cod 15.6 -18.5 4 4 Cod 13.2 -20.1 5 5 Haddock 14.3 -18.8 require(sqldf) sqldf(select Species, avg(Nitrogen) Nitrogen, avg(Carbon) Carbon from x group by Species) Species Nitrogen Carbon 1 Cod 14.7 -19.2 2 Haddock 14.55000 -19.5 On Thu, Nov 4, 2010 at 8:28 AM, Ramsvatn Silje silje.ramsv...@uit.no wrote: Hello, I have tried to find this out some other way, but unsuccessful I have to try this list. I assume this should be quite simple. I have a dataset with 4 columns, Sample_no, Species, Nitrogen, Carbon in csv format. In the species column I have many different species with varying number of obs per species Eg Sample_no Species Nitrogen Carbon 1 Cod 15.2 -19.0 2 Haddock 14.8 -20.2 3 Cod 15.6 -18.5 4 Cod 13.2 -20.1 5 Haddock 14.3 -18.8 Etc.. And I want to calculate, mean, standard dev etc per species for the observations Nitrogen and Carbon. And later do plots and stats with the different species. I will in the end have many species, so need it to be automatic I can't enter code for every species separate. Can anyone help me with this? Or if this is the wrong list to sendt this question to, where do I send it? Thank you very much in advance. Best regards Silje Ramsvatn PhD-candidate University of Tromsø Norway __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix Manipulation
try this: x V2 V3 V4 [1,] 1 1 1 [2,] 2 2 2 [3,] 3 3 3 [4,] 4 4 4 [5,] 5 NA 5 [6,] NA NA 6 [7,] NA NA NA offset - c(0,2,1) # add the control to the data and make two copies so we can offset x.new - rbind(offset, x, x) result - apply(x.new, 2, function(.col){ + .col[seq(nrow(x) - .col[1L] + 2L, length = nrow(x))] + }) result V2 V3 V4 1 NA NA 2 NA 1 3 1 2 4 2 3 5 3 4 NA 4 5 NA NA 6 On Thu, Nov 4, 2010 at 11:47 AM, emj83 stp08...@shef.ac.uk wrote: Hi, Is there a quick way to go from this matrix: A [,1] [,2] [,3] [1,] 1 1 1 [2,] 2 2 2 [3,] 3 3 3 [4,] 4 4 4 [5,] 5 NA 5 [6,] NA NA 6 [7,] NA NA NA to this matrix: B [,1] [,2] [,3] [1,] 1 NA NA [2,] 2 NA 1 [3,] 3 1 2 [4,] 4 2 3 [5,] 5 3 4 [6,] NA 4 5 [7,] NA NA 6 without using a loop? For example using a vector which describes how many NA's are required from the top of the matrix- so in this case it would be c(0,2,1). Many thanks Emma -- View this message in context: http://r.789695.n4.nabble.com/Matrix-Manipulation-tp3027266p3027266.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to work with long vectors
Is this what you want: x id reads 1 Contig79:1 4 2 Contig79:2 8 3 Contig79:313 4 Contig79:414 5 Contig79:517 6 Contig79:620 7 Contig79:725 8 Contig79:827 9 Contig79:932 10 Contig79:1033 11 Contig79:1134 x$percent - x$reads / max(x$reads) * 100 x id reads percent 1 Contig79:1 4 11.76471 2 Contig79:2 8 23.52941 3 Contig79:313 38.23529 4 Contig79:414 41.17647 5 Contig79:517 50.0 6 Contig79:620 58.82353 7 Contig79:725 73.52941 8 Contig79:827 79.41176 9 Contig79:932 94.11765 10 Contig79:1033 97.05882 11 Contig79:1134 100.0 On Thu, Nov 4, 2010 at 11:46 AM, Changbin Du changb...@gmail.com wrote: HI, Dear R community, I have one data set like this, What I want to do is to calculate the cumulative coverage. The following codes works for small data set (#rows = 100), but when feed the whole data set, it still running after 24 hours. Can someone give some suggestions for long vector? id reads Contig79:1 4 Contig79:2 8 Contig79:3 13 Contig79:4 14 Contig79:5 17 Contig79:6 20 Contig79:7 25 Contig79:8 27 Contig79:9 32 Contig79:10 33 Contig79:11 34 matt-read.table(/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth, sep=\t, skip=0, header=F,fill=T) # dim(matt) [1] 3384766 2 matt_plot-function(matt, outputfile) { names(matt)-c(id,reads) cover-matt$reads #calculate the cumulative coverage. + cover_per-function (data) { + output-numeric(0) + for (i in data) { + x-(100*sum(ifelse(data = i, 1, 0))/length(data)) + output-c(output, x) + } + return(output) + } result-cover_per(cover) Thanks so much! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop
in comparison to `[[` , so he was showing you a method that would be more effective. BTW the as.data.frame step is unnecessary, since the first thing write.table does is coerce an object to a data.frame. The write.table name is misleading. It should be write.data.frame. You cannot really write tables with write.table. You would also use: file=paste(vari, csv, sep=.) as the file argument to write.table write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) What are these next actions supposed to do after the file is written? Are you trying to store a group of related w objects that will later be indexed in sequence? If so, then a list would make more sense. -- David. w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) 20 times, where W1-20 (capital letters) are the fields in a data.frame called lit and w1-20 are the data.frames being created. Hope that explains it better, m -Original Message- From: Patrick Burns [mailto:pbu...@pburns.seanet.com] Subject: Re: [R] Loop If I understand properly, you'll want something like: lit[[w2]] instead of lit$w2 more accurately: for(i in 1:20) { vari - paste(w, i) lit[[vari]] ... } The two documents mentioned in my signature may help you. On 03/11/2010 20:23, Matevž Pavlič wrote: Hi all, I managed to do what i want (with the great help of thi mailing list) manually . Now i would like to automate it. I would probably need a for loop for to help me with this...but of course I have no idea how to do that in R. Bellow is the code that i would like to be replicated for a number of times (let say 20). I would like to achieve that w1 would change to w2, w3, w4 ... up to w20 and by that create 20 data.frames that I would than bind together with cbind. (i did it like shown bellow -manually) w1-table(lit$W1) w1-as.data.frame(w1) write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) w2-table(lit$W2) w2-as.data.frame(w2) write.table(w2,file=w2.csv,sep=;,row.names=T, dec=.) w2- w2[order(w2$Freq, decreasing=TRUE),] w2-head(w2, 20) . . . Thanks for the help,m David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating vectors with three variables out of three datasets
Is this what you want: x V1 V2 V3 V4 1 ascii1: 11 12 13 2 ascii2: 14 15 16 3 ascii3: 17 18 19 z - as.matrix(x[,-1]) z V2 V3 V4 [1,] 11 12 13 [2,] 14 15 16 [3,] 17 18 19 as.vector(z) [1] 11 14 17 12 15 18 13 16 19 On Thu, Nov 4, 2010 at 6:05 PM, DomDom realown...@msn.com wrote: okay sorry. i´ve got three ascii files with pixel values without any header information. so if the first line of the three ascii files are: ascii1: 11 12 13 ascii2: 14 15 16 ascii3: 17 18 19 i would like a new matrix with: 11,14,17;12,15,18;13,16,19; thx -- View this message in context: http://r.789695.n4.nabble.com/creating-vectors-with-three-variables-out-of-three-datasets-tp3027852p3027880.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop
try this: The top half of the matrix is the counts and the bottom is the value: x - apply(mat, 2, function(a) c(sort(table(a)), as.integer(names(sort(table(a)) x [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 4856 4906 4857 4877 4788 4880 4861 4851 4878 4841 [2,] 4907 4917 4863 4879 4853 4882 4922 4890 4907 4927 [3,] 4942 4938 4930 4934 4951 4921 4935 4909 4912 4929 [4,] 4943 4963 4951 4951 4962 4930 4939 4947 4944 4931 [5,] 4952 4966 4956 4961 4965 4942 4948 4950 4972 4951 [6,] 4973 4969 4970 4971 4974 4960 4965 4965 4981 4955 [7,] 4977 4974 4979 4976 4983 4980 4965 4980 4985 4962 [8,] 4980 4978 4985 4978 4994 4980 4986 4981 5000 4981 [9,] 4986 4983 5001 4981 5004 4983 5003 4987 5003 4987 [10,] 4995 4989 5015 4996 5011 5005 5007 4988 5006 5001 [11,] 5000 4991 5016 5000 5015 5009 5017 4997 5015 5002 [12,] 5024 5010 5021 5012 5017 5012 5018 5015 5021 5021 [13,] 5030 5029 5022 5022 5017 5013 5022 5031 5023 5030 [14,] 5040 5031 5034 5027 5028 5015 5031 5033 5024 5031 [15,] 5046 5039 5041 5028 5028 5036 5045 5038 5033 5037 [16,] 5061 5040 5042 5028 5056 5039 5048 5045 5034 5038 [17,] 5066 5043 5042 5036 5068 5044 5051 5064 5035 5058 [18,] 5070 5043 5067 5054 5074 5074 5054 5085 5056 5058 [19,] 5074 5077 5080 5106 5090 5138 5057 5114 5073 5123 [20,] 5078 5114 5128 5183 5122 5157 5126 5130 5098 5137 [21,] 155 194 19 1591 1716 [22,]3 115 2028 2029 8 [23,]1 14 14 12 20 12 17 14514 [24,] 1647 17 1523 18 1810 [25,]9 17 15 15535 17 1215 [26,] 10 18 13 139 10 114 20 2 [27,]4 20 20 198 17 15 19 1311 [28,] 18 1347 17 19 13 11312 [29,] 11 12 1297 14 1477 1 [30,] 19 16933946 15 5 [31,] 1271 11 115 16 13813 [32,]23 1854 118 16418 [33,]7 19 172 124 103 19 3 [34,]81 10 16 10 1828 1017 [35,] 17 15 11 14 1411 20 1419 [36,] 2086 18 1366 101 9 [37,] 139 16 10 1877 126 4 [38,]6 10266 16 129220 [39,] 142811 20 195 16 7 [40,]5638 16 13 18 15 11 6 2010/11/4 Matevž Pavlič matevz.pav...@gi-zrmk.si: Hi Jim, Actually, this is better, but both values are what i am looking for. Count and the value of the count. Is there a way to just paste those two together? Thanks, m -Original Message- From: jim holtman [mailto:jholt...@gmail.com] Sent: Thursday, November 04, 2010 9:59 PM To: Matevž Pavlič Cc: Petr PIKAL; r-help@r-project.org Subject: Re: [R] Loop Is this closer to what you want, assuming that it is the value of the most frequently occurring: apply(mat, 2, function(x) head(names(sort(table(x), decreasing=T)),5)) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 1 14 5 1 4 14 6 18 11 19 [2,] 3 3 13 12 3 11 14 9 18 12 [3,] 2 18 20 8 11 12 17 14 14 7 [4,] 5 11 8 19 5 18 18 15 16 10 [5,] 18 13 11 11 17 3 4 16 8 16 2010/11/4 Matevž Pavlič matevz.pav...@gi-zrmk.si: Hi again, Stil don't qute get it... Here's what i did : mat-read.csv(litologija.csv, dec=., sep=;) apply(mat, 2, function(x) head(sort(table(x),decreasing=T),10)) With that i get a table(list/matrix...) which gives the highest count of occurances of each value in a table (at least i think so) But the problem is because it does not tell which value occurs the most (has the highest count). If written like this : apply(mat, 2, function(x) sort(table(x),decreasing=T)) I get decreasingly sorted values of counts of occurances of a specific field and the value of that field for each column: $W2 x PEŠČEN GRADUIRAN IN PROD DO GLINAST PROD, PREPEREL MELJAST GRUŠČ GLINA Z MALO GRANULIRANA 1872 1542 552 519 458 214 175 174 132 114 62 53 47 45 ZELO PEŠČENA ZAGLINJEN KARBONATNI SKRILAVCA, S SKRILAVCA GRANULIRAN PEČŠEN VEZAN ZAOBLJEN GR. DROBEN SLABO 40 34 31 26 26 25 25 24 17 17 17 15 12 12 GRUŠČ, MELJASTO PEŠEEN DOBRO GRAN. PEŠČENJAKA HUDOURNIŠKI MELJNA PEŠČN GIRADUIRAN
Re: [R] Memory Management under Linux
It would be very useful if you would post some information about what exactly you are doing. There si something with the size of the data object you are processing ('str' would help us understand it) and then a portion of the script (both before and after the error message) so we can understand the transformation that you are doing. It is very easy to generate a similar message: x - matrix(0,2, 2) Error: cannot allocate vector of size 3.0 Gb but unless you know the context, it is almost impossible to give advice. It also depends on if you are in some function calls were copies of objects may have been made, etc. On Thu, Nov 4, 2010 at 7:52 PM, ricardo souza ricsouz...@yahoo.com.br wrote: Dear all, I am using ubuntu linux 32 with 4 Gb. I am running a very small script and I always got the same error message: CAN NOT ALLOCATE A VECTOR OF SIZE 231.8 Mb. I have reading carefully the instruction in ?Memory. Using the function gc() I got very low numbers of memory (please sea below). I know that it has been posted several times at r-help (http://tolstoy.newcastle.edu.au/R/help/05/06/7565.html#7627qlink2). However I did not find yet the solution to improve my memory issue in Linux. Somebody cold please give some instruction how to improve my memory under linux? gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 170934 4.6 35 9.4 35 9.4 Vcells 195920 1.5 786432 6.0 781384 6.0 INCREASING THE R MEMORY FOLLOWING THE INSTRUCTION IN ?Memory I started R with: R --min-vsize=10M --max-vsize=4G --min-nsize=500k --max-nsize=900M gc() used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) Ncells 130433 3.5 50 13.4 25200 50 13.4 Vcells 81138 0.7 1310720 10.0 NA 499143 3.9 It increased but not so much! Please, please let me know. I have read all r-help about this matter, but not solution. Thanks for your attention! Ricardo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Detect the Warning Message
?options and then you will find the following: warn: sets the handling of warning messages. If warn is negative all warnings are ignored. If warn is zero (the default) warnings are stored until the top–level function returns. If fewer than 10 warnings were signalled they will be printed otherwise a message saying how many (max 50) were signalled. An object called last.warning is created and can be printed through the function warnings. If warn is one, warnings are printed as they occur. If warn is two or larger all warnings are turned into errors. by setting options(warn = 2) will cause the system to halt at that point. Also setting: options(error=utils::recover) will drop you in the 'browser' (?browser) so you can see the values of objects when the error occurred. Google for 'debugging R' to get some more information. On Fri, Nov 5, 2010 at 4:00 AM, Yen Lee b88207...@ntu.edu.tw wrote: Dear all, I've written a function and repeated it for 5000 times with loops with different value, and the messages returned are the output I set and 15 warnings. I would like to trace the warnings by stopping the loop when warning came out. Does anyone know how to make it? Thanks a lot for your help. Yen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] assignment operator saving factor level as number
Your example looks like you are assigning back to the first column of df2 (Num). Is this what you are really doing in your code? You need to follow the posting guide: PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. On Fri, Nov 5, 2010 at 3:54 PM, Wade Wall wade.w...@gmail.com wrote: Hi all, I have a dataframe (df1) that I am trying to select values from to a second dataframe that at the current time is only for the selected items from df1 (df2). The values that I am trying to save from df1 are factors with alphanumeric names df1 looks like this: 'data.frame': 3014 obs. of 13 variables: $ Num : int 1 1 1 2 2 2 3 3 3 4 ... $ Tag_Num : int 1195 1195 1195 1162 1162 1162 1106 1106 1106 1173 ... $ Site : Factor w/ 25 levels PYBR002A,PYBR003B,..: 1 1 1 1 1 1 1 1 1 1 ... $ Site_IndNum : Factor w/ 1044 levels PYBR002A_001,..: 1 1 1 2 2 2 3 3 3 4 ... ... $ Area : num 463.3 29.5 101.8 152.9 34.6 ... However, whenever I try to assign values, like this df2[j,1]-df2$Site[i] the values are changed from alphanumeric (e.g. PYBR003A) to numerals (e.g. 1). Does anyone know why this is happening and how I can assign the actual values from df1 to df2? Thanks in advance, Wade [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory Management under Linux
I would do some monitoring (debugging) of the script by placing some 'gc()' calls in the sequence of statements leading to the problem to see what the memory usage is at that point. Take a close look at the sizes of your objects. If it is happening in some function you have called, you may have to take a look and understand if multiple copies are being made. Most problems of this type may require that you put hooks in your code (most of the stuff that I write has it in so I can isolate performance problems) to gain an understanding of what is happening when. To improve memory allocation, you first have to understand what is causing the problem, and enough information has not been provided so that I could make a comment on it. There are lots of rules of thumb that can be used, but many depend on exactly what you are trying to do. On Fri, Nov 5, 2010 at 2:59 PM, ricardo souza ricsouz...@yahoo.com.brwrote: Dear Jim, Thanks for your attention. I am running a geostatistic analysis with geoR that is computational intense. At the end my analysis I call the function krige.control and krige.conv. Do you have any idea how to improve the memory allocation in Linux? Thanks, Ricardo De: jim holtman jholt...@gmail.com Assunto: Re: [R] Memory Management under Linux Para: ricardo souza ricsouz...@yahoo.com.br Cc: r-help@r-project.org Data: Sexta-feira, 5 de Novembro de 2010, 10:21 It would be very useful if you would post some information about what exactly you are doing. There si something with the size of the data object you are processing ('str' would help us understand it) and then a portion of the script (both before and after the error message) so we can understand the transformation that you are doing. It is very easy to generate a similar message: x - matrix(0,2, 2) Error: cannot allocate vector of size 3.0 Gb but unless you know the context, it is almost impossible to give advice. It also depends on if you are in some function calls were copies of objects may have been made, etc. On Thu, Nov 4, 2010 at 7:52 PM, ricardo souza ricsouz...@yahoo.com.brhttp://mc/compose?to=ricsouz...@yahoo.com.br wrote: Dear all, I am using ubuntu linux 32 with 4 Gb. I am running a very small script and I always got the same error message: CAN NOT ALLOCATE A VECTOR OF SIZE 231.8 Mb. I have reading carefully the instruction in ?Memory. Using the function gc() I got very low numbers of memory (please sea below). I know that it has been posted several times at r-help ( http://tolstoy.newcastle.edu.au/R/help/05/06/7565.html#7627qlink2). However I did not find yet the solution to improve my memory issue in Linux. Somebody cold please give some instruction how to improve my memory under linux? gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 170934 4.6 35 9.4 35 9.4 Vcells 195920 1.5 786432 6.0 781384 6.0 INCREASING THE R MEMORY FOLLOWING THE INSTRUCTION IN ?Memory I started R with: R --min-vsize=10M --max-vsize=4G --min-nsize=500k --max-nsize=900M gc() used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) Ncells 130433 3.5 50 13.4 25200 50 13.4 Vcells 81138 0.71310720 10.0 NA 499143 3.9 It increased but not so much! Please, please let me know. I have read all r-help about this matter, but not solution. Thanks for your attention! Ricardo [[alternative HTML version deleted]] __ R-help@r-project.org http://mc/compose?to=r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] slow dget
dput/dget were not intended to save/restore large objects. Understand what is happening in the use of dput/dget. dput is creating a text file that can reconstitute the object with dget. dget is having to read the file in and then parse it: dget function (file) eval(parse(file = file)) environment: namespace:base This can be a complex process if there object is large and complex. save/load basically take the binary object and save it with little additional processing and the load is just as fast. In general, most of the functions can be used both correctly and incorrectly. So should a warning for every potential condition/criteria be put in the help file? Probably not. It is hard to protect the user against him/herself. So what you are doing in seeing how long alternatives take is a good learning tool and will help you improve your use of the features. On Fri, Nov 5, 2010 at 11:16 PM, Jack Tanner i...@hotmail.com wrote: I have a data structure that is fast to dput(), but very slow to dget(). On disk, the file is about 35MB. system.time(dget(r.txt)) user system elapsed 142.93 1.27 192.84 The same data structure is fast to save() and fast to load(). The .RData file on disk is about 12MB. system.time(load(r.RData)) user system elapsed 4.89 0.08 7.82 I imagine that this is a known speed issue with dget, and that the recommended solution is to use load, which is fine with me. If so, perhaps a note to this effect could be added to the dget help page. All timings above using R version 2.12.0 (2010-10-15) Platform: i386-pc-mingw32/i386 (32-bit) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using variable in rmysql query
?paste id - dbGetQuery(con1,paste(SELECT id FROM tenants WHERE name LIKE '%, ,str, %', sep = '')) On Sun, Nov 7, 2010 at 4:51 AM, Mohan L l.mohanphys...@gmail.com wrote: Dear All, I am using this query it returns id : id - dbGetQuery(con1,SELECT id FROM tenants WHERE name LIKE '%consim%') But In my case the string consim is there in another variable(it is coming from configuration file); str - consim I am trying to replace the string some this like, but it not working: id - dbGetQuery(con1,SELECT id FROM tenants WHERE name LIKE '%str%') I need help to replace the value of str . Any help will really appreciated . Thanks for your time. Mohan L __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] remove data frame from list of data frames
Is this what you are asking; this accepts any dataframe that has at least one Acc = 1; changing 'any' to 'all' means all Acc==1. Play around and get what you need: ls- list(a,b) ls [[1]] x y Acc 1 0.26550866 0.2059746 1 2 0.37212390 0.1765568 1 3 0.57285336 0.6870228 1 4 0.90820779 0.3841037 1 5 0.20168193 0.7698414 1 6 0.89838968 0.4976992 1 7 0.94467527 0.7176185 1 8 0.66079779 0.9919061 1 9 0.62911404 0.3800352 1 10 0.06178627 0.7774452 1 [[2]] x y Acc 1 0.93470523 0.4820801 0 2 0.21214252 0.5995658 0 3 0.65167377 0.4935413 0 4 0.1210 0.1862176 0 5 0.26722067 0.8273733 0 6 0.38611409 0.6684667 0 7 0.01339033 0.7942399 0 8 0.38238796 0.1079436 0 9 0.86969085 0.7237109 0 10 0.34034900 0.4112744 0 sapply(ls, function(x) any(x$Acc == 1)) [1] TRUE FALSE ls[sapply(ls, function(x) any(x$Acc == 1))] [[1]] x y Acc 1 0.26550866 0.2059746 1 2 0.37212390 0.1765568 1 3 0.57285336 0.6870228 1 4 0.90820779 0.3841037 1 5 0.20168193 0.7698414 1 6 0.89838968 0.4976992 1 7 0.94467527 0.7176185 1 8 0.66079779 0.9919061 1 9 0.62911404 0.3800352 1 10 0.06178627 0.7774452 1 On Sun, Nov 7, 2010 at 6:07 AM, Matthew Finkbeiner matthew.finkbei...@mq.edu.au wrote: I have a list of data frames like this: a- data.frame(x=runif(10), y = runif(10), Acc = 1) b- data.frame(x=runif(10), y = runif(10), Acc = 0) ls- list(a,b) and I want to remove the data frames from ls that have Acc values other than 1. How do I do that? Thanks for any help! Matthew __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help to sum up data frame
Nice thing about R is there is more than one way of doing something: x name ip Bsent Breceived 1 a 1 0.00 0.00 2 a 2 1.43 19.83 3 a 1 0.00 0.00 4 a 2 1.00 1.00 5 b 1 0.00 2.00 6 b 3 0.00 2.00 7 b 2 2.00 0.00 8 b 2 2.00 0.00 9 b 1 24.40 22.72 10c 1 1.00 1.00 11c 1 2.00 1.00 12c 1 2.00 1.00 13c 1 90.97 15.70 14d 0 0.00 0.00 15d 1 30.00 17.14 require(sqldf) sqldf('select name, sum(ip) as ip, sum(Bsent) as Bsent, + sum(Breceived) as Breceived + from x + group by name') name ip Bsent Breceived 1a 6 2.43 20.83 2b 9 28.40 26.72 3c 4 95.97 18.70 4d 1 30.00 17.14 On Sun, Nov 7, 2010 at 8:59 AM, Mohan L l.mohanphys...@gmail.com wrote: Dear All, I have a data frame like this: name ip Bsent Breceived a 1 0.00 0.00 a 2 1.43 19.83 a 1 0.00 0.00 a 2 1.00 1.00 b 1 0.00 2.00 b 3 0.00 2.00 b 2 2.00 0.00 b 2 2.00 0.00 b 1 24.40 22.72 c 1 1.00 1.00 c 1 2.00 1.00 c 1 2.00 1.00 c 1 90.97 15.70 d 0 0.00 0.00 d 1 30.00 17.14 I want to sum up the similar name into one row, like : name ip Bsent Breceived a 6 2.43 20.83 b 9 28.40 26.72 c d I need help to sum up. Thanks for your time. Thanks Rg Mohan L __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rserve causes Perl error
Could be both. Do you have perl installed and is it on a path that R can find. On Mon, Nov 8, 2010 at 1:32 AM, Ralf B ralf.bie...@gmail.com wrote: Hi all, I tried to run Rserve: I installed it from CRAN using install.packages(Rserve) and tried to run it from the command line using: R CMD Rserve I am getting an error telling me that the command perl cannot be found. What is wrong and what can I do to fix this? Do I need to install any other packages or is it just a path problem? Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with getting ?match to not sort
Missing: a closer reading of the help page -- Value A data frame. The rows are by default lexicographically sorted on the common columns, but for sort = FALSE are in an unspecified order. So sort = FALSE says unspecified. If you want the original order, then add a column to the dataframe with the order and then sort the result. On Mon, Nov 8, 2010 at 4:09 PM, Tal Galili tal.gal...@gmail.com wrote: Hello all, I think I am missing something about the sorting parameter in the match command/ Here is an example: a1 - data.frame(name = c(D, B, C, A, A, C)) a2 - data.frame(name = c(A, B, C, D), num = 1:4) a1 a2 merge(a1, a2, sort = F, by.x = T) The result is: name num 1 D 4 2 B 2 3 C 3 4 C 3 5 A 1 6 A 1 While I wish my rows to be in the same order as in a1, they are having some other order. What am I missing here? Thanks. Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how do i plot this hist?
try this: x - read.table('clipboard') x V1 V2 V3 V4 V5 1 50 01 0 0 2 55 1 14 0 1 3 60 7 86 0 3 4 65 22 324 2 3 5 70 58 1035 1 7 6 75 30 2568 0 34 7 80 9 2936 15 162 8 85 27 2169 46 365 9 90 80 1439 212 432 10 95 236 1670 521 281 11 100 332 827 709 172 12 105 156 311 556 103 13 110 69 49 144 44 14 115 26 10 36 17 15 120 29 3 3 16 125 16 1 1 17 130 0 14 0 0 18 135 05 0 0 19 140 00 0 0 20 145 00 0 0 21 150 00 0 0 22 155 00 0 0 23 160 00 0 0 24 165 00 0 0 25 170 00 0 0 26 175 00 0 0 27 180 00 0 0 28 185 00 0 0 29 190 00 0 1 30 195 00 0 0 31 200 00 0 0 32 205 00 0 0 33 210 00 0 0 x.m - as.matrix(x) # dataframe - matrix barplot(t(x.m), names.arg = x.m[,1], las=2) On Mon, Nov 8, 2010 at 5:42 PM, casperyc caspe...@hotmail.co.uk wrote: Hi all, I have the following data in abc.dat === 50 0 1 0 0 55 1 14 0 1 60 7 86 0 3 65 22 324 2 3 70 58 1035 1 7 75 30 2568 0 34 80 9 2936 15 162 85 27 2169 46 365 90 80 1439 212 432 95 236 1670 521 281 100 332 827 709 172 105 156 311 556 103 110 69 49 144 44 115 26 10 36 17 120 2 9 3 3 125 1 6 1 1 130 0 14 0 0 135 0 5 0 0 140 0 0 0 0 145 0 0 0 0 150 0 0 0 0 155 0 0 0 0 160 0 0 0 0 165 0 0 0 0 170 0 0 0 0 175 0 0 0 0 180 0 0 0 0 185 0 0 0 0 190 0 0 0 1 195 0 0 0 0 200 0 0 0 0 205 0 0 0 0 210 0 0 0 0 === which i have used abc=read.table(abc.dat) to read the table into R. There are two problems: 1- I want the first column of the data to be the 'column names', how should i read the data? 2- I want to plot the histogram, using the first column as 'x' values, and the 2nd,3rd,4th and 5th columns as the frequencies. How do I plot it? I have tried to add a 'row' of variable names to it, and then read with 'header=T', then the first column become 'col.names' as I was expecting it to be. However, when I plot it using 'hist', R uses the 2nd column as the 'x value', where it should be used as 'frequency'. (the 50,55,60,65,70... should be on the x-axis) Thanks! Casper -- View this message in context: http://r.789695.n4.nabble.com/how-do-i-plot-this-hist-tp3032796p3032796.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how do i plot this hist?
Typing too fast; last line should be: barplot(t(x.m[, 2:5]), names.arg = x.m[,1], las=2) On Mon, Nov 8, 2010 at 5:42 PM, casperyc caspe...@hotmail.co.uk wrote: Hi all, I have the following data in abc.dat === 50 0 1 0 0 55 1 14 0 1 60 7 86 0 3 65 22 324 2 3 70 58 1035 1 7 75 30 2568 0 34 80 9 2936 15 162 85 27 2169 46 365 90 80 1439 212 432 95 236 1670 521 281 100 332 827 709 172 105 156 311 556 103 110 69 49 144 44 115 26 10 36 17 120 2 9 3 3 125 1 6 1 1 130 0 14 0 0 135 0 5 0 0 140 0 0 0 0 145 0 0 0 0 150 0 0 0 0 155 0 0 0 0 160 0 0 0 0 165 0 0 0 0 170 0 0 0 0 175 0 0 0 0 180 0 0 0 0 185 0 0 0 0 190 0 0 0 1 195 0 0 0 0 200 0 0 0 0 205 0 0 0 0 210 0 0 0 0 === which i have used abc=read.table(abc.dat) to read the table into R. There are two problems: 1- I want the first column of the data to be the 'column names', how should i read the data? 2- I want to plot the histogram, using the first column as 'x' values, and the 2nd,3rd,4th and 5th columns as the frequencies. How do I plot it? I have tried to add a 'row' of variable names to it, and then read with 'header=T', then the first column become 'col.names' as I was expecting it to be. However, when I plot it using 'hist', R uses the 2nd column as the 'x value', where it should be used as 'frequency'. (the 50,55,60,65,70... should be on the x-axis) Thanks! Casper -- View this message in context: http://r.789695.n4.nabble.com/how-do-i-plot-this-hist-tp3032796p3032796.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] new column from column in another df
?merge df1 V1 V2 1 Populus tremula 1 2 Populus tremula 2 3 Populus tremula 3 4 Calluna vulgaris 1 5 Calluna vulgaris 2 6 Betula alba 1 7 Betula alba 2 8 Betula alba 3 9 Primula veris 1 10Primula veris 2 df2 - read.table('clipboard', sep=',') df2 V1 V2 1 Populus tremulatree 2 Acer platanoidestree 3 Ribes rubrum shrub 4 Calluna vulgaris dwarf_shrub 5 Betula albatree 6Primula verisherb merge(df1, df2, by = V1) V1 V2.xV2.y 1 Betula alba1tree 2 Betula alba2tree 3 Betula alba3tree 4 Calluna vulgaris1 dwarf_shrub 5 Calluna vulgaris2 dwarf_shrub 6 Populus tremula1tree 7 Populus tremula2tree 8 Populus tremula3tree 9 Primula veris1herb 10Primula veris2herb On Tue, Nov 9, 2010 at 7:28 AM, fugelpitch jo...@runtimerecords.net wrote: If I have a data frame where a species occupies several rows with different phases such as (both col's ar factors): species,phase Populus tremula,1 Populus tremula,2 Populus tremula,3 Calluna vulgaris,1 Calluna vulgaris,2 Betula alba,1 Betula alba,2 Betula alba,3 Primula veris,1 Primula veris,2 and another df where each species only have one row: species,growth_form Populus tremula,tree Acer platanoides,tree Ribes rubrum,shrub Calluna vulgaris,dwarf_shrub Betula alba,tree Primula veris,herb ...how can I create a new column in the first data frame where growth form is picked up from the second data frame (also factors) and entered into all rows for a species as follows: species,phase,growth_form Populus tremula,1,tree Populus tremula,2,tree Populus tremula,3,tree Calluna vulgaris,1,dwarf_shrub Calluna vulgaris,2,dwarf_shrub Betula alba,1,tree Betula alba,2,tree Betula alba,3,tree Primula veris,1,herb Primula veris,2,herb This will be made for data frames a lot larger than this one so it needs to be automated in some way. Also, as you can see the second data frame contains more species than the first one so I need to pick them out by name not only by row number... (I tried something like: subset(dataframe2.df, dataframe2.df$species==as.character(unique(dataframe1.df$species))) in a for loop but I got an error about different factor levels which is true.) Any help is very appreciated! Jonas -- View this message in context: http://r.789695.n4.nabble.com/new-column-from-column-in-another-df-tp3033619p3033619.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which - value not present
length(result) == 0 result - integer(0) result integer(0) length(result) == 0 [1] TRUE On Tue, Nov 9, 2010 at 9:55 PM, vioravis viora...@gmail.com wrote: I am trying to use which function to obtain the index of a value in a dataframe. Depending on whether the value is present in the dataframe or not I am performing further operations to the dataframe. However, if the value is not present in the dataframe, I am getting an integer(0). How do I check for integer(0)? something like is.na??? Thank you. Ravishankar -- View this message in context: http://r.789695.n4.nabble.com/Which-value-not-present-tp3035455p3035455.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parsing txt file
Here is a start: # read the input file input - readLines('/tempxx.txt') # process the file starting at each Book result - lapply(which(grepl(^Book, input)), function(.line){ + contents - NULL # initialize + name - strsplit(input[.line], '\t')[[1]][2] # book name + # process succeeding lines as long as they are CD + while (grepl(^CD, input[.line + 1L])){ + contents - c(contents, strsplit(input[.line + 1L], '\t')[[1]][3]) + .line - .line + 1L + } + c(bookname = name, contents = paste(contents, collapse = ',')) + }) do.call(rbind, result) bookname contents [1,] bioR chapter5 [2,] bioc++ workexamples, experiments [3,] management tools On Wed, Nov 10, 2010 at 5:30 AM, Santosh Srinivas santosh.srini...@gmail.com wrote: You could use the following to achieve your objective. To start with ?readLines ?strsplit ?for ?ifelse As you try, you may receive more specific answers for the issues you come up with. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of karthicklakshman Sent: 10 November 2010 15:06 To: r-help@r-project.org Subject: [R] Parsing txt file Hello, I have a tab limited text document with multiple lines as mentioned below, #FILE FORMAT #Book bookname author publisher pages #CD name content -- Book bioR xxx abc publishers 230 CD biorexamples chapter5 -- Book bioc++ mmm tata publishers 400 CD samples workexamples CD data experiments -- Book management tools aaa some publishers 200 -- here the texts book and CD are present in each block. now, I am interested in creating a data frame with two columns, column names=bookname and content. Using grep it is possible to pick specific rows (grep(^book, finename)) but my expertise in programming is limited to create the mentioned data.frame. Note: the rowname book is present in all blocks but CD is variable (ie., some block has two and some with no CD row, as shown above) please help me in creating something like this, bookname content [1] bioR chapter5 [2] bioc++ workexamples, experiments [3] management tools NA Thanks in advance, karthick -- View this message in context: http://r.789695.n4.nabble.com/Parsing-txt-file-tp3035749p3035749.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parallel code runs slower!
Can you provide a little more information. What operating system are you using? Have you monitored the CPU and memory utilizations of the processes? Do you have enough physical memory; e.g., are you paging? How big are the matrices that you are processing; e.g., str(tTA) and object.size(tTA). This is the type of information that would be required to make an informed guess at what is happening. On Wed, Nov 10, 2010 at 9:07 AM, Santosh Srinivas santosh.srini...@gmail.com wrote: My parallel code is running slower than my non-parallel code! Can someone pls advise what am I doing wrong here? t and tTA are simple matrices of equal dimensions. #NON PARALLEL CODE nCols=ncol(t) nRows=nrow(t) tTA = matrix(nrow=nRows,ncol=nCols) require(TTR) system.time( for (i in 1:nCols) { x = t[,i] xROC = ROC(x) tTA[,i]=xROC } ) user system elapsed 123.24 0.07 123.47 # PARALLEL CODE nCols=ncol(t) nRows=nrow(t) tTA = matrix(nrow=nRows,ncol=nCols) require(doSMP) workers - startWorkers(4) # My computer has 4 cores registerDoSMP(workers) system.time( foreach (i=1:nCols) %dopar%{ x = t[,i] xROC = ROC(x) tTA[,i]=xROC } ) # stop workers stopWorkers(workers) It is taking ages! Thanks, S __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot Axes
apply the xlim/ylim in the initial plot plot(..., xlim=range(H2, H.10, H.20, H.50, H.100), ylim=range(D2, D.10, D.20, D.50, D.100)) On Wed, Nov 10, 2010 at 12:50 PM, dpender d.pen...@civil.gla.ac.uk wrote: R community, I am creating a bivariate return level plot by adding calculated return period values as lines onto an existing plot using the following code with the points representing the return periods. plot(H2,D2,pch=+,axes=TRUE) points(H.10,D.10, type=l,col=blue) points(H.20,D.20, type=l,col=green) points(H.50,D.50, type=l,col=red) points(H.100,D.100, type=l,col=orange) The problem is that my return period values are greater than the data values and therefore are partially cut out of the plot. How can I increase the axes limits in order to include all of the return period lines? I've tried ## xis(2,at=seq(35,max(D.100),by=20)) ## but it doesn't work. Thanks, Doug -- View this message in context: http://r.789695.n4.nabble.com/Plot-Axes-tp3036571p3036571.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extending a plot in a loop
If you want to read in all the files and then set the range so you can print a parameter from each one on a single chart, there is some information in the archives about how to do this. A brief outline is below (definitely untested) allFiles - lapply(fileList, read.table, header=TRUE, ... other parameters) xRange - range(sapply(allFiles, '[[', 'xParm')) # column with data of interest yRange - range(sapply(allFiles, '[[', 'yParm')) plot(0, xlim=xRange, ylim=yRange, type='n') # setup the plot lapply(allFiles, function(.file) lines(.file$xParm, .file$yParm)) # plot all the line On Wed, Nov 10, 2010 at 1:38 PM, Sebastian Gibb li...@sebastiangibb.de wrote: Am Mittwoch, 10. November 2010, 19:22:38 schrieb Nasrin Pak: My problem is that I have a data set for every day of measurement in a seperate file and I want to plot one parameter of the data for all the days in one graph. I tried to use for loop but only the last data remains in the program memory, I don`t know how to plot each day`s data continusly after the others(or how to extending the x axis.) Would you please help me with it? This a plot for one day: radiation.data -read.table(C:/updated_CFL_Rad_files/2008/RAD_2008_JD101_0410.dat, header = TRUE,sep = ,, quote = , dec = .) attach(radiation.data) The following object(s) are masked from 'radiation.data (position 3)': Batt_avg, Batt_st, Day, Hour, Kdown_avg, Kdown_st, LW_in, LW_in_st, Minute, Month, PanelT_avg, PanelT_st, PAR_avg, PAR_st, Sec, Tcase_avg, Tcase_st, Tdome_avg, Tdome_st, Thermopile_avg, Thermopile_st, Tuv_avg, Tuv_st, Uva_avg, Uva_st, Uvb_avg, Uvb_st, Year names(radiation.data) [1] Year Month Day Hour [5] Minute Sec Batt_avg PanelT_avg [9] Batt_st PanelT_st Kdown_avg Thermopile_avg [13] Tcase_avg Tdome_avg LW_in PAR_avg [17] Tuv_avg Uvb_avg Uva_avg Kdown_st [21] Thermopile_st Tcase_st Tdome_st LW_in_st [25] PAR_st Tuv_st Uvb_st Uva_st plot(((PAR_avg*0.216)/Uvb_avg), main=Par/UVB,xlab=minutes,ylab=Par/UVB) and this is the algorithm I tried for plotting all the data in one plot: x- matrix( list.files(C:/updated_CFL_Rad_files,full=TRUE)) # putting all data sets in a matrix for(i in 1:100) { if(i 101) next radiation.data -read.table(x[i], header = TRUE,sep = ,, quote = , dec = .) attach(radiation.data) plot(i*Hour*60+Minute,PAR_avg,main=PAR,xlab=Hour,ylab=Par) dev.print(device=postscript, C:/graph5.eps, onefile=FALSE, horizontal=FALSE) } The plot I see is the last file's plot, I don't know how to keep previous data and continue within the same plot. Hello, use something like this: plot(0, 0, type=n, xlim=c(0, maxTime), ylim=c(minY, maxY)) for ( i in 1:100) { lines(x[i], y[i]); } ?plot ?lines ?points Bye, Sebastian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comma separated format
Is this what you want: paste($, format(1e6, big.mark = ',', format = 'f'), sep='') [1] $1,000,000 On Thu, Nov 11, 2010 at 5:51 PM, sachinthaka.abeyward...@allianz.com.au wrote: Hi All, I'm trying to create labels to plot such that it doesn't show up as scientific notation. So for example how do I get R to show 1e06 as $1,000,000. I was wondering if there was a single function which allows you to do that, the same way that as.Date() allows you to show in date format on a plot. Thanks, Sachin --- Please consider the environment before printing this email --- Allianz - Best General Insurance Company of the Year 2010* Allianz - General Insurance Company of the Year 2009+ * Australian Banking and Finance Insurance Awards + Australia and New Zealand Insurance Industry Awards This email and any attachments has been sent by Allianz Australia Insurance Limited (ABN 15 000 122 850) and is intended solely for the addressee. It is confidential, may contain personal information and may be subject to legal professional privilege. Unauthorised use is strictly prohibited and may be unlawful. If you have received this by mistake, confidentiality and any legal privilege are not waived or lost and we ask that you contact the sender and delete and destroy this and any other copies. In relation to any legal use you may make of the contents of this email, you must ensure that you comply with the Privacy Act (Cth) 1988 and you should note that the contents may be subject to copyright and therefore may not be reproduced, communicated or adapted without the express consent of the owner of the copyright. Allianz will not be liable in connection with any data corruption, interruption, delay, computer virus or unauthorised access or amendment to the contents of this email. If this email is a commercial electronic message and you would prefer not to receive further commercial electronic messages from Allianz, please forward a copy of this email to unsubscr...@allianz.com.au with the word unsubscribe in the subject header. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can not produce graph using splom
Did you have 'dev.off()' at the end? On Thu, Nov 11, 2010 at 7:27 PM, Xiaoqi Cui x...@mtu.edu wrote: Hi, I wrote a function basically to first read an input data file, then open an pdf file and draw graph using splom. When testing, I ran the function line by line, it can produce nice plot, but with like 50 warnings. However, whenever I ran this function as a whole, it can not produce any plot, the pdf file has nothing in it. It seems the splom function even hasn't been run. Does anybody encountered similar problems, and would kindly share any possible reasons? Thanks, Xiaoqi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing to a file
The HELP page for 'sink' is pretty clear about this: sink() or sink(file=NULL) ends the last diversion (of the specified type). There is a stack of diversions for normal output, so output reverts to the previous diversion (if there was one). The stack is of up to 21 connections (20 diversions). On Sat, Nov 13, 2010 at 11:12 PM, Gregory Ryslik rsa...@comcast.net wrote: Hi, I have a fairly complex object that I have written a print function for. Thus when I do print(results), the R console shows me a whole bunch of stuff already formatted. What I want to do is to take whatever print(results) shows to console and then put that in a file. I am doing this using the sink command. However, I am unsure as to how to unsink. Eg, how do I restore output to the normal console? Thanks, Greg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cannot see the y-labels (getting cut-off)
increase the margins on the plot: par(mar=c(4,7,2,1)) plot(1:5,y,ylab='',yaxt='n' ); axis(2, at=y, labels=formatC(y,big.mark=,,format=fg),las=2,cex=0.1); On Sun, Nov 14, 2010 at 6:03 PM, sachinthaka.abeyward...@allianz.com.au wrote: Hi All, When I run the following code, I cannot see the entire number. As opposed to seeing 1,000,000,000. I only see 000,000 because the rest is cut off. The cex option doesn't seem to be doing anything at all. y-seq(1e09,5e09,1e09); plot(1:5,y,ylab='',yaxt='n' ); axis(2, at=y, labels=formatC(y,big.mark=,,format=fg),las=2,cex=0.1); Any thoughts? Thanks, Sachin p.s. sorry about corporate notice. --- Please consider the environment before printing this email --- Allianz - Best General Insurance Company of the Year 2010* Allianz - General Insurance Company of the Year 2009+ * Australian Banking and Finance Insurance Awards + Australia and New Zealand Insurance Industry Awards This email and any attachments has been sent by Allianz ...{{dropped:3}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems loading xlsx
If you are running on WIndows, I would suggest that you use the RODBC package and the odbcConnectExcel2007 function. I have had reasonable success with this. On Sun, Nov 14, 2010 at 3:23 PM, Paolo Rossi statmailingli...@googlemail.com wrote: Hi all, I am trying to run the package xlsx to read Excel 2007 files and I am getting the error below. library(xlsx) Loading required package: xlsxjars Loading required package: rJava Error : .onLoad failed in loadNamespace() for 'xlsxjars', details: call: .jinit() error: cannot obtain Class.getSimpleName method ID Error: package 'xlsxjars' could not be loaded By looking up this in the mailing list I have seen that it is an error related to the configuration of the path. I was also made aware that the path read into R gets truncated if it is too long. To avoid any issue I have added the jre at the very beginning of the path - see below p = Sys.getenv(PATH) strsplit(p,;) $PATH [1] c:\\Program Files\\Java\\j2re1.4.2_06\\bin\\client\\ [2] c:\\Program Files\\Java\\jre1.5.0_06\\bin\\client\\ [3] c:\\oracle\\ora92\\bin\\ [4] c:\\oracle\\ora92\\jre\\1.4.2\\bin\\ [5] c:\\oracle\\ora92\\jre\\1.4.2\\bin\\client\\ [6] c:\\program files\\oracle\\jre\\1.3.1\\bin\\ [7] C:\\WINDOWS\\system32 [8] C:\\WINDOWS In the path variable the items have been pasted like this: c:\Program Files\Java\j2re1.4.2_06\bin\client\;c:\Program Files\Java\jre1.5.0_06\bin\client\; The issue still persists. Can you please help? Thanks Paolo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Null values in R
Are you talking about NULL or NA that may occur in some data? Can you give an example of what your concern is and what set of operations you want to do. If they are NAs, there are some standard ways that they can be handled. On Mon, Nov 15, 2010 at 9:55 AM, Raji raji.sanka...@gmail.com wrote: Hi R-helpers , can you please let me know the methods in which NULL values can be handled in R? Are there any generic commands/functions that can be given in a workspace,so that the NULL values occuring in that workspace (for any datasets that are loaded , any output that is calculated) , are considered in the same way? Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/Null-values-in-R-tp3043184p3043184.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about readLines
x - readLines('yourFile') x - x[grepl('myWord', x)] On Tue, Nov 16, 2010 at 7:37 AM, romzero romz...@yahoo.it wrote: Can i use readLines to extract only the linees with a specific word within? If yes, how? Tnx for help. -- View this message in context: http://r.789695.n4.nabble.com/Question-about-readLines-tp3044701p3044701.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] : plot different axis, same plot position
I think this is what you want for the 'axis' command: axis(1, at = x, labels = T, las=2) On Wed, Nov 17, 2010 at 12:19 PM, Pelt van, Saskia (KNMI) saskia.van.p...@knmi.nl wrote: Dear R-users, I am trying to make a plot in R where x and y are plotted in a regular way, but the x axis corresponds to another set of values. For example I have x,y and T (all 29 values) x- c( -1.31846232, -1.04744756, -0.87034853, -0.72883370, -0.60618971, -0.49501845, -0.39128988, -0.29250120, -0.19694055, -0.10334039, -0.01069355, 0.08185470, 0.17507665, 0.26971270, 0.36651292, 0.46627625, 0.56989300, 0.67839644, 0.79303127, 0.91535108, 1.04736522, 1.19177282, 1.35235778, 1.53470330, 1.74760041, 2.00616370, 2.33996397, 2.82073311, 3.72564504) y-c(51.85177, 53.67026, 60.64062, 62.33320, 62.81224, 63.20116,76.10719, 78.07620, 78.83859, 80.06188, 84.53568, 85.15358, 87.39279, 87.49965, 89.88347, 90.73792, 90.92971, 92.17759, 92.84064, 93.17964, 97.51360, 97.64690, 98.20756, 101.64150,104.91425, 112.88917, 116.90400, 121.50099, 126.43808) T- c(1.190283, 1.240506, 1.295154, 1.354839, 1.420290, 1.492386, 1.572193, 1.661017, 1.760479, 1.872611, 2.00, 2.145985, 2.314961, 2.512821, 2.747664, 3.030928, 3.379310, 3.818182, 4.388060, 5.157895, 6.255319, 7.945946, 10.89, 17.294118, 42.00) plot(x,y,xlab=x, ylab=10 day maximum (mm),col=blue,type=b,pch=1,lty=2) This plot is the correct plot, but I want the x-axis to display the values of T If I do this: plot(x,y,xlab=x, ylab=10 day maximum (mm),xaxt=n, col=blue,type=b,pch=1,lty=2) axis(1,las=0,T) than the T is plotted in stead of x, but on the same scale. So the tickmarks start at 1 and stop at 3. I would like each point of the graph to correspond to a value of T on the x-axis, but the plot position should still correspond to x. This means that y corresponds to the x values in plot position, but to the T value on the x-axis. I want this because the x values have no explanatory meaning (Gumbel variates), while the T values (return period) have, so I can use it to communicate what is happening in this graph. I hope somebody can help me with this. Kind regards, Saskia van Pelt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking up the directory a file is located in
What do you expect it to look up? How do you want to specify the directory? Is it supposed to search through some sequence or use ESP? On Wed, Nov 17, 2010 at 4:08 PM, Cliff Clive cliffcl...@gmail.com wrote: Hello everyone, This should be an easy question, I think. I'd like to write a command in a program to set the working directory to whatever directory the file is currently stored in. Suppose I have a file called myRscript.r, and it's stored in C:\Rprojects\myRscript.r, and it references other R scripts and data files in the same directory. If I enter the command setwd(C:/Rprojects) I can then access all the files I need without typing the path. But suppose I want to move all of those files into another folder, say, C:\NewFolder. And suppose I might do this fairly often, or make copies of the script in several folders. Is there a command that looks something like this: setwd( look up current directory ) that will work no matter where I move my project, without having to go in and re-type the new directory path? Thanks in advance, Cliff -- View this message in context: http://r.789695.n4.nabble.com/Looking-up-the-directory-a-file-is-located-in-tp3047649p3047649.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting things in a time series.
You never did show us what your data looks like. You could convert them to POSIXct, then use 'cut' to split them into the bins and then 'table' to count them. Try something like this: timeStamp - as.POSIXct('2010-11-21 00:00') + runif(50, 0, 86400) # day's worth of time # bin into 1 hour buckets starting at midnight bins - cut(timeStamp, breaks = seq(as.POSIXct('2010-11-21 00:00'), + by = '1 hour', length = 25)) table(bins) bins 2010-11-21 00:00:00 2010-11-21 01:00:00 2010-11-21 02:00:00 2010-11-21 03:00:00 2010-11-21 04:00:00 1 5 4 5 5 2010-11-21 05:00:00 2010-11-21 06:00:00 2010-11-21 07:00:00 2010-11-21 08:00:00 2010-11-21 09:00:00 2 2 4 1 3 2010-11-21 10:00:00 2010-11-21 11:00:00 2010-11-21 12:00:00 2010-11-21 13:00:00 2010-11-21 14:00:00 0 3 3 0 1 2010-11-21 15:00:00 2010-11-21 16:00:00 2010-11-21 17:00:00 2010-11-21 18:00:00 2010-11-21 19:00:00 0 3 0 3 1 2010-11-21 20:00:00 2010-11-21 21:00:00 2010-11-21 22:00:00 2010-11-21 23:00:00 0 1 1 2 On Sun, Nov 21, 2010 at 12:29 AM, Noah Silverman n...@smartmediacorp.com wrote: Hi, I have a process (not in R) that records events with a time stamp. So, I have a huge series of maybe 100,000 time stamps. I'd like to break it up into hourly (Or daily) intervals and then count how many events occurred in each interval. That way I can graph it. Ideally, converting the this into a time series in R would let me do some interesting analysis. The data is just a list of epoch timestamps. Importing into R is trivial but I'm stuck from there. 1) How can I bin the counts by an hour? One thought would be to divided each timestamp by 3600, them multiply back by 1000. This would effectively convert them to hours with multiple entries per hour. ) But, I then don't know how to count them 2) Once I figure out the counts, I'll then have a data structure with a column of epoch seconds and a second column of counts. How can I then convert that into a ts object? Thanks! -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to change number of characters per line for print() to sink()?
?options width options(width = 1000) On Mon, Nov 22, 2010 at 7:22 AM, Nevil Amos nevil.a...@gmail.com wrote: I am using r to read and reformat data that is then saved to a text file using sink(), the file has a number of initial lines of comments and summary data followed by print of a data.frame with no row names. for example a-c(100:120) b-c(rnorm(100:120)) c-c(rnorm(200:220)) mydata-data.frame(rbind(a,b,c)) sink(datafile.txt) cat(comments about my data \n) cat(other calculations returned as separate text comments on a line \n) print(mydata,row.names=F) sink() I need the content of the text file to keep each row of the data frame on a single line thus (with intervening columns present of course) datafile.txt comments about my data other calculations returned as separate text comments on a line X1 X2 X3 X4 X5 X6 . X19 X20 X21 100.000 101.00 102.000 103.000 104.000 105.000 ..118.000 119.000 120.000 -0.3380570 -1.400905 1.0396499 -0.5802181 -0.2340614 0.6044928 ...-0.4854702 -0.3677461 -1.2033173 -0.9002824 1.544242 -0.8668653 0.3066256 0.2490254 -1.6429223 . 0.0861146 0.4276929 -0.3408604 How doI change setting for print() or use another function to keep each row of the data frame as a single line ( of greater length up to approx 300 characters) instead of wrapping the data frame into multiple lines of text? The problem : I end up with the data frame split into several sections one under another thus datafile.txt comments about my data other calculations returned as separate text comments on a line X1 X2 X3 X4 X5 X6 100.000 101.00 102.000 103.000 104.000 105.000 -0.3380570 -1.400905 1.0396499 -0.5802181 -0.2340614 0.6044928 -0.9002824 1.544242 -0.8668653 0.3066256 0.2490254 -1.6429223 X7 X8 X9 X10 X11 X12 106.000 107. 108.000 109.000 110.000 111.000 0.3152427 0.15093494 -0.3316172 -0.3603724 -2.0516402 -0.4556241 -0.6502265 -0.08842649 -0.3775335 -0.4942572 -0.0976565 -0.7716651 X13 X14 X15 X16 X17 X18 112.000 113.000 114.000 115.000 116. 117.00 0.8829135 0.8851043 -0.7687383 -0.9573476 -0.03041968 1.425754 0.2666777 0.6405255 0.2342905 -0.7705545 -1.18028004 1.303601 X19 X20 X21 118.000 119.000 120.000 -0.4854702 -0.3677461 -1.2033173 0.0861146 0.4276929 -0.3408604 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading parts of data files
?file - how to use connections ?read.table'skip' parameter, colClasses to only read columns you want That is not a large file. Read the whole thing in and then extract the data you need. On Tue, Nov 23, 2010 at 6:05 AM, fbielejec fbiele...@gmail.com wrote: Dear, I'm doing analysis where I need to work on relatively large (50-60 MB) text files, though I'm really interested only in parts with binary variables (named indicators1, indicators2, ... etc.) Every text file contains other numeric columns, but not always the same and not always in the same order - therefore I would rather need a method connecting to file and reading only colums with respect to name pattern (ie indicators + number). That should speed things up (now I have to clean data by hand) but also leave less memory footprint. Could You point me towards sth? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Executing multiple .R files
?source On Tue, Nov 23, 2010 at 10:04 AM, Santosh Srinivas santosh.srini...@gmail.com wrote: Hello R-Helpers, I have a directory with some .R files that I execute every day. I want to write a script that executes each one of time sequentially. Is there a statement for this? Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about list function
Try this: input - textConnection(var1 a + var2 b + var3 c + var4 d + var5 e) x - read.table(input, as.is = TRUE) close(input) # create a list xList - as.list(x$V2) names(xList) - x$V1 xList $var1 [1] a $var2 [1] b $var3 [1] c $var4 [1] d $var5 [1] e On Tue, Nov 23, 2010 at 12:05 PM, Guido Leoni guido.le...@gmail.com wrote: Dear List I'm a newbie R user. I'm utilizing the list function in order to make a var like this: clusters-list(a=var1,b=var2) My problem is that the total numer of variables that I need to include in my list is up to 200. I've the text string with the complete list of my variables but is too long to cut and paste in my bash shell. So is there a way too import the list from a text file? Thank you very much for any kind of help Guido -- Guido Leoni National Research Institute on Food and Nutrition (I.N.R.A.N.) via Ardeatina 546 00178 Rome Italy tel + 39 06 51 49 41 (operator) + 39 06 51 49 498 (direct) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Custom ticks on x axis when dates are involved
Try this: d - c(4/6/1984, 9/29/1984, 1/19/1985, 3/27/1986, 10/3/1987, 10/8/1987, 1/28/1988, 12/16/1989, 10/11/1991, 10/5/1992, 11/15/1995, 4/7/1996, 10/3/1997, 2/28/1998, 10/11/2000, 10/30/2001, 2/27/2002, 12/28/2002, 10/20/2003, 10/20/2003, 10/20/2003, 11/7/2004, 10/9/2005, 10/9/2005, 10/28/2006, 3/7/2007, 4/6/2007, 10/1/2008, 11/2/2008, 9/2/2009) land - c(3094.083, 3173.706, 3032.062, 3110.191, 3013.832, 3013.843, 3030.776, 3111.819, 3131.474, 3104.857, 2992.511, 3018.579, 2994.332, 2992.453, 3065.483, 3077.917, 3096.034, 3057.518, 3089.202, 3082.897, 3086.080, 3071.480, 3106.573, 3109.163, 3124.328, 3118.239, 3119.106, 3107.055, 3113.695, 3113.021) #transform d in dates: mdY.format - %m/%d/%Y d1 - as.POSIXct(strptime(d, format = mdY.format)) #Do the simple plot, supress ticks plot(d1, land, xlab=Date, ylab = Sq. Km., xaxt = n) # get minimum date dMin - min(d1) # get Jan-1 of that year dMinJan - as.POSIXct(format(dMin, %Y-1-1)) # now the sequence by year seqYear - seq(dMinJan, by = 1 year, to = max(d1) + (86400 * 365)) # add year to end axis(1, at = seqYear, labels = format(seqYear, %Y), las = 2) On Wed, Nov 24, 2010 at 2:27 PM, Monica Pisica pisican...@hotmail.com wrote: Hi, I have a set of irregular time series and i want to produce a simple plot, with dates on x axis and attribute value of y axis. This is simple enough but my x axis is divided automatically by ticks every 5 years. I would like to have a tick every year at January 1st. I am not sure how i can do that - i end up with something very close to what i want, but it is clunky and not very correct. I know that my dates are somehow internally converted in an integer that represents the number of seconds passed since an origin (i suppose it is 1st of January 1900). I think it would be easier to show you my example what i've done. I would be very happy to find out the correct way of actually doing this. d - c(4/6/1984, 9/29/1984, 1/19/1985, 3/27/1986, 10/3/1987, 10/8/1987, 1/28/1988, 12/16/1989, 10/11/1991, 10/5/1992, 11/15/1995, 4/7/1996, 10/3/1997, 2/28/1998, 10/11/2000, 10/30/2001, 2/27/2002, 12/28/2002, 10/20/2003, 10/20/2003, 10/20/2003, 11/7/2004, 10/9/2005, 10/9/2005, 10/28/2006, 3/7/2007, 4/6/2007, 10/1/2008, 11/2/2008, 9/2/2009) land - c(3094.083, 3173.706, 3032.062, 3110.191, 3013.832, 3013.843, 3030.776, 3111.819, 3131.474, 3104.857, 2992.511, 3018.579, 2994.332, 2992.453, 3065.483, 3077.917, 3096.034, 3057.518, 3089.202, 3082.897, 3086.080, 3071.480, 3106.573, 3109.163, 3124.328, 3118.239, 3119.106, 3107.055, 3113.695, 3113.021) #transform d in dates: mdY.format - %m/%d/%Y d1 - as.POSIXct(strptime(d, format = mdY.format)) #Do the simple plot, supress ticks plot(d1, land, xlab=Date, ylab = Sq. Km., xaxt = n) #Now dealing with my ticks and labels - the ugly part: s1 - (31+29+31+6)*24*60*60 # to be subtracted from earliest date to get 1st of January s2 - (29+31+30+31)*24*60*60 # to be added to the latest date to get 1st of January 2010 # number of seconds in a year - but does not take into consideration the leap years t - 365*24*60*60 axis(1, at = seq((range(d1)[1]-s1), (range(d1)[2]+s2), t), las = 2, labels = paste(seq(1984, 2010, 1))) abline(v= seq((range(d1)[1]-s1), (range(d1)[2]+s2), t), lty = 2, col = darkgrey) Now the graph looks very close to what i want, but i know that my ticks actually are not exactly at 01/01/ as i would like, although i suppose my error is not that much in this instance. However i would really appreciate if i can get the ticks on my x axis how i want in a much more elegant way - if possible (and if not at least in the correct way). Thanks, and Happy Thanksgiving for those who celebrate ;-) Monica __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cumsum with a max and min value
Does this do it; pmin(2, pmax(-2, cumsum(a))) [1] 0 1 1 2 2 2 1 0 0 -1 -2 On Thu, Nov 25, 2010 at 3:44 PM, henrique henri...@allianceasset.com.br wrote: I have a vector of values -1, 0, 1, say a - c(0, 1, 0, 1, 1, -1, -1, -1, 0, -1, -1) I want to create a vector of the cumulative sum of this, but I need to set a maximum and minimum value for the cumsum, for example: max_value - 2 min_value - -2 the expected result would be (0, 1, 1, 2, 2, 1, 0, -1, -1, -2, -2) The only way I managed to do It, was res - vector(length=length(a)) res[1] - a[1] for ( i in 2:length(a)) res[i] - res[i-1] + a[i] * (( res[i-1] max_value a[i] 0 ) | ( res[i-1] min_value a[i] 0 )) This is certainly not the best way to do it, so any suggestions? Henrique [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to bin/average time points?
try this: # create times 9 minutes apart time - seq(as.POSIXct('2010-11-25 00:00'), by = '9 min', length = 480) mySamp - data.frame(time = time, value = sample(1:100, length(time), TRUE)) # add column to split by hour mySamp$hour - format(mySamp$time, '%Y-%m-%d %H:30') # compute the mean for each hour tapply(mySamp$value, mySamp$hour, mean) 2010-11-25 00:30 2010-11-25 01:30 2010-11-25 02:30 2010-11-25 03:30 2010-11-25 04:30 2010-11-25 05:30 54.42857 59.85714 47.5 45.71429 40.28571 56.5 2010-11-25 06:30 2010-11-25 07:30 2010-11-25 08:30 2010-11-25 09:30 2010-11-25 10:30 2010-11-25 11:30 46.57143 47.14286 34.0 53.85714 50.28571 36.0 2010-11-25 12:30 2010-11-25 13:30 2010-11-25 14:30 2010-11-25 15:30 2010-11-25 16:30 2010-11-25 17:30 31.57143 44.57143 42.5 52.42857 54.14286 44.7 2010-11-25 18:30 2010-11-25 19:30 2010-11-25 20:30 2010-11-25 21:30 2010-11-25 22:30 2010-11-25 23:30 50.28571 60.57143 36.0 42.14286 65.14286 37.5 2010-11-26 00:30 2010-11-26 01:30 2010-11-26 02:30 2010-11-26 03:30 2010-11-26 04:30 2010-11-26 05:30 51.71429 58.85714 48.5 45.0 44.0 38.0 2010-11-26 06:30 2010-11-26 07:30 2010-11-26 08:30 2010-11-26 09:30 2010-11-26 10:30 2010-11-26 11:30 56.0 34.14286 64.7 51.42857 57.57143 44.5 2010-11-26 12:30 2010-11-26 13:30 2010-11-26 14:30 2010-11-26 15:30 2010-11-26 16:30 2010-11-26 17:30 65.0 59.57143 63.5 52.57143 36.85714 63.3 2010-11-26 18:30 2010-11-26 19:30 2010-11-26 20:30 2010-11-26 21:30 2010-11-26 22:30 2010-11-26 23:30 44.85714 64.85714 63.0 62.57143 62.0 57.0 2010-11-27 00:30 2010-11-27 01:30 2010-11-27 02:30 2010-11-27 03:30 2010-11-27 04:30 2010-11-27 05:30 26.71429 33.57143 37.5 67.0 47.85714 63.0 2010-11-27 06:30 2010-11-27 07:30 2010-11-27 08:30 2010-11-27 09:30 2010-11-27 10:30 2010-11-27 11:30 40.28571 46.42857 54.5 41.0 51.0 58.3 2010-11-27 12:30 2010-11-27 13:30 2010-11-27 14:30 2010-11-27 15:30 2010-11-27 16:30 2010-11-27 17:30 62.14286 52.28571 75.3 43.71429 53.14286 27.5 2010-11-27 18:30 2010-11-27 19:30 2010-11-27 20:30 2010-11-27 21:30 2010-11-27 22:30 2010-11-27 23:30 33.42857 56.85714 57.8 51.0 57.71429 38.7 On Thu, Nov 25, 2010 at 3:49 PM, DonDolowy kevin.dalga...@gmail.com wrote: Dear all, I am pretty new to R only having an introduction course, so please bare with me. I am doing my PhD at The Max Planck Institute of Immunobiology where I am analyzing some calorimetry data from some mice. I have a spreadsheet consisting of measurements of the respiratory exchange rate at different time points measured every 9 minutes over some days. My goal is bin/average the time points of each hour to only one measurements. E.g. [Time] - [Measurement] 12.09 - 0.730 12.18 - 0.732 12.27 - 0.743 12.36 - 0.757 12.45 - 0.781 12.54 - 0.731 -- should be averaged to fx one time point and one value, fx: 12.30 - [average of the six measurements] I know how to average the measurements in a whole column but how to average every six measurements automatically and also how to average every six time points and make a new sheet consisting of these data? I hope you guys are able to help, since we are really stuck here. I can of course do it manually but with 8000 measurements it will take lots of time. Thank you very much. Best regards, Kevin Dalgaard -- View this message in context: http://r.789695.n4.nabble.com/How-to-bin-average-time-points-tp3059509p3059509.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to catch error message
?try On Fri, Nov 26, 2010 at 9:26 AM, Alla Bulashevska alla.bullashev...@fdm.uni-freiburg.de wrote: Dear R users, i would like to catch error message (coming after unsuccessful database query) so that the script will process further. How can I manage this? Thank you in Advance, Alla. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unused argument error
?nlm It does not seem to have a parameter 'data', so that is the cause of your error message. On Fri, Nov 26, 2010 at 4:36 PM, Mike Gibson megalop...@hotmail.com wrote: I want to change my parameter g to maximize the sum of my model. I keep getting an unused argument error and I don't know why. Here are the details of my problem. g-0.2 #initial value for g Qt-exp(-g*tagdat$t) #model building PTT-Qt*Qt #model building PT-2*Qt*(1-Qt) #model building P0-(1-Qt)^2 #model building pTT-PTT/(1-P0) #model building pT-PT/(1-P0) #model building lamda-c(g=g) #make a list of values for my g parameter. -sum(tagdat$N2*log(pTT)+tagdat$N1*log(pT)) #here is the sum of my model. It works. f - function(g) sum(tagdat$N2*log(pTT)+tagdat$N1*log(pT)) #now I name my model and make it a function called g tag.fit-nlm(f,p=lamda, data=tagdat, hessian=T) #this is where I am stuck. I am running the nlm procedure but it keeps telling me I have the unused argument of data=list. ??? Any help would be greatly appreciated. Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add multiple ablines
just add more 'abline' calls, or pass multiple values; e.g., abline(v=c(1,2,3,4), h=c(10,20)) On Fri, Nov 26, 2010 at 9:41 PM, Stephen Liu sati...@yahoo.com wrote: Hi folks Run; ToothGrowth attach(ToothGrowth) toothgrowth=lm(len~dose) adding abline: abline(toothgrowth) I got it done adding single abline. How to add more ablines on the same diagram? I found following thread, applying mapply command; Plotting multiple ablines http://www.mail-archive.com/r-help@r-project.org/msg51543.html mapply(abline, (converge$kY + tan((90-converge$kT) * pi / 180)*(-converge$kX)), tan((90-converge$kT) * pi / 180)) But couldn't resolve it. Substituting converge with toothgrowth didn't work? I also look at ?mapply. Pls help. I don't have parameter of other ablines to be added. This is only a learning example. Any data can fit to them. TIA B.R. Stephen L [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to load data file without attribute names?
You need to follow the posting guide and show exactly what you did. I read in your data and can print it. You did not show what 'x', and 'y' were in your data: myData - read.table('clipboard', sep=',') myData V1 V2 V3 V4 V5 1 1 72 0 5.6431 28.199 2 1 72 0 12.6660 28.447 3 1 72 0 19.6810 28.695 4 1 72 0 25.6470 28.905 plot(myData$V4, myData$V5) You need to at least do str(x) str(y) to show what you are working with. On Sat, Nov 27, 2010 at 7:45 AM, 44whyfrog 44whyf...@163.com wrote: Hi all, I'm new to R and I'm struggling with loading a data file without attribute names, like: 1,72,0,5.6431,28.199 1,72,0,12.666,28.447 1,72,0,19.681,28.695 1,72,0,25.647,28.905 It has no names for the columns nor the rows. I tried data - read.table(path,header = FALSE, sep = ,) and it seems to work. But later, when I try qqnorm to plot the graph, it gives me the error msg: xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ I think the reason might be that I load the file wrongly. What should I do in this case? -- View this message in context: http://r.789695.n4.nabble.com/How-to-load-data-file-without-attribute-names-tp3061489p3061489.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two time measures
Try this: x - structure(list(Date = c(01/09/2009, 01/09/2009, 01/09/2009, + 01/09/2009, 01/09/2009, 01/09/2009, 02/09/2009, 02/09/2009, + 02/09/2009, 02/09/2009, 02/09/2009, 02/09/2009), Time = c(10:00, + 10:05, 10:10, 16:45, 16:50, 16:55, 10:00, 10:05, + 10:10, 16:45, 16:50, 16:55), Close = c(56567L, 56463L, + 56370L, 55771L, 55823L, 55814L, 55626L, 55723L, 55659L, 55742L, + 55717L, 55385L)), .Names = c(Date, Time, Close), class = data.frame, row.names = c(NA, + -12L)) # convert the time x$time - as.POSIXct(paste(x$Date, x$Time), format='%d/%m/%Y %H:%M') # create list x.list - split(x[, c('time','Close')], format(x$time, format = %Y%m%d)) str(x.list) List of 2 $ 20090901:'data.frame': 6 obs. of 2 variables: ..$ time : POSIXct[1:6], format: 2009-09-01 10:00:00 2009-09-01 10:05:00 2009-09-01 10:10:00 ... ..$ Close: int [1:6] 56567 56463 56370 55771 55823 55814 $ 20090902:'data.frame': 6 obs. of 2 variables: ..$ time : POSIXct[1:6], format: 2009-09-02 10:00:00 2009-09-02 10:05:00 2009-09-02 10:10:00 ... ..$ Close: int [1:6] 55626 55723 55659 55742 55717 55385 On Sat, Nov 27, 2010 at 5:02 PM, Eduardo de Oliveira Horta eduardo.oliveiraho...@gmail.com wrote: Hello! I have a csv file of intra-day financial data (5-min closing prices) that looks like this: (obs - the dates are formated as day/month/year, as is usual here in Brazil) Date;Time;Close 01/09/2009;10:00;56567 01/09/2009;10:05;56463 01/09/2009;10:10;56370 ##(goes on all day) 01/09/2009;16:45;55771 01/09/2009;16:50;55823 01/09/2009;16:55;55814 ##(jumps to the subsequent day) 02/09/2009;10:00;55626 02/09/2009;10:05;55723 02/09/2009;10:10;55659 ##(goes on all day) 02/09/2009;16:45;55742 02/09/2009;16:50;55717 02/09/2009;16:55;55385 ## (and so on to the next day) I would like to store the intra-day 5-min prices into a list, where each element would represent one day, something like list[[1]] price at time 1, day 1 price at time 2, day 1 ... price at time n_1, day 1 list[[2]] price at time 1, day 2 price at time 2, day 2 ... price at time n_2, day 2 and so on. As the n_1, n_2, etc, suggest, each day have its own number of observations (this reflects the fact that the market may open and close at varying daytimes). I have guessed that a list would be a better way to store my data, since a matrix would be filled with NA's and that is not exactly what I'm looking for. Thanks in advance, and best regards, Eduardo Horta [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help Please!!!!!!!!!
Your data seems to read in just fine, so what is the problem you are trying to solve? x - read.table('clipboard', sep='\t', header=TRUE) str(x) 'data.frame': 5 obs. of 5 variables: $ X : Factor w/ 5 levels JE,JM,S,..: 5 2 4 1 3 $ None : int 4 4 25 18 10 $ Light : int 2 3 10 24 6 $ Medium: int 3 7 12 33 7 $ Heavy : int 2 4 4 13 2 summary(x) X None LightMedium Heavy JE:1 Min. : 4.0 Min. : 2 Min. : 3.0 Min. : 2 JM:1 1st Qu.: 4.0 1st Qu.: 3 1st Qu.: 7.0 1st Qu.: 2 S :1 Median :10.0 Median : 6 Median : 7.0 Median : 4 SE:1 Mean :12.2 Mean : 9 Mean :12.4 Mean : 5 SM:1 3rd Qu.:18.0 3rd Qu.:10 3rd Qu.:12.0 3rd Qu.: 4 Max. :25.0 Max. :24 Max. :33.0 Max. :13 On Mon, Nov 29, 2010 at 12:29 AM, Melissa Waldman melissawald...@gmail.com wrote: Hi, I have been working with Program R for my stats class and I keep coming upon the same error, I have read so many sites about inputting data from a text file into R and I'm using the data to do a correspondence analysis. I feel like I have read everything and it is still not explaining why the error message keeps coming up, I have used the exact examples I have seen in articles and the same error keeps popping up: Error in sum(N) : invalid 'type' (character) of argument I have spent so long trying to figure this out without success, I am sure it has to do with the fact that my rows have names in them. I have attached the text file I have been using and if you have any ideas as to how I can get R to plot the data using correspondence analysis with the column and row names that would be really helpful! Or if you could pass this email to someone who may know how to help me, that would be much appreciated. Thank you, Melissa Waldman my email: melissawald...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to know if a file exists on a remote server?
?file.exists On Tue, Nov 30, 2010 at 11:10 AM, Baoqiang Cao bqcaom...@gmail.com wrote: Hi, I'd like to download some data files from a remote server, the problem here is that some of the files actually don't exist, which I don't know before try. Just wondering if a function in R could tell me if a file exists on a remote server? I searched this mailing list and after read severals mails, still clueless. Any help will be highly appreciated. B.C. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] repeat write.table with the same code many times
Try using a connection: output - file('boothd10.txt, 'w') write.table(boothd(10), output, sep = '\t', col.names = FALSE) close(output) On Tue, Nov 30, 2010 at 9:57 AM, Laura Bonnett l.j.bonn...@gmail.com wrote: Dear all, I am using R version 2.9.2 in Windows. I would like to output the results of a function I have written to a .txt file. I know that I can do this by using the code write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) etc. However, I would like to bootstrap my function 'boothd' several times and get each vector of results as a new line in my text file. Is there a way to do this? I usually just set the code up to do bootstrapping around the function (i.e. I perform the replications within the function and output a matrix of results). However in the case of 'boothd' I am dealing with rare events and so sometimes I get an empty vector as output which makes mathematical sense. Unfortunately this casues the bootstrapping code to crash. I'm hoping that writing the results out line by line will remove this problem. I have tried rep(write.table(...),15) say but because of the occasional null vector the table is not written. Thank you for any help you can give. By the way, write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) etc works but if I want to look at 1000 replications this is very time consuming! Thanks, Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference text variables as column name to plot
Where is the column? Is it in a matrix/dataframe? Why are you trying to plot 12 points against 'z' which has 24 values? You can reference the column by: plot(yourDF[, s]) where 's' is what you created as the column name. Need to refer to indexing in the Intro to R. On Tue, Nov 30, 2010 at 11:40 AM, Graves, Gregory ggra...@sfwmd.gov wrote: Having given myself carpal tunnel looking for answer to this ... I have a dataset each column of which has 12 rows in it. I created a variable 'z' as follows: z=1:24 Since I have a large number of these plots to make, and they are a bit complex, I want to want to reference the column I want to plot via a variable containing the name of that column. As follows: similar='1978' s=paste('Y',similar,sep='') variable s now contains 'Y1978' which is the name of one of the columns. However, when I try to plot plot(z,s,type='l') I get a 'x and y lengths differ' error because variable s is being recognized as 'Y1978' length=1, rather than the contents of the column Y1978 length=12. I tried all the usual tricks I know like s. How do you get R to reference a variable as a column name? Thank you. Gregory A. Graves, Lead Scientist Everglades REstoration COoordination and VERification (RECOVER) Restoration Sciences Department South Florida Water Management District Phones: DESK: 561 / 682 - 2429 CELL: 561 / 719 - 8157 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] excluding factors from sampling
You could just sample the complete group and then draw off the top the number you want; the next time you go to the well, they are missing and you get something from the remainder. You can also do 'setdiff' to get the difference. On Tue, Nov 30, 2010 at 12:45 PM, Emma Moran emma.r.mo...@gmail.com wrote: Hello, I am trying to write a function that first requires randomly sampling items from a set of factors. I need to be able to sample from that same set of factors, but exclude the ones that have already been sampled previously. For example, suppose I have a set of items a-j (a,b,c,d,e,f,g,h,i, and j) and randomly sample a, c, and f from that group. How do I sample again from the larger group (a-j) but exclude the items (a,c,f) that I have already sampled. I want this to be a function, so I don't want to just manually exclude a,c, and f. Thanks! -- Emma Moran Washington University in St Louis Biology Department McDonnell Hall Rm 419 One Brookings Drive, St. Louis, MO 63130 emo...@wustl.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a small puzzle?
You probably want to use ifelse s - ifelse(news1os2o, 1, -1) 'if' only handle a single logical expression. On Mon, Jul 12, 2010 at 10:02 AM, Raghu r.raghura...@gmail.com wrote: I know the following may sound too basic but I thought the mailing list is for the benefit of all levels of people. I ran a simple if statement on two numeric vectors (news1o and s2o) which are of equal length. I have done an str on both of them for your kind perusal below. I am trying to compare the numbers in both and initiate a new vector s as 1 or 0 depending on if the elements in the arrays are greater or lesser than each other. When I do a simple s=(news1os2o) I get the values of S as a string of TRUEs and FALSEs but when I try to override using the if statements this cribs. I get only one element in s and that is a puzzle. Any ideas on this please? Many thanks. if(news1os2o)(s-1) else + (s--1) [1] -1 Warning message: In if (news1o s2o) (s - 1) else (s - -1) : the condition has length 1 and only the first element will be used s [1] -1 length(s) [1] 1 str(news1o) num [1:3588] 891 890 890 888 886 ... str(s2o) num [1:3588] 895 892 890 888 885 ... -- 'Raghu' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exercise in frustration: applying a function to subsamples
try 'drop=TRUE' on the split function call. This will prevent the NULL set from being sent to the function. On Mon, Jul 12, 2010 at 3:10 PM, Ted Byers r.ted.by...@gmail.com wrote: From the documentation I have found, it seems that one of the functions from package plyr, or a combination of functions like split and lapply would allow me to have a really short R script to analyze all my data (I have reduced it to a couple hundred thousand records with about half a dozen records. I get the same result from ddply and split/lapply: ddply(moreinfo,c(m_id,sale_year,sale_week), + function(df) data.frame(res = fitdist(df$elapsed_time,exp),est = res$estimate,sd = res$sd)) Error in fitdist(df$elapsed_time, exp) : data must be a numeric vector of length greater than 1 and lapply(split(moreinfo,list(moreinfo$m_id,moreinfo$sale_year,moreinfo$sale_week)), + function(df) fitdist(df$elapsed_time,exp)) Error in fitdist(df$elapsed_time, exp) : data must be a numeric vector of length greater than 1 Now, in retrospect, unless I misunderstood the properties of a data.frame, I suppose a data.frame might not have been entirely appropriate as the m_id samples start and end on very different dates, but I would have thought a list data structure should have been able to handle that. It would seem that split is making groups that have the same start and end dates (or that if, for example, I have sale data for precisely the last year, split would insist on both 2009 and 2010 having weeks from 0 through 52 instead of just the weeks in each year that actually have data: 26 through 52 for last year and 1 through 25 for this year). I don't see how else the data passed to fitdist could have a sample size of 0. I'd appreciate understanding how to resolve this. However, it isn't s show stopper as it now seems trivial to just break it out into a loop (followed by a lapply/split combo using only sale year and sale month). While I am asking, is there a better way to split such temporally ordered data into weekly samples that respective the year in which the sample is taken as well as the week in which it is taken? Thanks Ted [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Printing status updates in while-loop
try: print(counter) flush.console() # force the output On Wed, Jul 14, 2010 at 2:31 PM, Michael Haenlein haenl...@escpeurope.eu wrote: Dear all, I'm using a while loop in the context of an iterative optimization procedure. Within my while loop I have a counter variable that helps me to determine how long the loop has been running. Before the loop I initialize it as counter - 0 and the last condition within my loop is counter - counter + 1. I'd like to print out the current status of counter while the loop is running to know where the optimization routine is standing. I tried to do so by adding print(counter) within the while loop. This does however not seem to work as instead of printing regular updates all print commands are executed only after the loop is finished. Is there some easy way to print regular status updates while the while loop is still running? Thanks, Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table input array
Here is a way of creating a separate list of variable length vectors that you can use in your processing: # read into a dataframe x - read.table(textConnection(ABCTLengths + 14.00.001525878918c(1,2,3) + 14.00.001525878918c(1,2,6,7,8,3) + 14.00.001525878918c(1,2,3,1,2,3,4,5,6,7,9) + 14.00.001525878918c(1,2,3) + 11.00.001716613824c(1,1,4)), header=TRUE) # create a 'list' with the variable length vectors # assuming the the Lengths are legal R expressions using 'c' x$varList - lapply(x$Lengths, function(a) eval(parse(text=a))) x A B C T Lengths varList 1 1 4 0.001525879 18 c(1,2,3) 1, 2, 3 2 1 4 0.001525879 18 c(1,2,6,7,8,3)1, 2, 6, 7, 8, 3 3 1 4 0.001525879 18 c(1,2,3,1,2,3,4,5,6,7,9) 1, 2, 3, 1, 2, 3, 4, 5, 6, 7, 9 4 1 4 0.001525879 18 c(1,2,3) 1, 2, 3 5 1 1 0.001716614 24 c(1,1,4) 1, 1, 4 str(x) 'data.frame': 5 obs. of 6 variables: $ A : int 1 1 1 1 1 $ B : num 4 4 4 4 1 $ C : num 0.00153 0.00153 0.00153 0.00153 0.00172 $ T : int 18 18 18 18 24 $ Lengths: Factor w/ 4 levels c(1,1,4),c(1,2,3),..: 2 4 3 2 1 $ varList:List of 5 ..$ : num 1 2 3 ..$ : num 1 2 6 7 8 3 ..$ : num 1 2 3 1 2 3 4 5 6 7 ... ..$ : num 1 2 3 ..$ : num 1 1 4 On Fri, Jul 16, 2010 at 10:51 AM, Balpo ba...@gmx.net wrote: Hello to all! I am new with R and I need your help. I'm trying to read a file which contests are similar to this: A B C T Lengths 1 4.0 0.0015258789 18 c(1,2,3) 1 1.0 0.0017166138 24 c(1,1,4) So all the columns are numeric values, except Lengths, which is supposed to be an variable length array of integers. How can I make R read them as arrays of integers? Or otherwise, convert the character array to an array of integers. When I read the file, I do it like this t1 = read.table(file=paste(./borrar.dat,sep=), header=T, colClasses=c(numeric, numeric, numeric, numeric, array)) But the 5th column is treated as an array of characters, and when trying to convert it to another class of data, I either get two strings c(1,2,3) and c(1,1,4) or using a toRaw converter, I get the corresponding ASCII ¿? values. Should the input be modified in order to be able to read it as an array of integers? Thank you for your help. Balpo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple question regarding name of column headers
subset(miceTrainSample, select = -plD50) On Fri, Jul 16, 2010 at 11:22 AM, Addi Wei addi...@gmail.com wrote: names(miceTrainSample) [1] b_double KierA2 KierFlex Q_VSA_POS pID50 In the above code, how do I delete pID50 column to store the resulting object without indicating column 5. The code below does the trick, but I wish to delete the column by specifying -pID50 instead of 5. names(miceTrainSample)[-5] [1] b_double KierA2 KierFlex Q_VSA_POS -- View this message in context: http://r.789695.n4.nabble.com/Simple-question-regarding-name-of-column-headers-tp2291534p2291534.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with a problem
0 15 2010-05-13 9030 16 16 2010-05-14 8682 16 17 2010-05-15 8440 15 What I am trying to do is sort by ds, and take rows 1,7, see if c1 is at least 100 AND c2 is at least 8. If it is not, start with check rows 2,8 and if not there 3,9until it loops over the entire file. If it finds a set that matches, set a new variable equal to 1, if never finds a match, set it equal to 0. I have done this in stata but on this project we are trying to use R. Is this something that can be done in R, if so, could someone point me in the correct direction. Thanks, Michael Hess University of Michigan Health System ** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Missing value
do 'str(t1)' to see what the value returned is. Most likely one of the comparisons in the 'if' statement is evaluating to NA. Also PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. 2010/7/19 A. Fırat ÖZDEMİR firat.ozde...@deu.edu.tr: Hi, i have such a code tau-0 for (i in 1:10) { x-rnorm(20,0,1) # t1=onesampb(x,est=tauloc,SEED=F)$conf.interval if(t1[1]0 ||t1[2]0)tau=tau+1 } print (tau) this code gives me Error in if (t1[1] 0 || t1[2] 0) tau = tau + 1 : missing value where TRUE/FALSE needed what can be done with such a warning message? i tried x-x[!is.na(x)] but didnt work. Best Regards.. firat Yrd.Doç.Dr.A.Fýrat ÖZDEMÝR DEÜ Fen Edebiyat Fakültesi Ýstatistik Bölümü Tel: 232-412 85 52 Belge Geçer: 232-453 42 65 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error when copy and transform within a data frame
Look at the error messages generated and you will see: Error in data.frame(list(a = c(1, 2, 3), b = c(2, 3, 4), c = c(2L, 1L, : arguments imply differing number of rows: 3, 2 In addition: Warning messages: 1: In if (x == Yes) { : the condition has length 1 and only the first element will be used 2: In if (x == Yes) { : the condition has length 1 and only the first element will be used 3: In if (x == No) { : the condition has length 1 and only the first element will be used This indicates that you had an 'if' statement with more than one logical value in its result. Take a close look at the help page for 'if'. You need to use 'ifelse' for vectorized testing: a=c(1,2,3) b=c(2,3,4) c=c(Yes,No,Yes) d=c(No,Yes,No) df=data.frame(a,b,c,d) # the following works fine! df1 = transform(df, new=sapply(df[,c(1,2)], FUN = function(x) { x^2 } )) # but the following doesn't work: num_value = function(x) { ifelse(x == Yes, 1, ifelse(x == No, 0, NA)) } df2 = transform(df, new=sapply(df[,c(3,4)], FUN = num_value )) On Mon, Jul 19, 2010 at 4:46 AM, Al R aneva...@yahoo.com wrote: # trying to do a copy and a transform within a data frame, but getting the arguments imply differing number of rows error, and I'm not sure why a=c(1,2,3) b=c(2,3,4) c=c(Yes,No,Yes) d=c(No,Yes,No) df=data.frame(a,b,c,d) # the following works fine! df = transform(df, new=sapply(df[,c(1,2)], FUN = function(x) { x^2 } )) # but the following doesn't work: num_value = function(x) { if (x == Yes) { return(1) } else if (x == No) { return(0) } else return(NA) } df = transform(df, new=sapply(df[,c(3,4)], FUN = num_value )) # generates this error.. Error in data.frame(list(a = c(1, 2, 3), b = c(2, 3, 4), c = c(2L, 1L, : arguments imply differing number of rows: 3, 2 # thanks for the help! -- View this message in context: http://r.789695.n4.nabble.com/error-when-copy-and-transform-within-a-data-frame-tp2293686p2293686.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ACCTGMX to 1223400 in R?
Here is another way of doing it with 'chartr'; I only assume that you have the upper characters, but you can add to the strings to cover any others: tst - rep( ACCTGMX, 5) chartr(ACTGBDEFHIJKLMNOPQRSUVWXYZ, 123400, tst) [1] 1223400 1223400 1223400 1223400 1223400 On Mon, Jul 19, 2010 at 5:31 PM, John1983 sandhya_prabhaka...@yahoo.com wrote: Hi, I am a newbie in R and was working on some DNA data represented as strings of A,C,T and G (also wild-character like M and X). I use the Bioconductor package in R. Currently I need to convert a string of the form ACCTGMX to 1223400 i.e. A is replaced by 1, C with 2, T with 3, G with 4 and any other character with a 0. I checked with 'replace' and also with a function called 'copySubstitute' found in the Biobase package but this is only for files. The data here is a string (ACCTGMX ) and we need to convert it to yet another string (1223400). Now I use the strsplit function to split ACCTGM into A C C T G M and then use 'which' to assign the corresponding numbers. Is there a faster way to do this or some function I can make use of? Please advice. Thank you. -- View this message in context: http://r.789695.n4.nabble.com/ACCTGMX-to-1223400-in-R-tp2294636p2294636.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.