Re: [R] User R to create MySQL database and table
On Sat, 22 May 2010, Waverley @ Palo Alto wrote: Hi, I am thinking about using R to create a database, then create table in MySQL server. Can I do that using RMySQL package? Maybe: it is done by SQL commands which you can use *if* you have the correct privileges. However, this is R-help, not R-sig-db and discussion of non-R programming questions in detail would not be appropriate. If you do want to follow up on R-sig-db, do first study the R posting guide and provide the 'at a minimum' information requested. I am familiar with RMySQL, and in the online help most of the sample code assumes the database exists and transact with the table inside the database. Can someone provide me some sample code to create a database and table? Specifically create a database first, then create a table inside the database. Thanks a lot in advance. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Selecting first 7 elements
Hi, I have a list of 100, each list has 20 elements, and I would like to select the first 7 elements in each list. Let's take the alphabet as an example. x - lapply(1:100, function(i) sample(LETTERS)) I tried x[[1:7]], but it doesn't work. Can anyone enlighten me on how to do such selections? Thank you. Kang Min __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in FUN(X[[1L]], ...) : STRING_ELT() can only be applied to a 'character vector', not a 'integer'
Sorry - I figured that this to be a more common defined error than anything specific to the data/function... Thanks for looking at this. The data and function are below. Creating a single line of the data.frame at a time will work (i.e. fold(s)) For multiple line data.frames, an error is generated. Ideally I would like to record the output from fold(sq) in a two column data.frame, whether it requires reading in the data to fold one line at a time or in bulk. library(GeneRfold) s- ATTATGCATCGACTAGCATCACTAG fold(s) [[1]] [1] ..... [[2]] [1] -2.3 sq - data.frame(c(ATGTGTGATATGCATGTACAGCATCGAC, + ACTAGCACTAGCATCAGCTGTAGATAGA, + ACTAGCATCGACATCATCGACATGATAG, + CATCGACTACGACTACGTAGATAGATAG, + ATCAGCACTACGACACATAGATAGAATA)) fold(sq) Error in fold(sq) : STRING_ELT() can only be applied to a 'character vector', not a 'list' struct - t(as.data.frame(sapply(sq[,1], fold, t=37))) Error in FUN(X[[1L]], ...) : STRING_ELT() can only be applied to a 'character vector', not a 'integer' dput(fold,file=fred123) function (s, t = 37) { .Call(foldR, s, t, PACKAGE = GeneRfold) } dput(sq) structure(list(c..ATGTGTGATATGCATGTACAGCATCGACACTAGCACTAGCATCAGCTGTAGATAGA... = structure(c(4L, 1L, 2L, 5L, 3L), .Label = c(ACTAGCACTAGCATCAGCTGTAGATAGA, ACTAGCATCGACATCATCGACATGATAG, ATCAGCACTACGACACATAGATAGAATA, ATGTGTGATATGCATGTACAGCATCGAC, CATCGACTACGACTACGTAGATAGATAG), class = factor)), .Names = c..ATGTGTGATATGCATGTACAGCATCGACACTAGCACTAGCATCAGCTGTAGATAGA..., row.names = c(NA, -5L), class = data.frame) dput(s) ATTATGCATCGACTAGCATCACTAG sessionInfo() R version 2.11.0 (2010-04-22) i386-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GeneRfold_1.6.0 GeneR_2.18.0 loaded via a namespace (and not attached): [1] tools_2.11.0 -- View this message in context: http://r.789695.n4.nabble.com/Error-in-FUN-X-1L-STRING-ELT-can-only-be-applied-to-a-character-vector-not-a-integer-tp2226811p2227512.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Increasing the maximum number of rows
Might there be a limit ? c - matrix(1:1, ncol=200) dim(c) [1] 50200 c - matrix(1:10, ncol=200) Error: cannot allocate vector of size 3.7 Gb - A R learner. -- View this message in context: http://r.789695.n4.nabble.com/Increasing-the-maximum-number-of-rows-tp2226950p2227578.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] need help in understanding R code, and maybe some math
Hi, I am trying to implement Higham's algorithm for correcting a non positive definite covariance matrix. I found this code in R: http://projects.cs.kent.ac.uk/projects/cxxr/trac/browser/trunk/src/library/Recommended/Matrix/R/nearPD.R?rev=637 I managed to understand most of it, the only line I really don't understand is this one: X - tcrossprod(Q * rep(d[p], each=nrow(Q)), Q) This line is supposed to calculate the matrix product Q*D*Q^T, Q is an n by m matrix and R is a diagonal n by n matrix. What does this mean? I also don't understand the meaning of a cross product between matrices, I only know it between vectors. Thanks, Barisdad. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] importing columns as factors
I have a large csv table I am trying to read into R. I would like each column to be of type factor. However, most columns have only numeral entries (e.g. likert scales), so are automatically imported as type numeric. Is there a way to convert ALL columns to be of type factor, without having to convert each column manually? Cheers, Caitlin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting first 7 elements
Kang Min wrote: Hi, I have a list of 100, each list has 20 elements, and I would like to select the first 7 elements in each list. Let's take the alphabet as an example. x - lapply(1:100, function(i) sample(LETTERS)) I tried x[[1:7]], but it doesn't work. Can anyone enlighten me on how to do such selections? [ is a function, and you want to use it on each element of the list, so... lapply(x, [, c(1:7)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting first 7 elements
[ is a function, and you want to use it on each element of the list, so... lapply(x, [, c(1:7)) and the call to c() is of course not necessary, since : will generate a vector. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How sample without replacement on more than one variables?
Hello All, sample() only sample on one variable x. But I'm interested in sampling more than one variable without replacement. Suppose I have 3 vectors x, y, z. I want to draw samples from all three vectors such that the combination of the three elements in each draw is not the same as any previous draws. I could use expand.grid to generate a vector out of the three vectors. But when the number of vectors are large and the number of elements in some vectors are large, it will be infeasible to do so. If you know there is a method on sampling on more than one variables, would you please let me know? Thank you! -- Tom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing columns as factors
Caitlin Sadowski wrote: I have a large csv table I am trying to read into R. I would like each column to be of type factor. However, most columns have only numeral entries (e.g. likert scales), so are automatically imported as type numeric. Is there a way to convert ALL columns to be of type factor, without having to convert each column manually? It's in the help file for ?read.csv, the colClasses argument: colClasses: character. A vector of classes to be assumed for the columns. Recycled as necessary, or if the character vector is named, unspecified values are taken to be ‘NA’. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting first 7 elements
Thanks a lot, it works! On May 23, 3:10 pm, Erik Iverson er...@ccbr.umn.edu wrote: [ is a function, and you want to use it on each element of the list, so... lapply(x, [, c(1:7)) and the call to c() is of course not necessary, since : will generate a vector. __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- You received this message because you are subscribed to the Google Groups R-help-archive group. To post to this group, send email to r-help-arch...@googlegroups.com. To unsubscribe from this group, send email to r-help-archive+unsubscr...@googlegroups.com. For more options, visit this group athttp://groups.google.com/group/r-help-archive?hl=en. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in FUN(X[[1L]], ...) : STRING_ELT() can only be applied to a 'character vector', not a 'integer'
Hello, sedm1000 wrote: Sorry - I figured that this to be a more common defined error than anything specific to the data/function... Thanks for looking at this. The data and function are below. Creating a single line of the data.frame at a time will work (i.e. fold(s)) For multiple line data.frames, an error is generated. Ideally I would like to record the output from fold(sq) in a two column data.frame, whether it requires reading in the data to fold one line at a time or in bulk. library(GeneRfold) s- ATTATGCATCGACTAGCATCACTAG fold(s) [[1]] [1] ..... [[2]] [1] -2.3 sq - data.frame(c(ATGTGTGATATGCATGTACAGCATCGAC, + ACTAGCACTAGCATCAGCTGTAGATAGA, + ACTAGCATCGACATCATCGACATGATAG, + CATCGACTACGACTACGTAGATAGATAG, + ATCAGCACTACGACACATAGATAGAATA)) fold(sq) Error in fold(sq) : STRING_ELT() can only be applied to a 'character vector', not a 'list' struct - t(as.data.frame(sapply(sq[,1], fold, t=37))) Error in FUN(X[[1L]], ...) : STRING_ELT() can only be applied to a 'character vector', not a 'integer' This appears to be a Bioconductor package, so if this doesn't help, I'd ask on the specific bioconductor mailing list. I don't have the package installed, so take the following advice with that in mind. Did you look at the str(sq) ? It is not a character vector, it is a factor, so you might need to convert or see stringsAsFactors in ?options. Try lapply(sq[, 1], function(x) fold(as.character(x))) If that doesn't work, try the other list. Good luck, Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How sample without replacement on more than one variables?
thmsfuller...@gmail.com wrote: Hello All, sample() only sample on one variable x. But I'm interested in sampling more than one variable without replacement. Suppose I have 3 vectors x, y, z. I want to draw samples from all three vectors such that the combination of the three elements in each draw is not the same as any previous draws. I could use expand.grid to generate a vector out of the three vectors. But when the number of vectors are large and the number of elements in some vectors are large, it will be infeasible to do so. If you know there is a method on sampling on more than one variables, would you please let me know? Thank you! Can you give a reproducible example? Since you suggested the method that is most reasonable, but it will not work in large cases, I suppose you'll have to draw independently from each vector one at a time, then somehow concatenate the results, perhaps as a character vector, even if the vectors are, say, integers. Then repeat this process checking each time if your new vector is %in% the vector. There may be a much better way, too, see if anyone else responds. Also, you'll have to think about what a unique sample is. If x - 1:3 y - 2:4 , is x = 2, y = 3 the same as x = 3, y = 2? Good luck, Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How sample without replacement on more than one variables?
This might help, depending on your exact needs: v1 - sample(letters[1:2], 10, replace=TRUE) v2 - sample(letters[3:4], 10, replace=TRUE) v3 - sample(letters[5:6], 10, replace=TRUE) aa - data.frame(v1=v1, v2=v2, v3=v3) aa v1 v2 v3 1 a d e 2 a d e 3 a c e 4 b d e 5 b d f 6 a c f 7 a c f 8 a c f 9 a c e 10 b c e bb - unique(aa) bb v1 v2 v3 1 a d e 3 a c e 4 b d e 5 b d f 6 a c f 10 b c e You can sample from the bb dataframe, or from the corresponding rows of the aa dataframe that are unique (1, 3, 4, 5, 6 and 10) which can be obtained via rownames(bb). Hth, Adrian -- View this message in context: http://r.789695.n4.nabble.com/How-sample-without-replacement-on-more-than-one-variables-tp2227665p2227683.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Increasing the maximum number of rows
You are trying to create an object with 1G elements. Given that these are integers, this will require about 4GB of space. If you are running on a 32-bit system, which has a total phyical limit of 2-3GB depending on what options you are running (at least on Windows), then you have exceeded the limits. It is a good idea to limit your largest object to about 25% of physical memory in case copies have to be made during some of the analysis. On Sat, May 22, 2010 at 10:31 PM, Wu Gong gho...@gmail.com wrote: Might there be a limit ? c - matrix(1:1, ncol=200) dim(c) [1] 50 200 c - matrix(1:10, ncol=200) Error: cannot allocate vector of size 3.7 Gb - A R learner. -- View this message in context: http://r.789695.n4.nabble.com/Increasing-the-maximum-number-of-rows-tp2226950p2227578.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] need help in understanding R code, and maybe some math
On 2010-05-23 0:56, john smith wrote: Hi, I am trying to implement Higham's algorithm for correcting a non positive definite covariance matrix. I found this code in R: http://projects.cs.kent.ac.uk/projects/cxxr/trac/browser/trunk/src/library/Recommended/Matrix/R/nearPD.R?rev=637 I managed to understand most of it, the only line I really don't understand is this one: X- tcrossprod(Q * rep(d[p], each=nrow(Q)), Q) This line is supposed to calculate the matrix product Q*D*Q^T, Q is an n by m matrix and R is a diagonal n by n matrix. What does this mean? I also don't understand the meaning of a cross product between matrices, I only know it between vectors. You could have a look at the help page for crossprod which gives the definitions of crossprod and tcrossprod. Perhaps this will help: Q - matrix(1:12, ncol=3) v - rep(1:3, each=nrow(Q) Q v Q * v (Q * v) %*% t(Q) tcrossprod(Q * v, Q) -Peter Ehlers Thanks, Barisdad. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Increasing the maximum number of rows
Hello Jim, It sounds like a good time to go read about the packages bigmemory and/or ff Best, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Sun, May 23, 2010 at 12:31 PM, jim holtman jholt...@gmail.com wrote: You are trying to create an object with 1G elements. Given that these are integers, this will require about 4GB of space. If you are running on a 32-bit system, which has a total phyical limit of 2-3GB depending on what options you are running (at least on Windows), then you have exceeded the limits. It is a good idea to limit your largest object to about 25% of physical memory in case copies have to be made during some of the analysis. On Sat, May 22, 2010 at 10:31 PM, Wu Gong gho...@gmail.com wrote: Might there be a limit ? c - matrix(1:1, ncol=200) dim(c) [1] 50200 c - matrix(1:10, ncol=200) Error: cannot allocate vector of size 3.7 Gb - A R learner. -- View this message in context: http://r.789695.n4.nabble.com/Increasing-the-maximum-number-of-rows-tp2226950p2227578.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How sample without replacement on more than one variables?
On Sun, 2010-05-23 at 00:56 -0700, dusadrian wrote: This might help, depending on your exact needs: v1 - sample(letters[1:2], 10, replace=TRUE) v2 - sample(letters[3:4], 10, replace=TRUE) v3 - sample(letters[5:6], 10, replace=TRUE) aa - data.frame(v1=v1, v2=v2, v3=v3) And now is simple, sample the row of data frame aa[sample(1:nrows(aa),3),] -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re : Re : Re : Nomogram with multiple interactions (package rms)
Thanks for the answer. Unfortunately, I'm not yet skilled enough to do such a thing. I had a look on the code and I'll try to understand it, as a good exercise. I thought about sending fake fit objects to nomogram() derived from the original one : - orignal : f2- cph(Surv(d.time,death) ~ sex*(rcs(cholesterol,4)+blood.pressure) - manually derived : * fMale : with coef rcs(cholesterol,4) and blood.pressure form f2, no sex effect * fFemale : with agregated coef sex:rcs(cholesterol,4) for cholesterol and sex:blood.pressure for BP and an obligatory sex effect. But I failed to fool your function. Had to try though... Marc - Message d'origine De : Frank E Harrell Jr f.harr...@vanderbilt.edu À : Marc Carpentier marc.carpent...@ymail.com Cc : r-help-request Mailing List r-help@r-project.org Envoyé le : Jeu 20 mai 2010, 15h 30min 27s Objet : Re: Re : Re : [R] Nomogram with multiple interactions (package rms) On 05/20/2010 01:42 AM, Marc Carpentier wrote: Thank you for your responses, but I don't think you're right about the doc... I carefully looked at it before posting and ran the examples, looked in Vanderbilt Biostat doc, and just looked again example(nomogram) : 1st example : categorical*continous : two axes for each sex f- lrm(y ~ lsp(age,50)+sex*rcs(cholesterol,4)+blood.pressure) Hi Marc, My apologies; I misread my own example. This will take some digging into the code. If you have time to do this before I do, code change suggestions welcomed. Frank 2nd : continous*continous : one age axe for each specified value of cholesterol g- lrm(y ~ sex + rcs(age,3)*rcs(cholesterol,3)) 3rd and 4th : categorical*continous : two axes for each sex (4th with fun) f- psm(Surv(d.time,death) ~ sex*age, dist='lognormal') 5th : categorical*continous : two axes for each sex (with fun) g- lrm(Y ~ age+rcs(cholesterol,4)*sex) I'm desperately trying to represent a case of categorical*(continous+continous) : f2- cph(Surv(d.time,death) ~ sex*(rcs(cholesterol,4)+blood.pressure) The best solution I can think of is to draw one nomogram for each sex : Assuming 'male' is the ref level of sex : 1st nomogram : one axe for rcs(cholesterol,4), one axe for blood.pressure 2nd nomogram : one axe for sex:rcs(cholesterol,4), one axe for sex:blood.pressure, both shifted because of the sex own effect. (I badly draw it in my previous mail) I didn't see any example of this adjustement of nomogram to 'male' or 'female'... I hope I gave a clearer explanation and I'm not wrong about this unmentioned case. Marc - Message d'origine De : Frank E Harrell Jrf.harr...@vanderbilt.edu À : Marc Carpentiermarc.carpent...@ymail.com Cc : r-help-request Mailing Listr-help@r-project.org Envoyé le : Jeu 20 mai 2010, 0h 55min 32s Objet : Re: Re : [R] Nomogram with multiple interactions (package rms) On 05/19/2010 04:36 PM, Marc Carpentier wrote: I'm sorry. I don't understand the omit solution, and maybe I mislead you with my explanation. With the data from the f exemple of nomogram() : Let's declare : f2- cph(Surv(d.time,death) ~ sex*(age+blood.pressure)) I guess the best (and maybe the only) way to represent it with a nomogram is to plot two nomograms (I couldn't find better). Is there a way to have : Nomogram1 : male : - points 1-100 --- - age (men) --- - blood.pressure (men) --- - linear predictor --- And nomogram2 : female : - points 1-100 --- - age (female) --- - blood.pressure (female) --- - linear predictor --- As I said I tried and failed (nomogram() still wants me to define interact=list(...)) with : plot(nomorgam(f2, adj.to=list(sex=male)) #and female for the other one Marc I think the documentation tells you how to do this. But you failed to look at the output from example(nomogram). In one of the examples two continuous predictors have two axes each, with male and female in close proximity. Or maybe I'm just missing your point. Frank - Message d'origine De : Frank E Harrell Jrf.harr...@vanderbilt.edu À : Marc Carpentiermarc.carpent...@ymail.com; r-help-request Mailing Listr-help@r-project.org Envoyé le : Mer 19 mai 2010, 22h 28min 51s Objet : Re: [R] Nomogram with multiple interactions (package rms) On 05/19/2010 03:17 PM, Marc Carpentier wrote: Dear list, I'm facing the following problem : A cox model with my sex variable interacting with several continuous variables : cph(S~sex*(x1+x2+x3)) And I'd like to make a nomogram. I know it's a bit tricky and one mights argue that nomogram is not a good a choice... I could use the parameter interact=list(sex=(male,female),x1=c(a,b,c))... but with rcs or pol transformations of x1, x2 and x3, the choice of the categorization (a,b,c,...) is arbitrary and the nomogram not so useful... Considering that sex is the problem, I thought I could draw two nomograms, one for each
Re: [R] need help in understanding R code, and maybe some math
On Sun, May 23, 2010 at 5:09 AM, Peter Ehlers ehl...@ucalgary.ca wrote: On 2010-05-23 0:56, john smith wrote: Hi, I am trying to implement Higham's algorithm for correcting a non positive definite covariance matrix. I found this code in R: http://projects.cs.kent.ac.uk/projects/cxxr/trac/browser/trunk/src/library/Recommended/Matrix/R/nearPD.R?rev=637 I managed to understand most of it, the only line I really don't understand is this one: X- tcrossprod(Q * rep(d[p], each=nrow(Q)), Q) This line is supposed to calculate the matrix product Q*D*Q^T, Q is an n by m matrix and R is a diagonal n by n matrix. What does this mean? I also don't understand the meaning of a cross product between matrices, I only know it between vectors. In the original S language, on which R is based, the function named crossprod was used for what statisticians view as the cross-product of the columns of a matrix, such as a multivariate data matrix or a model matrix. That is crossprod(X) := X'X This is a special case of the cross-product of the columns of two matrices with the same number of rows crossprod(X, Y) := X'Y The tcrossprod function was introduced more recently to mean the crossprod of the transpose of X. That is trcossprod(X) := crossprod(t(X)) := X %*% t(X) These definitions are unrelated to the cross-product of vectors used in Physics and related disciplines. The reason for creating such functions is that these are common operations in statistical computing and it helps to know the special structure (e.g. the result of crossprod(X) or tcrossprod(X) is a symmetric, positive semidefinite matrix). You could have a look at the help page for crossprod which gives the definitions of crossprod and tcrossprod. Perhaps this will help: Q - matrix(1:12, ncol=3) v - rep(1:3, each=nrow(Q) Q v Q * v (Q * v) %*% t(Q) tcrossprod(Q * v, Q) -Peter Ehlers Thanks, Barisdad. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression with sparse matricies
As Frank mentioned in his reply, expecting to estimate tens of thousands of fixed-effects parameters in a logistic regression is optimistic. You could start with a generalized linear mixed model instead library(lme4) fm1 - glmer(resp ~ 1 + (1|f1) + (1|f2) + (1|f1:f2), mydata, binomial)) If you have difficulty with that it might be best to switch the discussion to the r-sig-mixed-mod...@r-project.org mailing list. On Sat, May 22, 2010 at 2:19 PM, Robin Jeffries rjeffr...@ucla.edu wrote: I would like to run a logistic regression on some factor variables (main effects and eventually an interaction) that are very sparse. I have a moderately large dataset, ~100k observations with 1500 factor levels for one variable (x1) and 600 for another (X2), creating ~19000 levels for the interaction (X1:X2). I would like to take advantage of the sparseness in these factors to avoid using GLM. Actually glm is not an option given the size of the design matrix. I have looked through the Matrix package as well as other packages without much help. Is there some option, some modification of glm, some way that it will recognize a sparse matrix and avoid large matrix inversions? -Robin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in FUN(X[[1L]], ...) : STRING_ELT() can only be applied to a 'character vector', not a 'integer'
On May 23, 2010, at 3:27 AM, Erik Iverson wrote: Hello, sedm1000 wrote: Sorry - I figured that this to be a more common defined error than anything specific to the data/function... Thanks for looking at this. The data and function are below. Creating a single line of the data.frame at a time will work (i.e. fold(s)) For multiple line data.frames, an error is generated. Ideally I would like to record the output from fold(sq) in a two column data.frame, whether it requires reading in the data to fold one line at a time or in bulk. library(GeneRfold) s- ATTATGCATCGACTAGCATCACTAG fold(s) [[1]] [1] ..... [[2]] [1] -2.3 sq - data.frame(c(ATGTGTGATATGCATGTACAGCATCGAC, + ACTAGCACTAGCATCAGCTGTAGATAGA, + ACTAGCATCGACATCATCGACATGATAG, + CATCGACTACGACTACGTAGATAGATAG, + ATCAGCACTACGACACATAGATAGAATA)) fold(sq) Building on Erik's comments, perhaps trying: sq - data.frame(s1 = c(ATGTGTGATATGCATGTACAGCATCGAC, + ACTAGCACTAGCATCAGCTGTAGATAGA, + ACTAGCATCGACATCATCGACATGATAG, + CATCGACTACGACTACGTAGATAGATAG, + ATCAGCACTACGACACATAGATAGAATA), stringsAsFactors=FALSE) stringsAsFactors=FALSE leaves the character vector unfactored. str(sq) 'data.frame': 5 obs. of 1 variable: $ s1: chr ATGTGTGATATGCATGTACAGCATCGAC ACTAGCACTAGCATCAGCTGTAGATAGA ACTAGCATCGACATCATCGACATGATAG CATCGACTACGACTACGTAGATAGATAG ... Passing sq would still be passing a list. You probably want just the first and only column. str(sq$s1) chr [1:5] ATGTGTGATATGCATGTACAGCATCGAC ACTAGCACTAGCATCAGCTGTAGATAGA ... fold(sq$s1) # passing a character vector, which is what the error message says is needed. -- David. Error in fold(sq) : STRING_ELT() can only be applied to a 'character vector', not a 'list' struct - t(as.data.frame(sapply(sq[,1], fold, t=37))) Error in FUN(X[[1L]], ...) : STRING_ELT() can only be applied to a 'character vector', not a 'integer' This appears to be a Bioconductor package, so if this doesn't help, I'd ask on the specific bioconductor mailing list. I don't have the package installed, so take the following advice with that in mind. Did you look at the str(sq) ? It is not a character vector, it is a factor, so you might need to convert or see stringsAsFactors in ? options. Try lapply(sq[, 1], function(x) fold(as.character(x))) If that doesn't work, try the other list. Good luck, Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re : Re : Re : Nomogram with multiple interactions (package rms)
On 05/23/2010 06:29 AM, Marc Carpentier wrote: Thanks for the answer. Unfortunately, I'm not yet skilled enough to do such a thing. I had a look on the code and I'll try to understand it, as a good exercise. I thought about sending fake fit objects to nomogram() derived from the original one : - orignal : f2- cph(Surv(d.time,death) ~ sex*(rcs(cholesterol,4)+blood.pressure) - manually derived : * fMale : with coef rcs(cholesterol,4) and blood.pressure form f2, no sex effect * fFemale : with agregated coef sex:rcs(cholesterol,4) for cholesterol and sex:blood.pressure for BP and an obligatory sex effect. But I failed to fool your function. Had to try though... Marc Marc, Although this feature should really be implemented or fixed in nomogram(), you can always use ols to predict (with an R^2 of 1.0) the linear predictor from predict(cph fit) setting a variable to a constant in the newdata argument to predict, and not using that variable to predict the linear predictor. Then you can make a nomogram from the ols model. Frank - Message d'origine De : Frank E Harrell Jrf.harr...@vanderbilt.edu À : Marc Carpentiermarc.carpent...@ymail.com Cc : r-help-request Mailing Listr-help@r-project.org Envoyé le : Jeu 20 mai 2010, 15h 30min 27s Objet : Re: Re : Re : [R] Nomogram with multiple interactions (package rms) On 05/20/2010 01:42 AM, Marc Carpentier wrote: Thank you for your responses, but I don't think you're right about the doc... I carefully looked at it before posting and ran the examples, looked in Vanderbilt Biostat doc, and just looked again example(nomogram) : 1st example : categorical*continous : two axes for each sex f- lrm(y ~ lsp(age,50)+sex*rcs(cholesterol,4)+blood.pressure) Hi Marc, My apologies; I misread my own example. This will take some digging into the code. If you have time to do this before I do, code change suggestions welcomed. Frank 2nd : continous*continous : one age axe for each specified value of cholesterol g- lrm(y ~ sex + rcs(age,3)*rcs(cholesterol,3)) 3rd and 4th : categorical*continous : two axes for each sex (4th with fun) f- psm(Surv(d.time,death) ~ sex*age, dist='lognormal') 5th : categorical*continous : two axes for each sex (with fun) g- lrm(Y ~ age+rcs(cholesterol,4)*sex) I'm desperately trying to represent a case of categorical*(continous+continous) : f2- cph(Surv(d.time,death) ~ sex*(rcs(cholesterol,4)+blood.pressure) The best solution I can think of is to draw one nomogram for each sex : Assuming 'male' is the ref level of sex : 1st nomogram : one axe for rcs(cholesterol,4), one axe for blood.pressure 2nd nomogram : one axe for sex:rcs(cholesterol,4), one axe for sex:blood.pressure, both shifted because of the sex own effect. (I badly draw it in my previous mail) I didn't see any example of this adjustement of nomogram to 'male' or 'female'... I hope I gave a clearer explanation and I'm not wrong about this unmentioned case. Marc - Message d'origine De : Frank E Harrell Jrf.harr...@vanderbilt.edu À : Marc Carpentiermarc.carpent...@ymail.com Cc : r-help-request Mailing Listr-help@r-project.org Envoyé le : Jeu 20 mai 2010, 0h 55min 32s Objet : Re: Re : [R] Nomogram with multiple interactions (package rms) On 05/19/2010 04:36 PM, Marc Carpentier wrote: I'm sorry. I don't understand the omit solution, and maybe I mislead you with my explanation. With the data from the f exemple of nomogram() : Let's declare : f2- cph(Surv(d.time,death) ~ sex*(age+blood.pressure)) I guess the best (and maybe the only) way to represent it with a nomogram is to plot two nomograms (I couldn't find better). Is there a way to have : Nomogram1 : male : - points 1-100 --- - age (men) --- - blood.pressure (men) --- - linear predictor --- And nomogram2 : female : - points 1-100 --- - age (female) --- - blood.pressure (female) --- - linear predictor --- As I said I tried and failed (nomogram() still wants me to define interact=list(...)) with : plot(nomorgam(f2, adj.to=list(sex=male)) #and female for the other one Marc I think the documentation tells you how to do this. But you failed to look at the output from example(nomogram). In one of the examples two continuous predictors have two axes each, with male and female in close proximity. Or maybe I'm just missing your point. Frank - Message d'origine De : Frank E Harrell Jrf.harr...@vanderbilt.edu À : Marc Carpentiermarc.carpent...@ymail.com; r-help-request Mailing Listr-help@r-project.org Envoyé le : Mer 19 mai 2010, 22h 28min 51s Objet : Re: [R] Nomogram with multiple interactions (package rms) On 05/19/2010 03:17 PM, Marc Carpentier wrote: Dear list, I'm facing the following problem : A cox model with my sex variable interacting with several continuous variables : cph(S~sex*(x1+x2+x3)) And I'd like to make a nomogram. I know it's a bit tricky and one mights
Re: [R] Re : Indexing array to 1000
On May 22, 2010, at 10:48 PM, Mohan L wrote: Dear All, I have an array some thing like this: avglog January February March April May June July August September 60102 83397 56774 48785 49010 40572 38175 47037 51402 The class of avglog array. class(avglog) [1] array str(avglog) num [1:9(1d)] 60102 83397 56774 48785 49010 ... - attr(*, dimnames)=List of 1 ..$ : chr [1:9] January February March April ... I have to normalize this avglog array to 1000. I mean, I need to devide 1000/avglog[1] and have to multiply this to all the elements in the array and need to plot graph Month Vs Index. To achive this I am doing the below code. I am feeling there may be a simple way to do this. This would accomplish those two goals in two lines: plot(normedavlog - 1000*avlog/avlog[1], xaxt=n) axis(1, at=1:9, labels =names(avlog)) -- David. value - matrix (avglog) value [,1] [1,] 60102 [2,] 83397 [3,] 56774 [4,] 48785 [5,] 49010 [6,] 40572 [7,] 38175 [8,] 47037 [9,] 51402 day1Avg - value[1] day1Avg [1] 60102 ID - (1000/day1Avg) ID [1] 0.01663838 index - value*ID index [,1] [1,] 1000. [2,] 1387.5911 [3,] 944.6275 [4,] 811.7034 [5,] 815.4471 [6,] 675.0524 [7,] 635.1702 [8,] 782.6195 [9,] 855.2461 monthcount - length(avglog) Month - c(1:monthcount) trend - cbind(Month,c(index)) colnames(trend) - c(Month,Index) trend Month Index [1,]1 1000. [2,]2 1387.5911 [3,]3 944.6275 [4,]4 811.7034 [5,]5 815.4471 [6,]6 675.0524 [7,]7 635.1702 [8,]8 782.6195 [9,]9 855.2461 any help will be greatly appreciated. Thanks Rg Mohan L [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subsetting with a list of vectors
Hi, I have a dataset that looks like the one below. data plot plantno.species H 31 ABC D 2 DEF Y 54 GFE E 12 ERF Y 98 FVD H 4 JKU J 7 JFG A 55 EGD .. . .. . .. . I want to select rows belonging to 7 random plots for 100 times. (There are 50 plots in total) So I created a list of 100 vectors, each vector has 7 elements. samp - lapply(1:100, function(i) sample(LETTERS)) samp2 - lapply(samp2, [, 1:7) How can I select the 26 plots from 'data' using 'samp'? samp3 - sample(LETTERS, 7) samp4 - subset(data, plot %in% samp3) # this works samp5 - subset(data, plot %in% samp2[[1]]) # this works as well, but I used a for loop to get it to select 7 plots 100 times. for (i in nrow(samp2)) { samp6 - subset(data, plot %in% samp2[[i]]) } # this doesn't work Am I missing something, or is there a better solution? Thanks. Kang Min __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting with a list of vectors
try this: x - read.table(textConnection(plot plantno.species + H 31 ABC + D 2 DEF + Y 54 GFE + E 12 ERF + Y 98 FVD + H 4 JKU + J 7 JFG + A 55 EGD), header=TRUE, as.is=TRUE) closeAllConnections() # chose 10 groups of 3 sample choice - lapply(1:10, function(.dummy){ + x[sample(nrow(x),3),] + }) choice [[1]] plot plantno. species 3Y 54 GFE 8A 55 EGD 4E 12 ERF [[2]] plot plantno. species 8A 55 EGD 2D2 DEF 6H4 JKU [[3]] plot plantno. species 8A 55 EGD 5Y 98 FVD 4E 12 ERF On Sun, May 23, 2010 at 10:00 AM, Kang Min ngokang...@gmail.com wrote: Hi, I have a dataset that looks like the one below. data plot plantno. species H 31 ABC D 2 DEF Y 54 GFE E 12 ERF Y 98 FVD H 4 JKU J 7 JFG A 55 EGD . . . . . . . . . I want to select rows belonging to 7 random plots for 100 times. (There are 50 plots in total) So I created a list of 100 vectors, each vector has 7 elements. samp - lapply(1:100, function(i) sample(LETTERS)) samp2 - lapply(samp2, [, 1:7) How can I select the 26 plots from 'data' using 'samp'? samp3 - sample(LETTERS, 7) samp4 - subset(data, plot %in% samp3) # this works samp5 - subset(data, plot %in% samp2[[1]]) # this works as well, but I used a for loop to get it to select 7 plots 100 times. for (i in nrow(samp2)) { samp6 - subset(data, plot %in% samp2[[i]]) } # this doesn't work Am I missing something, or is there a better solution? Thanks. Kang Min __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting with a list of vectors
Thanks, but what I want is not 100 groups of 7 samples. Let's say in my samp2 I get [[1]] D H K S E U O [[2]] H S R V A L B etc... I want to select all rows from 'data' containing D H K S E U O first, then H S R V A L B and so on. On May 23, 10:12 pm, jim holtman jholt...@gmail.com wrote: try this: x - read.table(textConnection(plot plantno. species + H 31 ABC + D 2 DEF + Y 54 GFE + E 12 ERF + Y 98 FVD + H 4 JKU + J 7 JFG + A 55 EGD), header=TRUE, as.is=TRUE) closeAllConnections() # chose 10 groups of 3 sample choice - lapply(1:10, function(.dummy){ + x[sample(nrow(x),3),] + }) choice [[1]] plot plantno. species 3 Y 54 GFE 8 A 55 EGD 4 E 12 ERF [[2]] plot plantno. species 8 A 55 EGD 2 D 2 DEF 6 H 4 JKU [[3]] plot plantno. species 8 A 55 EGD 5 Y 98 FVD 4 E 12 ERF On Sun, May 23, 2010 at 10:00 AM, Kang Min ngokang...@gmail.com wrote: Hi, I have a dataset that looks like the one below. data plot plantno. species H 31 ABC D 2 DEF Y 54 GFE E 12 ERF Y 98 FVD H 4 JKU J 7 JFG A 55 EGD . . . . . . . . . I want to select rows belonging to 7 random plots for 100 times. (There are 50 plots in total) So I created a list of 100 vectors, each vector has 7 elements. samp - lapply(1:100, function(i) sample(LETTERS)) samp2 - lapply(samp2, [, 1:7) How can I select the 26 plots from 'data' using 'samp'? samp3 - sample(LETTERS, 7) samp4 - subset(data, plot %in% samp3) # this works samp5 - subset(data, plot %in% samp2[[1]]) # this works as well, but I used a for loop to get it to select 7 plots 100 times. for (i in nrow(samp2)) { samp6 - subset(data, plot %in% samp2[[i]]) } # this doesn't work Am I missing something, or is there a better solution? Thanks. Kang Min __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- You received this message because you are subscribed to the Google Groups R-help-archive group. To post to this group, send email to r-help-arch...@googlegroups.com. To unsubscribe from this group, send email to r-help-archive+unsubscr...@googlegroups.com. For more options, visit this group athttp://groups.google.com/group/r-help-archive?hl=en. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting with a list of vectors
On May 23, 2010, at 10:00 AM, Kang Min wrote: Hi, I have a dataset that looks like the one below. data plot plantno.species H 31 ABC D 2 DEF Y 54 GFE E 12 ERF Y 98 FVD H 4 JKU J 7 JFG A 55 EGD .. . .. . .. . I want to select rows belonging to 7 random plots for 100 times. So you should be thinking about a function that will do what you want exactly once and then wrapping it in replicate(). (There are 50 plots in total) So I created a list of 100 vectors, each vector has 7 elements. samp - lapply(1:100, function(i) sample(LETTERS)) Please. Minimal!!! 5 samples should be enough for testing. samp2 - lapply(samp2, [, 1:7) How can I select the 26 plots from 'data' using 'samp'? samp3 - sample(LETTERS, 7) You do not want to sample from LETTERS but rather from the vector of data named plot. Otherwise you will not be creating a representative sample. And ... plot is a really crappy name for a column. Try to avoid naming your columns with names that are common functions. Confusion of the humans reading your code is the predictable result, and occasional confusion of the R interpreter also may occur. [After reading your reply to Holtman Or maybe you do want to sample from LETTERS. The fix would be obvious.] samp4 - subset(data, plot %in% samp3) # this works So this is what you want to do once: samp1 - function() subset(data, plot %in% sample(data$plot, 7) ) samp15 - replicate(10, samp1()) samp5[,1] will be one sampled subset. (samp10 is now an array of lists.) Unforfunately, I noticed that even with minimal data example you provided (not in reproducible form unfortunately) that I was getting 7 or 8 samples and realized that using letters to subset was creating some overlaps whenever H was sampled. So this is safer: samp1 - function() data[ sample(1:nrow(data), 7 ),] samp5 - replicate(5, samp1() ) for(1 in 1:5) print(samp5[,i]) Then I noticed your reply to Holtman, so perhaps you do really wnat the first solution. Just so you understand it might not be statistically correct. -- David. samp5 - subset(data, plot %in% samp2[[1]]) # this works as well, but I used a for loop to get it to select 7 plots 100 times. for (i in nrow(samp2)) { samp6 - subset(data, plot %in% samp2[[i]]) } # this doesn't work Am I missing something, or is there a better solution? Thanks. Kang Min __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] creating a reverse geometric sequence
Hello, Can anyone think of a non-iterative way to generate a decreasing geometric sequence in R? For example, for a hypothetical function dg, I would like: dg(20) [1] 20 10 5 2 1 where I am using integer division by 2 to get each subsequent value in the sequence. There is of course: dg - function(x) { res - integer() while(x = 1) { res - c(res, x) x - x %/% 2 } res } dg(20) [1] 20 10 5 2 1 This implementation of 'dg' uses an interative 'while' loop. I'm simply wondering if there is a way to vectorize this process? Thanks, Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a reverse geometric sequence
Erik Iverson wrote: Hello, Can anyone think of a non-iterative way to generate a decreasing geometric sequence in R? For example, for a hypothetical function dg, I would like: dg(20) [1] 20 10 5 2 1 where I am using integer division by 2 to get each subsequent value in the sequence. There is of course: dg - function(x) { res - integer() while(x = 1) { res - c(res, x) x - x %/% 2 } res } dg(20) [1] 20 10 5 2 1 This implementation of 'dg' uses an interative 'while' loop. I'm simply wondering if there is a way to vectorize this process? Something like this should work, at least for integer bases: base - 2 len - ceiling(log(x, base)) floor(x/base^(seq_len(len)-1)) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotCI overlay
I'm using the plotCI function and I'd like to overlay additional means with CIs onto an existing plotCI-created plot in a different color. Is this possible? Thanks. Rick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a reverse geometric sequence
Erik Iverson er...@ccbr.umn.edu writes: Hello, Can anyone think of a non-iterative way to generate a decreasing geometric sequence in R? For example, for a hypothetical function dg, I would like: dg(20) [1] 20 10 5 2 1 where I am using integer division by 2 to get each subsequent value in the sequence. There is of course: dg - function(x) { res - integer() while(x = 1) { res - c(res, x) x - x %/% 2 } res } dg(20) [1] 20 10 5 2 1 This implementation of 'dg' uses an interative 'while' loop. I'm simply wondering if there is a way to vectorize this process? Hi Erik, How about dg - function(x) { maxi - floor(log(x)/log(2)) floor(x / (2^(0:maxi))) } I don't think the remainders cause a problem. Dan Thanks, Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotCI overlay
Rick Reiss rreiss at exponent.com writes: I'm using the plotCI function and I'd like to overlay additional means with CIs onto an existing plotCI-created plot in a different color. Is this possible? Thanks. Rick Assuming you mean the one from the plotrix package: use add=TRUE e.g.: library(plotrix) plotCI(1:5,1:5,1,xlim=c(0,6)) plotCI((1:5)+0.2,rep(4,5),0.5,col=2,add=TRUE) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a reverse geometric sequence
Erik Iverson eriki at ccbr.umn.edu writes: Can anyone think of a non-iterative way to generate a decreasing geometric sequence in R? Reduce(%/%,rep(2,4),init=20,accum=TRUE) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a reverse geometric sequence
On May 23, 2010, at 1:43 PM, Erik Iverson wrote: Hello, Can anyone think of a non-iterative way to generate a decreasing geometric sequence in R? For example, for a hypothetical function dg, I would like: dg(20) [1] 20 10 5 2 1 where I am using integer division by 2 to get each subsequent value in the sequence. dg - function(ratio, len) (ratio)^( 0:(len-1) ) 20*dg(.5, 20) [1] 2.00e+01 1.00e+01 5.00e+00 2.50e+00 1.25e+00 6.25e-01 [7] 3.125000e-01 1.562500e-01 7.812500e-02 3.906250e-02 1.953125e-02 9.765625e-03 [13] 4.882812e-03 2.441406e-03 1.220703e-03 6.103516e-04 3.051758e-04 1.525879e-04 [19] 7.629395e-05 3.814697e-05 (20*dg(.5, 20))[1:19] / (20*dg(.5, 20))[2:20] [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 There is of course: dg - function(x) { res - integer() while(x = 1) { res - c(res, x) x - x %/% 2 } res } dg(20) [1] 20 10 5 2 1 This implementation of 'dg' uses an interative 'while' loop. I'm simply wondering if there is a way to vectorize this process? Thanks, Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] order issue
Hi everybody, this is a real dummy thing. I sorted a matrix based on a given column, and what I get is right, until it comes to columns of negative and positive values; than, order orders everything from max to min in the negative values, and then AGAIN from max to min in the positive values!!! Why isn't everything order from max to min, and that's it? Thank you!!! Attached is the txt file I use; try: x=x[order(x[,2]),] What I get is: print(x) Product A B Tissue 44 ME:MDA_MB_435 -0.1915-0.16744 Melanoma 17 CNS:SNB_75-0.23183 1.03945 CNS 37 LE:K_562-0.58218 1.8581 Leukemia 43ME:MALME_3M-0.67327-1.33493 Melanoma 49ME:UACC_257-0.72431-1.84753 Melanoma 42 ME:M14-0.73942-0.73904 Melanoma 40 LE:SR-0.93541 2.95346 Leukemia 25 CO:SW_620-1.53265-1.35446Colon 63 RE:CAKI_1-2.48443 0.43245Renal 39 LE:RPMI_8226-2.59561 -1.9448 Leukemia 26LC:A549-2.66221 0.71215 Lung 61RE:A498-2.89402 0.93287Renal 9 BR:HS578T-2.94118 1.1217 Breast 34LC:NCI_H522-2.94381 0.3859 Lung 66 RE:TK_10-2.95281 1.26245Renal 52 OV:NCI_ADR_RES-3.04456 0.17046 Ovarian 57 OV:SK_OV_3-3.04477 2.15405 Ovarian 53 OV:OVCAR_3 -3.0705-0.31743 Ovarian 14 CNS:SF_295-3.09348-1.00095 CNS 54 OV:OVCAR_4-3.13137-0.47497 Ovarian 36 LE:HL_60-3.16745-3.16745 Leukemia 38 LE:MOLT_4-3.20055-1.72841 Leukemia 11 BR:MDA_MB_231-3.24907 1.58326 Breast 59PR:PC_3-3.36612 1.39328 Prostate 19 CO:HCT_116-3.39764 0.43061Colon 12BR:T47D-3.41228 1.13818 Breast 22 CO:HCT_15-3.45342 0.16357Colon 64 RE:RXF_393-3.49615 2.59144Renal 28 LC:HOP_62 -3.4968 0.67884 Lung 60 RE:786_0 -3.5086 1.75056Renal 35LE:CCRF_CEM-3.54526-2.09262 Leukemia 29 LC:HOP_92-3.60636 0.87116 Lung 21CO:HCC_2998-3.61457-0.32362Colon 13 CNS:SF_268-3.63916 2.54378 CNS 20 CO:COLO205-3.64656 0.54344Colon 56 OV:OVCAR_8-3.66053 -0.9594 Ovarian 24CO:KM12-3.68703 2.19991Colon 55 OV:OVCAR_5 -3.7852 2.43038 Ovarian 8 BR:BT_549-3.80239-0.43099 Breast 15 CNS:SF_539-3.86184 1.39114 CNS 65 RE:SN12C-3.90776 0.85244Renal 31 LC:NCI_H23-3.91625-1.14955 Lung 62RE:ACHN-3.96246-0.62365Renal 67 RE:UO_31-3.99791-1.09215Renal 10BR:MCF7-4.00187 1.46303 Breast 51 OV:IGROV1-4.02758 2.04324 Ovarian 23CO:HT29-4.11624-0.02799Colon 41 ME:LOXIMVI -4.2572 0.37259 Melanoma 32 LC:NCI_H322M-4.28534 1.66783 Lung 27LC:EKVX-4.32847 1.66042 Lung 58 PR:DU_145-4.33961 1.57548 Prostate 30LC:NCI_H226-4.37408-0.22311 Lung 33LC:NCI_H460 0.0042 -0.6023 Lung 18 CNS:U251 0.01263 1.66389 CNS 16 CNS:SNB_19 0.16583 0.03737 CNS 45 ME:MDA_N 0.21077 0.05502 Melanoma 50 ME:UACC_62 0.52503 0.1605 Melanoma 46ME:SK_MEL_2 0.55255 -1.6667 Melanoma 47 ME:SK_MEL_28 1.7425 1.45266 Melanoma 48ME:SK_MEL_5 1.74749-1.47817 Melanoma Gabriele Zoppoli, MD Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy Guest Researcher, LMP, NCI, NIH, Bethesda MD Work: 301-451-8575 Mobile: 301-204-5642 Email: zoppo...@mail.nih.govProduct A B Tissue 44ME:MDA_MB_435 -0.1915 -0.16744 Melanoma 17CNS:SNB_75-0.23183 1.03945 CNS 37LE:K_562 -0.58218 1.8581Leukemia 43ME:MALME_3M -0.67327 -1.33493 Melanoma 49ME:UACC_257 -0.72431 -1.84753 Melanoma 42ME:M14-0.73942 -0.73904 Melanoma 40LE:SR -0.93541 2.95346 Leukemia 25CO:SW_620 -1.53265 -1.35446 Colon 63RE:CAKI_1 -2.48443 0.43245 Renal 39LE:RPMI_8226 -2.59561 -1.9448 Leukemia 26LC:A549 -2.66221 0.71215 Lung 61RE:A498 -2.89402 0.93287 Renal 9 BR:HS578T -2.94118 1.1217Breast 34LC:NCI_H522 -2.94381 0.3859Lung 66RE:TK_10 -2.95281 1.26245 Renal 52OV:NCI_ADR_RES-3.04456 0.17046 Ovarian 57OV:SK_OV_3-3.04477 2.15405 Ovarian 53OV:OVCAR_3-3.0705 -0.31743 Ovarian 14CNS:SF_295-3.09348 -1.00095 CNS 54OV:OVCAR_4-3.13137 -0.47497 Ovarian 36LE:HL_60 -3.16745 -3.16745 Leukemia 38LE:MOLT_4
Re: [R] order issue
do 'str' on your object to see if you have factors where you think you have numerics. What is the problem you are trying to solve? Sent from my iPhone. On May 23, 2010, at 17:39, Zoppoli, Gabriele (NIH/NCI) [G] zoppo...@mail.nih.gov wrote: Hi everybody, this is a real dummy thing. I sorted a matrix based on a given column, and what I get is right, until it comes to columns of negative and positive values; than, order orders everything from max to min in the negative values, and then AGAIN from max to min in the positive values!!! Why isn't everything order from max to min, and that's it? Thank you!!! Attached is the txt file I use; try: x=x[order(x[,2]),] What I get is: print(x) Product A B Tissue 44 ME:MDA_MB_435 -0.1915-0.16744 Melanoma 17 CNS:SNB_75-0.23183 1.03945 CNS 37 LE:K_562-0.58218 1.8581 Leukemia 43ME:MALME_3M-0.67327-1.33493 Melanoma 49ME:UACC_257-0.72431-1.84753 Melanoma 42 ME:M14-0.73942-0.73904 Melanoma 40 LE:SR-0.93541 2.95346 Leukemia 25 CO:SW_620-1.53265-1.35446Colon 63 RE:CAKI_1-2.48443 0.43245Renal 39 LE:RPMI_8226-2.59561 -1.9448 Leukemia 26LC:A549-2.66221 0.71215 Lung 61RE:A498-2.89402 0.93287Renal 9 BR:HS578T-2.94118 1.1217 Breast 34LC:NCI_H522-2.94381 0.3859 Lung 66 RE:TK_10-2.95281 1.26245Renal 52 OV:NCI_ADR_RES-3.04456 0.17046 Ovarian 57 OV:SK_OV_3-3.04477 2.15405 Ovarian 53 OV:OVCAR_3 -3.0705-0.31743 Ovarian 14 CNS:SF_295-3.09348-1.00095 CNS 54 OV:OVCAR_4-3.13137-0.47497 Ovarian 36 LE:HL_60-3.16745-3.16745 Leukemia 38 LE:MOLT_4-3.20055-1.72841 Leukemia 11 BR:MDA_MB_231-3.24907 1.58326 Breast 59PR:PC_3-3.36612 1.39328 Prostate 19 CO:HCT_116-3.39764 0.43061Colon 12BR:T47D-3.41228 1.13818 Breast 22 CO:HCT_15-3.45342 0.16357Colon 64 RE:RXF_393-3.49615 2.59144Renal 28 LC:HOP_62 -3.4968 0.67884 Lung 60 RE:786_0 -3.5086 1.75056Renal 35LE:CCRF_CEM-3.54526-2.09262 Leukemia 29 LC:HOP_92-3.60636 0.87116 Lung 21CO:HCC_2998-3.61457-0.32362Colon 13 CNS:SF_268-3.63916 2.54378 CNS 20 CO:COLO205-3.64656 0.54344Colon 56 OV:OVCAR_8-3.66053 -0.9594 Ovarian 24CO:KM12-3.68703 2.19991Colon 55 OV:OVCAR_5 -3.7852 2.43038 Ovarian 8 BR:BT_549-3.80239-0.43099 Breast 15 CNS:SF_539-3.86184 1.39114 CNS 65 RE:SN12C-3.90776 0.85244Renal 31 LC:NCI_H23-3.91625-1.14955 Lung 62RE:ACHN-3.96246-0.62365Renal 67 RE:UO_31-3.99791-1.09215Renal 10BR:MCF7-4.00187 1.46303 Breast 51 OV:IGROV1-4.02758 2.04324 Ovarian 23CO:HT29-4.11624-0.02799Colon 41 ME:LOXIMVI -4.2572 0.37259 Melanoma 32 LC:NCI_H322M-4.28534 1.66783 Lung 27LC:EKVX-4.32847 1.66042 Lung 58 PR:DU_145-4.33961 1.57548 Prostate 30LC:NCI_H226-4.37408-0.22311 Lung 33LC:NCI_H460 0.0042 -0.6023 Lung 18 CNS:U251 0.01263 1.66389 CNS 16 CNS:SNB_19 0.16583 0.03737 CNS 45 ME:MDA_N 0.21077 0.05502 Melanoma 50 ME:UACC_62 0.52503 0.1605 Melanoma 46ME:SK_MEL_2 0.55255 -1.6667 Melanoma 47 ME:SK_MEL_28 1.7425 1.45266 Melanoma 48ME:SK_MEL_5 1.74749-1.47817 Melanoma Gabriele Zoppoli, MD Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy Guest Researcher, LMP, NCI, NIH, Bethesda MD Work: 301-451-8575 Mobile: 301-204-5642 Email: zoppo...@mail.nih.gov x.txt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] order issue
I tried this, but it doesn't change the division between negative and positive values (see that you have first positive from max to min, and then negative from min to max, as if order considered only the absolute values...) Product hsa.miR.204 hsa.miR.210 Tissue 48 ME:SK_MEL_51.74749 -1.47817 Melanoma 47 ME:SK_MEL_28 1.74251.45266 Melanoma 46 ME:SK_MEL_20.55255 -1.6667 Melanoma 50 ME:UACC_62 0.52503 0.1605Melanoma 45 ME:MDA_N 0.21077 0.05502 Melanoma 16 CNS:SNB_19 0.16583 0.03737 CNS 18 CNS:U251 0.01263 1.66389 CNS 33 LC:NCI_H4600.0042-0.6023 Lung 30 LC:NCI_H226-4.37408 -0.22311 Lung 58 PR:DU_145 -4.33961 1.57548 Prostate 27 LC:EKVX-4.32847 1.66042 Lung 32 LC:NCI_H322M -4.28534 1.66783 Lung 41 ME:LOXIMVI -4.2572 0.37259 Melanoma 23 CO:HT29-4.11624 -0.02799 Colon 51 OV:IGROV1 -4.02758 2.04324 Ovarian 10 BR:MCF7-4.00187 1.46303 Breast 67 RE:UO_31 -3.99791 -1.09215 Renal 62 RE:ACHN-3.96246 -0.62365 Renal 31 LC:NCI_H23 -3.91625 -1.14955 Lung 65 RE:SN12C -3.90776 0.85244 Renal 15 CNS:SF_539 -3.86184 1.39114 CNS 8 BR:BT_549 -3.80239 -0.43099 Breast 55 OV:OVCAR_5 -3.7852 2.43038 Ovarian 24 CO:KM12-3.68703 2.19991 Colon 56 OV:OVCAR_8 -3.66053 -0.9594 Ovarian 20 CO:COLO205 -3.64656 0.54344 Colon 13 CNS:SF_268 -3.63916 2.54378 CNS 21 CO:HCC_2998-3.61457 -0.32362 Colon 29 LC:HOP_92 -3.60636 0.87116 Lung 35 LE:CCRF_CEM-3.54526 -2.09262 Leukemia 60 RE:786_0 -3.5086 1.75056 Renal 28 LC:HOP_62 -3.4968 0.67884 Lung 64 RE:RXF_393 -3.49615 2.59144 Renal 22 CO:HCT_15 -3.45342 0.16357 Colon 12 BR:T47D-3.41228 1.13818 Breast 19 CO:HCT_116 -3.39764 0.43061 Colon 59 PR:PC_3-3.36612 1.39328 Prostate 11 BR:MDA_MB_231 -3.24907 1.58326 Breast 38 LE:MOLT_4 -3.20055 -1.72841 Leukemia 36 LE:HL_60 -3.16745 -3.16745 Leukemia 54 OV:OVCAR_4 -3.13137 -0.47497 Ovarian 14 CNS:SF_295 -3.09348 -1.00095 CNS 53 OV:OVCAR_3 -3.0705 -0.31743 Ovarian 57 OV:SK_OV_3 -3.04477 2.15405 Ovarian 52 OV:NCI_ADR_RES -3.04456 0.17046 Ovarian 66 RE:TK_10 -2.95281 1.26245 Renal 34 LC:NCI_H522-2.94381 0.3859Lung 9 BR:HS578T -2.94118 1.1217Breast 61 RE:A498-2.89402 0.93287 Renal 26 LC:A549-2.66221 0.71215 Lung 39 LE:RPMI_8226 -2.59561 -1.9448 Leukemia 63 RE:CAKI_1 -2.48443 0.43245 Renal 25 CO:SW_620 -1.53265 -1.35446 Colon 40 LE:SR -0.93541 2.95346 Leukemia 42 ME:M14 -0.73942 -0.73904 Melanoma 49 ME:UACC_257-0.72431 -1.84753 Melanoma 43 ME:MALME_3M-0.67327 -1.33493 Melanoma 37 LE:K_562 -0.58218 1.8581Leukemia 17 CNS:SNB_75 -0.23183 1.03945 CNS 44 ME:MDA_MB_435 -0.1915 -0.16744 Melanoma Gabriele Zoppoli, MD Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy Guest Researcher, LMP, NCI, NIH, Bethesda MD Work: 301-451-8575 Mobile: 301-204-5642 Email: zoppo...@mail.nih.gov From: Jorge Ivan Velez [jorgeivanve...@gmail.com] Sent: Sunday, May 23, 2010 6:09 PM To: Zoppoli, Gabriele (NIH/NCI) [G] Subject: Re: [R] order issue Hi Gabriele, Take a look at the decreasing argument in ?order xx -c(1,3,4,10, 5,2,3,8,9) xx [1] 1 3 4 10 5 2 3 8 9 xx[order(xx)] [1] 1 2 3 3 4 5 8 9 10 xx[order(xx, decreasing = TRUE)] [1] 10 9 8 5 4 3 3 2 1 HTH, Jorge On Sun, May 23, 2010 at 5:39 PM, Zoppoli, Gabriele (NIH/NCI) [G] zoppo...@mail.nih.govmailto:zoppo...@mail.nih.gov wrote: Hi everybody, this is a real dummy thing. I sorted a matrix based on a given column, and what I get is right, until it comes to columns of negative and positive values; than, order orders everything from max to min in the negative values, and then AGAIN from max to min in the positive values!!! Why isn't everything order from max to min, and that's it? Thank you!!! Attached is the txt file I use; try: x=x[order(x[,2]),] What I get is: print(x) Product A B Tissue 44 ME:MDA_MB_435 -0.1915-0.16744 Melanoma 17 CNS:SNB_75-0.23183 1.03945 CNS 37 LE:K_562-0.58218 1.8581 Leukemia 43ME:MALME_3M-0.67327-1.33493 Melanoma 49ME:UACC_257-0.72431-1.84753 Melanoma 42 ME:M14-0.73942-0.73904 Melanoma 40 LE:SR-0.93541 2.95346 Leukemia 25 CO:SW_620-1.53265-1.35446Colon 63 RE:CAKI_1-2.48443 0.43245Renal 39 LE:RPMI_8226-2.59561 -1.9448 Leukemia 26LC:A549-2.66221 0.71215 Lung 61RE:A498-2.89402 0.93287
Re: [R] order issue
On 23-May-10 21:39:06, Zoppoli, Gabriele (NIH/NCI) [G] wrote: Hi everybody, this is a real dummy thing. I sorted a matrix based on a given column, and what I get is right, until it comes to columns of negative and positive values; than, order orders everything from max to min in the negative values, and then AGAIN from max to min in the positive values!!! Why isn't everything order from max to min, and that's it? Thank you!!! Attached is the txt file I use; try: x=x[order(x[,2]),] What I get is: print(x) Product A B Tissue 44 ME:MDA_MB_435 -0.1915-0.16744 Melanoma 17 CNS:SNB_75-0.23183 1.03945 CNS 37 LE:K_562-0.58218 1.8581 Leukemia 43ME:MALME_3M-0.67327-1.33493 Melanoma 49ME:UACC_257-0.72431-1.84753 Melanoma 42 ME:M14-0.73942-0.73904 Melanoma 40 LE:SR-0.93541 2.95346 Leukemia 25 CO:SW_620-1.53265-1.35446Colon 63 RE:CAKI_1-2.48443 0.43245Renal 39 LE:RPMI_8226-2.59561 -1.9448 Leukemia 26LC:A549-2.66221 0.71215 Lung 61RE:A498-2.89402 0.93287Renal 9 BR:HS578T-2.94118 1.1217 Breast 34LC:NCI_H522-2.94381 0.3859 Lung 66 RE:TK_10-2.95281 1.26245Renal 52 OV:NCI_ADR_RES-3.04456 0.17046 Ovarian 57 OV:SK_OV_3-3.04477 2.15405 Ovarian 53 OV:OVCAR_3 -3.0705-0.31743 Ovarian 14 CNS:SF_295-3.09348-1.00095 CNS 54 OV:OVCAR_4-3.13137-0.47497 Ovarian 36 LE:HL_60-3.16745-3.16745 Leukemia 38 LE:MOLT_4-3.20055-1.72841 Leukemia 11 BR:MDA_MB_231-3.24907 1.58326 Breast 59PR:PC_3-3.36612 1.39328 Prostate 19 CO:HCT_116-3.39764 0.43061Colon 12BR:T47D-3.41228 1.13818 Breast 22 CO:HCT_15-3.45342 0.16357Colon 64 RE:RXF_393-3.49615 2.59144Renal 28 LC:HOP_62 -3.4968 0.67884 Lung 60 RE:786_0 -3.5086 1.75056Renal 35LE:CCRF_CEM-3.54526-2.09262 Leukemia 29 LC:HOP_92-3.60636 0.87116 Lung 21CO:HCC_2998-3.61457-0.32362Colon 13 CNS:SF_268-3.63916 2.54378 CNS 20 CO:COLO205-3.64656 0.54344Colon 56 OV:OVCAR_8-3.66053 -0.9594 Ovarian 24CO:KM12-3.68703 2.19991Colon 55 OV:OVCAR_5 -3.7852 2.43038 Ovarian 8 BR:BT_549-3.80239-0.43099 Breast 15 CNS:SF_539-3.86184 1.39114 CNS 65 RE:SN12C-3.90776 0.85244Renal 31 LC:NCI_H23-3.91625-1.14955 Lung 62RE:ACHN-3.96246-0.62365Renal 67 RE:UO_31-3.99791-1.09215Renal 10BR:MCF7-4.00187 1.46303 Breast 51 OV:IGROV1-4.02758 2.04324 Ovarian 23CO:HT29-4.11624-0.02799Colon 41 ME:LOXIMVI -4.2572 0.37259 Melanoma 32 LC:NCI_H322M-4.28534 1.66783 Lung 27LC:EKVX-4.32847 1.66042 Lung 58 PR:DU_145-4.33961 1.57548 Prostate 30LC:NCI_H226-4.37408-0.22311 Lung 33LC:NCI_H460 0.0042 -0.6023 Lung 18 CNS:U251 0.01263 1.66389 CNS 16 CNS:SNB_19 0.16583 0.03737 CNS 45 ME:MDA_N 0.21077 0.05502 Melanoma 50 ME:UACC_62 0.52503 0.1605 Melanoma 46ME:SK_MEL_2 0.55255 -1.6667 Melanoma 47 ME:SK_MEL_28 1.7425 1.45266 Melanoma 48ME:SK_MEL_5 1.74749-1.47817 Melanoma Gabriele Zoppoli, MD Somewhat strange indeed! The only further question I can think of is to ask how what did x look like before your re-ordered it. Using the x.txt file you supplied, I get: x - read.table(x.txt) str(x) # 'data.frame': 60 obs. of 4 variables: # $ Product: Factor w/ 60 levels BR:BT_549,BR:HS578T,..: 37 10 30 #36 42 35 33 18 56 32 ... # $ A : num -0.192 -0.232 -0.582 -0.673 -0.724 ... # $ B : num -0.167 1.039 1.858 -1.335 -1.848 ... # $ Tissue : Factor w/ 9 levels Breast,CNS,..: 6 2 4 6 6 6 4 3 9 4 #... so x[,2] and x[,3] are indeed numeric. Then (similar to yours): X-x[order(x[,2]),] print(X) # ProductAB Tissue # 30LC:NCI_H226 -4.37408 -0.22311 Lung # 58 PR:DU_145 -4.33961 1.57548 Prostate # 27LC:EKVX -4.32847 1.66042 Lung # 32 LC:NCI_H322M -4.28534 1.66783 Lung # 41 ME:LOXIMVI -4.25720 0.37259 Melanoma # 23CO:HT29 -4.11624 -0.02799Colon # 51 OV:IGROV1 -4.02758 2.04324 Ovarian # 10BR:MCF7 -4.00187 1.46303 Breast # 67 RE:UO_31 -3.99791 -1.09215Renal # 62RE:ACHN -3.96246 -0.62365Renal # 31 LC:NCI_H23 -3.91625 -1.14955 Lung # 65 RE:SN12C -3.90776 0.85244Renal
Re: [R] order issue
This is what I get: str(x) chr [1:60, 1:4] ME:SK_MEL_5 ME:SK_MEL_28 ME:SK_MEL_2 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:60] 48 47 46 50 ... ..$ : chr [1:4] Product hsa.miR.204 hsa.miR.210 Tissue It doesn't make much sense to me... I would like to have the second column ordered from max to min, or from min to max (with the argument decreasing=TRUE), but order seems to reorder everything without considering negative number as smaller than positive ones... Gabriele Zoppoli, MD Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy Guest Researcher, LMP, NCI, NIH, Bethesda MD Work: 301-451-8575 Mobile: 301-204-5642 Email: zoppo...@mail.nih.gov From: Jim Holtman [jholt...@gmail.com] Sent: Sunday, May 23, 2010 6:07 PM To: Zoppoli, Gabriele (NIH/NCI) [G] Cc: R help Subject: Re: [R] order issue do 'str' on your object to see if you have factors where you think you have numerics. What is the problem you are trying to solve? Sent from my iPhone. On May 23, 2010, at 17:39, Zoppoli, Gabriele (NIH/NCI) [G] zoppo...@mail.nih.gov wrote: Hi everybody, this is a real dummy thing. I sorted a matrix based on a given column, and what I get is right, until it comes to columns of negative and positive values; than, order orders everything from max to min in the negative values, and then AGAIN from max to min in the positive values!!! Why isn't everything order from max to min, and that's it? Thank you!!! Attached is the txt file I use; try: x=x[order(x[,2]),] What I get is: print(x) Product A B Tissue 44 ME:MDA_MB_435 -0.1915-0.16744 Melanoma 17 CNS:SNB_75-0.23183 1.03945 CNS 37 LE:K_562-0.58218 1.8581 Leukemia 43ME:MALME_3M-0.67327-1.33493 Melanoma 49ME:UACC_257-0.72431-1.84753 Melanoma 42 ME:M14-0.73942-0.73904 Melanoma 40 LE:SR-0.93541 2.95346 Leukemia 25 CO:SW_620-1.53265-1.35446Colon 63 RE:CAKI_1-2.48443 0.43245Renal 39 LE:RPMI_8226-2.59561 -1.9448 Leukemia 26LC:A549-2.66221 0.71215 Lung 61RE:A498-2.89402 0.93287Renal 9 BR:HS578T-2.94118 1.1217 Breast 34LC:NCI_H522-2.94381 0.3859 Lung 66 RE:TK_10-2.95281 1.26245Renal 52 OV:NCI_ADR_RES-3.04456 0.17046 Ovarian 57 OV:SK_OV_3-3.04477 2.15405 Ovarian 53 OV:OVCAR_3 -3.0705-0.31743 Ovarian 14 CNS:SF_295-3.09348-1.00095 CNS 54 OV:OVCAR_4-3.13137-0.47497 Ovarian 36 LE:HL_60-3.16745-3.16745 Leukemia 38 LE:MOLT_4-3.20055-1.72841 Leukemia 11 BR:MDA_MB_231-3.24907 1.58326 Breast 59PR:PC_3-3.36612 1.39328 Prostate 19 CO:HCT_116-3.39764 0.43061Colon 12BR:T47D-3.41228 1.13818 Breast 22 CO:HCT_15-3.45342 0.16357Colon 64 RE:RXF_393-3.49615 2.59144Renal 28 LC:HOP_62 -3.4968 0.67884 Lung 60 RE:786_0 -3.5086 1.75056Renal 35LE:CCRF_CEM-3.54526-2.09262 Leukemia 29 LC:HOP_92-3.60636 0.87116 Lung 21CO:HCC_2998-3.61457-0.32362Colon 13 CNS:SF_268-3.63916 2.54378 CNS 20 CO:COLO205-3.64656 0.54344Colon 56 OV:OVCAR_8-3.66053 -0.9594 Ovarian 24CO:KM12-3.68703 2.19991Colon 55 OV:OVCAR_5 -3.7852 2.43038 Ovarian 8 BR:BT_549-3.80239-0.43099 Breast 15 CNS:SF_539-3.86184 1.39114 CNS 65 RE:SN12C-3.90776 0.85244Renal 31 LC:NCI_H23-3.91625-1.14955 Lung 62RE:ACHN-3.96246-0.62365Renal 67 RE:UO_31-3.99791-1.09215Renal 10BR:MCF7-4.00187 1.46303 Breast 51 OV:IGROV1-4.02758 2.04324 Ovarian 23CO:HT29-4.11624-0.02799Colon 41 ME:LOXIMVI -4.2572 0.37259 Melanoma 32 LC:NCI_H322M-4.28534 1.66783 Lung 27LC:EKVX-4.32847 1.66042 Lung 58 PR:DU_145-4.33961 1.57548 Prostate 30LC:NCI_H226-4.37408-0.22311 Lung 33LC:NCI_H460 0.0042 -0.6023 Lung 18 CNS:U251 0.01263 1.66389 CNS 16 CNS:SNB_19 0.16583 0.03737 CNS 45 ME:MDA_N 0.21077 0.05502 Melanoma 50 ME:UACC_62 0.52503 0.1605 Melanoma 46ME:SK_MEL_2 0.55255 -1.6667 Melanoma 47 ME:SK_MEL_28 1.7425 1.45266 Melanoma 48ME:SK_MEL_5 1.74749-1.47817 Melanoma Gabriele Zoppoli, MD Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy Guest Researcher, LMP, NCI, NIH, Bethesda MD Work: 301-451-8575 Mobile: 301-204-5642 Email:
Re: [R] order issue
crazy stuff!!! I tried to reload the txt file, and now it's working... this is the original (attached) thanks! Gabriele Zoppoli, MD Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy Guest Researcher, LMP, NCI, NIH, Bethesda MD Work: 301-451-8575 Mobile: 301-204-5642 Email: zoppo...@mail.nih.gov From: Ted Harding [ted.hard...@manchester.ac.uk] Sent: Sunday, May 23, 2010 6:31 PM To: Zoppoli, Gabriele (NIH/NCI) [G] Cc: R help Subject: RE: [R] order issue On 23-May-10 21:39:06, Zoppoli, Gabriele (NIH/NCI) [G] wrote: Hi everybody, this is a real dummy thing. I sorted a matrix based on a given column, and what I get is right, until it comes to columns of negative and positive values; than, order orders everything from max to min in the negative values, and then AGAIN from max to min in the positive values!!! Why isn't everything order from max to min, and that's it? Thank you!!! Attached is the txt file I use; try: x=x[order(x[,2]),] What I get is: print(x) Product A B Tissue 44 ME:MDA_MB_435 -0.1915-0.16744 Melanoma 17 CNS:SNB_75-0.23183 1.03945 CNS 37 LE:K_562-0.58218 1.8581 Leukemia 43ME:MALME_3M-0.67327-1.33493 Melanoma 49ME:UACC_257-0.72431-1.84753 Melanoma 42 ME:M14-0.73942-0.73904 Melanoma 40 LE:SR-0.93541 2.95346 Leukemia 25 CO:SW_620-1.53265-1.35446Colon 63 RE:CAKI_1-2.48443 0.43245Renal 39 LE:RPMI_8226-2.59561 -1.9448 Leukemia 26LC:A549-2.66221 0.71215 Lung 61RE:A498-2.89402 0.93287Renal 9 BR:HS578T-2.94118 1.1217 Breast 34LC:NCI_H522-2.94381 0.3859 Lung 66 RE:TK_10-2.95281 1.26245Renal 52 OV:NCI_ADR_RES-3.04456 0.17046 Ovarian 57 OV:SK_OV_3-3.04477 2.15405 Ovarian 53 OV:OVCAR_3 -3.0705-0.31743 Ovarian 14 CNS:SF_295-3.09348-1.00095 CNS 54 OV:OVCAR_4-3.13137-0.47497 Ovarian 36 LE:HL_60-3.16745-3.16745 Leukemia 38 LE:MOLT_4-3.20055-1.72841 Leukemia 11 BR:MDA_MB_231-3.24907 1.58326 Breast 59PR:PC_3-3.36612 1.39328 Prostate 19 CO:HCT_116-3.39764 0.43061Colon 12BR:T47D-3.41228 1.13818 Breast 22 CO:HCT_15-3.45342 0.16357Colon 64 RE:RXF_393-3.49615 2.59144Renal 28 LC:HOP_62 -3.4968 0.67884 Lung 60 RE:786_0 -3.5086 1.75056Renal 35LE:CCRF_CEM-3.54526-2.09262 Leukemia 29 LC:HOP_92-3.60636 0.87116 Lung 21CO:HCC_2998-3.61457-0.32362Colon 13 CNS:SF_268-3.63916 2.54378 CNS 20 CO:COLO205-3.64656 0.54344Colon 56 OV:OVCAR_8-3.66053 -0.9594 Ovarian 24CO:KM12-3.68703 2.19991Colon 55 OV:OVCAR_5 -3.7852 2.43038 Ovarian 8 BR:BT_549-3.80239-0.43099 Breast 15 CNS:SF_539-3.86184 1.39114 CNS 65 RE:SN12C-3.90776 0.85244Renal 31 LC:NCI_H23-3.91625-1.14955 Lung 62RE:ACHN-3.96246-0.62365Renal 67 RE:UO_31-3.99791-1.09215Renal 10BR:MCF7-4.00187 1.46303 Breast 51 OV:IGROV1-4.02758 2.04324 Ovarian 23CO:HT29-4.11624-0.02799Colon 41 ME:LOXIMVI -4.2572 0.37259 Melanoma 32 LC:NCI_H322M-4.28534 1.66783 Lung 27LC:EKVX-4.32847 1.66042 Lung 58 PR:DU_145-4.33961 1.57548 Prostate 30LC:NCI_H226-4.37408-0.22311 Lung 33LC:NCI_H460 0.0042 -0.6023 Lung 18 CNS:U251 0.01263 1.66389 CNS 16 CNS:SNB_19 0.16583 0.03737 CNS 45 ME:MDA_N 0.21077 0.05502 Melanoma 50 ME:UACC_62 0.52503 0.1605 Melanoma 46ME:SK_MEL_2 0.55255 -1.6667 Melanoma 47 ME:SK_MEL_28 1.7425 1.45266 Melanoma 48ME:SK_MEL_5 1.74749-1.47817 Melanoma Gabriele Zoppoli, MD Somewhat strange indeed! The only further question I can think of is to ask how what did x look like before your re-ordered it. Using the x.txt file you supplied, I get: x - read.table(x.txt) str(x) # 'data.frame': 60 obs. of 4 variables: # $ Product: Factor w/ 60 levels BR:BT_549,BR:HS578T,..: 37 10 30 #36 42 35 33 18 56 32 ... # $ A : num -0.192 -0.232 -0.582 -0.673 -0.724 ... # $ B : num -0.167 1.039 1.858 -1.335 -1.848 ... # $ Tissue : Factor w/ 9 levels Breast,CNS,..: 6 2 4 6 6 6 4 3 9 4 #... so x[,2] and x[,3] are indeed numeric. Then (similar to yours): X-x[order(x[,2]),] print(X) # ProductAB Tissue # 30LC:NCI_H226 -4.37408 -0.22311
Re: [R] order issue
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Zoppoli, Gabriele (NIH/NCI) [G] Sent: Sunday, May 23, 2010 3:44 PM To: ted.hard...@manchester.ac.uk Cc: R-help@r-project.org Subject: Re: [R] order issue crazy stuff!!! I tried to reload the txt file, and now it's working... When you reloaded the txt file (with what function?) it probably was made into a data.frame, with some columns factors or characters and some columns numerics. It looks like your original problem arose after you converted that data.frame into a matrix, all of whose columns must be the same (character in this case). Sorting character representations of numbers is different than sorting the numbers as numbers. sort(c(1, 0.05, 0., -0.10, -2)) [1] -2.00 -0.10 0.00 0.05 1.00 sort(as.character(c(1, 0.05, 0., -0.10, -2))) [1] -0.1 -2 00.05 1 Use str(x) again to see if this is what is happening. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com this is the original (attached) thanks! Gabriele Zoppoli, MD Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy Guest Researcher, LMP, NCI, NIH, Bethesda MD Work: 301-451-8575 Mobile: 301-204-5642 Email: zoppo...@mail.nih.gov From: Ted Harding [ted.hard...@manchester.ac.uk] Sent: Sunday, May 23, 2010 6:31 PM To: Zoppoli, Gabriele (NIH/NCI) [G] Cc: R help Subject: RE: [R] order issue On 23-May-10 21:39:06, Zoppoli, Gabriele (NIH/NCI) [G] wrote: Hi everybody, this is a real dummy thing. I sorted a matrix based on a given column, and what I get is right, until it comes to columns of negative and positive values; than, order orders everything from max to min in the negative values, and then AGAIN from max to min in the positive values!!! Why isn't everything order from max to min, and that's it? Thank you!!! Attached is the txt file I use; try: x=x[order(x[,2]),] What I get is: print(x) Product A B Tissue 44 ME:MDA_MB_435 -0.1915-0.16744 Melanoma 17 CNS:SNB_75-0.23183 1.03945 CNS 37 LE:K_562-0.58218 1.8581 Leukemia 43ME:MALME_3M-0.67327-1.33493 Melanoma 49ME:UACC_257-0.72431-1.84753 Melanoma 42 ME:M14-0.73942-0.73904 Melanoma 40 LE:SR-0.93541 2.95346 Leukemia 25 CO:SW_620-1.53265-1.35446Colon 63 RE:CAKI_1-2.48443 0.43245Renal 39 LE:RPMI_8226-2.59561 -1.9448 Leukemia 26LC:A549-2.66221 0.71215 Lung 61RE:A498-2.89402 0.93287Renal 9 BR:HS578T-2.94118 1.1217 Breast 34LC:NCI_H522-2.94381 0.3859 Lung 66 RE:TK_10-2.95281 1.26245Renal 52 OV:NCI_ADR_RES-3.04456 0.17046 Ovarian 57 OV:SK_OV_3-3.04477 2.15405 Ovarian 53 OV:OVCAR_3 -3.0705-0.31743 Ovarian 14 CNS:SF_295-3.09348-1.00095 CNS 54 OV:OVCAR_4-3.13137-0.47497 Ovarian 36 LE:HL_60-3.16745-3.16745 Leukemia 38 LE:MOLT_4-3.20055-1.72841 Leukemia 11 BR:MDA_MB_231-3.24907 1.58326 Breast 59PR:PC_3-3.36612 1.39328 Prostate 19 CO:HCT_116-3.39764 0.43061Colon 12BR:T47D-3.41228 1.13818 Breast 22 CO:HCT_15-3.45342 0.16357Colon 64 RE:RXF_393-3.49615 2.59144Renal 28 LC:HOP_62 -3.4968 0.67884 Lung 60 RE:786_0 -3.5086 1.75056Renal 35LE:CCRF_CEM-3.54526-2.09262 Leukemia 29 LC:HOP_92-3.60636 0.87116 Lung 21CO:HCC_2998-3.61457-0.32362Colon 13 CNS:SF_268-3.63916 2.54378 CNS 20 CO:COLO205-3.64656 0.54344Colon 56 OV:OVCAR_8-3.66053 -0.9594 Ovarian 24CO:KM12-3.68703 2.19991Colon 55 OV:OVCAR_5 -3.7852 2.43038 Ovarian 8 BR:BT_549-3.80239-0.43099 Breast 15 CNS:SF_539-3.86184 1.39114 CNS 65 RE:SN12C-3.90776 0.85244Renal 31 LC:NCI_H23-3.91625-1.14955 Lung 62RE:ACHN-3.96246-0.62365Renal 67 RE:UO_31-3.99791-1.09215Renal 10BR:MCF7-4.00187 1.46303 Breast 51 OV:IGROV1-4.02758 2.04324 Ovarian 23CO:HT29-4.11624-0.02799Colon 41 ME:LOXIMVI -4.2572 0.37259 Melanoma 32 LC:NCI_H322M-4.28534 1.66783 Lung 27LC:EKVX-4.32847 1.66042 Lung 58 PR:DU_145-4.33961 1.57548 Prostate 30LC:NCI_H226-4.37408-0.22311 Lung 33LC:NCI_H460 0.0042 -0.6023 Lung 18 CNS:U251 0.01263 1.66389 CNS 16
Re: [R] order issue
On May 23, 2010, at 6:32 PM, Zoppoli, Gabriele (NIH/NCI) [G] wrote: This is what I get: str(x) chr [1:60, 1:4] ME:SK_MEL_5 ME:SK_MEL_28 ME:SK_MEL_2 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:60] 48 47 46 50 ... ..$ : chr [1:4] Product hsa.miR.204 hsa.miR.210 Tissue It doesn't make much sense to me... How did you bring that text file into R? Both Ted and I are getting: str(x) 'data.frame': 60 obs. of 4 variables: $ Product: Factor w/ 60 levels BR:BT_549,BR:HS578T,..: 37 10 30 36 42 35 33 18 56 32 ... $ A : num -0.192 -0.232 -0.582 -0.673 -0.724 ... $ B : num -0.167 1.039 1.858 -1.335 -1.848 ... $ Tissue : Factor w/ 9 levels Breast,CNS,..: 6 2 4 6 6 6 4 3 9 4 ... Your x is a 60 x 4 matrix of all character elements. If I try: x[ order(as.character(x[,2])),] I get the same behavior as you describe. -- David. I would like to have the second column ordered from max to min, or from min to max (with the argument decreasing=TRUE), but order seems to reorder everything without considering negative number as smaller than positive ones... Gabriele Zoppoli, MD Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy Guest Researcher, LMP, NCI, NIH, Bethesda MD Work: 301-451-8575 Mobile: 301-204-5642 Email: zoppo...@mail.nih.gov From: Jim Holtman [jholt...@gmail.com] Sent: Sunday, May 23, 2010 6:07 PM To: Zoppoli, Gabriele (NIH/NCI) [G] Cc: R help Subject: Re: [R] order issue do 'str' on your object to see if you have factors where you think you have numerics. What is the problem you are trying to solve? Sent from my iPhone. On May 23, 2010, at 17:39, Zoppoli, Gabriele (NIH/NCI) [G] zoppo...@mail.nih.gov wrote: Hi everybody, this is a real dummy thing. I sorted a matrix based on a given column, and what I get is right, until it comes to columns of negative and positive values; than, order orders everything from max to min in the negative values, and then AGAIN from max to min in the positive values!!! Why isn't everything order from max to min, and that's it? Thank you!!! Attached is the txt file I use; try: x=x[order(x[,2]),] What I get is: print(x) Product A B Tissue 44 ME:MDA_MB_435 -0.1915-0.16744 Melanoma 17 CNS:SNB_75-0.23183 1.03945 CNS 37 LE:K_562-0.58218 1.8581 Leukemia 43ME:MALME_3M-0.67327-1.33493 Melanoma 49ME:UACC_257-0.72431-1.84753 Melanoma 42 ME:M14-0.73942-0.73904 Melanoma 40 LE:SR-0.93541 2.95346 Leukemia 25 CO:SW_620-1.53265-1.35446Colon 63 RE:CAKI_1-2.48443 0.43245Renal 39 LE:RPMI_8226-2.59561 -1.9448 Leukemia 26LC:A549-2.66221 0.71215 Lung 61RE:A498-2.89402 0.93287Renal 9 BR:HS578T-2.94118 1.1217 Breast 34LC:NCI_H522-2.94381 0.3859 Lung 66 RE:TK_10-2.95281 1.26245Renal 52 OV:NCI_ADR_RES-3.04456 0.17046 Ovarian 57 OV:SK_OV_3-3.04477 2.15405 Ovarian 53 OV:OVCAR_3 -3.0705-0.31743 Ovarian 14 CNS:SF_295-3.09348-1.00095 CNS 54 OV:OVCAR_4-3.13137-0.47497 Ovarian 36 LE:HL_60-3.16745-3.16745 Leukemia 38 LE:MOLT_4-3.20055-1.72841 Leukemia 11 BR:MDA_MB_231-3.24907 1.58326 Breast 59PR:PC_3-3.36612 1.39328 Prostate 19 CO:HCT_116-3.39764 0.43061Colon 12BR:T47D-3.41228 1.13818 Breast 22 CO:HCT_15-3.45342 0.16357Colon 64 RE:RXF_393-3.49615 2.59144Renal 28 LC:HOP_62 -3.4968 0.67884 Lung 60 RE:786_0 -3.5086 1.75056Renal 35LE:CCRF_CEM-3.54526-2.09262 Leukemia 29 LC:HOP_92-3.60636 0.87116 Lung 21CO:HCC_2998-3.61457-0.32362Colon 13 CNS:SF_268-3.63916 2.54378 CNS 20 CO:COLO205-3.64656 0.54344Colon 56 OV:OVCAR_8-3.66053 -0.9594 Ovarian 24CO:KM12-3.68703 2.19991Colon 55 OV:OVCAR_5 -3.7852 2.43038 Ovarian 8 BR:BT_549-3.80239-0.43099 Breast 15 CNS:SF_539-3.86184 1.39114 CNS 65 RE:SN12C-3.90776 0.85244Renal 31 LC:NCI_H23-3.91625-1.14955 Lung 62RE:ACHN-3.96246-0.62365Renal 67 RE:UO_31-3.99791-1.09215Renal 10BR:MCF7-4.00187 1.46303 Breast 51 OV:IGROV1-4.02758 2.04324 Ovarian 23CO:HT29-4.11624-0.02799Colon 41 ME:LOXIMVI -4.2572 0.37259 Melanoma 32 LC:NCI_H322M-4.28534 1.66783 Lung 27LC:EKVX-4.32847 1.66042 Lung 58 PR:DU_145-4.33961 1.57548 Prostate 30LC:NCI_H226-4.37408-0.22311 Lung 33LC:NCI_H460 0.0042 -0.6023 Lung 18 CNS:U251
[R] Split-plot design in GLM with only fixed factors.
Good evening gentlemen! I have a test in split-plot with randomized block design where my answer is a binomial variable. I wonder if there is any way I can calculate the probability of my factors considering the design errors in the case are two. I looked at various threads here and elsewhere, and unfortunately no to answer objective my problem that is very simple. My interest isn't to estimate variance components, so I see no reason to use functions like LM as found on most topics. If anyone knows a way to calculate the probability of my factors as we do in an LM model, ie, announcing the types of errors so that the probability factor in interest is not prejudiced, I'd be grateful. -- View this message in context: http://r.789695.n4.nabble.com/Split-plot-design-in-GLM-with-only-fixed-factors-tp2228126p2228126.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] order issue
after read.delim: 'data.frame': 60 obs. of 4 variables: $ Cell : Factor w/ 60 levels BR:BT_549,BR:HS578T,..: 23 51 20 25 34 16 44 3 60 55 ... $ hsa-miR-204: num -4.37 -4.34 -4.33 -4.29 -4.26 ... $ hsa-miR-210: num -0.223 1.575 1.66 1.668 0.373 ... $ Tissue : Factor w/ 9 levels Breast,CNS,..: 5 8 5 5 6 3 7 1 9 9 ... before: chr [1:60, 1:4] ME:SK_MEL_5 ME:SK_MEL_28 ME:SK_MEL_2 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:60] 48 47 46 50 ... ..$ : chr [1:4] Product hsa.miR.204 hsa.miR.210 Tissue Looks like the issue is that, after the first time I read.delimmed the txt file, I removed the first three raws by doing x=x[-c(1:3),] because the first three raws were characters (parameters like probe name, chromosomal position ecc.) So maybe R remembers that the columns used were characters and not numeric... How would you explain R (sorry for the naive definitions but I've learnt R over time by myself and I misuse some words, hope it's clear anyway) that a matrix is all numeric? by doing as.numeric(x), it transforms everything in a long colum of number, but loses the matrix structure... Thank you all guys! You're really precious! Now, how can you explain (sorry for my naive definitions...) R that now all of your values are numeric in a matrix? If you do as.numeric, everything becomes a long column of n Gabriele Zoppoli, MD Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy Guest Researcher, LMP, NCI, NIH, Bethesda MD Work: 301-451-8575 Mobile: 301-204-5642 Email: zoppo...@mail.nih.gov From: William Dunlap [wdun...@tibco.com] Sent: Sunday, May 23, 2010 7:05 PM To: Zoppoli, Gabriele (NIH/NCI) [G]; ted.hard...@manchester.ac.uk Cc: R-help@r-project.org Subject: RE: [R] order issue -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Zoppoli, Gabriele (NIH/NCI) [G] Sent: Sunday, May 23, 2010 3:44 PM To: ted.hard...@manchester.ac.uk Cc: R-help@r-project.org Subject: Re: [R] order issue crazy stuff!!! I tried to reload the txt file, and now it's working... When you reloaded the txt file (with what function?) it probably was made into a data.frame, with some columns factors or characters and some columns numerics. It looks like your original problem arose after you converted that data.frame into a matrix, all of whose columns must be the same (character in this case). Sorting character representations of numbers is different than sorting the numbers as numbers. sort(c(1, 0.05, 0., -0.10, -2)) [1] -2.00 -0.10 0.00 0.05 1.00 sort(as.character(c(1, 0.05, 0., -0.10, -2))) [1] -0.1 -2 00.05 1 Use str(x) again to see if this is what is happening. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com this is the original (attached) thanks! Gabriele Zoppoli, MD Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy Guest Researcher, LMP, NCI, NIH, Bethesda MD Work: 301-451-8575 Mobile: 301-204-5642 Email: zoppo...@mail.nih.gov From: Ted Harding [ted.hard...@manchester.ac.uk] Sent: Sunday, May 23, 2010 6:31 PM To: Zoppoli, Gabriele (NIH/NCI) [G] Cc: R help Subject: RE: [R] order issue On 23-May-10 21:39:06, Zoppoli, Gabriele (NIH/NCI) [G] wrote: Hi everybody, this is a real dummy thing. I sorted a matrix based on a given column, and what I get is right, until it comes to columns of negative and positive values; than, order orders everything from max to min in the negative values, and then AGAIN from max to min in the positive values!!! Why isn't everything order from max to min, and that's it? Thank you!!! Attached is the txt file I use; try: x=x[order(x[,2]),] What I get is: print(x) Product A B Tissue 44 ME:MDA_MB_435 -0.1915-0.16744 Melanoma 17 CNS:SNB_75-0.23183 1.03945 CNS 37 LE:K_562-0.58218 1.8581 Leukemia 43ME:MALME_3M-0.67327-1.33493 Melanoma 49ME:UACC_257-0.72431-1.84753 Melanoma 42 ME:M14-0.73942-0.73904 Melanoma 40 LE:SR-0.93541 2.95346 Leukemia 25 CO:SW_620-1.53265-1.35446Colon 63 RE:CAKI_1-2.48443 0.43245Renal 39 LE:RPMI_8226-2.59561 -1.9448 Leukemia 26LC:A549-2.66221 0.71215 Lung 61RE:A498-2.89402 0.93287Renal 9 BR:HS578T-2.94118 1.1217 Breast 34LC:NCI_H522-2.94381 0.3859 Lung 66 RE:TK_10-2.95281 1.26245Renal 52 OV:NCI_ADR_RES-3.04456 0.17046 Ovarian 57 OV:SK_OV_3-3.04477 2.15405 Ovarian 53 OV:OVCAR_3 -3.0705-0.31743 Ovarian 14 CNS:SF_295-3.09348-1.00095 CNS 54 OV:OVCAR_4-3.13137
[R] sum of certain length
Hi r-users, I have this data below. I would like to obtain the weekly rainfall sum. That is I would like to find sum for day 1 to day 7, day 8 - day15, and so on. year month day rain 1 1922 1 1 0.0 2 1922 1 2 0.0 3 1922 1 3 0.0 4 1922 1 4 0.0 5 1922 1 5 0.0 6 1922 1 6 0.0 7 1922 1 7 0.0 8 1922 1 8 6.6 9 1922 1 9 1.5 10 1922 1 10 0.0 11 1922 1 11 0.0 12 1922 1 12 4.8 13 1922 1 13 14.7 14 1922 1 14 0.0 15 1922 1 15 0.0 16 1922 1 16 0.0 17 1922 1 17 0.0 18 1922 1 18 0.0 19 1922 1 19 0.0 20 1922 1 20 0.8 21 1922 1 21 0.0 22 1922 1 22 0.0 23 1922 1 23 0.0 24 1922 1 24 0.0 25 1922 1 25 0.0 26 1922 1 26 0.0 27 1922 1 27 0.0 28 1922 1 28 0.0 29 1922 1 29 0.0 30 1922 1 30 0.0 31 1922 1 31 0.0 32 1922 2 1 0.0 33 1922 2 2 0.0 34 1922 2 3 0.0 35 1922 2 4 0.0 36 1922 2 5 0.0 37 1922 2 6 0.0 38 1922 2 7 0.0 39 1922 2 8 0.0 40 1922 2 9 0.0 Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] retrieve path analysis coefficients (package agricolae)
Dear list, I'd like to use path.analysis in the package agricolae in batch format on many files, retrieving the path coefficients for each run and appending them to a table. I don't see any posts in the help files about this package or the path.analysis package. I've tried creating an object out of the call to path.analysis, but no matter what I try, the function automatically prints the result. I'll be grateful for any assistance. Thanks in advance, Zack [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sum of certain length
This is one way to do it. Suppose your data is in the file rainfall.txt, as set out below. Then dat - read.table(rainfall.txt, header = TRUE) dat - within(dat, { + date - as.Date(paste(year, month, day, sep=-)) + week - factor(as.numeric(date - date[1]) %/% 7) + }) wRain - with(dat, tapply(rain, week, sum)) wRain 012345 0.0 27.6 0.8 0.0 0.0 0.0 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Roslina Zakaria Sent: Monday, 24 May 2010 10:09 AM To: r-help@r-project.org Subject: [R] sum of certain length Hi r-users, I have this data below. I would like to obtain the weekly rainfall sum. That is I would like to find sum for day 1 to day 7, day 8 - day15, and so on. year month day rain 1 1922 1 1 0.0 2 1922 1 2 0.0 3 1922 1 3 0.0 4 1922 1 4 0.0 5 1922 1 5 0.0 6 1922 1 6 0.0 7 1922 1 7 0.0 8 1922 1 8 6.6 9 1922 1 9 1.5 10 1922 1 10 0.0 11 1922 1 11 0.0 12 1922 1 12 4.8 13 1922 1 13 14.7 14 1922 1 14 0.0 15 1922 1 15 0.0 16 1922 1 16 0.0 17 1922 1 17 0.0 18 1922 1 18 0.0 19 1922 1 19 0.0 20 1922 1 20 0.8 21 1922 1 21 0.0 22 1922 1 22 0.0 23 1922 1 23 0.0 24 1922 1 24 0.0 25 1922 1 25 0.0 26 1922 1 26 0.0 27 1922 1 27 0.0 28 1922 1 28 0.0 29 1922 1 29 0.0 30 1922 1 30 0.0 31 1922 1 31 0.0 32 1922 2 1 0.0 33 1922 2 2 0.0 34 1922 2 3 0.0 35 1922 2 4 0.0 36 1922 2 5 0.0 37 1922 2 6 0.0 38 1922 2 7 0.0 39 1922 2 8 0.0 40 1922 2 9 0.0 Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] library location and error messages when loading packages
Hello, I am running R on a server that several people share. Previously we all had separate libraries for R. I have set up R so everyone on the server shares the same library and I downloaded the latest version of R and installed it on the main drive of our server in the Program Files folder (obvious enough). I changed the Environmental Variables in the advanced system setting so R_LIBS is C:\\RLIBRARY and restarted the server. The commands : .libPaths() [1] C:\\RLIBRARYC:/PROGRA~1/R/R-211~1.0/library .Library [1] C:/PROGRA~1/R/R-211~1.0/library When I try to run several packages it says It can not load them (although many packages do work). So I tried to install the packages again (I deleted the old ones, downloaded a zip file of the new ones and this is what happened : library(Hmisc) Error in library.dynam(lib, package, package.lib) : shared library 'cluster' not found Error: package/namespace load failed for 'Hmisc' utils:::menuInstallPkgs() trying URL 'http://cran.ms.unimelb.edu.au/bin/windows/contrib/2.11/cluster_1.12.3.zip' Content type 'application/zip' length 340188 bytes (332 Kb) opened URL downloaded 332 Kb package 'cluster' successfully unpacked and MD5 sums checked The downloaded packages are in C:\Users\Daisy Englert\AppData\Local\Temp\2\RtmpWGZV31\downloaded_packages library(cluster) Error in get(Info[i, 1], envir = env) : internal error -3 in R_decompress1 Error: package/namespace load failed for 'cluster' **But, I can fix this by setting the lib.loc library(cluster, lib.loc = C:/PROGRA~1/R/R-211~1.0/library ) **Unfortunately this is not where the updated packages went, the updated package went to C:\\RLIBRARY . I have messed something up and I do not know how to fix it. Any advice would be welcome. Thanks, Daisy -- Daisy Englert Duursma Room E8C156 Dept. Biological Sciences Macquarie University NSW 2109 Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in FUN(X[[1L]], ...) : STRING_ELT() can only be applied to a 'character vector', not a 'integer'
Thanks for your time with this. Erik's solution works best to deal with the input... I'll try to reshape the output back into the appropriate columns. David, fold(sq$s1) only outputs the result for the first sequence in the list I'm afraid. The 'fold' function doesn't deal well with spaces... Thanks again. -- View this message in context: http://r.789695.n4.nabble.com/Error-in-FUN-X-1L-STRING-ELT-can-only-be-applied-to-a-character-vector-not-a-integer-tp2226811p2228157.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] high-dimensional contingency table
Dear Friends. I am just starting to use R. And in this occasion I want to construct a high-dimensional contingency table, because I want to crate a mosaic plot with the vcd package. My table is in this format: año ac.repcat.gru conteos 1 2005 Rparejas 253 2 2005 Nparejas 23 3 2006 Rparejas 347 4 2006 Nparejas 39 5 2007 Rparejas 266 6 2007 Nparejas 83 7 2005 R solitarios 53 8 2005 N solitarios 1 9 2006 R solitarios 109 10 2006 N solitarios 8 11 2007 R solitarios 85 12 2007 N solitarios 34 13 2005 R trios 29 14 2005 N trios 1 15 2006 R trios 62 16 2006 N trios 19 17 2007 R trios 48 18 2007 N trios 3 How can I do this? I saw the help of the mosaic command, and I found that the files are like a hig-dimensional contingency table (for example Tytanic data), but I was unable to do the change. Thank you very much!!! With best wishes -- Claudia I. Rodríguez-Flores Maestra en Ciencias Biológicas Laboratorio de Ecología, UBIPRO UNAM FES-Iztacala 52-55-56231130 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] AM/PM strptime %p failing 2.11.0 WinXP
I am attempting to import dates in the following format to R: 5/20/2010 6:45:32 PM Unfortunately I am unable to get the AM/PM function (%p) to work correctly under either 2.11.0 or 2.8.1. strptime(5/20/2010 6:45:32 PM, %m/%d/%Y %I:%M:%S %p) [1] NA but strptime(5/20/2010 6:45:32, %m/%d/%Y %I:%M:%S) [1] 2010-05-20 06:45:32 showing that the problem is with %p. I could only find one previous mention of this issue in the archives ( http://tolstoy.newcastle.edu.au/R/e2/help/06/11/6272.html) , which provided no solution beyond upgrading R (which I have done), and just suggested it was a problem with that particular installation of R and Windows. What could I do to get this function working on my Windows XP machine? Thankyou, Samuel Dennis sjdenn...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AM/PM strptime %p failing 2.11.0 WinXP
I know it is not very useful to you, but on Vista with 2.11.patched it works: strptime(5/20/2010 6:45:32 PM, %m/%d/%Y %I:%M:%S %p) [1] 2010-05-20 18:45:32 strptime(5/20/2010 6:45:32, %m/%d/%Y %I:%M:%S) [1] 2010-05-20 06:45:32 sessionInfo() R version 2.11.0 Patched (2010-04-26 r51822) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base Maybe you can try to set the LANGUAGE to English. Good luck! mario On 24-May-10 2:59, Samuel Dennis wrote: I am attempting to import dates in the following format to R: 5/20/2010 6:45:32 PM Unfortunately I am unable to get the AM/PM function (%p) to work correctly under either 2.11.0 or 2.8.1. strptime(5/20/2010 6:45:32 PM, %m/%d/%Y %I:%M:%S %p) [1] NA but strptime(5/20/2010 6:45:32, %m/%d/%Y %I:%M:%S) [1] 2010-05-20 06:45:32 showing that the problem is with %p. I could only find one previous mention of this issue in the archives ( http://tolstoy.newcastle.edu.au/R/e2/help/06/11/6272.html) , which provided no solution beyond upgrading R (which I have done), and just suggested it was a problem with that particular installation of R and Windows. What could I do to get this function working on my Windows XP machine? Thankyou, Samuel Dennis sjdenn...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ing. Mario Valle Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60 v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAS for R-users
Thanks for the suggestions! This will keep me busy for a while. Tom 2010/5/15 Muenchen, Robert A (Bob) muenc...@utk.edu: Thomas Levine wrote: Bob Muenchen says that 'Ralph O’Brien says that in a few years there will be so many students graduating knowing mainly R that [he]’ll need to write, “SAS for R Users.” That’ll be the day!' Heh! I quite agree. I've had a few people write me saying they had used my book R for SAS and SPSS Users to learn SAS, but I certainly didn't aim for that when writing it. For R programmers wanting to learn SAS, here's what I recommend: 1. Read the text of the free version of R for SAS and SPSS Users at http://r4stats.com. That version has extremely short explanations of the differences by topic. Most of the explanation about R is in the form of comments in the R programs, which you can skip of course. The SAS programs will give you an idea of the basics. The book version adds lots of explanation but it's all about R, so skip that. 2. Read The Little SAS Book http://www.amazon.com/Little-SAS-Book-Primer-Third/dp/1590473337/ref=sr_1_1?ie=UTF8s=booksqid=1273963558sr=8-1 This is a quick and easy read that covers the basics well. 3. Read SAS and R http://www.amazon.com/SAS-Management-Statistical-Analysis-Graphics/dp/1420070576/ref=sr_1_1?ie=UTF8s=booksqid=1273963594sr=1-1 SAS and R is a good book that covers both SAS and R. The explanations are very brief but well written. That brevity allows it to cover a lot of ground. 4. For in-depth topics, the SAS documentation is well written and all online: http://support.sas.com/documentation/index.html Although the SAS manuals are online, knowing what to look up is the challenge for an R user. That's where 1 and 3 will help. Get ready for a whole different kind of world! Cheers, Bob = Bob Muenchen (pronounced Min'-chen), Manager Research Computing Support Voice: (865) 974-5230 Email: muenc...@utk.edu Web: http://oit.utk.edu/research, News: http://oit.utk.edu/research/news.php Feedback: http://oit.utk.edu/feedback/ = __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] high-dimensional contingency table
On May 23, 2010, at 8:41 PM, Claudia Rodriguez wrote: Dear Friends. I am just starting to use R. And in this occasion I want to construct a high-dimensional contingency table, because I want to crate a mosaic plot with the vcd package. My table is in this format: año ac.repcat.gru conteos 1 2005 Rparejas 253 2 2005 Nparejas 23 3 2006 Rparejas 347 4 2006 Nparejas 39 5 2007 Rparejas 266 6 2007 Nparejas 83 7 2005 R solitarios 53 8 2005 N solitarios 1 9 2006 R solitarios 109 10 2006 N solitarios 8 11 2007 R solitarios 85 12 2007 N solitarios 34 13 2005 R trios 29 14 2005 N trios 1 15 2006 R trios 62 16 2006 N trios 19 17 2007 R trios 48 18 2007 N trios 3 How can I do this? I saw the help of the mosaic command, and I found that the files are like a hig-dimensional contingency table (for example Tytanic data), but I was unable to do the change. mosaic's help page says you need to supply a data.frame or a contingency table. Given that you do not have separate records for each individual, but rather have counts in the last column, you can use xtabs to create a table object. Note the help page of xtabs says: ## xtabs() - as.data.frame.table() You need to tell xtabs which column has the counts. xtabs(conteos ~.,dta) mosaic( cat.gru ~año, data = xtabs( conteos ~., dta)) -- David. Thank you very much!!! With best wishes -- Claudia I. Rodríguez-Flores Maestra en Ciencias Biológicas Laboratorio de Ecología, UBIPRO UNAM FES-Iztacala 52-55-56231130 David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ROC curve
HI, Dear R community, I want to know how to select the optimal decision threshold from the ROC curve? At what threshold will give the highest accuracy? Thanks! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sum of certain length
If the days are consecutive with no missing rows then the dates don't need to be calculated and it could be represented as a ts series with a frequency of 7. Just aggregate it down to a frequency of 1: rain - ts(dat$rain, freq = 7) aggregate(rain, 1) If there are missing rows (or even there are none missing) you could use zoo. This time lets use dates: library(zoo) rain - with(dat, zoo(rain, as.Date(paste(year, month, day, sep = - week - 7 * (as.numeric(time(rain)-start(rain)) %/% 7) + start(rain) + 6 aggregate(rain, week) Each point in the aggregated series is associated with the date of the end of its week. On Sun, May 23, 2010 at 8:09 PM, Roslina Zakaria zrosl...@yahoo.com wrote: Hi r-users, I have this data below. I would like to obtain the weekly rainfall sum. That is I would like to find sum for day 1 to day 7, day 8 - day15, and so on. year month day rain 1 1922 1 1 0.0 2 1922 1 2 0.0 3 1922 1 3 0.0 4 1922 1 4 0.0 5 1922 1 5 0.0 6 1922 1 6 0.0 7 1922 1 7 0.0 8 1922 1 8 6.6 9 1922 1 9 1.5 10 1922 1 10 0.0 11 1922 1 11 0.0 12 1922 1 12 4.8 13 1922 1 13 14.7 14 1922 1 14 0.0 15 1922 1 15 0.0 16 1922 1 16 0.0 17 1922 1 17 0.0 18 1922 1 18 0.0 19 1922 1 19 0.0 20 1922 1 20 0.8 21 1922 1 21 0.0 22 1922 1 22 0.0 23 1922 1 23 0.0 24 1922 1 24 0.0 25 1922 1 25 0.0 26 1922 1 26 0.0 27 1922 1 27 0.0 28 1922 1 28 0.0 29 1922 1 29 0.0 30 1922 1 30 0.0 31 1922 1 31 0.0 32 1922 2 1 0.0 33 1922 2 2 0.0 34 1922 2 3 0.0 35 1922 2 4 0.0 36 1922 2 5 0.0 37 1922 2 6 0.0 38 1922 2 7 0.0 39 1922 2 8 0.0 40 1922 2 9 0.0 Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.