Re: [R] counting the duplicates in an object of list
hmm, frustrating. BTW, unique() works alright. It seems not using deparse() or using it differently On Wed, Sep 7, 2011 at 11:27 PM, William Dunlap wrote: > I don't think you can increase width.cutoff above 500 and > > it isn't an argument to as.character or match. The best > > solution would be to avoid the internal use of deparse > > when using match() or unique() on lists and to hash the > > list element directly, but that is a fair bit of work. > > ** ** > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > *From:* zhenjiang xu [mailto:zhenjiang...@gmail.com] > *Sent:* Wednesday, September 07, 2011 8:04 PM > > *To:* William Dunlap > *Cc:* r-help > *Subject:* Re: [R] counting the duplicates in an object of list > > ** ** > > I tried converting the elements to strings before, but due to the large > data size it took forever to finish with paste(). Is there anyway to set the > default width.cutoff longer and pass it to match()? > > On Wed, Sep 7, 2011 at 10:42 PM, William Dunlap wrote: > > > match(aList, aList) probably does what as.character(aList) does: > > cut off the character strings at 500 characters (because deparse(x, > > nlines=1, width.cutoff) requires that width.cutoff<=500) . Try > > converting the elements to character strings yourself before passing them* > *** > > to match. E.g., > > ac <- sapply(a, function(ai) paste(collapse="\n", deparse(ai))) > > and use match on that. You can use the indices it returns on > > the original list. > > > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > *From:* zhenjiang xu [mailto:zhenjiang...@gmail.com] > *Sent:* Wednesday, September 07, 2011 7:25 PM > *To:* William Dunlap > *Cc:* r-help > *Subject:* Re: [R] counting the duplicates in an object of list > > > > Now I nailed down the problem, but I am still confused why match() takes > the 1st two components and the last two the same. > > > > > match(a,a) > > [1] 1 2 3 1 2 > > > > > a > > [[1]] > > [1] "YARCTy1-1" "YAR009C" "YBLWTy1-1" "YBL005W-B" "YBRWTy1-2" > "YBR012W-B" > > [7] "YDRCTy1-1" "YDR098C-B" "YDRCTy1-2" "YDR210C-D" "YDRCTy1-3" > "YDR261C-D" > > [13] "YDRWTy1-4" "YDR316W-B" "YDRWTy1-5" "YDR365W-B" "YERCTy1-1" "YER138C" > > > [19] "YERCTy1-2" "YER160C" "YGRWTy1-1" "YGR027W-B" "YGRCTy1-2" > "YGR038C-B" > > [25] "YGRCTy1-3" "YGR161C-D" "YHRCTy1-1" "YHR214C-B" "YJRWTy1-1" "YJR027W" > > > [31] "YJRWTy1-2" "YJR029W" "YLR035C-A" "YLRCTy1-1" "YLR157C-B" > "YLRWTy1-3" > > [37] "YMLWTy1-1" "YML045W" "YMLWTy1-2" "YML039W" "YMRCTy1-3" "YMR045C" > > > [43] "YMRCTy1-4" "YMR050C" "YNLCTy1-1" "YNL284C-B" "YNLWTy1-2" > "YNL054W-B" > > [49] "YOLWTy1-1" "YOL103W-B" "YORWTy1-2" "YOR142W-B" "YPLWTy1-1" > "YPL257W-B" > > [55] "YPRCTy1-2" "YPR137C-B" "YPRWTy1-3" "YPR158W-B" > > > > [[2]] > > [1] "YARCTy1-1" "YAR009C" "YBLWTy1-1" "YBL005W-B" "YBRWTy1-2" > "YBR012W-B" > > [7] "YDRCTy1-1" "YDR098C-B" "YDRCTy1-2" "YDR210C-D" "YDRCTy1-3" > "YDR261C-D" > > [13] "YDRWTy1-4" "YDR316W-B" "YDRWTy1-5" "YDR365W-B" "YERCTy1-1" "YER138C" > > > [19] "YERCTy1-2" "YER160C" "YGRWTy1-1" "YGR027W-B" "YGRCTy1-2" > "YGR038C-B" > > [25] "YGRCTy1-3" "YGR161C-D" "YHRCTy1-1" "YHR214C-B" "YJRWTy1-1" "YJR027W" > > > [31] "YJRWTy1-2" "YJR029W" "YLR035C-A" "YLRCTy1-1" "YLR157C-B" > "YLRWTy1-2" > > [37] "YLR227W-B" "YLRWTy1-3" "YMLWTy1-1" "YML045W" "YMLWTy1-2" "YML039W" > >
Re: [R] counting the duplicates in an object of list
I tried converting the elements to strings before, but due to the large data size it took forever to finish with paste(). Is there anyway to set the default width.cutoff longer and pass it to match()? On Wed, Sep 7, 2011 at 10:42 PM, William Dunlap wrote: > match(aList, aList) probably does what as.character(aList) does: > > cut off the character strings at 500 characters (because deparse(x, > > nlines=1, width.cutoff) requires that width.cutoff<=500) . Try > > converting the elements to character strings yourself before passing them* > *** > > to match. E.g., > > ac <- sapply(a, function(ai) paste(collapse="\n", deparse(ai))) > > and use match on that. You can use the indices it returns on > > the original list. > > ** ** > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > *From:* zhenjiang xu [mailto:zhenjiang...@gmail.com] > *Sent:* Wednesday, September 07, 2011 7:25 PM > *To:* William Dunlap > *Cc:* r-help > *Subject:* Re: [R] counting the duplicates in an object of list > > ** ** > > Now I nailed down the problem, but I am still confused why match() takes > the 1st two components and the last two the same. > > ** ** > > > match(a,a) > > [1] 1 2 3 1 2 > > ** ** > > > a > > [[1]] > > [1] "YARCTy1-1" "YAR009C" "YBLWTy1-1" "YBL005W-B" "YBRWTy1-2" > "YBR012W-B" > > [7] "YDRCTy1-1" "YDR098C-B" "YDRCTy1-2" "YDR210C-D" "YDRCTy1-3" > "YDR261C-D" > > [13] "YDRWTy1-4" "YDR316W-B" "YDRWTy1-5" "YDR365W-B" "YERCTy1-1" "YER138C" > > > [19] "YERCTy1-2" "YER160C" "YGRWTy1-1" "YGR027W-B" "YGRCTy1-2" > "YGR038C-B" > > [25] "YGRCTy1-3" "YGR161C-D" "YHRCTy1-1" "YHR214C-B" "YJRWTy1-1" "YJR027W" > > > [31] "YJRWTy1-2" "YJR029W" "YLR035C-A" "YLRCTy1-1" "YLR157C-B" > "YLRWTy1-3" > > [37] "YMLWTy1-1" "YML045W" "YMLWTy1-2" "YML039W" "YMRCTy1-3" "YMR045C" > > > [43] "YMRCTy1-4" "YMR050C" "YNLCTy1-1" "YNL284C-B" "YNLWTy1-2" > "YNL054W-B" > > [49] "YOLWTy1-1" "YOL103W-B" "YORWTy1-2" "YOR142W-B" "YPLWTy1-1" > "YPL257W-B" > > [55] "YPRCTy1-2" "YPR137C-B" "YPRWTy1-3" "YPR158W-B" > > ** ** > > [[2]] > > [1] "YARCTy1-1" "YAR009C" "YBLWTy1-1" "YBL005W-B" "YBRWTy1-2" > "YBR012W-B" > > [7] "YDRCTy1-1" "YDR098C-B" "YDRCTy1-2" "YDR210C-D" "YDRCTy1-3" > "YDR261C-D" > > [13] "YDRWTy1-4" "YDR316W-B" "YDRWTy1-5" "YDR365W-B" "YERCTy1-1" "YER138C" > > > [19] "YERCTy1-2" "YER160C" "YGRWTy1-1" "YGR027W-B" "YGRCTy1-2" > "YGR038C-B" > > [25] "YGRCTy1-3" "YGR161C-D" "YHRCTy1-1" "YHR214C-B" "YJRWTy1-1" "YJR027W" > > > [31] "YJRWTy1-2" "YJR029W" "YLR035C-A" "YLRCTy1-1" "YLR157C-B" > "YLRWTy1-2" > > [37] "YLR227W-B" "YLRWTy1-3" "YMLWTy1-1" "YML045W" "YMLWTy1-2" "YML039W" > > > [43] "YMRCTy1-3" "YMR045C" "YMRCTy1-4" "YMR050C" "YNLCTy1-1" > "YNL284C-B" > > [49] "YNLWTy1-2" "YNL054W-B" "YOLWTy1-1" "YOL103W-B" "YORWTy1-2" > "YOR142W-B" > > [55] "YPLWTy1-1" "YPL257W-B" "YPRCTy1-2" "YPR137C-B" "YPRWTy1-3" > "YPR158W-B" > > [61] "YPRCTy1-4" "YPR158C-D" > > ** ** > > [[3]] > > [1] "YARCTy1-1" "YAR009C" "YDRCTy1-1" "YDR098C-B" "YDRCTy1-2" > "YDR210C-D" > > [7] "YDRCTy1-3" "YDR261C-D" "YDRWTy1-4" "YDR316W-B" "YDRWTy1-5" > "YDR365W-B" > > [13] "YERCTy1-1" &
Re: [R] how to create data.frames from vectors with duplicates
Thanks for benchmarking them. data.table is indeed worth looking at. On Wed, Sep 7, 2011 at 9:55 PM, Dennis Murphy wrote: > Hi: > > Here are a few informal timings on my machine with the following > example. The data.table package is worth investigating, particularly > in problems where its advantages can scale with size. > > library(data.table) > dt <- data.table(x = sample(1:50, 100, replace = TRUE), > y = sample(letters[1:26], 100, replace = TRUE), > key = 'y') > system.time(dt[, list(count = sum(x)), by = 'y']) > user system elapsed > 0.020.000.02 > > # Data tables are also data frames, so we can use them as such: > > system.time(with(dt, tapply(x, y, sum))) > user system elapsed > 0.390.000.39 > system.time(with(dt, rowsum(x, y))) > user system elapsed > 0.040.000.03 > system.time(aggregate(x ~ y, data = dt, FUN = sum)) > user system elapsed > 1.870.001.87 > > So rowsum() is good, but data.table is a little better for this task. > Increasing the size of the problem is to the advantage of both > data.table and rowsum(), but tapply() takes a fair bit longer, > relatively speaking (appx. 10x rowsum() in the first example, 20x in > the second example). The ratios of rowsum() to data.table are about > the same (appx. 2x). > > # 10M observations, 1000 groups > > dt <- data.table(x = sample(1:100, 1000, replace = TRUE), > + y = sample(1:1000, 1000, replace = TRUE), > + key = 'y') > > system.time(dt[, list(count = sum(x)), by = 'y']) > user system elapsed > 0.160.030.18 > > system.time(with(dt, rowsum(x, y))) > user system elapsed > 0.360.040.40 > > system.time(with(dt, tapply(x, y, sum))) > user system elapsed > 8.770.339.11 > > HTH, > Dennis > > > On Wed, Sep 7, 2011 at 6:18 PM, zhenjiang xu > wrote: > > Thanks for all your replies. I am using rowsum() and it looks efficient. > I > > hope I could do some benchmark sometime in near future and let people > know. > > Or is there any benchmark result available? > > > > On Wed, Aug 31, 2011 at 12:58 PM, Bert Gunter >wrote: > > > >> Inline below: > >> > >> On Wed, Aug 31, 2011 at 9:50 AM, Jorge I Velez < > jorgeivanve...@gmail.com> > >> wrote: > >> > Hi Zhenjiang, > >> > > >> > Try > >> > > >> > table(unlist(mapply(function(x, y) rep(x, y), y, x))) > >> > >> Yikes! How about simply tapply(x,y,sum) ?? > >> ?tapply > >> > >> -- Bert > >> > > >> > HTH, > >> > Jorge > >> > > >> > > >> > On Wed, Aug 31, 2011 at 12:45 PM, zhenjiang xu <> wrote: > >> > > >> >> Hi R users, > >> >> > >> >> suppose I have two vectors, > >> >> > x=c(1,2,3,4,5) > >> >> > y=c('a','b','c','a','c') > >> >> How can I get a data.frame like this? > >> >> > xy > >> >> count > >> >> a 5 > >> >> b 2 > >> >> c 8 > >> >> > >> >> I know a few ways to fulfill the task. However, I have a huge number > >> >> of this kind calculations, so I'd like an efficient solution. Thanks > >> >> > >> >> -- > >> >> Best, > >> >> Zhenjiang > >> >> > >> >> __ > >> >> R-help@r-project.org mailing list > >> >> https://stat.ethz.ch/mailman/listinfo/r-help > >> >> PLEASE do read the posting guide > >> >> http://www.R-project.org/posting-guide.html > >> >> and provide commented, minimal, self-contained, reproducible code. > >> >> > >> > > >> >[[alternative HTML version deleted]] > >> > > >> > __ > >> > R-help@r-project.org mailing list > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. > >> > > >> > > > > > > > > -- > > Best, > > Zhenjiang > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] counting the duplicates in an object of list
Now I nailed down the problem, but I am still confused why match() takes the 1st two components and the last two the same. > match(a,a) [1] 1 2 3 1 2 > a [[1]] [1] "YARCTy1-1" "YAR009C" "YBLWTy1-1" "YBL005W-B" "YBRWTy1-2" "YBR012W-B" [7] "YDRCTy1-1" "YDR098C-B" "YDRCTy1-2" "YDR210C-D" "YDRCTy1-3" "YDR261C-D" [13] "YDRWTy1-4" "YDR316W-B" "YDRWTy1-5" "YDR365W-B" "YERCTy1-1" "YER138C" [19] "YERCTy1-2" "YER160C" "YGRWTy1-1" "YGR027W-B" "YGRCTy1-2" "YGR038C-B" [25] "YGRCTy1-3" "YGR161C-D" "YHRCTy1-1" "YHR214C-B" "YJRWTy1-1" "YJR027W" [31] "YJRWTy1-2" "YJR029W" "YLR035C-A" "YLRCTy1-1" "YLR157C-B" "YLRWTy1-3" [37] "YMLWTy1-1" "YML045W" "YMLWTy1-2" "YML039W" "YMRCTy1-3" "YMR045C" [43] "YMRCTy1-4" "YMR050C" "YNLCTy1-1" "YNL284C-B" "YNLWTy1-2" "YNL054W-B" [49] "YOLWTy1-1" "YOL103W-B" "YORWTy1-2" "YOR142W-B" "YPLWTy1-1" "YPL257W-B" [55] "YPRCTy1-2" "YPR137C-B" "YPRWTy1-3" "YPR158W-B" [[2]] [1] "YARCTy1-1" "YAR009C" "YBLWTy1-1" "YBL005W-B" "YBRWTy1-2" "YBR012W-B" [7] "YDRCTy1-1" "YDR098C-B" "YDRCTy1-2" "YDR210C-D" "YDRCTy1-3" "YDR261C-D" [13] "YDRWTy1-4" "YDR316W-B" "YDRWTy1-5" "YDR365W-B" "YERCTy1-1" "YER138C" [19] "YERCTy1-2" "YER160C" "YGRWTy1-1" "YGR027W-B" "YGRCTy1-2" "YGR038C-B" [25] "YGRCTy1-3" "YGR161C-D" "YHRCTy1-1" "YHR214C-B" "YJRWTy1-1" "YJR027W" [31] "YJRWTy1-2" "YJR029W" "YLR035C-A" "YLRCTy1-1" "YLR157C-B" "YLRWTy1-2" [37] "YLR227W-B" "YLRWTy1-3" "YMLWTy1-1" "YML045W" "YMLWTy1-2" "YML039W" [43] "YMRCTy1-3" "YMR045C" "YMRCTy1-4" "YMR050C" "YNLCTy1-1" "YNL284C-B" [49] "YNLWTy1-2" "YNL054W-B" "YOLWTy1-1" "YOL103W-B" "YORWTy1-2" "YOR142W-B" [55] "YPLWTy1-1" "YPL257W-B" "YPRCTy1-2" "YPR137C-B" "YPRWTy1-3" "YPR158W-B" [61] "YPRCTy1-4" "YPR158C-D" [[3]] [1] "YARCTy1-1" "YAR009C" "YDRCTy1-1" "YDR098C-B" "YDRCTy1-2" "YDR210C-D" [7] "YDRCTy1-3" "YDR261C-D" "YDRWTy1-4" "YDR316W-B" "YDRWTy1-5" "YDR365W-B" [13] "YERCTy1-1" "YER138C" "YGRWTy1-1" "YGR027W-B" "YGRCTy1-2" "YGR038C-B" [19] "YJRWTy1-1" "YJR027W" "YJRWTy1-2" "YJR029W" "YLRCTy1-1" "YLR157C-B" [25] "YLRWTy1-3" "YMLWTy1-1" "YML045W" "YMLWTy1-2" "YML039W" "YMRCTy1-4" [31] "YMR050C" "YOLWTy1-1" "YOL103W-B" "YORWTy1-2" "YOR142W-B" "YPLWTy1-1" [37] "YPL257W-B" "YPRCTy1-2" "YPR137C-B" "YPRWTy1-3" "YPR158W-B" [[4]] [1] "YARCTy1-1" "YAR009C" "YBLWTy1-1" "YBL005W-B" "YBRWTy1-2" "YBR012W-B" [7] "YDRCTy1-1" "YDR098C-B" "YDRCTy1-2" "YDR210C-D" "YDRCTy1-3" "YDR261C-D" [13] "YDRWTy1-4" "YDR316W-B" "YDRWTy1-5" "YDR365W-B" "YERCTy1-1" "YER138C" [19] "YERCTy1-2" "YER160C" "YGRWTy1-1" "YGR027W-B" "YGRCTy1-2" "YGR038C-B" [25] "YGRCTy1-3" "YGR161C-D" "YHRCTy1-1" "YHR214C-B" "YJRWTy1-1" "YJR027W" [31] "YJRWTy1-2" "YJR029W" "YLR035C-A" "YLRCTy1-1" "YLR157C-B" "YLRWTy1-3" [37] "YMLWTy1-1" "YML045W" "YMLWTy1-2" "YML039W" "YMRCTy1-3" "YMR045C" [43] "YMRCTy1-4" "YMR050C" "YOLWTy1-1" "YOL103W-B" "YORWTy1-2" "YOR142W-B" [49] "YPLWTy1-1" "YPL257W-B&q
Re: [R] read.table truncated data?
Thanks for the suggestion. I guess that's the only thing I could do On Fri, Aug 26, 2011 at 4:22 AM, Petr PIKAL wrote: > Hi > > > > > Thanks, Jim. quote='' works. And then I found a single quote in each of > > these lines: > > 3262 > > 10403 > > 17544 > > 24685 > > 31826 > > 38967 > > > > None of them near the position the table got truncated. Why is it? > > > > And read.table is a great function. Is it possible for it to give a > warning > > message when the data gets truncated? In my case I almost looked over > the > > truncation... > > When I read in some big data I usually do > > str(data) > > which tells me if there is some problem with data types (conversion of > numeric to factor due to any problematic item) > > and/or > > dim(data) > > to see that size is as expected. > > Regards > Petr > > > > > On Thu, Aug 25, 2011 at 11:57 AM, jim holtman > wrote: > > > > > But did you try the following: > > > > > > x <- read.table(, comment.char = '', quote = '') > > > > > > Most cases is that there is a missing quote somewhere in your data. > > > use a text editor and search for single and double quotes. > > > > > > On Thu, Aug 25, 2011 at 11:49 AM, zhenjiang xu > > > > wrote: > > > > Thanks for your replies. I looked at those lines and didn't spot > anything > > > > unusual. > > > > > > > >> tail(a) > > > >test_id gene_id gene locus sample_1 sample_2 > status > > > > 21418 tY(GUA)J1 - SUP7 chr10:354243-354332 air1rrp6 air2rrp6 > OK > > > > 21419 tY(GUA)J2 - SUP4 chr10:542955-543044 air1rrp6 air2rrp6 > OK > > > > 21420 tY(GUA)M1 - SUP5 chr13:168794-168883 air1rrp6 air2rrp6 > OK > > > > 21421 tY(GUA)M2 - SUP8 chr13:837927-838016 air1rrp6 air2rrp6 > OK > > > > 21422 tY(GUA)O - SUP3 chr15:288191-288280 air1rrp6 air2rrp6 > OK > > > > 21423 tY(GUA)Q -- chrmt:70823-70907 air1rrp6 air2rrp6 > > > OK > > > > value_1 value_2 ln.fold_change. test_stat p_value q_value > > > > significant > > > > 21418 0.0 0.0.00 0.0 1.00 1.011650 > > > > no > > > > 21419 0.0 0.0.00 0.0 1.00 1.011480 > > > > no > > > > 21420 0.0 0.0.00 0.0 1.00 1.011500 > > > > no > > > > 21421 0.0 0.0.00 0.0 1.00 1.011520 > > > > no > > > > 21422 0.0 0.0.00 0.0 1.00 1.011550 > > > > no > > > > 21423 6.68356 10.73970.474301 -1.08614 0.277417 0.455917 > > > > no > > > > > > > > > > > > tY(GUA)J1 - SUP7chr10:354243-354332 rrp6 > air1rrp6 > > > > OK 0 0 0 0 11.00404 no > > > > tY(GUA)J2 - SUP4chr10:542955-543044 rrp6 > air1rrp6 > > > > OK 0 0 0 0 11.00497 no > > > > tY(GUA)M1 - SUP5chr13:168794-168883 rrp6 > air1rrp6 > > > > OK 0 0 0 0 11.00492 no > > > > tY(GUA)M2 - SUP8chr13:837927-838016 rrp6 > air1rrp6 > > > > OK 0 0 0 0 11.00488 no > > > > tY(GUA)O- SUP3chr15:288191-288280 rrp6 > air1rrp6 > > > > OK 0 0 0 0 11.00485 no > > > > tY(GUA)Q- - chrmt:70823-70907 rrp6 > air1rrp6 > > > > OK 4.49644 6.68356 0.396365-0.766052 0.443645 > > > > 0.634724no > > > > 15S_rRNA- 15S_RRNAchrmt:6545-8194 WT air2rrp6 > > > > OK 2288.88 711.697 -1.168172.78772 0.00530801 > > > > 0.0167772 yes > > > > 21S_rRNA- 21S_RRNAchrmt:58008-62447 WT > > > > air2rrp6OK 4134.59 1927.04 -0.7634 1.58991 0.111855 > > > > 0.22339 no > > > > ETS1-1 - ETS1-1 chr12:457732-458432 WT air2rrp6 > > > OK > > > > 3258.97 1114.76 -1.072772.91211 0.00359 0.0121587 > > > yes > > > > ETS1-2 - ETS1-2 chr12:466869-467569 WT air2rrp6 > > >
Re: [R] how to create data.frames from vectors with duplicates
Thanks for all your replies. I am using rowsum() and it looks efficient. I hope I could do some benchmark sometime in near future and let people know. Or is there any benchmark result available? On Wed, Aug 31, 2011 at 12:58 PM, Bert Gunter wrote: > Inline below: > > On Wed, Aug 31, 2011 at 9:50 AM, Jorge I Velez > wrote: > > Hi Zhenjiang, > > > > Try > > > > table(unlist(mapply(function(x, y) rep(x, y), y, x))) > > Yikes! How about simply tapply(x,y,sum) ?? > ?tapply > > -- Bert > > > > HTH, > > Jorge > > > > > > On Wed, Aug 31, 2011 at 12:45 PM, zhenjiang xu <> wrote: > > > >> Hi R users, > >> > >> suppose I have two vectors, > >> > x=c(1,2,3,4,5) > >> > y=c('a','b','c','a','c') > >> How can I get a data.frame like this? > >> > xy > >> count > >> a 5 > >> b 2 > >> c 8 > >> > >> I know a few ways to fulfill the task. However, I have a huge number > >> of this kind calculations, so I'd like an efficient solution. Thanks > >> > >> -- > >> Best, > >> Zhenjiang > >> > >> __ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] counting the duplicates in an object of list
Thanks, Bill. match() is nice and efficient. However, I met a problem: My real data is a large _list_ named "read.genes". I found conflict results between match() and unique() - the lengths of the outcomes are different (and my final result are wrong too). I suspect that some different list components are regarded as the same when they are converted to vectors (the r-help of match() says "Factors, raw vectors and lists are converted to character vectors"). Is it possible? And as important, how to fix this? > read.genes[[1]] [1] "YAL065C" "YAL063C" "YAR050W" "YHR211W" > duplicates <- as.vector(table(match(read.genes, read.genes))) > length(duplicates) [1] 1424 > read.genes.uniq <- unique(read.genes) > length(read.genes.uniq) [1] 1469 > sum(duplicates) [1] 9945348 > length(read.genes) [1] 9945348 On Wed, Aug 31, 2011 at 12:42 PM, William Dunlap wrote: > table(match(x, x)) gives you the numbers but the labels are > a bit more work. > > E.g., I'll define another list > > x <- list(c("1", "2", "4"), c("1", "2", "4"), 2^(0:4), 3^(1:2), 2^(0:4)) > > tb <- table(m <- match(x, x)) > > m > [1] 1 1 3 4 3 > > tb > > 1 3 4 > 2 2 1 > which says that the first element of x is seen twice, > the third twice, and the fourth once. How to organize > that the best depends on what you want to do with the > data. > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > > -Original Message- > > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of zhenjiang xu > > Sent: Wednesday, August 31, 2011 9:25 AM > > To: r-help > > Subject: [R] counting the duplicates in an object of list > > > > Hi all, > > > > I have a list x: > > > > > x=list(a=c('1','2'),b=c('2','3'),c=c('1','2'),d=c('2','3')) > > > > I can get the unique elements with unique(), but how can I get the > > number of duplicates for each unique elements? > > > > > unique(x) > > [[1]] > > [1] "1" "2" > > > > [[2]] > > [1] "2" "3" > > > > Thanks > > > > -- > > Best, > > Zhenjiang > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to create data.frames from vectors with duplicates
Hi R users, suppose I have two vectors, > x=c(1,2,3,4,5) > y=c('a','b','c','a','c') How can I get a data.frame like this? > xy count a 5 b 2 c 8 I know a few ways to fulfill the task. However, I have a huge number of this kind calculations, so I'd like an efficient solution. Thanks -- Best, Zhenjiang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sum of two lists
Thanks, Henrique. It works. On Mon, Aug 29, 2011 at 6:45 PM, Henrique Dallazuanna wrote: > Try this: > as.list(colSums(merge(m, n, all = TRUE), na.rm = TRUE)) > > On Mon, Aug 29, 2011 at 7:39 PM, zhenjiang xu > wrote: >> >> Hi R users, >> >> Suppose I have two lists and the names of list 'm' are a subset of those >> of >> 'n', how can I sum the two lists with corresponding elements added >> together >> to get list 'o'? >> >> > n = list("a"=1,"b"=3,"c"=5) >> > m = list('b'=4) >> > o >> $a >> [1] 1 >> >> $b >> [1] 7 >> >> $c >> [1] 5 >> >> Thanks >> >> -- >> Best, >> Zhenjiang >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > Henrique Dallazuanna > Curitiba-Paraná-Brasil > 25° 25' 40" S 49° 16' 22" O > -- Best, Zhenjiang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] counting the duplicates in an object of list
Hi all, I have a list x: > x=list(a=c('1','2'),b=c('2','3'),c=c('1','2'),d=c('2','3')) I can get the unique elements with unique(), but how can I get the number of duplicates for each unique elements? > unique(x) [[1]] [1] "1" "2" [[2]] [1] "2" "3" Thanks -- Best, Zhenjiang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sum of two lists
Hi R users, Suppose I have two lists and the names of list 'm' are a subset of those of 'n', how can I sum the two lists with corresponding elements added together to get list 'o'? > n = list("a"=1,"b"=3,"c"=5) > m = list('b'=4) > o $a [1] 1 $b [1] 7 $c [1] 5 Thanks -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table truncated data?
Thanks, Jim. quote='' works. And then I found a single quote in each of these lines: 3262 10403 17544 24685 31826 38967 None of them near the position the table got truncated. Why is it? And read.table is a great function. Is it possible for it to give a warning message when the data gets truncated? In my case I almost looked over the truncation... On Thu, Aug 25, 2011 at 11:57 AM, jim holtman wrote: > But did you try the following: > > x <- read.table(, comment.char = '', quote = '') > > Most cases is that there is a missing quote somewhere in your data. > use a text editor and search for single and double quotes. > > On Thu, Aug 25, 2011 at 11:49 AM, zhenjiang xu > wrote: > > Thanks for your replies. I looked at those lines and didn't spot anything > > unusual. > > > >> tail(a) > >test_id gene_id gene locus sample_1 sample_2 status > > 21418 tY(GUA)J1 - SUP7 chr10:354243-354332 air1rrp6 air2rrp6 OK > > 21419 tY(GUA)J2 - SUP4 chr10:542955-543044 air1rrp6 air2rrp6 OK > > 21420 tY(GUA)M1 - SUP5 chr13:168794-168883 air1rrp6 air2rrp6 OK > > 21421 tY(GUA)M2 - SUP8 chr13:837927-838016 air1rrp6 air2rrp6 OK > > 21422 tY(GUA)O - SUP3 chr15:288191-288280 air1rrp6 air2rrp6 OK > > 21423 tY(GUA)Q -- chrmt:70823-70907 air1rrp6 air2rrp6 > OK > > value_1 value_2 ln.fold_change. test_stat p_value q_value > > significant > > 21418 0.0 0.0.00 0.0 1.00 1.011650 > > no > > 21419 0.0 0.0.00 0.0 1.00 1.011480 > > no > > 21420 0.0 0.0.00 0.0 1.00 1.011500 > > no > > 21421 0.0 0.0.00 0.0 1.00 1.011520 > > no > > 21422 0.0 0.0.00 0.0 1.00 1.011550 > > no > > 21423 6.68356 10.73970.474301 -1.08614 0.277417 0.455917 > > no > > > > > > tY(GUA)J1 - SUP7chr10:354243-354332 rrp6air1rrp6 > > OK 0 0 0 0 11.00404 no > > tY(GUA)J2 - SUP4chr10:542955-543044 rrp6air1rrp6 > > OK 0 0 0 0 11.00497 no > > tY(GUA)M1 - SUP5chr13:168794-168883 rrp6air1rrp6 > > OK 0 0 0 0 11.00492 no > > tY(GUA)M2 - SUP8chr13:837927-838016 rrp6air1rrp6 > > OK 0 0 0 0 11.00488 no > > tY(GUA)O- SUP3chr15:288191-288280 rrp6air1rrp6 > > OK 0 0 0 0 11.00485 no > > tY(GUA)Q- - chrmt:70823-70907 rrp6air1rrp6 > > OK 4.49644 6.68356 0.396365-0.766052 0.443645 > > 0.634724no > > 15S_rRNA- 15S_RRNAchrmt:6545-8194 WT air2rrp6 > > OK 2288.88 711.697 -1.168172.78772 0.00530801 > > 0.0167772 yes > > 21S_rRNA- 21S_RRNAchrmt:58008-62447 WT > > air2rrp6OK 4134.59 1927.04 -0.7634 1.58991 0.111855 > > 0.22339 no > > ETS1-1 - ETS1-1 chr12:457732-458432 WT air2rrp6 > OK > > 3258.97 1114.76 -1.072772.91211 0.00359 0.0121587 > yes > > ETS1-2 - ETS1-2 chr12:466869-467569 WT air2rrp6 > OK > > 3258.97 1114.76 -1.072772.91211 0.00359 0.0121597 > yes > > > > > > On Wed, Aug 24, 2011 at 2:34 PM, Sarah Goslee >wrote: > > > >> Hi, > >> > >> On Wed, Aug 24, 2011 at 2:18 PM, zhenjiang xu > >> wrote: > >> > Hi R users, > >> > > >> > I was using read.table to read a file. The data.fame looked alright, > but > >> I > >> > found not all rows are read by the read.table. What's wrong with it? > It > >> > didn't give me any warning or error messages. Why the data are > truncated? > >> > Thanks. > >> > > >> > $ wc -l all/isoform_exp.diff > >> > 42847 all/isoform_exp.diff > >> > > >> >> a=read.table('all/isoform_exp.diff', header=T, sep='\t') > >> >> nrow(a) > >> > [1] 21423 > >> > >> This is a common problem. You need to take a look at the last row that > >> was imported, and the rows around 21423 in the original file. > >> > >> Common causes include stray single or double quotation marks, and > >> other sp
Re: [R] read.table truncated data?
Thanks for your replies. I looked at those lines and didn't spot anything unusual. > tail(a) test_id gene_id gene locus sample_1 sample_2 status 21418 tY(GUA)J1 - SUP7 chr10:354243-354332 air1rrp6 air2rrp6 OK 21419 tY(GUA)J2 - SUP4 chr10:542955-543044 air1rrp6 air2rrp6 OK 21420 tY(GUA)M1 - SUP5 chr13:168794-168883 air1rrp6 air2rrp6 OK 21421 tY(GUA)M2 - SUP8 chr13:837927-838016 air1rrp6 air2rrp6 OK 21422 tY(GUA)O - SUP3 chr15:288191-288280 air1rrp6 air2rrp6 OK 21423 tY(GUA)Q -- chrmt:70823-70907 air1rrp6 air2rrp6 OK value_1 value_2 ln.fold_change. test_stat p_value q_value significant 21418 0.0 0.0.00 0.0 1.00 1.011650 no 21419 0.0 0.0.00 0.0 1.00 1.011480 no 21420 0.0 0.0.00 0.0 1.00 1.011500 no 21421 0.0 0.0.00 0.0 1.00 1.011520 no 21422 0.0 0.0.00 0.0 1.00 1.011550 no 21423 6.68356 10.73970.474301 -1.08614 0.277417 0.455917 no tY(GUA)J1 - SUP7chr10:354243-354332 rrp6air1rrp6 OK 0 0 0 0 11.00404 no tY(GUA)J2 - SUP4chr10:542955-543044 rrp6air1rrp6 OK 0 0 0 0 11.00497 no tY(GUA)M1 - SUP5chr13:168794-168883 rrp6air1rrp6 OK 0 0 0 0 11.00492 no tY(GUA)M2 - SUP8chr13:837927-838016 rrp6air1rrp6 OK 0 0 0 0 11.00488 no tY(GUA)O- SUP3chr15:288191-288280 rrp6air1rrp6 OK 0 0 0 0 11.00485 no tY(GUA)Q- - chrmt:70823-70907 rrp6air1rrp6 OK 4.49644 6.68356 0.396365-0.766052 0.443645 0.634724no 15S_rRNA- 15S_RRNAchrmt:6545-8194 WT air2rrp6 OK 2288.88 711.697 -1.168172.78772 0.00530801 0.0167772 yes 21S_rRNA- 21S_RRNAchrmt:58008-62447 WT air2rrp6OK 4134.59 1927.04 -0.7634 1.58991 0.111855 0.22339 no ETS1-1 - ETS1-1 chr12:457732-458432 WT air2rrp6OK 3258.97 1114.76 -1.072772.91211 0.00359 0.0121587 yes ETS1-2 - ETS1-2 chr12:466869-467569 WT air2rrp6OK 3258.97 1114.76 -1.072772.91211 0.00359 0.0121597 yes On Wed, Aug 24, 2011 at 2:34 PM, Sarah Goslee wrote: > Hi, > > On Wed, Aug 24, 2011 at 2:18 PM, zhenjiang xu > wrote: > > Hi R users, > > > > I was using read.table to read a file. The data.fame looked alright, but > I > > found not all rows are read by the read.table. What's wrong with it? It > > didn't give me any warning or error messages. Why the data are truncated? > > Thanks. > > > > $ wc -l all/isoform_exp.diff > > 42847 all/isoform_exp.diff > > > >> a=read.table('all/isoform_exp.diff', header=T, sep='\t') > >> nrow(a) > > [1] 21423 > > This is a common problem. You need to take a look at the last row that > was imported, and the rows around 21423 in the original file. > > Common causes include stray single or double quotation marks, and > other special characters in your file like the default comment.char # > > Sarah > -- > Sarah Goslee > http://www.functionaldiversity.org > -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.table truncated data?
Hi R users, I was using read.table to read a file. The data.fame looked alright, but I found not all rows are read by the read.table. What's wrong with it? It didn't give me any warning or error messages. Why the data are truncated? Thanks. $ wc -l all/isoform_exp.diff 42847 all/isoform_exp.diff > a=read.table('all/isoform_exp.diff', header=T, sep='\t') > nrow(a) [1] 21423 -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a question on list manipulation
Unfortunately the list names of my real data are irregular with mixed digit and letters at the end. This is good idea though. It inspired me to give another solution based on that: > x <- list(A=c("d", "e", "f"), B=c("d", "e"), C=c("d","g")) > tmp <- unlist(x, use.names=F) > a = unlist(lapply(x, length)) > tmp2 = rep(names(a), a) > x.new = split(tmp2, tmp) And I tested it on my data. It took over an hour using for loops while finishing in a second with the vectorization. Thanks all of you. Hooray~ On Fri, Aug 5, 2011 at 3:31 PM, Greg Snow wrote: > Here is one approach, whether it is better than the basic loop or not is up > to you: > >> x <- list(A=c("d", "e", "f"), B=c("d", "e"), C=c("d")) >> >> tmp <- unlist(x) >> tmp2 <- sub( '[0-9]+$', '', names(tmp) ) >> >> x.new <- split( tmp2, tmp ) >> x.new > $d > [1] "A" "B" "C" > > $e > [1] "A" "B" > > $f > [1] "A" > > > Of course this version will have some problems if the names of your list > elements end with digits that you don't want stripped off (but you can work > around that by preprocessing the list names). > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.s...@imail.org > 801.408.8111 > > >> -Original Message- >> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- >> project.org] On Behalf Of zhenjiang xu >> Sent: Friday, August 05, 2011 11:04 AM >> To: Duncan Murdoch >> Cc: r-help >> Subject: Re: [R] a question on list manipulation >> >> Exactly! Sorry I get others misunderstood. The uppercase/lowercase is >> only a toy example (and a bad one; yours is better than mine). My >> question is a more general one: a list is basically a one-to-many >> matching, from the names of a list to the elements belonging to each >> name. I'd like to reverse the matching, from all the elements to the >> names of the list. >> >> On Fri, Aug 5, 2011 at 12:53 PM, Duncan Murdoch >> wrote: >> > On 05/08/2011 12:05 PM, zhenjiang xu wrote: >> >> >> >> Hi R users, >> >> >> >> I have a list: >> >> > x >> >> $A >> >> [1] "a" "b" "c" >> >> $B >> >> [1] "b" "c" >> >> $C >> >> [1] "c" >> >> >> >> I want to convert it to a lowercase-to-uppercase list like this: >> >> > y >> >> $a >> >> [1] "A" >> >> $b >> >> [1] "A" "B" >> >> $c >> >> [1] "A" "B" "C" >> >> >> >> In a word, I want to reverse the list names and the elements under >> >> each list name. Is there any quick way to do that? Thanks >> > >> > I interpreted this question differently from the others, and your >> example is >> > ambiguous as to which is the right interpretation. I thought you >> wanted to >> > swap names and elements, so >> > >> >> x <- list(A=c("d", "e", "f"), B=c("d", "e"), C=c("d")) >> >> x >> > $A >> > [1] "d" "e" "f" >> > >> > $B >> > [1] "d" "e" >> > >> > $C >> > [1] "d" >> > >> > would become >> > >> >> list(d=c("A", "B", "C"), e=c("A", "B"), f="A") >> > $d >> > [1] "A" "B" "C" >> > >> > $e >> > [1] "A" "B" >> > >> > $f >> > [1] "A" >> > >> > I don't know a slick way to do this; I'd just do it by brute force, >> looping >> > over the names of x. >> > >> > Duncan Murdoch >> > >> >> >> >> -- >> Best, >> Zhenjiang >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. > -- Best, Zhenjiang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a question on list manipulation
This is a nice solution. Thanks, Dennis. But I am afraid if the length of the list x isn't equal to the length of x2, there will be errors since lapply returns a list of the same length. > x <- list(A=c("d", "e", "f"), B=c("d", "e"), C=c("d","g")) > x2 <- unique(unlist(x)) > w <- lapply(x, function(u) names(x)[which(x2 %in% u)]) > names(w) <- x2 Error in names(w) <- x2 : 'names' attribute [4] must be the same length as the vector [3] On Fri, Aug 5, 2011 at 3:23 PM, Dennis Murphy wrote: > Hi: > > Your clarification suggests Duncan was on the right track, so how about this: > > x <- list(A=c("d", "e", "f"), B=c("d", "e"), C=c("d")) > x2 <- unique(unlist(x)) > w <- lapply(x, function(u) names(x)[which(x2 %in% u)]) > names(w) <- x2 > w > $d > [1] "A" "B" "C" > > $e > [1] "A" "B" > > $f > [1] "A" > > HTH, > Dennis > > On Fri, Aug 5, 2011 at 10:04 AM, zhenjiang xu wrote: >> Exactly! Sorry I get others misunderstood. The uppercase/lowercase is >> only a toy example (and a bad one; yours is better than mine). My >> question is a more general one: a list is basically a one-to-many >> matching, from the names of a list to the elements belonging to each >> name. I'd like to reverse the matching, from all the elements to the >> names of the list. >> >> On Fri, Aug 5, 2011 at 12:53 PM, Duncan Murdoch >> wrote: >>> On 05/08/2011 12:05 PM, zhenjiang xu wrote: >>>> >>>> Hi R users, >>>> >>>> I have a list: >>>> > x >>>> $A >>>> [1] "a" "b" "c" >>>> $B >>>> [1] "b" "c" >>>> $C >>>> [1] "c" >>>> >>>> I want to convert it to a lowercase-to-uppercase list like this: >>>> > y >>>> $a >>>> [1] "A" >>>> $b >>>> [1] "A" "B" >>>> $c >>>> [1] "A" "B" "C" >>>> >>>> In a word, I want to reverse the list names and the elements under >>>> each list name. Is there any quick way to do that? Thanks >>> >>> I interpreted this question differently from the others, and your example is >>> ambiguous as to which is the right interpretation. I thought you wanted to >>> swap names and elements, so >>> >>>> x <- list(A=c("d", "e", "f"), B=c("d", "e"), C=c("d")) >>>> x >>> $A >>> [1] "d" "e" "f" >>> >>> $B >>> [1] "d" "e" >>> >>> $C >>> [1] "d" >>> >>> would become >>> >>>> list(d=c("A", "B", "C"), e=c("A", "B"), f="A") >>> $d >>> [1] "A" "B" "C" >>> >>> $e >>> [1] "A" "B" >>> >>> $f >>> [1] "A" >>> >>> I don't know a slick way to do this; I'd just do it by brute force, looping >>> over the names of x. >>> >>> Duncan Murdoch >>> >> >> >> >> -- >> Best, >> Zhenjiang >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > -- Best, Zhenjiang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a question on list manipulation
Exactly! Sorry I get others misunderstood. The uppercase/lowercase is only a toy example (and a bad one; yours is better than mine). My question is a more general one: a list is basically a one-to-many matching, from the names of a list to the elements belonging to each name. I'd like to reverse the matching, from all the elements to the names of the list. On Fri, Aug 5, 2011 at 12:53 PM, Duncan Murdoch wrote: > On 05/08/2011 12:05 PM, zhenjiang xu wrote: >> >> Hi R users, >> >> I have a list: >> > x >> $A >> [1] "a" "b" "c" >> $B >> [1] "b" "c" >> $C >> [1] "c" >> >> I want to convert it to a lowercase-to-uppercase list like this: >> > y >> $a >> [1] "A" >> $b >> [1] "A" "B" >> $c >> [1] "A" "B" "C" >> >> In a word, I want to reverse the list names and the elements under >> each list name. Is there any quick way to do that? Thanks > > I interpreted this question differently from the others, and your example is > ambiguous as to which is the right interpretation. I thought you wanted to > swap names and elements, so > >> x <- list(A=c("d", "e", "f"), B=c("d", "e"), C=c("d")) >> x > $A > [1] "d" "e" "f" > > $B > [1] "d" "e" > > $C > [1] "d" > > would become > >> list(d=c("A", "B", "C"), e=c("A", "B"), f="A") > $d > [1] "A" "B" "C" > > $e > [1] "A" "B" > > $f > [1] "A" > > I don't know a slick way to do this; I'd just do it by brute force, looping > over the names of x. > > Duncan Murdoch > -- Best, Zhenjiang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to control to save plots to which dev
Yes, but I thought the parameter to dev.set() should only be the value returned by dev.next()/dev.prev(). So I read the help page again. It's a little embarrassing - I missed the sentence "Devices are associated with ... a number in the range 1 to 63". I should have read the help page more carefully. Thanks. On Fri, Aug 5, 2011 at 12:02 PM, Duncan Murdoch wrote: > On 05/08/2011 11:49 AM, zhenjiang xu wrote: >> >> Thanks, Prof Ripley. I was using dev.next(), dev.prev(),, but I am >> wondering, instead of switching the current dev, is there a way to >> more directly print plot A into file connection A, plot B into file >> connection B...? Because if coding with more then two dev >> simultaniously, one could easily get confused which dev is the current >> one. > > dev.set() will do exactly that (and Prof. Ripley did point you to it). > > Duncan Murdoch >> >> On Tue, Aug 2, 2011 at 1:28 AM, Prof Brian Ripley >> wrote: >> > On Tue, 2 Aug 2011, David Winsemius wrote: >> > >> >> >> >> On Aug 1, 2011, at 11:14 PM, zhenjiang xu wrote: >> >> >> >>> Hi, >> >>> >> >>> I have a for loop to make 2 types of plots and I'd like to save one >> >>> type of plots to a pdf file and the other to another pdf file. How >> >>> can >> >>> I control which plot will be saved to which pdf? Thanks >> >> >> >> Why not give them file names that identify the type? >> > >> > I think he wants >> > >> > pdf("a.pdf") >> > pdf("b.pdf") >> > for(i in 1:n) { >> > plot something on a.pdf >> > plot something on b.pdf >> > } >> > >> > This is done using dev.prev/dev.next/dev.set: see their help for >> > details. >> > >> >> >> >> -- >> >> >> >> David Winsemius, MD >> >> West Hartford, CT >> >> >> >> __ >> >> R-help@r-project.org mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> >> >> > >> > -- >> > Brian D. Ripley, rip...@stats.ox.ac.uk >> > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >> > University of Oxford, Tel: +44 1865 272861 (self) >> > 1 South Parks Road, +44 1865 272866 (PA) >> > Oxford OX1 3TG, UK Fax: +44 1865 272595 >> > >> >> >> > > -- Best, Zhenjiang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] a question on list manipulation
Hi R users, I have a list: > x $A [1] "a" "b" "c" $B [1] "b" "c" $C [1] "c" I want to convert it to a lowercase-to-uppercase list like this: > y $a [1] "A" $b [1] "A" "B" $c [1] "A" "B" "C" In a word, I want to reverse the list names and the elements under each list name. Is there any quick way to do that? Thanks -- Best, Zhenjiang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to control to save plots to which dev
Thanks, Prof Ripley. I was using dev.next(), dev.prev(),, but I am wondering, instead of switching the current dev, is there a way to more directly print plot A into file connection A, plot B into file connection B...? Because if coding with more then two dev simultaniously, one could easily get confused which dev is the current one. On Tue, Aug 2, 2011 at 1:28 AM, Prof Brian Ripley wrote: > On Tue, 2 Aug 2011, David Winsemius wrote: > >> >> On Aug 1, 2011, at 11:14 PM, zhenjiang xu wrote: >> >>> Hi, >>> >>> I have a for loop to make 2 types of plots and I'd like to save one >>> type of plots to a pdf file and the other to another pdf file. How can >>> I control which plot will be saved to which pdf? Thanks >> >> Why not give them file names that identify the type? > > I think he wants > > pdf("a.pdf") > pdf("b.pdf") > for(i in 1:n) { > plot something on a.pdf > plot something on b.pdf > } > > This is done using dev.prev/dev.next/dev.set: see their help for details. > >> >> -- >> >> David Winsemius, MD >> West Hartford, CT >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Brian D. Ripley, rip...@stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > -- Best, Zhenjiang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to control to save plots to which dev
Hi, I have a for loop to make 2 types of plots and I'd like to save one type of plots to a pdf file and the other to another pdf file. How can I control which plot will be saved to which pdf? Thanks -- Best, Zhenjiang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to add two data.frame with the same column but different row numbers
Thanks, Gabor. It's a nice workaround. I'll look more at zoo library. On Fri, Apr 15, 2011 at 7:10 PM, Gabor Grothendieck wrote: > On Fri, Apr 15, 2011 at 6:10 PM, zhenjiang xu > wrote: > > Thanks, Dennis! I'll go with it. It's surprising there is no ready way to > do > > that. I imagine it should be a common data manipulation to add two > > data.frame from two different sources. It could happen that one > data.frame > > is missing some rows while the other have some more. > > > > If you represent them as zoo series then you can do it using + > (although the definition of + is different than in your post). Here > "a", "b" and "c" are the "times": > > library(zoo) > a <- zoo(1:3, letters[1:3]) > b <- zoo(c(6, 1), c("a", "c")) > a+b > > The last line gives: > > > a+b > a c > 7 4 > > To use the definition in your post one could do this (which has the > effect of modifying b so that a+b works as in your post): > > merge(a, b, fill = 0, retclass = NULL) > a+b > > The last line gives: > > > a+b > a b c > 7 2 4 > > -- > Statistics & Software Consulting > GKX Group, GKX Associates Inc. > tel: 1-877-GKX-GROUP > email: ggrothendieck at gmail.com > -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to add two data.frame with the same column but different row numbers
Thanks, Dennis! I'll go with it. It's surprising there is no ready way to do that. I imagine it should be a common data manipulation to add two data.frame from two different sources. It could happen that one data.frame is missing some rows while the other have some more. On Fri, Apr 15, 2011 at 5:10 PM, Dennis Murphy wrote: > Hi: > > Here's one approach: > > > df1 <- data.frame(x = letters[1:3], y = 1:3) > > df2 <- data.frame(x = c('a', 'c'), z = c(6, 1)) > > dfm <- merge(df1, df2, all.x = TRUE) > > dfm > x y z > 1 a 1 6 > 2 b 2 NA > 3 c 3 1 > sumdf <- data.frame(x = dfm$x, y = rowSums(dfm[, -1], na.rm = TRUE)) > x y > 1 a 7 > 2 b 2 > 3 c 4 > > HTH, > Dennis > > On Fri, Apr 15, 2011 at 1:31 PM, zhenjiang xu > wrote: > > Hi all, > > > > Suppose I have 2 data.frame , a and b, how can I add them together to get > c? > > Thanks > >> a > > A > > a 1 > > b 2 > > c 3 > > > >> b > > A > > a 6 > > c 1 > > > >> c > > A > > a 7 > > b 2 > > c 4 > > > > -- > > Best, > > Zhenjiang > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to add two data.frame with the same column but different row numbers
Hi all, Suppose I have 2 data.frame , a and b, how can I add them together to get c? Thanks > a A a 1 b 2 c 3 > b A a 6 c 1 > c A a 7 b 2 c 4 -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to reshape the data.frame from long to wide in a specific order
Hi, For example, the data.frame like: origdata.long <- read.table(header=T, con <- textConnection(' subject sex condition measurement 1 M control 7.9 1 M first12.3 1 Msecond10.7 2 F control 6.3 2 F first10.6 2 Fsecond11.1 3 F control 9.5 3 F first13.1 3 Fsecond13.8 4 M control11.5 4 M first13.4 4 Msecond12.9 ')) close(con) Given a vector c('first', 'second', 'control), how can I reshape the data.frame to this? # subject sex first second control # 1 M 12.3 10.7 7.9 # 2 F 10.6 11.1 6.3 # 3 F 13.1 13.8 9.5 # 4 M 13.4 12.911.5 I know reshape() can transform the data.frame from long to wide, but it seems not able to control the order of the columns. Thanks ahead of time -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot2 problem in interacting mode
Hi all, When running R interactively, I have the problem as following: > library(ggplot2) Loading required package: reshape Loading required package: plyr Attaching package: 'reshape' The following object(s) are masked from 'package:plyr': round_any Loading required package: grid Loading required package: proto > data(VADeaths) > pg <- ggplot(melt(VADeaths), aes(value, X1)) + geom_point() + + facet_wrap(~X2) + ylab("") > print(pg) Error in get("transform", env = ., inherits = TRUE)(., ...) : attempt to apply non-function My R package information is : > library(plyr) > sessionInfo() R version 2.11.1 (2010-05-31) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=zh_CN.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] grid stats graphics grDevices utils datasets methods [8] base other attached packages: [1] lattice_0.18-8 ggplot2_0.8.8 proto_0.3-8reshape_0.8.3 plyr_1.2.1 loaded via a namespace (and not attached): [1] tools_2.11.1 The interesting thing is that when I put the codes into an R script, and run with command "R CMD BATCH XX.R", it works alright. Does anyone have any idea what the problem is? Thanks~ -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error bars in lattice barchart
Hi all, I've read the emails of Dan, Deepayan and Sundar about adding error bars to the lattice plots ( https://stat.ethz.ch/pipermail/r-help/2006-October/114883.html), but I still have the problem when I want to adding error bars to barchart. I tried both the solution of Deepayan and Sundar but without luck. Here is my code (I changed prepanel.ci and panel.ci a little to plot bars vertically): ## Sundar's solution ### prepanel.ci <- function(x, y, ly, uy, subscripts, ...) { y <- as.numeric(y) ly <- as.numeric(ly[subscripts]) uy <- as.numeric(uy[subscripts]) list(ylim = range(y, uy, ly, finite = TRUE)) } panel.ci <- function(x, y, ly, uy, subscripts, groups = NULL, pch = 16, ...) { x <- as.numeric(x) y <- as.numeric(y) ly <- as.numeric(ly[subscripts]) uy <- as.numeric(uy[subscripts]) par <- if(is.null(groups))"plot.symbol" else "superpose.symbol" sym <- trellis.par.get(par) col <- sym$col groups <- if(!is.null(groups)) { groups[subscripts] } else { rep(1, along = x) } ug <- unique(groups) for(i in seq(along = ug)) { subg <- groups == ug[i] y.g <- y[subg] x.g <- x[subg] ly.g <- ly[subg] uy.g <- uy[subg] panel.abline(h = unique(y.g), col = "grey") panel.arrows(ly.g, y.g, uy.g, y.g, col = 'black', length = 0.25, unit = "native", angle = 90, code = 3) panel.barchart(x.g, y.g, pch = pch, col = col[i], ...) } } all = barchart( Score ~ Methods | Score.Name * RNA.Type, data = benchmark, box.ratio = 1.2, xlab = 'Methods', ylab = 'Percentage', groups = Seq.Number, layout = c(2, 5), # 2 columns per row between = list( y = 0.5, x = 0 ), # par.settings = list(fontsize=list(text=8)), ## specify the colors used for bars par.settings = list(fontsize=list(text=8), superpose.polygon = list(border = 'black', col = c('white', 'gray', 'black'))), par.strip.text = list(cex=0.9), auto.key = list(space = 'top', columns = 3, cex = 0.7), # key = key.variety, # index.cond = list(c('tRNA', '5S rRNA', 'SRP RNA', 'RNase P', '16S rRNA')), # index.cond = list(rep(1,6)), # ylim = my.ylim, scales = list(x = list(rot = 45), y=list(tck = 0.4, rot = 0, relation = 'free')), ly = benchmark$Score - benchmark$Error, uy = benchmark$Score + benchmark$Error, prepanel = prepanel.ci, panel.groups = panel.ci ) Deepayan's solution prepanel.ci <- function(x, y, ly, uy, subscripts, ...) { y <- as.numeric(y) ly <- as.numeric(ly[subscripts]) uy <- as.numeric(uy[subscripts]) list(ylim = range(y, uy, ly, finite = TRUE)) } panel.ci <- function(x, y, ly, uy, subscripts, ...) { x <- as.numeric(x) y <- as.numeric(y) ly <- as.numeric(ly[subscripts]) uy <- as.numeric(uy[subscripts]) panel.barchart(x, y, ...) panel.arrows(x, ly, x, uy, col = 'black', length = 0.1, unit = "native", angle = 90, code = 3) } all = barchart( Score ~ Methods | Score.Name * RNA.Type, data = benchmark, box.ratio = 1.2, xlab = 'Methods', ylab = 'Percentage', groups = Seq.Number, layout = c(2, 5), # 2 columns per row between = list( y = 0.5, x = 0 ), # par.settings = list(fontsize=list(text=8)), ## specify the colors used for bars par.settings = list(fontsize=list(text=8), superpose.polygon = list(border = 'black', col = c('white', 'gray', 'black'))), par.strip.text = list(cex=0.9), auto.key = list(space = 'top', columns = 3, cex = 0.7), # key = key.variety, # index.cond = list(c('tRNA', '5S rRNA', 'SRP RNA', 'RNase P', '16S rRNA')), # index.cond = list(rep(1,6)), # ylim = my.ylim, scales = list(x = list(rot = 45), y=list(tck = 0.4, rot = 0, relation = 'free')), ly = benchmark$Score - benchmark$Error, uy = benchmark$Score + benchmark$Error, prepanel = prepanel.ci, panel.groups = panel.ci, panel = panel.superpose ) Sundar's solution gives me the exact same original plot without error bars, and Deepayan's solution gives me a messy plot. Did I mess up anything in these two solutions? I'd appreciate any help from you experts. Thanks -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] matrix problem
Hi, I have a file like this: 1 2 0.1 2 3 0.2 3 1 0.3 And I want to read it to create a matrix like this: [,1] [,2][,3] [1,]0 0.1 0 [2,]0 00.2 [3,]0.300 How can I do it efficiently? Thanks. -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to display the value of each data points on the levelplot
Hi R users, How can I display the corresponding value inside each little square of level plot plotted by the following code? > data(Cars93, package = "MASS") > cor.Cars93 <- cor(Cars93[, !sapply(Cars93, is.factor)], use = "pair") > levelplot(cor.Cars93, aspect = 1, scales = list(x = list(rot = 90))) This is an example from the book "Lattice:Mutivariate Data Visualization with R". I know there is an example (Fig 13.5) showing how to do levelplot with data labels and ellipse shape. But here I want to keep the square shape. -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a question on autocorrelation acf
Thanks, Duncan, but there are no reference in ?acf. The only probably related stuff is "Author(s): Original: Paul Gilbert, Martyn Plummer. Extensive modifications and univariate case of 'pacf' by B.D. Ripley." And I didn't find anything with google search of it. On Thu, Apr 29, 2010 at 7:08 PM, Duncan Murdoch wrote: > On 29/04/2010 6:22 PM, zhenjiang xu wrote: > >> Hi R users, >> >> where can I find the equations used by acf function to calculate >> autocorrelation? >> > > See the reference listed in ?acf. > > Duncan Murdoch > > > I think I misunderstand acf. Doesn't acf use following >> equation to calculate autocorrelation? >> [image: R(\tau) = \frac{\operatorname{E}[(X_t - \mu)(X_{t+\tau} - >> \mu)]}{\sigma^2}\, ,] >> If it does, then the autocorrelation of a sine function should give a >> cosine; however, the following code gives a cosine-shape function with its >> magnitude decreasing along the lag. >> x = c(1:500) >> x = x/10 >> x = sin(x) >> acf(x, type='correlation', lag.max=length(x)-1) >> >> >> > > -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] a question on autocorrelation acf
Hi R users, where can I find the equations used by acf function to calculate autocorrelation? I think I misunderstand acf. Doesn't acf use following equation to calculate autocorrelation? [image: R(\tau) = \frac{\operatorname{E}[(X_t - \mu)(X_{t+\tau} - \mu)]}{\sigma^2}\, ,] If it does, then the autocorrelation of a sine function should give a cosine; however, the following code gives a cosine-shape function with its magnitude decreasing along the lag. x = c(1:500) x = x/10 x = sin(x) acf(x, type='correlation', lag.max=length(x)-1) -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to reorder of groups and specify ylim for each row in lattice barchart
Yes. I put the real ranges instead of '...'. But I tried the following code and it works. This is great! Thank you. Previously I thought you said ylim was put inside the scales(). library(lattice) barchart(yield ~ variety | site,data=barley, groups = year, layout = c(1,6),auto.key = list(points = FALSE, rectangles = TRUE, space = "right"),ylab = "Barley Yield (bushels/acre)",scales = list(x = list(rot = 45), y=list(relation='free')), ylim = mylist) On Fri, Apr 23, 2010 at 11:51 AM, Peter Ehlers wrote: > Works for me. Did you replace the '' in mylist() > with appropriate c(,) code? For example: > > mylist <- list(c(0,30), c(40,80), c(0,50), > c(0,50), c(0,50), c(0,50)) > > -Peter Ehlers > > > On 2010-04-23 9:22, zhenjiang xu wrote: > >> Peter, thanks, but that doesn't work. Did I missed something? >> >> library(lattice) >> mylist<- list(c(0,30), c(40,80), ) >> barchart(yield ~ variety | site,data=barley, groups = year, layout = >> c(1,6),auto.key = list(points = FALSE, rectangles = TRUE, space = >> "right"),ylab = "Barley Yield (bushels/acre)",scales = list(x = list(rot = >> 45), y=list(relation='free', ylim=mylist))) >> >> On Thu, Apr 22, 2010 at 7:54 PM, Peter Ehlers wrote: >> >> On 2010-04-21 21:13, zhenjiang xu wrote: >>> >>> R experts, >>>> >>>> Is there anyway to reorder inside each group? In the following example, >>>> the >>>> bar of year 1932 is always plotted before the bar of year 1931, may I >>>> change >>>> the order inside each groups of bars? >>>> >>>> >>>> Do you mean a different order in different panels? That seems to >>> me to defeat the purpose of panels. I can't think of an easy way >>> to do that. >>> >>> >>> library(lattice) >>> >>>> barchart(yield ~ variety | site,data=barley, groups = year, layout = >>>> c(1,6),auto.key = list(points = FALSE, rectangles = TRUE, space = >>>> "right"),ylab = "Barley Yield (bushels/acre)",scales = list(x = list(rot >>>> = >>>> 45))) >>>> >>>> >>>> Another questions is: may I specify the various ranges of y axis for >>>> each >>>> row of the panels? In the example of above, how can I change the y range >>>> of >>>> "Waseca" panel to (40,70) from the default (0,80)? Please notice I don't >>>> want to set arguement "scales = list(y=list(relation='free'))", for the >>>> automatic various setting of ranges for different panels isn't good >>>> enough >>>> for me. Basically I'd like to manually control y ranges. >>>> >>>> >>> You can use scales() with >>> y=list(relation='free', ylim=mylist) >>> >>> where mylist is a list of ylims: >>> mylim<- list(c(0,30), c(40,80), ) >>> >>> >> Peter, thanks, but that doesn't work. Did I missed something? >> >> library(lattice) >> mylist<- list(c(0,30), c(40,80), ) >> barchart(yield ~ variety | site,data=barley, groups = year, layout = >> c(1,6),auto.key = list(points = FALSE, rectangles = TRUE, space = >> "right"),ylab = "Barley Yield (bushels/acre)",scales = list(x = list(rot = >> 45), y=list(relation='free', ylim=mylist))) >> >> >> >>> -Peter Ehlers >>> >>> >>> Thank you! >>>> >>>> >>> -- >>> Peter Ehlers >>> University of Calgary >>> >>> >> >> >> > -- > Peter Ehlers > University of Calgary > -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] a question related to table output
Hi, I have a data.frame object: > a.df Methods Score 1 Northern 1.3544227 2 Northern 0.8302436 3 RT-PCR 1.0011360 4 RT-PCR 1.1149423 If I write it out with write.table, > write.table(a.df, file = 'data.txt', quote = FALSE, sep = '\t', row.names = FALSE) the data.txt is looks like: MethodsScore Northern1.35442268939541 Northern0.830243615689926 RT-PCR1.00113601434407 RT-PCR1.11494230904995 My question is, can I merge the two "Northern" entries into one cell, like the "Merge Cells" in MS Excel ? Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the bar width of barchart plot in lattice package
probably yes. I plotted each row individually instead. Thanks On Fri, Apr 23, 2010 at 11:14 AM, Deepayan Sarkar wrote: > On Wed, Apr 21, 2010 at 8:55 PM, David Winsemius > wrote: > > > > On Apr 21, 2010, at 9:51 PM, zhenjiang xu wrote: > > > >> I tried that. It seems the bar width is already maximized, although > there > >> is a lot of space between groups of bars. Thank you anyway. > > > > I apologize. It was reproducible code. I missed the "values" assignment. > > There is also a box.width argument which does affect how the plot gets > > drawn, but the effects do not appear salutory. It appears that the > alignment > > of the bars gets shifted relative to the labels. The barchart function > > cannot seem to deal with the completity of the 2 * 5 factor crossed with > a > > c(3,3,4) factor. On the other hand that problem seems to be present in > the > > original plot as well. Maybe you should re-think the structure of the > data? > > The problem is that levels is nested within factors: > > > xtabs(~levels + factors, a) > factors > levels Cycles MaxPairs Order > Cycle 1 100 0 > Cycle 2 100 0 > Cycle 3 100 0 > Cycle 4 100 0 > Order 10010 > Order 20010 > Order 30010 > MaxPairs = 20 0 10 0 > MaxPairs = Average Length 0 10 0 > MaxPairs = 500 0 10 0 > > I can't think of a meaningful design that would give the desired result > here. > > -Deepayan > -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to reorder of groups and specify ylim for each row in lattice barchart
Peter, thanks, but that doesn't work. Did I missed something? library(lattice) mylist <- list(c(0,30), c(40,80), ) barchart(yield ~ variety | site,data=barley, groups = year, layout = c(1,6),auto.key = list(points = FALSE, rectangles = TRUE, space = "right"),ylab = "Barley Yield (bushels/acre)",scales = list(x = list(rot = 45), y=list(relation='free', ylim=mylist))) On Thu, Apr 22, 2010 at 7:54 PM, Peter Ehlers wrote: > On 2010-04-21 21:13, zhenjiang xu wrote: > >> R experts, >> >> Is there anyway to reorder inside each group? In the following example, >> the >> bar of year 1932 is always plotted before the bar of year 1931, may I >> change >> the order inside each groups of bars? >> >> > Do you mean a different order in different panels? That seems to > me to defeat the purpose of panels. I can't think of an easy way > to do that. > > > library(lattice) >> barchart(yield ~ variety | site,data=barley, groups = year, layout = >> c(1,6),auto.key = list(points = FALSE, rectangles = TRUE, space = >> "right"),ylab = "Barley Yield (bushels/acre)",scales = list(x = list(rot = >> 45))) >> >> >> Another questions is: may I specify the various ranges of y axis for each >> row of the panels? In the example of above, how can I change the y range >> of >> "Waseca" panel to (40,70) from the default (0,80)? Please notice I don't >> want to set arguement "scales = list(y=list(relation='free'))", for the >> automatic various setting of ranges for different panels isn't good enough >> for me. Basically I'd like to manually control y ranges. >> > > You can use scales() with > y=list(relation='free', ylim=mylist) > > where mylist is a list of ylims: > mylim <- list(c(0,30), c(40,80), ) > Peter, thanks, but that doesn't work. Did I missed something? library(lattice) mylist <- list(c(0,30), c(40,80), ) barchart(yield ~ variety | site,data=barley, groups = year, layout = c(1,6),auto.key = list(points = FALSE, rectangles = TRUE, space = "right"),ylab = "Barley Yield (bushels/acre)",scales = list(x = list(rot = 45), y=list(relation='free', ylim=mylist))) > > -Peter Ehlers > > >> Thank you! >> > > -- > Peter Ehlers > University of Calgary > -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to reorder of groups and specify ylim for each row in lattice barchart
R experts, Is there anyway to reorder inside each group? In the following example, the bar of year 1932 is always plotted before the bar of year 1931, may I change the order inside each groups of bars? library(lattice) barchart(yield ~ variety | site,data=barley, groups = year, layout = c(1,6),auto.key = list(points = FALSE, rectangles = TRUE, space = "right"),ylab = "Barley Yield (bushels/acre)",scales = list(x = list(rot = 45))) Another questions is: may I specify the various ranges of y axis for each row of the panels? In the example of above, how can I change the y range of "Waseca" panel to (40,70) from the default (0,80)? Please notice I don't want to set arguement "scales = list(y=list(relation='free'))", for the automatic various setting of ranges for different panels isn't good enough for me. Basically I'd like to manually control y ranges. Thank you! -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the bar width of barchart plot in lattice package
I tried that. It seems the bar width is already maximized, although there is a lot of space between groups of bars. Thank you anyway. On Tue, Apr 20, 2010 at 10:16 AM, David Winsemius wrote: > > On Apr 20, 2010, at 9:46 AM, zhenjiang xu wrote: > > Dear R users, >> >> I am trying to use the following code to make a barchar plot. The bars in >> the plot turn out to be a little narrow. Is there any way to modify the >> width of the bars? Thank you! >> >> library(lattice) >> scores = gl(2, 5, label=c('Sensitivity', 'PPV'), length = 100) >> sequences = gl(5, 1, label=c('Lemna minor', 'Dugesia japonica A', >> 'Gymnosporangium sabinae', 'Hymeniacidon sanguinea', 'Streptomyces >> griseus'), length = 100) >> levels = gl(10, 10, label = c('Cycle 1', 'Cycle 2', 'Cycle 3', 'Cycle 4', >> 'Order 1', 'Order 2', 'Order 3', 'MaxPairs = 20', 'MaxPairs = Average >> Length', 'MaxPairs = 500')) >> factors = c(rep('Cycles', 40), rep('Order', 30), rep('MaxPairs', 30)) >> values = rnorm(100) # this is toy data >> a = data.frame(values, scores, sequences, levels, factors) >> bc.factors = >> barchart(values ~ sequences | scores * factors , data = a, >> groups = levels, >> layout = c(2,3), >> between = list(y=0.5), >> clip = list(strip = 'off'), >> par.strip.text = list(cex=0.7), >> par.settings = list(fontsize=list(text=8)), >> auto.key = list(rectangles = TRUE, space = 'right', columns = 1), >> draw.key = TRUE, >> scales = list(x = list(rot = 45))) >> >> > ?barchart > > Looking at the arguments to barchart in the help page I would have guessed > that box.ratio would do what you want. Since that is clearly not > reproducible code , (in the absence of test dataset of the appropriate > structure) I suppose guessing will remain the level of my knowledge in this > instance. > > > -- >> Best, >> Zhenjiang >> > > David Winsemius, MD > West Hartford, CT > > -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] the bar width of barchart plot in lattice package
Dear R users, I am trying to use the following code to make a barchar plot. The bars in the plot turn out to be a little narrow. Is there any way to modify the width of the bars? Thank you! library(lattice) scores = gl(2, 5, label=c('Sensitivity', 'PPV'), length = 100) sequences = gl(5, 1, label=c('Lemna minor', 'Dugesia japonica A', 'Gymnosporangium sabinae', 'Hymeniacidon sanguinea', 'Streptomyces griseus'), length = 100) levels = gl(10, 10, label = c('Cycle 1', 'Cycle 2', 'Cycle 3', 'Cycle 4', 'Order 1', 'Order 2', 'Order 3', 'MaxPairs = 20', 'MaxPairs = Average Length', 'MaxPairs = 500')) factors = c(rep('Cycles', 40), rep('Order', 30), rep('MaxPairs', 30)) values = rnorm(100) # this is toy data a = data.frame(values, scores, sequences, levels, factors) bc.factors = barchart(values ~ sequences | scores * factors , data = a, groups = levels, layout = c(2,3), between = list(y=0.5), clip = list(strip = 'off'), par.strip.text = list(cex=0.7), par.settings = list(fontsize=list(text=8)), auto.key = list(rectangles = TRUE, space = 'right', columns = 1), draw.key = TRUE, scales = list(x = list(rot = 45))) -- Best, Zhenjiang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.