[R] Grouped bar plot
Hi, I am trying to produce a grouped bar plot from a data.frame and I'm having difficulties figuring out how to do so. My data is 500 rows by 4 columns and basically looks like so: head(x) V1V2V3V4 1 XOM 0.2317915 0.1610068 1.6941637 2 AAPL 0.6735488 0.7433611 0.1594102 3 GE 1.2554160 0.9237384 1.6767711 4 IBM 1.6296938 0.3730387 0.5858115 5 CVX 0.9194169 0.4785705 0.1803601 6 PG 0.7768241 1.7622060 0.7640163 . . . I would like to produce something similar to what is found at: http://www.statmethods.net/graphs/bar.html # the grouped barplot example or http://had.co.nz/ggplot2/geom_bar.html# the Dodged bar charts example Across the X-axis, for each set(row) of 3 data points(V2, V3, V4) associated with a symbol(V1), I would like to create a group of 3 bars reflecting their values. So the Y-axis will represent the magnitude of values in the columns (V2, V3, V4), and X-axis will have 500 groups of 3 bars, for a total of 1500 bars. I would like the color of each bar to reflect the column of data it represents, and to label each group of 3 with the corresponding symbol in column V1. I was trying to get this to work using ggplot but the y-axis in the example is the count, which is not what I'm after. Any suggestions, to get me started down the right path would be appreciated. Thank you. James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] taking rows from data.frames in list to form new data.frame?
Hi, I am having a problem figuring out how to extract a subset of rows. I have a list with 68 similar data.frames. Each data.frame is 500 rows by 5 columns. I want to take one row from each data.frame based upon the data in a particular column (i.e. it matches a symbol). For example: str(database) List of 68 $ X2011.01.11:'data.frame':500 obs. of 5 variables: ..$ Symbol: chr [1:500] MMM ACE AES AFL ... ..$ Price : num [1:500] 87.7 60.7 13.1 55.7 15.6 ... ..$ Shares.Out: num [1:500] 7.15e+08 3.39e+08 7.88e+08 4.71e+08 1.10e+08 ... ..$ Float : num [1:500] 7.13e+08 3.38e+08 6.61e+08 4.60e+08 1.09e+08 ... ..$ Market.Cap: num [1:500] 6.27e+10 2.06e+10 1.04e+10 2.62e+10 1.72e+09 ... $ X2011.01.12:'data.frame':500 obs. of 5 variables: ..$ Symbol: chr [1:500] MMM ACE AES AFL ... ..$ Price : num [1:500] 88.7 60.9 12.9 57.1 15.2 ... ..$ Shares.Out: num [1:500] 7.15e+08 3.39e+08 7.88e+08 4.71e+08 1.10e+08 ... ..$ Float : num [1:500] 7.13e+08 3.38e+08 6.61e+08 4.60e+08 1.09e+08 ... ..$ Market.Cap: num [1:500] 6.34e+10 2.06e+10 1.02e+10 2.69e+10 1.68e+09 ... . . . lapply(database, function(x) which(x == IBM)) $X2011.01.11 [1] 234 $X2011.01.12 [1] 234 . . . lapply(database, function(x) x[which(x == IBM), ]) $X2011.01.11 Symbol Price Shares.OutFloat Market.Cap 234IBM 147.28 1.24e+09 1.24e+09 1.8297e+11 $X2011.01.12 Symbol Price Shares.OutFloat Market.Cap 234IBM 149.1 1.24e+09 1.24e+09 1.8524e+11 . . . What I would like to do is create a new data.frame with 68 rows by 5 columns of data, perhaps using the old data.frame names as the new row.names. I can get to the subset of data that I want, I just can't get it from list form into one new data.frame. Any suggestions? Thank you. James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] taking rows from data.frames in list to form new data.frame?
Dennis, Thanks, the first example works perfectly. do.call(rbind, lapply(database, function(df) subset(df, Symbol == 'IBM'))) I haven't tried the second, but will look into plyr. Thanks, again, James On Wed, Apr 20, 2011 at 6:36 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: Perhaps you're looking for subset()? I'm not sure I understand the problem completely, but is do.call(rbind, lapply(database, function(df) subset(df, Symbol == 'IBM'))) or library(plyr) ldply(lapply(database, function(df) subset(df, Symbol == 'IBM'), rbind) in the vicinity of what you're looking for? [Obviously untested for the usual reasons...] HTH, Dennis On Wed, Apr 20, 2011 at 4:13 PM, jctoll jct...@gmail.com wrote: Hi, I am having a problem figuring out how to extract a subset of rows. I have a list with 68 similar data.frames. Each data.frame is 500 rows by 5 columns. I want to take one row from each data.frame based upon the data in a particular column (i.e. it matches a symbol). For example: str(database) List of 68 $ X2011.01.11:'data.frame': 500 obs. of 5 variables: ..$ Symbol : chr [1:500] MMM ACE AES AFL ... ..$ Price : num [1:500] 87.7 60.7 13.1 55.7 15.6 ... ..$ Shares.Out: num [1:500] 7.15e+08 3.39e+08 7.88e+08 4.71e+08 1.10e+08 ... ..$ Float : num [1:500] 7.13e+08 3.38e+08 6.61e+08 4.60e+08 1.09e+08 ... ..$ Market.Cap: num [1:500] 6.27e+10 2.06e+10 1.04e+10 2.62e+10 1.72e+09 ... $ X2011.01.12:'data.frame': 500 obs. of 5 variables: ..$ Symbol : chr [1:500] MMM ACE AES AFL ... ..$ Price : num [1:500] 88.7 60.9 12.9 57.1 15.2 ... ..$ Shares.Out: num [1:500] 7.15e+08 3.39e+08 7.88e+08 4.71e+08 1.10e+08 ... ..$ Float : num [1:500] 7.13e+08 3.38e+08 6.61e+08 4.60e+08 1.09e+08 ... ..$ Market.Cap: num [1:500] 6.34e+10 2.06e+10 1.02e+10 2.69e+10 1.68e+09 ... . . . lapply(database, function(x) which(x == IBM)) $X2011.01.11 [1] 234 $X2011.01.12 [1] 234 . . . lapply(database, function(x) x[which(x == IBM), ]) $X2011.01.11 Symbol Price Shares.Out Float Market.Cap 234 IBM 147.28 1.24e+09 1.24e+09 1.8297e+11 $X2011.01.12 Symbol Price Shares.Out Float Market.Cap 234 IBM 149.1 1.24e+09 1.24e+09 1.8524e+11 . . . What I would like to do is create a new data.frame with 68 rows by 5 columns of data, perhaps using the old data.frame names as the new row.names. I can get to the subset of data that I want, I just can't get it from list form into one new data.frame. Any suggestions? Thank you. James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need to abstract changing name of column within loop
On Thu, Mar 17, 2011 at 10:32 AM, Joshua Ulrich josh.m.ulr...@gmail.com wrote: On Wed, Mar 16, 2011 at 6:58 PM, jctoll jct...@gmail.com wrote: Hi, I'm struggling to figure out the way to change the name of a column from within a loop. The problem is I can't refer to the object by its actual variable name, since that will change each time through the loop. My xts object is A. head(A) A.Open A.High A.Low A.Close A.Volume A.Adjusted A.Adjusted.1 2007-01-03 34.99 35.48 34.05 34.30 2574600 34.30 1186780 2007-01-04 34.30 34.60 33.46 34.41 2073700 34.41 1190586 2007-01-05 34.30 34.40 34.00 34.09 2676600 34.09 1179514 2007-01-08 33.98 34.08 33.68 33.97 1557200 33.97 1175362 2007-01-09 34.08 34.32 33.63 34.01 1386200 34.01 1176746 2007-01-10 34.04 34.04 33.37 33.70 2157400 33.70 1166020 It's column names are: colnames(A) [1] A.Open A.High A.Low A.Close A.Volume A.Adjusted A.Adjusted.1 I want to change the 7th column name: colnames(A)[7] [1] A.Adjusted.1 I need to do that through a reference to i: i [1] A This works: colnames(get(i))[7] [1] A.Adjusted.1 And this is what I want to change the column name to: paste(i, .MarketCap, sep = ) [1] A.MarketCap But how do I make the assignment? This clearly doesn't work: colnames(get(i))[7] - paste(i, .MarketCap, sep = ) Error in colnames(get(i))[7] - paste(i, .MarketCap, sep = ) : could not find function get- Nor does this (it creates a new object A.Adjusted.1 with a value of A.MarketCap) : assign(colnames(get(i))[7], paste(i, .MarketCap, sep = )) How can I change the name of that column within my big loop? Any ideas? Thanks! I usually make a copy of the object, change it, then overwrite the original: tmp - get(i) colnames(tmp)[7] - foo assign(i,tmp) Hope that helps. Best regards, Joshua Ulrich | FOSS Trading: www.fosstrading.com Thank you, that does help. It sounds like your way is similar to the way David suggested. At least now I can just resign myself to doing it that way rather than torturing my mind trying to figure out a shorter way. Thanks, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need to abstract changing name of column within loop
Hi, I'm struggling to figure out the way to change the name of a column from within a loop. The problem is I can't refer to the object by its actual variable name, since that will change each time through the loop. My xts object is A. head(A) A.Open A.High A.Low A.Close A.Volume A.Adjusted A.Adjusted.1 2007-01-03 34.99 35.48 34.05 34.30 2574600 34.30 1186780 2007-01-04 34.30 34.60 33.46 34.41 2073700 34.41 1190586 2007-01-05 34.30 34.40 34.00 34.09 2676600 34.09 1179514 2007-01-08 33.98 34.08 33.68 33.97 1557200 33.97 1175362 2007-01-09 34.08 34.32 33.63 34.01 1386200 34.01 1176746 2007-01-10 34.04 34.04 33.37 33.70 2157400 33.70 1166020 It's column names are: colnames(A) [1] A.Open A.High A.LowA.Close A.Volume A.Adjusted A.Adjusted.1 I want to change the 7th column name: colnames(A)[7] [1] A.Adjusted.1 I need to do that through a reference to i: i [1] A This works: colnames(get(i))[7] [1] A.Adjusted.1 And this is what I want to change the column name to: paste(i, .MarketCap, sep = ) [1] A.MarketCap But how do I make the assignment? This clearly doesn't work: colnames(get(i))[7] - paste(i, .MarketCap, sep = ) Error in colnames(get(i))[7] - paste(i, .MarketCap, sep = ) : could not find function get- Nor does this (it creates a new object A.Adjusted.1 with a value of A.MarketCap) : assign(colnames(get(i))[7], paste(i, .MarketCap, sep = )) How can I change the name of that column within my big loop? Any ideas? Thanks! Best regards, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need to abstract changing name of column within loop
On Wed, Mar 16, 2011 at 8:20 PM, David Winsemius dwinsem...@comcast.net wrote: On Mar 16, 2011, at 7:58 PM, jctoll wrote: Hi, I'm struggling to figure out the way to change the name of a column from within a loop. The problem is I can't refer to the object by its actual variable name, since that will change each time through the loop. My xts object is A. head(A) A.Open A.High A.Low A.Close A.Volume A.Adjusted A.Adjusted.1 2007-01-03 34.99 35.48 34.05 34.30 2574600 34.30 1186780 2007-01-04 34.30 34.60 33.46 34.41 2073700 34.41 1190586 2007-01-05 34.30 34.40 34.00 34.09 2676600 34.09 1179514 2007-01-08 33.98 34.08 33.68 33.97 1557200 33.97 1175362 2007-01-09 34.08 34.32 33.63 34.01 1386200 34.01 1176746 2007-01-10 34.04 34.04 33.37 33.70 2157400 33.70 1166020 It's column names are: colnames(A) [1] A.Open A.High A.Low A.Close A.Volume A.Adjusted A.Adjusted.1 I want to change the 7th column name: colnames(A)[7] [1] A.Adjusted.1 I need to do that through a reference to i: i [1] A It's not pretty and there may be a more direct way: a [,1] [,2] [1,] 1e+20 1000 [2,] 1e+02 1000 b - a assign(tmp, eval(parse(text=b))) tmp [,1] [,2] [1,] 1e+20 1000 [2,] 1e+02 1000 colnames(tmp) - c(one,two) assign( b, tmp) a one two [1,] 1e+20 1000 [2,] 1e+02 1000 Trying to short circuit the process without an intermediate temporary structure fails: colnames(eval(parse(text=b)) )- c(two, three) Error in parse(`*tmp*`) : EOF whilst reading MBCS char at line 1 -- David Thank you. This seems to work, but as you observed it's not exactly elegant. Thanks, James This works: colnames(get(i))[7] [1] A.Adjusted.1 And this is what I want to change the column name to: paste(i, .MarketCap, sep = ) [1] A.MarketCap But how do I make the assignment? This clearly doesn't work: colnames(get(i))[7] - paste(i, .MarketCap, sep = ) Error in colnames(get(i))[7] - paste(i, .MarketCap, sep = ) : could not find function get- Nor does this (it creates a new object A.Adjusted.1 with a value of A.MarketCap) : assign(colnames(get(i))[7], paste(i, .MarketCap, sep = )) How can I change the name of that column within my big loop? Any ideas? Thanks! Best regards, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Finding pairs with least magnitude difference from mean
Hi, I have what I think is some kind of linear programming question. Basically, what I want to figure out is if I have a vector of numbers, x - rnorm(10) x [1] -0.44305959 -0.26707077 0.07121266 0.44123714 -1.10323616 -0.19712807 0.20679494 -0.98629992 0.97191659 -0.77561593 mean(x) [1] -0.2081249 Using each number only once, I want to find the set of five pairs where the magnitude of the differences between the mean(x) and each pairs sum is least. y - outer(x, x, +) - (2 * mean(x)) y [,1][,2][,3][,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] -0.46986936 -0.29388054 0.04440289 0.41442737 -1.1300459 -0.22393784 0.1799852 -1.0131097 0.9451068 -0.80242569 [2,] -0.29388054 -0.11789173 0.22039171 0.59041619 -0.9540571 -0.04794902 0.3559740 -0.8371209 1.1210956 -0.62643688 [3,] 0.04440289 0.22039171 0.55867514 0.92869962 -0.6157737 0.29033441 0.6942574 -0.4988374 1.4593791 -0.28815345 [4,] 0.41442737 0.59041619 0.92869962 1.29872410 -0.2457492 0.66035889 1.0642819 -0.1288130 1.8294035 0.08187104 [5,] -1.13004593 -0.95405711 -0.61577368 -0.24574920 -1.7902225 -0.88411441 -0.4801914 -1.6732863 0.2849302 -1.46260226 [6,] -0.22393784 -0.04794902 0.29033441 0.66035889 -0.8841144 0.02199368 0.4259167 -0.7671782 1.1910383 -0.55649417 [7,] 0.17998518 0.35597399 0.69425742 1.06428191 -0.4801914 0.42591670 0.8298397 -0.3632552 1.5949614 -0.15257116 [8,] -1.01310969 -0.83712087 -0.49883744 -0.12881296 -1.6732863 -0.76717817 -0.3632552 -1.5563500 0.4018665 -1.34566603 [9,] 0.94510682 1.12109563 1.45937907 1.82940355 0.2849302 1.19103834 1.5949614 0.4018665 2.3600830 0.61255048 [10,] -0.80242569 -0.62643688 -0.28815345 0.08187104 -1.4626023 -0.55649417 -0.1525712 -1.3456660 0.6125505 -1.13498203 With this matrix, if I put together a combination of pairs which uses each number only once, the sum of the corresponding numbers is 0. For example, compare the SD between this set of 5 pairs y[10,1] + y[9,2] + y[8,3] + y[7,4] + y[6,5] [1] 0 sum(c(y[10,1], y[9,2], y[8,3], y[7,4], y[6,5])) [1] 5.551115e-17# basically 0, I assume this is round-off error mean(c(y[10,1], y[9,2], y[8,3], y[7,4], y[6,5])) [1] 1.111307e-17# basically 0, I assume this is round-off error sd(c(y[10,1], y[9,2], y[8,3], y[7,4], y[6,5])) [1] 1.007960 versus this hand-selected, possibly lowest SD combination of pairs sum(c(y[3,1], y[6,2], y[10,4], y[9,5], y[8,7])) [1] -1.665335e-16 # basically 0, I assume this is round-off error mean(c(y[3,1], y[6,2], y[10,4], y[9,5], y[8,7])) [1] -3.330669e-17 # basically 0, I assume this is round-off error sd(c(y[3,1], y[6,2], y[10,4], y[9,5], y[8,7])) [1] 0.2367030 I believe that if I could test all the various five pair combinations, the combination with the lowest SD of values from the table would give me my answer. I believe I have 3 questions regarding my problem. 1) How can I find all the 5 pair combinations of my 10 numbers so that I can perform a brute force test of each set of combinations? I believe there are 45 different pairs (i.e. choose(10,2)). I found combinations from the {Combinations} package but I can't figure out how to get it to provide pairs. 2) Will my brute force strategy of testing the SD of each of these 5 pair combinations actually give me the answer I'm searching for? 3) Is there a better way of doing this? Probably something to do with real linear programming, rather than this method I've concocted. Thanks for any help you can provide regarding my question. Best regards, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Applying multiple functions to one object
On Wed, Feb 2, 2011 at 7:59 AM, Karl Ove Hufthammer k...@huftis.org wrote: Dear list members, I recall seeing a convenience function for applying multiple functions to one object (i.e., almost the opposite of 'mapply’) somewhere. Example: If the function was named ’fun’ the output of fun(3.14, mode, typeof, class) would be identical to the output of c(mode(3.14), typeof(3.14), class(3.14)) Is my memory failing me, or does such a function already exists in a package? Of course, it’s not difficult to define a summary function and apply this to the object, but writing, for example, fun(x, mean, median, sd, mad) to quickly show the relevant information is much more *convient*. It would be even nicer with a function that could also handle vectors and lists of values, and output the result as data frames or matrices. Example: x = c(foo, bar, foobar) fun(x, nchar, function(st) substr(st, 1 ,2) ) y = list(3, 3L, 3.14, factor(3)) fun(x, mode, typeof, class) -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Karl, Perhaps you're thinking of the Reduce function? There's an example from the help page that you might be able to adapt to your purpose. ## Iterative function application: Funcall - function(f, ...) f(...) ## Compute log(exp(acos(cos(0)) Reduce(Funcall, list(log, exp, acos, cos), 0, right = TRUE) HTH, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.