Re: [R] Recursive function calls

2012-08-03 Thread Hadley Wickham
It's nice that R keeps the base function list short enough that you can look at it, but it would be nice to have a few more convenience functions included, especially ones that mirror common functions, like trim sum(sapply(search(), function(x) length(ls(x [1] 2376 Over two thousand

Re: [R] On Reproducible Code

2012-07-27 Thread Hadley Wickham
That assumes: * Everyone reads the mailing list before making the first posting * Everyone reads every part of every email. I'd argue that both assumptions are false. People are particular well trained to skip over boilerplate text at the bottom of emails. I'd suggest an alternative approach

Re: [R] Get XML or JSON data from api into data frame

2012-07-26 Thread Hadley Wickham
On Thu, Jul 26, 2012 at 4:18 AM, Richard Ohrvall richard.ohrv...@gmail.com wrote: Dear all, I am new to R in general and ways to retrieve XML or JSON data in particular. I have tried to get information through the XML package and various websites without being able to do exactly what I want.

Re: [R] turning R expressions into functions?

2012-07-23 Thread Hadley Wickham
One of the things I would love to add to my package would be the ability to compare more than two expressions in one call. But unfortunately, I haven't found out so far whether (and if so, how) it is possible to extract the elements of a ... object without evaluating them. Have a look at

Re: [R] turning R expressions into functions?

2012-07-23 Thread Hadley Wickham
On Mon, Jul 23, 2012 at 2:12 PM, S Ellison s.elli...@lgcgroup.com wrote: One of the things I would love to add to my package would be the ability to compare more than two expressions in one call. But unfortunately, I haven't found out so far whether (and if so, how) it is possible to extract

Re: [R] complexity of operations in R

2012-07-19 Thread Hadley Wickham
On Thu, Jul 19, 2012 at 8:02 AM, Jan van der Laan rh...@eoos.dds.nl wrote: Johan, Your 'list' and 'array doubling' code can be written much more efficient. The following function is faster than your g and easier to read: g2 - function(dotot) { v - list() for (i in seq_len(dotot)) {

Re: [R] complexity of operations in R

2012-07-19 Thread Hadley Wickham
On Thu, Jul 19, 2012 at 9:21 AM, William Dunlap wdun...@tibco.com wrote: Preallocation of lists does speed things up. The following shows time quadratic in size when there is no preallocation and linear growth when there is, for size in the c. 10^4 to 10^6 region: Interesting, thanks! I wish

[R] [R-pkgs] Devtools 0.7

2012-06-20 Thread Hadley Wickham
# devtools The aim of `devtools` is to make your life as a package developer easier by providing R functions that simplify many common tasks. Devtools is opinionated about how to do package development, and requires that you use `roxygen2` for documentation and `testthat` for testing. Future

[R] R development master class: NYC June 21-22, Bay Area June 28-29

2012-05-21 Thread Hadley Wickham
Hi all, I'm going to be teaching an R development master classes in NYC June 21-12 and in the Bay Area June 28-29. The basic idea of the class is to help you write better code, focused on the mantra of do not repeat yourself. In day one you will learn powerful new tools of abstraction, allowing

Re: [R] trouble loading ggplot2 using R

2012-04-25 Thread Hadley Wickham
On Wed, Apr 25, 2012 at 6:27 AM, Ramon Ovelar ramon.ove...@gmail.com wrote: I don't think I have touched at anything at all. I'm very newbie to R and to be honest I don't know what Ramdom.seed is. I will try to find out. I have seen other messages about restoring random.seed, but in order to

Re: [R] trouble loading ggplot2 using R

2012-04-24 Thread Hadley Wickham
I have a similar error, running R in Snow Leopard too library(ggplot2) Error : .onAttach failed in attachNamespace() for 'ggplot2', details:  call: stats::runif(1)  error: .Random.seed no es un vector de números enteros pero es de tipo 'list' Error: package/namespace load failed for

Re: [R] introducing R to high school students

2012-04-18 Thread Hadley Wickham
Now I have to put my money where my mouth is. I've offered to visit a high school and introduce R to some fairly advanced students participating in a longitudinal 3-year science research class. I anticipate keeping things very simple: --objects and the fact that there is stuff inside them.

Re: [R] introducing R to high school students

2012-04-18 Thread Hadley Wickham
If the students are in a science research class, does that mean they have data from their own research that they would want to understand better?  I think that would be much more motivating than anything else. It might depends on the class - most high school science experiments aren't that

Re: [R] Faceted bar plot shows wrong counts (ggplot2)

2012-04-11 Thread Hadley Wickham
And it's now fixed in the dev version. Hadley On Tue, Mar 13, 2012 at 11:37 AM, Helios de Rosario helios.derosa...@ibv.upv.es wrote: Michael, Thanks for the pointer to the discussion in the ggplot list. It seems that the reason of this behaviour of facet_grid() is already known and being

Re: [R] geom_plot creates Area Instead Of Lines

2012-04-11 Thread Hadley Wickham
What I would have liked is something like a cloud of lines, similar to what I get when I convert the data into a matrix (why do I not just use a matrix? I come from MATLAB and this seems natural, however, my data is large and a data frame seems to be an advantageous way to handle that). It's

Re: [R] extend data frame for plotting heat map in ggplot2

2012-04-11 Thread Hadley Wickham
On Sun, Apr 1, 2012 at 9:16 AM, Till Bayer till.ba...@kaust.edu.sa wrote: Hi all! I want to generate a heat map from an all-vs-all comparison. I have the data, already scaled to 0-1. However, I have the values only for the comparisons in one way, and not for the comparisons between the same

Re: [R] Appropriate method for sharing data across functions

2012-04-09 Thread Hadley Wickham
Make OPCON an environment and pass it into the functions that may read it or alter it.  There is no real need to pass it out, since environments are changed in-place (unlike lists).  E.g.,   x - list2env(list(one=1, two=ii, three=3))   x  environment: 0x03110890   objects(x)  

Re: [R] Appropriate method for sharing data across functions

2012-04-05 Thread Hadley Wickham
Why not pass around a reference class? Hadley On Thu, Apr 5, 2012 at 3:20 PM, John C Nash nas...@uottawa.ca wrote: In trying to streamline various optimization functions, I would like to have a scratch pad of working data that is shared across a number of functions. These can be called from

Re: [R] Year of data collection for 'diamonds' dataset in ggplot2

2012-03-27 Thread Hadley Wickham
I believe it was 2008. Hadley On Mon, Mar 26, 2012 at 11:46 AM, Marina Doucerain marinadoucer...@gmail.com wrote: Hello, I'm wondering what was the year (or year range) of collection for the data included in the 'diamonds' dataset in ggplot2. This information would be very helpful in

Re: [R] Improving help in R

2012-03-17 Thread Hadley Wickham
One difficulty in getting the help pages to look beautiful is that the original input is so inconsistent, and package authors (naturally) get upset when CRAN starts rejecting packages because of errors that used to be ignored.  The current output is definitely a compromise aimed at making most

[R] [R-pkgs] devtools 0.6

2012-03-03 Thread Hadley Wickham
# devtools The aim of `devtools` is to make your life as a package developer easier by providing R functions that simplify many common tasks. Devtools is opinionated about how to do package development, and requires that you use `roxygen2` for documentation and `testthat` for testing. Future

Re: [R] assigning NULL to a list element

2012-02-18 Thread Hadley Wickham
On Fri, Feb 17, 2012 at 7:51 PM, Benilton Carvalho beniltoncarva...@gmail.com wrote: Hi everyone, For reasons beyond the scope of this message, I'd like to append a NULL element to the end of a list. tmp0 - list(a=1, b=NULL, c=3) append(tmp0, c(d=4)) ## works as expected append(tmp0,

Re: [R] ggplot using scale_x_date gives Error in seq.int(r1$year, to$year, by)

2012-02-10 Thread Hadley Wickham
Hi Aidan, str is your friend: str(g) 'data.frame': 9 obs. of 3 variables: $ Date: chr 2011-12-23 2011-12-30 2012-01-06 2011-12-23 ... $ variable: Factor w/ 3 levels Price,Yield,..: 1 1 1 2 2 2 3 3 3 $ value : num 86.78 86.04 86.44 9.74 9.54 ... You haven't turned the Date

Re: [R] GGplot controlling point size across range

2012-02-10 Thread Hadley Wickham
On Thu, Jan 12, 2012 at 5:27 PM, Darran King darran.k...@csiro.au wrote: Hi all New to R and GGplot2 but loving the potential. I am trying to plot four separate point plots by looping over the data and plotting a different subset each time. When I plot the data as a point plot, the size of

Re: [R] pretty(range(data$z),10) Error

2012-02-10 Thread Hadley Wickham
Hi Mario, If you're still having problems, I'd suggest sending a small reproducible example (https://github.com/hadley/devtools/wiki/Reproducibility) to the ggplot2 mailing list. Hadley On Tue, Jan 17, 2012 at 12:32 PM, Mario Giesel rr.gie...@yahoo.de wrote: Hello, R-List, I'm getting error

Re: [R] Plotting bar graph over a geographical map

2012-02-10 Thread Hadley Wickham
Hi Simon, You might want to try sending a small reproducible example (https://github.com/hadley/devtools/wiki/Reproducibility) to the ggplot2 mailing list. Hadley On Tue, Jan 31, 2012 at 10:53 PM, sjlabrie sjlab...@mit.edu wrote: Hi, I am looking for a way to plot bar on a map instead of the

Re: [R] ggplot2 geom_polygon fill

2012-02-10 Thread Hadley Wickham
Hi Raimund, To increase your chances of getting help, I'd recommend using the ggplot2 mailing list, and reducing your example down to the essence of the problem. For example, the theme components don't affect the problem, but make the code longer, and so harder to understand. Hadley On Mon,

Re: [R] nice report generator?

2012-02-10 Thread Hadley Wickham
To be strictly correct, shouldn't that be: formula- eval(substitute( value*v*LEFT ~ RIGHT, list(LEFT=LEFT, RIGHT=RIGHT))) ? I think it probably doesn't matter.  The difference is that mine gives a pure language object, whereas yours gives a formula object.  The formula object has a

Re: [R] Version control (git, mercurial) for R packages

2012-02-09 Thread Hadley Wickham
Same warning here. Which made me think that R CMD build will probably tar up the git repository along with the package, which is not something I would like to do, and which CRAN people most likely won't tolerate in a package on CRAN. It doesn't. And you can always use .Rbuildignore to ignore

Re: [R] Version control (git, mercurial) for R packages

2012-02-09 Thread Hadley Wickham
I'm exploring using a version control system to keep better track of changes to the packages I maintain. I'm leaning towards git (although mercurial also looks good) but am not sure what is the best way to set up the repository. It seems I can't set the repository directly within the R

Re: [R] ggplot2(0.9.0): could not find function ==

2012-02-07 Thread Hadley Wickham
I'm curious if you have a guess whether the issue I was having is a result of the problems with the 0.9.0 version or if they're due to fundamental changes in ggplot2? It looks like a bug - I'll add it to the to do list. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department

Re: [R] nice report generator?

2012-02-06 Thread Hadley Wickham
2. It's more flexible to construct the language object as a language object, rather than pasting something together and parsing it.  For one thing, that allows non-syntactic variable names; I think it's also easier to read.  So your code txt- paste(tabular(value*v*, LEFT , ~ ,RIGHT ,, data =

Re: [R] ggplot2(0.9.0): could not find function ==

2012-02-01 Thread Hadley Wickham
Hi James, There were a few problems with the 0.9.0 version, which is why it was pulled from CRAN. I'd recommend re-installing 0.8.9: install.packages(ggplot2, type = source) Hadley On Wed, Feb 1, 2012 at 2:10 PM, J Toll jct...@gmail.com wrote: Hi, I have a question related to the newest

Re: [R] Subsetting for the ten highest values by group in a dataframe

2012-01-28 Thread Hadley Wickham
On Fri, Jan 27, 2012 at 1:26 PM, Sam Albers tonightstheni...@gmail.com wrote: Hello, I am looking for a way to subset a data frame by choosing the top ten maximum values from that dataframe. As well this occurs within some factor levels. ## I've used plyr here but I'm not married to this

Re: [R] What is a 'closure'?

2012-01-19 Thread Hadley Wickham
On Thu, Jan 19, 2012 at 1:45 PM, Ajay Askoolum aa2e...@yahoo.co.uk wrote: The R Language Definition at http://cran.r-project.org/doc/manuals/R-lang.html states in the following section 4.3.2 Argument matching This subsection applies to closures but not to primitive functions. What are

Re: [R] ggplot- using geom_point and geom_line at the same time

2012-01-17 Thread Hadley Wickham
On Mon, Jan 16, 2012 at 6:05 PM, Mary Kindall mary.kind...@gmail.com wrote: Thanks for reply I wanted to have legend name with spaces. Right now I am using the following code but it produce two legends. I have to use Gimp to cut the redundant legend. Your basic problem is that you're using

Re: [R] New PLYR issue

2012-01-17 Thread Hadley Wickham
Note that although ddply does a lot for you, it doesn't reproduce all of your calculations on all of the data columns like summaryBy does... you have to explicitly create every calculated column in your function. Well, ddply doesn't, but colwise will. Hadley -- Assistant Professor /

Re: [R] parallel computation in plyr 1.7

2012-01-16 Thread Hadley Wickham
Please see https://github.com/hadley/plyr/issues/60 Hadley On Thu, Jan 12, 2012 at 11:54 AM, abhagwat bhagwatadi...@gmail.com wrote: The code below shows that (1) the way to activate the parallel backend indeed is to use 'registerDoMC' (2) the function d_ply does NOT accept the argument

Re: [R] list: index of the element, that is TRUE

2012-01-16 Thread Hadley Wickham
On Mon, Jan 16, 2012 at 10:15 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 12-01-16 10:34 AM, Marion Wenty wrote: Dear People, I have got the following example for a vector and the index of the TRUE element: Myvector- c(FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,TRUE)

Re: [R] Aggregate by minimum

2012-01-10 Thread Hadley Wickham
On Mon, Jan 9, 2012 at 8:00 PM, jim holtman jholt...@gmail.com wrote: try this: x - structure(list(speed = c(3,9,14,8,7,6), result = c(0.697, 0.011, 0.015, 0.012, 0.018, 0.019), house = c(1, + 1, 1, 1, 1, 1), date = c(719, 1027, 1027, 1027, 1030, 1030), +    id = c(1000, 1, +    10001,

Re: [R] Is it possible to right align text in R graphics?

2012-01-03 Thread Hadley Wickham
FYI, if you're looking for the technical term for this type of text it's bidi: http://en.wikipedia.org/wiki/Bi-directional_text Hadley On Tue, Jan 3, 2012 at 4:32 PM, Tal Galili tal.gal...@gmail.com wrote: And I forgot to include the link to the image, here it is:

Re: [R] Base function for flipping matrices

2012-01-02 Thread Hadley Wickham
But if not,  it seems to me that it should be added as an array method to ?rev with an argument specifying which indices to rev() over. Yes, agreed. Sometimes arrays seem like something bolted onto R that is missing a lot of functionality. Hadley -- Assistant Professor / Dobelman Family

Re: [R] Base function for flipping matrices

2012-01-02 Thread Hadley Wickham
Your request is reminding me of the analysis of aray functions in Philip S Abrams dissertation http://www.slac.stanford.edu/cgi-wrap/getdoc/slac-r-114.pdf AN APL MACHINE The section that starts on page 17 with this paragraph is the one that immediately applies C. The Standard Form for

Re: [R] what's the command in R to completely clear the state of the console(including clearing up libraries, etc?)

2012-01-01 Thread Hadley Wickham
My understanding is that one only clears the variables... not functions/packages, etc... Not exactly true. rm should remove any function you defined in your current session. You need to look at ?unloadNamespace ?detach ...  in order to remove loaded packages. And read the caveats there.

[R] Base function for flipping matrices

2011-12-31 Thread Hadley Wickham
Hi all, Are there base functions that do the equivalent of this? fliptb - function(x) x[nrow(x):1, ] fliplr - function(x) x[, nrow(x):1] Obviously not hard to implement (although it needs some more checks), just wondering if it had already been implemented. Hadley -- Assistant Professor /

[R] [R-pkgs] Plyr 1.7

2011-12-30 Thread Hadley Wickham
# plyr plyr is a set of tools for a common set of problems: you need to __split__ up a big data structure into homogeneous pieces, __apply__ a function to each piece and then __combine__ all the results back together. For example, you might want to: * fit the same model each patient subsets of

[R] [R-pkgs] testthat 0.6

2011-12-30 Thread Hadley Wickham
# testthat Testing your code is normally painful and boring. `testthat` tries to make testing as fun as possible, so that you get a visceral satisfaction from writing tests. Testing should be fun, not a drag, so you do it all the time. To make that happen, `testthat`: * Provides functions that

Re: [R] Applyiing mode() or class() to each column of a data.frame XXXX

2011-12-30 Thread Hadley Wickham
But be careful because class is a character vector (not necessarily a character vector of length 1) On Fri, Dec 30, 2011 at 10:21 AM, Justin Haynes jto...@gmail.com wrote: there is also colwise in the plyr package. library(plyr) colwise(class)(data6)      v13     v14       v15     f4     v16

Re: [R] cast in reshape and reshape2

2011-12-26 Thread Hadley Wickham
?plyr::summarise seems pretty helpful to me.  If you can do better, please submit a patch - they are very much appreciated. My failure to find it stemmed from it not being mentioned in any way in package reshape2's help files, but maybe I was mistaken that it was meant to be used in that

Re: [R] ggplot2: behaviour with empty datasets

2011-12-23 Thread Hadley Wickham
See https://github.com/hadley/ggplot2/issues/31 - I totally agree that it's annoying. Hadley PS. You are more likely to get helpful responses about ggplot2 on the ggplot mailing list. On Fri, Dec 23, 2011 at 7:08 AM, Casper Ti. Vector caspervec...@gmail.com wrote: For example, prepare like

Re: [R] black and white in qplot? layout 4 graphs in one screen

2011-12-23 Thread Hadley Wickham
You might find the ggplot mailing list a friendlier place to ask questions about ggplot2. Hadley On Wed, Dec 21, 2011 at 2:16 PM, rachaelohde cox.rach...@gmail.com wrote: Hello, I am trying to plot means and standard errors conditioned by a factor, using qplot.  I am successful at getting the

Re: [R] Stacked area plot for time series

2011-12-23 Thread Hadley Wickham
You are more likely to get a helpful response if you provide a reproducible example - without that I can only guess that you need to use approx so you get y values at same x values. Hadley On Wed, Dec 21, 2011 at 8:13 AM, UncleFish bpn...@ucsd.edu wrote: I wish to make a stacked area chart of a

Re: [R] cast in reshape and reshape2

2011-12-23 Thread Hadley Wickham
On Fri, Dec 23, 2011 at 1:58 PM, Kaiyin Zhong kindlych...@gmail.com wrote: library(reshape2) x = melt(airquality, id=c('month', 'day')) With reshape I can cast with multiple functions: library(reshape) cast(x, month+variable~., c(mean,sd))   month variable       mean         sd 1      5  

Re: [R] cast in reshape and reshape2

2011-12-23 Thread Hadley Wickham
Have you looked at the .summarise argument to dcast? That seems to deliver the same sort of results one gets with base::aggregate. Actually I see after looking at examples on the plyr-reshape-googlegroups group that it is not '.summarise' but rather 'summarise'. Unfortunately there are no

Re: [R] Is there a way force hiding of all messages when calling library()?

2011-12-21 Thread Hadley Wickham
You should be able to suppress them with suppressPackageStartupMessages() but not all packages produce startup messages in the approved manner. Hadley On Wed, Dec 21, 2011 at 5:36 PM, Saiwing Yeung saiw...@berkeley.edu wrote: For example, if I call library(spam), I would get messages like this

[R] [R-pkgs] stringr 0.6

2011-12-09 Thread Hadley Wickham
# stringr Strings are not glamorous, high-profile components of R, but they do play a big role in many data cleaning and preparations tasks. R provides a solid set of string operations, but because they have grown organically over time, they can be inconsistent and a little hard to learn.

Re: [R] plotting and coloring longitudinal data with three time points (ggplot2)

2011-12-07 Thread Hadley Wickham
On Wed, Dec 7, 2011 at 4:02 AM, Eric Fail eric.f...@gmx.us wrote:  Dear list, I have been struggling with this for some time now, and for the last hour I have been struggling to make a working example for the list. I hope someone out there have some experience with plotting longitudinal

Re: [R] Project local libraries (reproducible research)

2011-12-05 Thread Hadley Wickham
On Sat, Dec 3, 2011 at 11:16 AM, Jim Lemon j...@bitwrit.com.au wrote: On 12/03/2011 06:04 AM, Hadley Wickham wrote: Hi all, I was wondering if any one had scripts that they could share for capturing the current version of R packages used for a project. I'm interested in creating a project

Re: [R] Counting the occurences of a charater within a string

2011-12-03 Thread Hadley Wickham
On Thu, Dec 1, 2011 at 10:32 AM, Douglas Esneault douglas.esnea...@mecglobal.com wrote: I am new to R but am experienced SAS user and I was hoping to get some help on counting the occurrences of a character within a string at a row level. My dataframe, x,  is structured as below: Col1

[R] Project local libraries (reproducible research)

2011-12-02 Thread Hadley Wickham
Hi all, I was wondering if any one had scripts that they could share for capturing the current version of R packages used for a project. I'm interested in creating a project local library so that you're safe if someone (e.g. the ggplot2 author) updates a package you're relying on and breaks your

Re: [R] tip: large plots

2011-11-18 Thread Hadley Wickham
You need: system.time(print(qplot(x,y,pch=I('.' Hadley On Fri, Nov 18, 2011 at 1:30 PM, Justin Haynes jto...@gmail.com wrote: Very cool.  Sadly, as far as I can tell, it doesn't work with ggplot though :( x-runif(1e6) y-runif(1e6) system.time(plot(x,y,pch='.'))   user  system elapsed

[R] [R-pkgs] Roxygen2: version 2.2

2011-11-13 Thread Hadley Wickham
# Roxygen2 The premise of `roxygen2` is simple: describe your functions in comments next to where their definitions and `roxygen2` will process your source code and comments to produce R compatible Rd files. Here's a simple example from the `stringr` package:     #' The length of a string (in

[R] R development master class: NYC, Dec 12-13

2011-11-13 Thread Hadley Wickham
Hi all, I hope you don't mind the slightly off topic email, but I'm going to be teaching an R development master class in New York City on Dec 12-13. The basic idea of the class is to help you write better code, focused on the mantra of do not repeat yourself. In day one you will learn powerful

Re: [R] R development master class: NYC, Dec 12-13

2011-11-13 Thread Hadley Wickham
No seriously, as much as I'm for free enterprise, it feels awkward to see you promote an (expensive!) course in a list where people offer not only their knowledge, but also the tools you use, for free. You might have a point if I taught this course instead of offering knowledge and code for

Re: [R] help with ggplot backgrounds

2011-11-13 Thread Hadley Wickham
You are more likely to receive helpful responses if you: a) Provide a reproducible example (e.g. https://github.com/hadley/devtools/wiki/Reproducibility) b) Post to the ggplot2 mailing list. Hadley On Fri, Oct 28, 2011 at 5:03 PM, RanRL rnr...@gmail.com wrote: Hi, I have two questions

Re: [R] any updates w.r.t. lapply, sapply, apply retaining classes

2011-11-03 Thread Hadley Wickham
   In the example I give above, the impact might seem small, but the implications are *huge*.  This means that I am, in effect, not allowed to use *any* of the vectoring functions in 'R', which avoid performing loops thereby speeding up process time extraordinarily.  Many can sympathize that

Re: [R] any updates w.r.t. lapply, sapply, apply retaining classes

2011-11-03 Thread Hadley Wickham
   I agree that it is non-trivial to solve the cases you I have posed.  However, I would wholeheartedly support having an error spit back for any function that does not explicitly support a class.  In this case, if I attempt to do   sapply(x, class), and 'x' is of class difftime, then I

Re: [R] Syntax Check: rshape2 melt()

2011-10-27 Thread Hadley Wickham
So I was using the rshape package rather than rshape2.  I don't know the relationship between those two packages and/or how they differ.  I am sure that there are others that can help you out here.  I, too, don't know how the two packages 'reshape, The Orignal' and 'reshape2, Rebooted'

Re: [R] Summary stats in table

2011-10-24 Thread Hadley Wickham
On Mon, Oct 24, 2011 at 5:39 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: Suppose I have data like this: A - sample(letters[1:3], 1000, replace=TRUE) B - sample(LETTERS[1:2], 1000, replace=TRUE) x - rnorm(1000) I can get a table of means via tapply(x, list(A, B), mean) and I can

Re: [R] multicore by(), like mclapply?

2011-10-10 Thread Hadley Wickham
On Mon, Oct 10, 2011 at 4:14 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: I could be waay off base here, but my concern about presplitting the data is that you will have your data, and a second copy of our data that is something like a list where each element contains the portion of the

Re: [R] Question about ggplot2 and stat_smooth

2011-10-04 Thread Hadley Wickham
On Mon, Oct 3, 2011 at 12:24 PM, Thomas Adams thomas.ad...@noaa.gov wrote:  I'm interested in creating a graphic -like- this: c - ggplot(mtcars, aes(qsec, wt)) c + geom_point() + stat_smooth(fill=blue, colour=darkblue, size=2, alpha = 0.2) but I need to show 2 sets of bands (with different

Re: [R] ggplot2: expression() in legend labels?

2011-10-04 Thread Hadley Wickham
You need to set the labels... Hadley On Sat, Sep 24, 2011 at 3:49 AM, Casper Ti. Vector caspervec...@gmail.com wrote: Is there any way to use expression() in legend labels with ggplot2? It seems that things like scale_shape_manual(value = c(   x = expression(italic(x)),   y =

Re: [R] Question about ggplot2 and stat_smooth

2011-10-04 Thread Hadley Wickham
# Function to compute quantiles and return a data frame g - function(d) {   qq - as.data.frame(as.list(quantile(d$y, c(.05, .25, .50, .75, .95   names(qq) - paste('Q', c(5, 25, 50, 75, 95), sep = '')   qq   } You could cut out the melt step by making this return a data frame: g -

Re: [R] remove NaN from element in a vector in a list

2011-09-27 Thread Hadley Wickham
apply(mt, 1, function(x) x[!is.nan(x)] ) [[1]] [1] 1 3 [[2]] [1] 4 5 6 You need to be a little careful with apply: mt2 - matrix(c(1,4,2,5,3,6),2,3) apply(mt2, 1, function(x) x[!is.nan(x)] ) [,1] [,2] [1,]14 [2,]25 [3,]36 Depending on the input you will get a

Re: [R] Quelplot

2011-09-22 Thread Hadley Wickham
. Best Wishes, Boris On Wed, Sep 21, 2011 at 3:11 PM, Hadley Wickham had...@rice.edu wrote: Hi all, Does anyone have an R implementation of the queplot (K. M. Goldberg and B. Iglewicz. Bivariate extensions of the boxplot. Technometrics, 34(3):pp. 307–320, 1992)?  I'm struggling

[R] Quelplot

2011-09-21 Thread Hadley Wickham
Hi all, Does anyone have an R implementation of the queplot (K. M. Goldberg and B. Iglewicz. Bivariate extensions of the boxplot. Technometrics, 34(3):pp. 307–320, 1992)? I'm struggling with the estimation of the asymmetry parameters. Hadley -- Assistant Professor / Dobelman Family Junior

Re: [R] x %% y as an alternative to which( x y)

2011-09-13 Thread Hadley Wickham
Because in coding, I often end up with big chunks looking like this: ((mydataframeName$myvariableName 2 !is.na(mydataframeName$myvariableName)) (mydataframeName$myotherVariableName == male !is.na(mydataframeName$myotherVariableName))) Which is much less

Re: [R] ddply from plyr package - any alternatives?

2011-08-25 Thread Hadley Wickham
z - ddply(past, c(GEO_CNTRY_NAME,PROD_SEG_NAME),  function(x) summary(lm(VAL~fy,x))$r.squared) But when ave is not exactly doing what I need. Above code runs under a minute for my data set where as ave runs over 8 mins. It's hard to know without a reproducible example, but I doubt that ddply

Re: [R] how to read a group of files into one dataset?

2011-08-25 Thread Hadley Wickham
# Method 2: Use the plyr package library('plyr') bdf - ldply(mlply(files, read.csv, header = TRUE), rbind) Or just bdf - ldply(files, read.csv, header = TRUE) Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/

Re: [R] How to apply a function to subsets of a data frame *and* obtain a data frame again?

2011-08-17 Thread Hadley Wickham
The following example does what you want using ddply: library(plyr) edfPerGroup = ddply(df, .(Group), summarise, edf = edf(Value), Value = Value) Or slightly more succinctly: ddply(df, .(Group), mutate, edf = edf(Value)) Hadley -- Assistant Professor / Dobelman Family Junior Chair

Re: [R] Utilizing column names to multiply over all columns

2011-08-16 Thread Hadley Wickham
You will get the warning that last last column is not going right but otherwise this returns what you asked for: sapply(1:length(mydf), function(i) mydf[[i]]* as.numeric(names(mydf)[i])  ) This suits my purposes well with a couple slight modifications: ## I made this into a data.frame so I

Re: [R] Reading name-value data

2011-08-01 Thread Hadley Wickham
if give non-data-frames. -s On Thu, Jul 28, 2011 at 19:30, Hadley Wickham had...@rice.edu wrote: Use plyr::rbind.fill? That does match up columns by name. Hadley On Thu, Jul 28, 2011 at 5:23 PM, Stavros Macrakis macra...@alum.mit.edu wrote: I have a file of data where each line

[R] [R-pkgs] plyr version 1.6

2011-07-30 Thread Hadley Wickham
# plyr plyr is a set of tools for a common set of problems: you need to __split__ up a big data structure into homogeneous pieces, __apply__ a function to each piece and then __combine__ all the results back together. For example, you might want to: * fit the same model each patient subsets of

Re: [R] 'breackpoints' (package 'strucchange'): 2 blocking error messages when using for multiple regression model testing

2011-07-29 Thread Hadley Wickham
struc.test - breakpoints(y~x1+x2+x3+x3+x4, data=D) *I get an error message:*  Erreur dans chol2inv(qr.R(fm$qr)) :  l'?l?ment (5, 5) est nul, donc l'inverse ne peut ?tre calcul? (sorry for the french version, I don't know how to get the message english translation in R). My first

Re: [R] http://www.r-project.org/contributors.html: display problem, because no character set is defined

2011-07-29 Thread Hadley Wickham
And I think Uwe is missing from that list! Hadley On Fri, Jul 29, 2011 at 3:34 PM, Paul Menzel paulepan...@users.sourceforge.net wrote: Dear R webmasters, my browser defaults to the charset UTF-8 and since [1] seems to be encoded in ISO-8859-1 the umlauts are not displayed correctly. It

Re: [R] Reading name-value data

2011-07-28 Thread Hadley Wickham
Use plyr::rbind.fill? That does match up columns by name. Hadley On Thu, Jul 28, 2011 at 5:23 PM, Stavros Macrakis macra...@alum.mit.edu wrote: I have a file of data where each line is a series of name-value pairs, but where the names are not necessarily the same from line to line, e.g.    

Re: [R] squared pie chart - is there such a thing?

2011-07-21 Thread Hadley Wickham
This is called a squarified pie chart or a waffle chart (if you want to keep the food metaphor going): http://eagereyes.org/communication/Engaging-readers-with-square-pie-waffle-charts.html Hadley On Thu, Jul 21, 2011 at 10:29 AM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote:

Re: [R] Odd behaviour of as.POSIXct

2011-07-16 Thread Hadley Wickham
Also, if we make days a list, the class attributes are kept when looping over the list, ie. days- list( as.Date( c(2000-01-01, 2000-01-02) ) ) Do you realise that that's a list with length one? I suspect you want days - as.list( as.Date( c(2000-01-01, 2000-01-02) ) ) for (day in days) {

Re: [R] Save generic plot to file (before rendering to device)

2011-07-16 Thread Hadley Wickham
Thank you, this is very helpful. One final question regarding this method: suppose a function prints multiple plots, i.e. multiple pages to a PDF. Is it possible to record all of these plots at once? The code below only records the final plot. I would like to record all of them, without

Re: [R] grey colored lines and overwriting labels i qqplot2

2011-07-15 Thread Hadley Wickham
You should only have one scale_ call for each scale type.  Here, you have three scale_colour_ calls, the first selecting a grey scale, the second defining a single break with its label (and thus implicitly subsetting on that single break value), and a second which defines a different

Re: [R] Referencing a vector of data labels in ggplot function

2011-07-09 Thread Hadley Wickham
Maybe something like this? withNames - function(dframe, lineNames, plotName, colors){ one_day - subset(dframe, data == '1941-06-16') one_day$lineNames - lineNames ggplot(dframe, aes(date, value, group = factor, color = factor)) + geom_line(size = 1) + facet_grid(Facet~., scales =

Re: [R] coefficients lm of data.frame

2011-07-07 Thread Hadley Wickham
On Thu, Jul 7, 2011 at 5:24 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: Here's another approach using the plyr package: library(plyr) df - data.frame(gp = factor(rep(1:3, each = 4)), x = rnorm(12), y = rnorm(12)) mylst - split(df, df$gp) mycoefs -  ldply(mylst, function(d) coef(lm(y ~

[R] [R-pkgs] stringr 0.5

2011-07-01 Thread Hadley Wickham
# stringr Strings are not glamorous, high-profile components of R, but they do play a big role in many data cleaning and preparations tasks. R provides a solid set of string operations, but because they have grown organically over time, they can be inconsistent and a little hard to learn.

Re: [R] Points but no lines in qplot.

2011-06-30 Thread Hadley Wickham
On Thu, Jun 30, 2011 at 7:31 AM, Ashim Kapoor ashimkap...@gmail.com wrote: Dear R helpers, I have molten data which is : - t3   Year       variable        value 1  2005     ICICI.Bank 27488370 2  2006     ICICI.Bank 43166850 3  2007     ICICI.Bank 59515300 4  2008    

Re: [R] ddply to count frequency of combinations

2011-06-21 Thread Hadley Wickham
Here's a non plyr approach: tab.df - as.data.frame(table(df)) tab.df[tab.df$Freq 0, ]   x y Freq 1  1 1    2 4  4 1    1 7  2 2    2 13 3 3    1 18 3 4    1 19 4 4    1 25 5 5    1 But look at str(tab.df) - x and y are now factors (or characters). I wrote count to avoid this

Re: [R] Reshape:cast; error using ... in formula expression.

2011-06-12 Thread Hadley Wickham
Yes, the basic problem is that you forgot to melt the data before trying to cast it. Hadley On Thursday, June 9, 2011, misterbray misterb...@gmail.com wrote: Dennis, doing some more research, and it seems you actually can include the ... term directly in the formula: cf. page 8 of

Re: [R] re-write plot function for ggplot

2011-06-02 Thread Hadley Wickham
Doesn't deal with what problems? Hadley On Thursday, June 2, 2011, rmje robinmje...@gmail.com wrote: I have been browsing the pages about ggplot and it really doesn't deal with such problems as far as I can see. -- View this message in context:

Re: [R] reshape::cast: invalid 'yinds' argument

2011-05-31 Thread Hadley Wickham
Hi Albert-Jan, It's impossible to know what went wrong without a reproducible example (https://github.com/hadley/devtools/wiki/Reproducibility). Without that, all I can recommend is trying out reshape2. Hadley On Tue, May 31, 2011 at 9:44 AM, Albert-Jan Roskam fo...@yahoo.com wrote: Hi, I'm

Re: [R] Question about ggplot2

2011-05-26 Thread Hadley Wickham
If a function uses substitute() or its equivalent to avoid evaluating its arguments in the normal way, you are pretty much forced to use eval() with the output of substitute() or call() or use do.call() to evaluate the arguments it will not evaluate for itself. Which is why I'd argue all

[R] R development master class: SF, June 8-9

2011-05-12 Thread Hadley Wickham
Hi all, I hope you don't mind the slightly off topic email, but I'm going to be teaching an R development master class in San Francisco on June 8-9.  The basic idea of the class is to help you write better code, focused on the mantra of do not repeat yourself. In day one you will learn powerful

<    1   2   3   4   5   6   7   8   9   10   >