Re: [R] ddply with mean and max...

2011-05-11 Thread Hadley Wickham
Thats the ticket!  So mean is already set up to operate on columns but max and min are not?  I guess its not too important now I know ... but whats going on in the background that makes that happen? Basically, this: mean.data.frame function (x, ...) sapply(x, mean, ...) environment:

Re: [R] Simple loop

2011-05-07 Thread Hadley Wickham
Using paste(Site,Prof) when calling ave() is ugly, in that it forces you to consider implementation details that you expect ave() to take care of (how does paste convert various types to strings?).  It also courts errors  since paste(A B, C) and paste(A, B C) give the same result but

Re: [R] Empty Data Frame

2011-04-27 Thread Hadley Wickham
On Wed, Apr 27, 2011 at 4:58 AM, Dennis Murphy djmu...@gmail.com wrote: Hi: You could try something like df - data.frame( expand.grid( Week = 1:52, Year = 2002:2011 )) expand.grid already returns a data frame... You might want KEEP.OUT.ATTRS = F though. Even it feels like you are yelling

Re: [R] setting options only inside functions

2011-04-27 Thread Hadley Wickham
This has the side effect of ignoring errors and even hiding the error messages.  If you are concerned about multiple calls to on.exit() in one function you could define a new function like  withOptions - function(optionList, expr) {   oldOpts - options(optionList)  

Re: [R] MASS fitdistr with plyr or data.table?

2011-04-27 Thread Hadley Wickham
On Wed, Apr 27, 2011 at 3:55 PM, Justin Haynes jto...@gmail.com wrote: I am trying to extract the shape and scale parameters of a wind speed distribution for different sites.  I can do this in a clunky way, but I was hoping to find a way using data.table or plyr.  However, when I try I am met

Re: [R] setting options only inside functions

2011-04-27 Thread Hadley Wickham
Put together a list and we can see what might make sense.  If we did take this on it would be good to think about providing a reasonable mechanism for addressing the small flaw in this function as it is defined here. In devtools, I have: #' Evaluate code in specified locale. with_locale -

Re: [R] Problem with ddply in the plyr-package: surprising output of a date-column

2011-04-25 Thread Hadley Wickham
If you need plyr for other tasks you ought to use a different class for your date data (or wait until plyr can deal with POSIXlt objects). How do you get POSIXlt objects into a data frame? df - data.frame(x = as.POSIXlt(as.Date(c(2008-01-01 str(df) 'data.frame': 1 obs. of 1 variable:

Re: [R] taking rows from data.frames in list to form new data.frame?

2011-04-21 Thread Hadley Wickham
On Wed, Apr 20, 2011 at 6:36 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: Perhaps you're looking for subset()? I'm not sure I understand the problem completely, but is do.call(rbind, lapply(database, function(df) subset(df, Symbol == 'IBM'))) or library(plyr) ldply(lapply(database,

Re: [R] (no subject)

2011-04-18 Thread Hadley Wickham
Yes, it's fixed and a new version of plyr has been pushed up to cran - hopefully will be available for download soon. In the meantime, I think you can fix it by running library(stats) before library(ggplot2). Hadley On Sun, Apr 17, 2011 at 3:51 PM, Bryan Hanson han...@depauw.edu wrote: Is

Re: [R] Is there a better way to parse strings than this?

2011-04-14 Thread Hadley Wickham
I was trying strsplit(string,\.\.\.) as per the suggestion in Venables and Ripleys book to (use '\.' to match '.'), which is in the Regular expressions section. I noticed that in the suggestions sent to me people used: strsplit(test,\\.\\.\\.) Could anyone please explain why I should have

Re: [R] Is there a better way to parse strings than this?

2011-04-13 Thread Hadley Wickham
On Wed, Apr 13, 2011 at 5:18 AM, Dennis Murphy djmu...@gmail.com wrote: Hi: Here's one approach: strings - c( A5.Brands.bought...Dulux, A5.Brands.bought...Haymes, A5.Brands.bought...Solver, A5.Brands.bought...Taubmans.or.Bristol, A5.Brands.bought...Wattyl, A5.Brands.bought...Other)

Re: [R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread Hadley Wickham
Even so, this would depend on what your publisher/printer requires in what you submit. It would be important to obtain from them a full and exact specification of what they require for colour printing in files submitted to them for printing. No one else has mentioned this, but the publisher

[R] Line plots in base graphics

2011-04-13 Thread Hadley Wickham
Am I missing something obvious on how to draw multi-line plots in base graphics? In ggplot2, I can do: data(Oxboys, package = nlme) library(ggplot2) qplot(age, height, data = Oxboys, geom = line, group = Subject) But in base graphics, the best I can come up with is this: with(Oxboys,

Re: [R] Line plots in base graphics

2011-04-13 Thread Hadley Wickham
On Wed, Apr 13, 2011 at 2:58 PM, Ben Bolker bbol...@gmail.com wrote: Hadley Wickham hadley at rice.edu writes: Am I missing something obvious on how to draw multi-line plots in base graphics? In ggplot2, I can do: data(Oxboys, package = nlme) library(ggplot2) qplot(age, height, data

Re: [R] Fwd: CRAN problem with plyr-1.4.1

2011-04-12 Thread Hadley Wickham
Then, can we have the ERROR message, please? Otherwise the only explanation I can guess is that a mirror grabs the contents of a repository exactly in the second the repository is updated and that is unlikely, particularly if more than one mirror is involved. Isn't one possible explanation

[R] [R-pkgs] plyr: version 1.5

2011-04-11 Thread Hadley Wickham
# plyr plyr is a set of tools for a common set of problems: you need to __split__ up a big data structure into homogeneous pieces, __apply__ a function to each piece and then __combine__ all the results back together. For example, you might want to: * fit the same model each patient subsets of

Re: [R] R licence

2011-04-07 Thread Hadley Wickham
If all you need is loess, I suspect it would be cheaper to re-write it in C# than to get a considered legal opinion on the matter. Hadley On Thu, Apr 7, 2011 at 2:45 AM, Stanislav Bek stanislav.pavel@gmail.com wrote: Hi, is it possible to use some statistic computing by R in proprietary

Re: [R] Windrose Percent Interval Frequencies Are Non Linear! Help!

2011-04-07 Thread Hadley Wickham
Does anyone with specific windrose experience know how to adjust the graphic such that the data and the percent intervals are evenly spaced? Hopefully I am making sense here How about giving us a reproducible example? Code is better than mere description; code + description is best.

Re: [R] merging data list in to single data frame

2011-04-04 Thread Hadley Wickham
filelist = list.files(pattern = K*cd.txt) # the file names are K1cd.txt .to K200cd.txt It's very easy: names(filelist) - basename(filelist) data_list - ldply(filelist, read.table, header=T, comment=;, fill=T) Hadley -- Assistant Professor / Dobelman Family Junior Chair

Re: [R] subset and as.POSIXct / as.POSIXlt oddness

2011-03-24 Thread Hadley Wickham
On Thu, Mar 24, 2011 at 8:29 AM, Michael Bach pha...@gmail.com wrote: Dear R users, Given this data: x - seq(1,100,1) dx - as.POSIXct(x*900, origin=2007-06-01 00:00:00) dfx - data.frame(dx) Now to play around for example: subset(dfx, dx as.POSIXct(2007-06-01 16:00:00)) Ok. Now for

Re: [R] How create vector that sums correct responses for multiple subjects?

2011-03-24 Thread Hadley Wickham
On Thu, Mar 24, 2011 at 2:24 PM, Kevin Burnham kburn...@gmail.com wrote: I have a data file with indicates pretest scores for a linguistics experiment.  The data are in long form so for each of 33 subjects there are 400 rows, one for each item on the test, and there is a column called

Re: [R] Popularity of R, SAS, SPSS, Stata, Statistica, S-PLUS updated

2011-03-22 Thread Hadley Wickham
I don't doubt that R may be the most popular in terms of discussion group traffic, but you should be aware that the traffic for SAS comprises two separate lists that used to be mirrored, but are no longer linked Usenet --  news://comp.soft-sys.sas  (what you counted) listserve -- SAS-L

Re: [R] assigning to list element within target environment

2011-03-17 Thread Hadley Wickham
On Thu, Mar 17, 2011 at 7:25 AM, Richard D. Morey r.d.mo...@rug.nl wrote: I would like to assign an value to an element of a list contained in an environment. The list will contain vectors and matrices. Here's a simple example: # create toy environment testEnv = new.env(parent = emptyenv())

Re: [R] Strange R squared, possible error

2011-03-17 Thread Hadley Wickham
2) I don't want to fit data with linear model of zero intercept. 3) I dont know if I understand correctly. Im 100% sure the model for my data should have zero intercept. The only coordinate which Im 100% sure is correct. If I had measured quality Y of a same sample X0 number of times I would

Re: [R] Persistent storage between package invocations

2011-03-16 Thread Hadley Wickham
No.  First, please use path.expand(~) for this, and it does not necessarily mean the home directory (and in principle it might not expand at all).  In practice I think it will always be *a* home directory, but on Windows there may be more than one (and watch out for local/roaming profile

Re: [R] File Save As...

2011-03-16 Thread Hadley Wickham
No, defaults are evaluated in the evaluation frame of the function. That's why you can use local variables in them, e.g. the way rgamma uses 1/rate as a default for scale. Oops, yes, I was getting confused with promises - non-missing arguments are promises evaluated in the parent frame. But

Re: [R] proportional symbol map ggplot

2011-03-16 Thread Hadley Wickham
On Mon, Mar 14, 2011 at 9:41 AM, Strategische Analyse CSD Hasselt csd...@fedpolhasselt.be wrote: Hello, we want to plot a proportional symbol map with ggplot. Symbols' area should have the same proportions as the scaled variable. Hereby an example we found on

Re: [R] Does R have a const object?

2011-03-16 Thread Hadley Wickham
Its useful for being able to set defaults for arguments that do not have defaults.  That cannot break existing programs. Until the next program decides do co change those defaults and either can't or does and you end up with incompatible assumptions.  It also make the code with the added

[R] Persistent storage between package invocations

2011-03-15 Thread Hadley Wickham
Hi all, Does anyone have any advice or experience storing package settings between R runs? Can I rely on the user's home directory (e.g. tools::file_path_as_absolute(~)) to be available and writeable across platforms? Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of

Re: [R] Changing colour of continuous time-series in ggplot2

2011-03-15 Thread Hadley Wickham
You need to specify the group aesthetic - that defines how observations are grouped into instances of a geom. Hadley On Tue, Mar 15, 2011 at 8:37 AM, joeP joseph.parr...@bt.com wrote: Hi, This seems like there should be a simple answer, but having spent most of the day trying to find it, I'm

Re: [R] File Save As...

2011-03-15 Thread Hadley Wickham
The bigger issue is that R can't tell the location of an open script, which makes it harder to create new versions of existing work But it can.  If you open a script and choose save, it will be saved to the same place.  Or do you mean an executing script?  There are indirect ways to find

Re: [R] File Save As...

2011-03-15 Thread Hadley Wickham
Could getSrcFilename() gain a default argument so that getSrcFilename() would by default return the path of the executing script? No, it needs to see a function defined in that script. But I thought default arguments were evaluated in the parent environment? Does that not follow for source

Re: [R] dataframe to a timeseries object

2011-03-14 Thread Hadley Wickham
Well, I'd start by removing all explicit use of environments, which makes you code very hard to follow. Hadley On Monday, March 14, 2011, Daniele Amberti daniele.ambe...@ors.it wrote: I found that plyr:::daply is more efficient than base:::by (am I doing something wrong?), below updated code

Re: [R] dataframe to a timeseries object

2011-03-14 Thread Hadley Wickham
))) res - as.timeSeries(cbind(t(res))) stopWorkers(w) -Original Message- From: h.wick...@gmail.com [mailto:h.wick...@gmail.com] On Behalf Of Hadley Wickham Sent: 14 March 2011 12:48 To: Daniele Amberti Cc: r-help@r-project.org Subject: Re: [R] dataframe to a timeseries object

Re: [R] increase a value by each group?

2011-03-14 Thread Hadley Wickham
On Mon, Mar 14, 2011 at 9:59 AM, ONKELINX, Thierry thierry.onkel...@inbo.be wrote: Something like this? my_data=read.table(clipboard, header=TRUE) my_data$s_name - factor(my_data$s_name) library(plyr) ddply(my_data, .(s_name), function(x){        x$Im_looking - x$Depth +

Re: [R] Need Assistance in Stacked Area plot

2011-03-13 Thread Hadley Wickham
You might try sending a reproducible example (https://github.com/hadley/devtools/wiki/Reproducibility) to the ggplot2 mailing list. Hadley On Wed, Feb 16, 2011 at 8:41 AM, Kishorenalluri kishorenalluri...@gmail.com wrote: Dear All, I need the assistance to plot the staked area plot using

Re: [R] Vector of weekly dates starting with a given date

2011-03-09 Thread Hadley Wickham
On Wed, Mar 9, 2011 at 3:04 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have a date (a Monday): date-20081229 mydates-as.Date(as.character(date),%Y%m%d) What package would allow me to create a vector that starts with that date (mydates) and contains dates for

Re: [R] R usage survey

2011-03-04 Thread Hadley Wickham
Ok, I am very interested in what methods you plan to use that would be fit under the description suitably analyzed for voluntary response data.  From my training and experience the only suitable thing to do with voluntary response data is to put it through the shredder, into the recycle

Re: [R] The L Word

2011-02-24 Thread Hadley Wickham
Note however that I've never seen evidence for a *practical* difference in simple cases, and also of such cases as part of a larger computation. But I'm happy to see one if anyone has an interesting example. E.g., I would typically never use  0L:100L  instead of 0:100 in an R script because

Re: [R] monitor variable change

2011-02-16 Thread Hadley Wickham
You can replace the previous line by: browser(expr=(a!=old.a) see ?browser for details. I don't understand why you'd want to do that - using if is much more readable to me (and is much more general!) Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics /

Re: [R] monitor variable change

2011-02-16 Thread Hadley Wickham
One way to implement this functionality is with a task manager callback: watch - function(varname) { old - get(varname) changed - function(...) { new - get(varname) if (!identical(old, new)) { message(varname, is now , new) old - new } TRUE }

Re: [R] Error when modifying names of the object returned by get()

2011-02-15 Thread Hadley Wickham
You can probably do this by constructing a call to the `names-` replacement function, but it's really bad style.  Don't write R code that has external side effects if you can avoid it.  In this case, you'll almost certainly get more maintainable code by writing your function to return a copy

Re: [R] how to order POSIXt objects ?

2011-02-14 Thread Hadley Wickham
It's a bit better to use xtfrm. Hadley On Monday, February 14, 2011, jim holtman jholt...@gmail.com wrote: 'unclass' it first(assuming that it is POSIXct) -unclass(mytime) On Mon, Feb 14, 2011 at 3:55 AM, JonC jon_d_co...@yahoo.co.uk wrote: I have a problem ordering by descending magnitude

Re: [R] aggregate function - na.action

2011-02-07 Thread Hadley Wickham
On Mon, Feb 7, 2011 at 5:54 AM, Matthew Dowle mdo...@mdowle.plus.com wrote: Looking at the timings by each stage may help :   system.time(dt - data.table(dat))   user  system elapsed   1.20    0.28    1.48   system.time(setkey(dt, x1, x2, x3, x4, x5, x6, x7, x8))   # sort by the 8 columns

Re: [R] aggregate function - na.action

2011-02-07 Thread Hadley Wickham
Does FAQ 1.8 answer that ok ?   Ok, I'm starting to see what data.table is about, but why didn't you enhance data.frame in R? Why does it have to be a new package?   http://datatable.r-forge.r-project.org/datatable-faq.pdf Kind of. I think there are two sets of features data.table provides:

Re: [R] aggregate function - na.action

2011-02-06 Thread Hadley Wickham
There's definitely something amiss with aggregate() here since similar functions from other packages can reproduce your 'control' sum. I expect ddply() will have some timing issues because of all the subgrouping in your data frame, but data.table did very well and the summaryBy() function in

Re: [R] Counting number of rows with two criteria in dataframe

2011-01-26 Thread Hadley Wickham
On Wed, Jan 26, 2011 at 5:27 AM, Dennis Murphy djmu...@gmail.com wrote: Hi: Here are two more candidates, using the plyr and data.table packages: library(plyr) ddply(X, .(x, y), function(d) length(unique(d$z)))  x y V1 1 1 1  2 2 1 2  2 3 2 3  2 4 2 4  2 5 3 5  2 6 3 6  2 The

Re: [R] How to reshape wide format data.frame to long format?

2011-01-20 Thread Hadley Wickham
I think I should be able to do this using the reshape function, but I cannot get it to work. I think I need some help to understand this... (If I could split the variable into three separate columns splitting by ., that would be even better.) Use strsplit and [ Or colsplit, from reshape,

Re: [R] ggplot2, geom_hline and facet_grid

2011-01-20 Thread Hadley Wickham
...@gmail.com [[4]h.wick...@gmail.com] On Behalf Of Hadley     Wickham [[5]had...@rice.edu]     Sent: 19 January 2011 15:11     To: Small Sandy (NHS Greater Glasgow Clyde)     Cc: [6]r-help@r-project.org   Subject: Re: [R] ggplot2, geom_hline and facet_grid   Hi Sandy,   It's difficult to know

Re: [R] ggplot2, geom_hline and facet_grid

2011-01-19 Thread Hadley Wickham
Hi Sandy, It's difficult to know what's going wrong without a small reproducible example (https://github.com/hadley/devtools/wiki/Reproducibility) - could you please provide one? You might also have better luck with an email directly to the ggplot2 mailing list. Hadley On Wed, Jan 19, 2011 at

Re: [R] dataframe: string operations on columns

2011-01-18 Thread Hadley Wickham
how can I perform a string operation like strsplit(x, )  on a column of a dataframe, and put the first or the second item of the split into a new dataframe column? (so that on each row it is consistent) Have a look at str_split_fixed in the stringr package. Hadley -- Assistant Professor /

Re: [R] how to cut a multidimensional array along a chosen dimension and store each piece into a list

2011-01-17 Thread Hadley Wickham
On Mon, Jan 17, 2011 at 2:20 PM, Sean Zhang seane...@gmail.com wrote: Dear R-Helpers, I wonder whether there is a function which cuts a multiple dimensional array along a chosen dimension and then store each piece (still an array of one dimension less) into a list. For example, arr -

Re: [R] Summing data frame columns on identical data

2011-01-17 Thread Hadley Wickham
library(plyr) # Function to sum y by A-B combinations for a generic data frame dsum - function(d) ddply(d, .(A, B), summarise, sumY = sum(y)) See count in plyr 1.4 for a much much faster way of doing this. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics

Re: [R] rootogram for normal distributions

2011-01-16 Thread Hadley Wickham
The normal distribution is a continuous distribution, i.e., the frequency for each observed value will essentially be 1/n and not converge to the density function. Hence, you would need to look at histogram or smoothed densities. Rootograms, on the other hand, are intended for discrete

Re: [R] data prep question

2011-01-16 Thread Hadley Wickham
On Sun, Jan 16, 2011 at 5:48 AM, bill.venab...@csiro.au wrote: Here is one way Here is one way: con - textConnection( + ID              TIME    OBS + 001             2200    23 + 001             2400    11 + 001             3200    10 + 001             4500    22 + 003             3900

Re: [R] median by geometric mean

2011-01-15 Thread Hadley Wickham
exp(median(log(x)) ? Hadley On Sat, Jan 15, 2011 at 10:26 AM, Skull Crossbones witch.of.agne...@gmail.com wrote: Hi All, I need to calculate the median for even number of data points.However instead of calculating the arithmetic mean of the two middle values,I need to calculate their

Re: [R] Help with Data Transformation

2011-01-11 Thread Hadley Wickham
The data is initially extracted from an SQL database into Excel, then saved as a tab-delimited text file for use in R. You might also want to look at the SQL packages for R so you can skip this manual step. I'd recommend starting with

[R] [R-pkgs] plyr 1.4

2011-01-04 Thread Hadley Wickham
# plyr plyr is a set of tools for a common set of problems: you need to __split__ up a big data structure into homogeneous pieces, __apply__ a function to each piece and then __combine__ all the results back together. For example, you might want to: * fit the same model each patient subsets of

[R] [R-pkgs] reshape2 1.1

2011-01-04 Thread Hadley Wickham
Reshape2 is a reboot of the reshape package. It's been over five years since the first release of the package, and in that time I've learned a tremendous amount about R programming, and how to work with data in R. Reshape2 uses that knowledge to make a new package for reshaping data that is much

Re: [R] packagename:::functionname vs. importFrom

2011-01-03 Thread Hadley Wickham
Hi Frank, I think you mean packagename::functionname? The three colon form is for accessing non-exported objects. Otherwise, I think using :: vs importFrom is functionally identical - either approach delays package loading until necessary. Hadley On Mon, Jan 3, 2011 at 9:45 PM, Frank Harrell

Re: [R] packagename:::functionname vs. importFrom

2011-01-03 Thread Hadley Wickham
I think you mean packagename::functionname?  The three colon form is for accessing non-exported objects. Normally two colons suffice, but within a package you need three to access exported but un-imported objects :) Are you sure? Note that it is typically a design mistake to use ‘:::’

Re: [R] packagename:::functionname vs. importFrom

2011-01-03 Thread Hadley Wickham
Correct.  I'm doing this because of non-exported functions in other packages, so I need ::: But you really really shouldn't be doing that. Is there a reason that the package authors won't export the functions? I'd still appreciate any insight about whether importFrom in NAMESPACE defers

Re: [R] Writing a single output file

2010-12-30 Thread Hadley Wickham
It looks like you have csv files, so use read.csv instead of read.table. Hadley On Thu, Dec 30, 2010 at 12:18 AM, Amy Milano milano_...@yahoo.com wrote: Dear sir, At the outset I sincerely apologize for reverting back bit late as I was out of office. I thank you for your guidance extended by

Re: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics

2010-12-30 Thread Hadley Wickham
The Inkscape user asked if there was any way that R could be coerced to use actual circles or paths for the points. I am not aware of a way to do this so any input from anyone here would be greatly appreciated. pdf(..., useDingbats = F) Hadley -- Assistant Professor / Dobelman Family Junior

[R] [R-pkgs] ggplot2 0.8.9 - Merry Christmas version

2010-12-24 Thread Hadley Wickham
ggplot2 ggplot2 is a plotting system for R, based on the grammar of graphics, which tries to take the good parts of base and lattice graphics and avoid bad parts. It takes care of many of the fiddly details that make plotting a hassle

Re: [R] Writing a single output file

2010-12-23 Thread Hadley Wickham
input - do.call(rbind, lapply(fileNames, function(.name){ +     .data - read.table(.name, header = TRUE, as.is = TRUE) +     # add file name to the data +     .data$file - .name +     .data + })) You can simplify this a little with plyr: fileNames - list.files(pattern = file.*.csv)

Re: [R] Coding a new variable based on criteria in a dataset

2010-12-22 Thread Hadley Wickham
 It isn't quite convenient to read the data posted below into R (if it was originally tab-separated, that formatting got lost) but ddply from the plyr package is good for this: something like (untested)  d - with(data,ddply(data,interaction(UniqueID,Reason),                    function(x) {

Re: [R] How to change the default location of x-axis in ggplot2?

2010-12-22 Thread Hadley Wickham
In ggplot2, by default the x-axis is in the bottom of the graph and y-axis is in the left of the graph. I wonder if it is possible to: 1. put the x axis in the top, or put the y axis in the right? 2. display x axis in both the top and bottom? These are on the to do list. 3. display x axis

Re: [R] ggplot2 histograms

2010-12-13 Thread Hadley Wickham
: 01412114592) From: h.wick...@gmail.com [h.wick...@gmail.com] On Behalf Of Hadley Wickham [had...@rice.edu] Sent: 01 December 2010 14:27 To: Small Sandy (NHS Greater Glasgow Clyde) Cc: ONKELINX, Thierry; r-help@r-project.org Subject: Re: [R] ggplot2

Re: [R] [plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function

2010-12-06 Thread Hadley Wickham
On Mon, Dec 6, 2010 at 3:58 AM, Sunny Srivastava research.b...@gmail.com wrote: Dear R-Helpers: I am using trying to use *ddply* to extract min and max of a particular column in a data.frame. I am using two different forms of the function: ## var_name_to_split is a string -- something like

Re: [R] [plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function

2010-12-06 Thread Hadley Wickham
. I am sorry if it is a basic question. Thank you and others for your reply. Best Regards, S. On Mon, Dec 6, 2010 at 5:28 PM, Hadley Wickham had...@rice.edu wrote: On Mon, Dec 6, 2010 at 3:58 AM, Sunny Srivastava research.b...@gmail.com wrote: Dear R-Helpers: I am using trying to use

Re: [R] ggplot2 histograms

2010-12-01 Thread Hadley Wickham
However if you do: ggplot(data=dafr, aes(x = d1, fill=d2)) + geom_histogram(binwidth = 1, position = position_dodge(width=0.99)) The position of first bin which goes from 0-2 appears to start at about 0.2 (I accept that there is some white space to the left of this) while the position of

Re: [R] ggplot2 histograms

2010-11-30 Thread Hadley Wickham
You may find it easier to use a frequency polygon, geom = freqpoly. Hadley On Tue, Nov 30, 2010 at 2:36 PM, Small Sandy (NHS Greater Glasgow Clyde) sandy.sm...@nhs.net wrote: Hi With ggplot2 I can very easily create beautiful histograms but I would like to put two histograms on the same

Re: [R] Help on running regression by grouping firms

2010-11-25 Thread Hadley Wickham
res - function(x) resid(x) ds_test$u - do.call(c, llply(mods, res)) I'd be a little careful with this, because there's no guarantee the results will by ordered in the same way as the input (and I'd also prefer ds_test$u - unlist(llply(mods, res)) or ds_test$u - laply(mods, res)) In your case,

Re: [R] Go (back) from Rd to roxygen

2010-11-25 Thread Hadley Wickham
Since roxygen is a great help to document R packages, I am wondering if there exists an approach to go back from the raw Rd files to roxygen-documentation? E.g. turn \author{Somebody} into @author Somebody. This sounds ridiculous, but I believe it helps in the long term for me to maintain R

Re: [R] sum in vector

2010-11-17 Thread Hadley Wickham
rowsum(value, paste(factor1, factor2, factor3)) That is dangerous in general, and always inefficient. Imagine factor1 is c(a, a b) and factor2 is (b c, c). Use interaction with drop = T. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University

Re: [R] Extending the accuracy of exp(1) in R

2010-11-09 Thread Hadley Wickham
Where the value of exp(1) as computed by R is concerned, you have been deceived by what R displays (prints) on screen. The default is to display any number to 7 digits of accuracy, but that is not the accuracy of the number held internally by R:  exp(1)  # [1] 2.718282  exp(1) - 2.718282  

[R] How to detect if a vector is FP constant?

2010-11-08 Thread Hadley Wickham
Hi all, What's the equivalent to length(unique(x)) == 1 if want to ignore small floating point differences? Should I look at diff(range(x)) or sd(x) or something else? What cut off should I use? If it helps to be explicit, I'm interested in detecting when a vector is constant for the purpose

Re: [R] How to detect if a vector is FP constant?

2010-11-08 Thread Hadley Wickham
I think this does what you want (borrowing from all.equal.numeric): all(abs((x - mean(x))) .Machine$double.eps^0.5) with a vector of length 1 million, it took .076 seconds on a fairly old system. Hmmm, maybe I want: all.equal(min(x), max(x)) ? Hadley -- Assistant Professor / Dobelman

Re: [R] Heatmap construction problems

2010-11-07 Thread Hadley Wickham
It's hard to know without a minimal reproducible example, but you probably want scale_fill_gradient or scale_fill_gradientn. Hadley On Thu, Oct 28, 2010 at 9:42 AM, Struchtemeyer, Chris stru...@okstate.edu wrote: I am very new to R and don't have any computer program experience whatsoever.  I

Re: [R] ggplot2: facet_grid with only one level does not display the graph with the facet_grid level in title

2010-11-07 Thread Hadley Wickham
This is on my to do list: https://github.com/hadley/ggplot2/issues/labels/facet#issue/107 Hadley On Thu, Oct 28, 2010 at 11:51 AM, Matthew Pettis matthew.pet...@gmail.com wrote: Hi All, Here is the code that I'll be referring to: p - ggplot(wastran.data, aes(PER_KEY, EVENTS)) (p - p +    

Re: [R] avoiding too many loops - reshaping data

2010-11-04 Thread Hadley Wickham
Beware of facile comparisons of this sort -- they may be apples and nematodes. And they also imply that the main time sink is the computation. In my experience, figuring out how to solve the problem using takes considerably more time than 18 / 1000 seconds, and so investing your energy in

Re: [R] overloading the generic primitive functions + and [

2010-10-28 Thread Hadley Wickham
Note how S3 methods are dispatched only by reference to the first argument (on the left of the operator). I think S4 beats this by having signatures that can dispatch depending on both arguments. That's somewhat of a simplification for primitive binary operators. R actually looks up the method

Re: [R] Which version control system to learn for managing R projects?

2010-10-26 Thread Hadley Wickham
git is where the world is headed.  This video is a little old: http://www.youtube.com/watch?v=4XpnKHJAok8, but does a good job getting the point across. And lots of R users are using github already: http://github.com/languages/R/created Hadley -- Assistant Professor / Dobelman Family Junior

Re: [R] Forcing results from lm into datframe

2010-10-26 Thread Hadley Wickham
On Tue, Oct 26, 2010 at 11:55 AM, Dennis Murphy djmu...@gmail.com wrote: Hi: When it comes to split, apply, combine, think plyr. library(plyr) ldply(split(afvtprelvefs, afvtprelvefs$basestudy),         function(x) coef(lm (ef ~ quartile, data=x, weights=1/ef_std))) Or do it in two steps:

Re: [R] Which version control system to learn for managing R projects?

2010-10-26 Thread Hadley Wickham
1. What is everyone else using?  The network effect is important since you want people to be able to access your repository and you want to leverage your knowledge of the version control system for other projects' repositories.  To that extent Subversion is the clear choice since its used on

Re: [R] Find index of a string inside a string?

2010-10-25 Thread Hadley Wickham
Or str_locate: library(stringr) str_locate(aabcd, bcd) Hadley On Mon, Oct 25, 2010 at 5:53 AM, jim holtman jholt...@gmail.com wrote: I think what you want is 'regexpr': regexpr(bcd, aabcd) [1] 3 attr(,match.length) [1] 3 On Mon, Oct 25, 2010 at 7:27 AM, yoav baranan

Re: [R] Query on save.image()

2010-10-14 Thread Hadley Wickham
On Thu, Oct 14, 2010 at 11:56 AM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi, I do not believe you can use the save.image() function in this case. save.image() is a wrapper for save() with defaults for the global environment (your workspace).  Try this instead, I believe it does what you

Re: [R] can't find and install reshape2??

2010-10-12 Thread Hadley Wickham
My guess is you are using an outdated R version for which the rather new reshape2 package has not been compiled. I wonder if install.packages() could detect this case (e.g. by also checking if the source version is not available), and offer a more informative error message. Hadley --

Re: [R] Looking for a book/tutorial with the following context:

2010-10-08 Thread Hadley Wickham
Do you also know more references about variables? Unfortunately this was a little bit short so I do not feel 100% sure I completely got it. Try here: http://github.com/hadley/devtools/wiki/Scoping It's a work in progress. Hadley -- Assistant Professor / Dobelman Family Junior Chair

Re: [R] R: Tools for thinking about data analysis and graphics

2010-10-06 Thread Hadley Wickham
On Wed, Oct 6, 2010 at 4:05 PM, Michael Friendly frien...@yorku.ca wrote:  I'm giving a talk about some aspects of language and conceptual tools for thinking about how to solve problems in several programming languages for statistical computing and graphics. I'm particularly interested in

Re: [R] plyr: a*ply with functions that return matrices-- possible bug in aaply?

2010-10-04 Thread hadley wickham
 That is, I want to define something like the following using an a*ply method, but aaply gives a result in which the applied .margin(s) do not appear last in the result, contrary to the documentation for ?aaply.  I think this is a bug, either in the function or the documentation, but perhaps

Re: [R] Issue with match.call

2010-10-04 Thread Hadley Wickham
RFF-function(qtype, qOpt,...){} i.e., I have two args that are compulsary and the rest are optional. Now when my user passes the function call, I need to see what optional args are defined and process accordingly...what I have so far is.. RFF-function(qtype, qOpt,...){        mc -

Re: [R] Script auto-detecting its own path

2010-10-04 Thread Hadley Wickham
I'm not sure this will solve the issue because if I move the script, I would still have to go into the script and edit the /path/to/my/script.r, or do I misunderstand your workaround? I'm looking for something like: file.path.is.here(myscript.r) and which would return something like: [1]

Re: [R] function which can apply a function by a grouping variable and also hand over an additional variable, e.g. a weight

2010-10-01 Thread Hadley Wickham
You might want to check out the plyr package. Hadley On Fri, Oct 1, 2010 at 6:05 AM, Werner W. pensterfuz...@yahoo.de wrote: Hi, I was wondering if there is an easy way to accomplish the following in R: Often I want to apply a function, e.g. weighted.quantile from the Hmisc package to

Re: [R] Script auto-detecting its own path

2010-09-29 Thread Hadley Wickham
Forgive me if this question has been addressed, but I was unable to find anything in the r-help list or in cyberspace. My question is this: is there a function, or set of functions, that will enable a script to detect its own path? I have tried file.path() but that was not what I was

Re: [R] Problem with ggplot2 - Boxplot

2010-09-22 Thread Hadley Wickham
That implies you need to update your version of plyr. Hadley On Wed, Sep 22, 2010 at 4:10 AM, RaoulD raoul.t.dso...@gmail.com wrote: Hi, I am using ggplot2 to create a boxplot that summarizes a continuous variable. This code works fine for me on one PC however when I use it on another it

Re: [R] parallel computation with plyr 1.2.1

2010-09-16 Thread Hadley Wickham
Yes, this was a little bug that will be fixed in the next release. Hadley On Thu, Sep 16, 2010 at 1:11 PM, Dylan Beaudette debeaude...@ucdavis.edu wrote: Hi, I have been trying to use the new .parallel argument with the most recent version of plyr [1] to speed up some tasks. I can run the

Re: [R] Problems with reshape2 on Mac

2010-09-13 Thread Hadley Wickham
Hi Uwe, The problem is most likely because the original poster doesn't have the latest version of plyr. I correctly declare this dependency in the DESCRIPTION (http://cran.r-project.org/web/packages/reshape2/index.html), but unfortunately R doesn't seem to use this information at run time,

Re: [R] post

2010-09-13 Thread Hadley Wickham
Have a look at: Computing Thousands of Test Statistics Simultaneously in R by Holger Schwender and Tina Müller, in http://stat-computing.org/newsletter/issues/scgn-18-1.pdf Hadley On Mon, Sep 13, 2010 at 4:26 PM, Alexey Ush usha...@yahoo.com wrote: Hello, I have a question regarding how to

<    1   2   3   4   5   6   7   8   9   10   >