On Mon, Jul 6, 2009 at 8:22 PM, Mark Knecht wrote:
> Hi,
> In the examples from the ReShape package there is a simple example
> of using melt followed by cast that produces a smallish amount of
> output about the chicks database. Here's the code:
>
> library(reshape)
>
> names(ChickWeight) <- tol
On Mon, Jul 6, 2009 at 12:12 AM, nyk wrote:
>
> Thanks for your reply! This is what I was looking for!
> I'm using
> nas1 <- apply(data_matrix,1,function(x)sum(is.na(x))/nrow(data_matrix))
> nas2 <- apply(data_matrix,2,function(x)sum(is.na(x))/ncol(data_matrix))
You can simplify this a little:
pe
Hi Malcolm,
You need to tell geom_boxplot not to use stat_boxplot:
geom_boxplot(aes(lower=y_q1, upper=y_q3, middle=y_med, ymin=y_min,
ymax=y_max), stat = "identity")
Hadley
On Mon, Jul 6, 2009 at 6:55 AM, Malcolm Ryan wrote:
> Is there anyway in ggplot2 to set the aesthetics for a geom_boxplot
>
> I think the root cause of a number of my coding problems in R right
> now is my lack of skills in reading and grabbing portions of the data
> out of arrays. I'm new at this. (And not a programmer) I need to find
> some good examples to read and test on that subject. If I could locate
> which co
Also make sure to check roxygen (from roxygen.org) - it makes package
documentation much much easier. Ironically, the documentation for
roxygen currently leaves something to be desired but I think Peter and
Manuel are working on it.
Hadley
On Sat, Jul 4, 2009 at 3:59 PM, Jason Rupert wrote:
>
>
> 2) Related to the above, how do I tell what packages are currently
> loaded at any given time so that I don't waste time loading things
> that are already loaded? search() tells me what's available, but
> what's loaded? The best I can find so far goes like this:
Loading something a second time t
On Sat, Jul 4, 2009 at 7:56 PM, Mark Kimpel wrote:
> I am using grep to locate colnames to automate a report build and have
> run into a problem when a colname is not found. The use of integer(0)
> in a conditional statement seems to be a no no as it has length 0.
> Below is a self-contained trivia
On Thu, Jul 2, 2009 at 3:34 PM, Sebastien
Bihorel wrote:
> Dear R-users,
>
> I would like to know how expressions could be passed as arguments to do.call
> functions. As illustrated in the short example below, concatenating lists
> objects and an expression creates an expression object, which is no
On Thu, Jun 18, 2009 at 12:08 PM, Dirk Eddelbuettel wrote:
>
> On 18 June 2009 at 09:36, Bert Gunter wrote:
> | -- or Chapter 4 in S PROGRAMMING? (you'll need to determine if it's "reader
> | friendly")
>
> +1
>
> It helped me a lot too back in the day. But I am wondering if there are good
> curre
On Thu, Jul 2, 2009 at 8:15 AM, James Martin wrote:
> Hadley, Sunil, and list,
>
> This is not quite doing what I wanted it to do (as far as I can tell). I
> perhaps did not explain it thoroughly. It seems to be sampling one value
> for each day leaving ~200 observations. I need for it randomly ch
On Wed, Jul 1, 2009 at 2:10 PM, Sunil
Suchindran wrote:
> #Highlight the text below (without the header)
> # read the data in from clipboard
>
> df <- do.call(data.frame, scan("clipboard", what=list(id=0,
> date="",loctype=0 ,haptype=0)))
>
> # split the data by date, sample 1 observation from each
On Tue, Jun 30, 2009 at 2:12 PM, Barry
Rowlingson wrote:
> On Tue, Jun 30, 2009 at 8:05 PM, Mark Knecht wrote:
>
>> You could wrap it in a function of your own making, right?
>>
>> AddNewDev = function() {dev.new();AddNewDev=dev.cur()}
>>
>> histPlot=AddNewDev()
>>
>> Seems to work.
>
> You leaRn
ackage. That code is giving me
> the following error:
>
>> qplot(reorder(model,delta),delta,data=growthm.bic)
> Error in UseMethod("reorder") : no applicable method for "reorder"
>
> Cheers,
> Chris
>
> On 6/28/09 8:21 PM, hadley wickham wrote:
>
>
Hi Chris,
Try this:
qplot(reorder(model, delta), delta, data = growthm.bic)
Hadley
On Sun, Jun 28, 2009 at 9:53 AM, Christopher
Desjardins wrote:
> Hi,
> I have 45 models that I have named: 1, 2, 3, ... , 45 and I am trying to
> plot them in order of ascending BIC values. I am however unclear a
> Also consider ddply in the plyr package (although that's an over kill if
> your only having two loops)
Maybe, but it sure is much simpler:
library(plyr)
ddply(data, c("industry","year"), summarise, avg = mean(X1))
Hadley
--
http://had.co.nz/
__
R-
On Fri, Jun 26, 2009 at 10:27 PM, Osman Al-Radi wrote:
> Dear Richard and David,
>
> Thanks for this reference. I looked into vcd and mosaic plot, it is a nice
> plot for investigating associations between two or more variables. However,
> I just need to plot the frequency of a single variable as t
Have a look at ddply from the plyr package, http://had.co.nz/plyr.
It's made for exactly this type of operation.
Hadley
On Wed, Jun 24, 2009 at 10:34 PM, Stephan Lindner wrote:
> Dear all,
>
>
> I have a code where I subset a data frame to match entries within
> levels of an factor (actually, the
You might also want to look at the plyr package,
http://had.co.nz/plyr. In particular, ddply + transform makes these
tasks very easy.
library(plyr)
ddply(mtcars, "cyl", transform, pos = seq_along(cyl), mpg_avg = mean(mpg))
Hadley
On Wed, Jun 24, 2009 at 11:48 AM, David
Hugh-Jones wrote:
> That
Hi Mark,
Have a look at colwise (and numcolwise and catcolwise) in the plyr package.
Hadley
On Tue, Jun 23, 2009 at 4:23 PM, Mark Na wrote:
> Hi R-helpers,
>
> I have a dataframe with 60columns and I would like to convert several
> columns to factor, others to numeric, and yet others to dates. R
plyr is a set of tools for a common set of problems: you need to break
down a big data structure into manageable pieces, operate on each
piece and then put all the pieces back together. For example, you
might want to:
* fit the same model to subsets of a data frame
* quickly calculate summary
> I have been using R for a while. Recently, I have begun converting my
> package into S4 classes. I was previously using Rdoc for documentation.
> Now, I am looking to use the best tool for S4 documentation. It seems that
> the best choices for me are Roxygen and Sweave (I am fine with tex).
>
> In revising my book Regression Modeling Strategies for a second edition, I
> am seeking a dataset for exemplifying multiple regression using least
> squares. Ideally the dataset would have 5-40 variables and 40-1
> independent observations, and would generate significant interest for a wide
ck
> Sent: Thursday, June 18, 2009 9:17 AM
> To: Hadley Wickham
> Cc: r-help
> Subject: Re: [R] Learning S3
>
> There is a section on Object Orientation in MASS (I have 2nd ed).
>
> On Thu, Jun 18, 2009 at 12:06 PM, Hadley Wickham wrote:
>> Hi all,
>>
>> Do you k
Hi all,
Do you know of any good resources for learning how S3 works? I've
some how become familiar with it by reading many small pieces, but now
that I'm teaching it to students I'm wondering if there are any good
resources that describe it completely, especially in a reader-friendly
way. So far
Hi all,
This is a little off-topic, but it is on the general topic of getting
data in R. I'm looking for a excel macro / vba script that will
export all spreadsheets in a directory (with one file per tab) into
csv. Does anyone have anything like this?
Thanks,
Hadley
--
http://had.co.nz/
___
Hi all,
Is there a cross-platform way to do this? On the mac, I cando this by
saving an eps file, and then using pbcopy. Is it possible on other
platforms?
Hadley
--
http://had.co.nz/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman
On Mon, Jun 8, 2009 at 8:56 PM, Mao Jianfeng wrote:
> Dear Ruser's
>
> I ask for helps on how to substitute missing values (NAs) by mean of the
> group it is belonging to.
>
> my dummy dataframe is:
>
>> df
> group traits
> 1 BSPy01-10 NA
> 2 BSPy01-10 7.3
> 3 BSPy01-10 7.3
> 4
On Mon, Jun 8, 2009 at 10:29 AM, Herbert
Jägle wrote:
> Hi,
>
> i do have a dataframe representing data from a repeated experiment. PID is a
> subject identifier, Time are timepoints in an experiment which was repeated
> twice. For each subject and all three timepoints there are 2 sets of four
> va
On Sat, Jun 6, 2009 at 5:02 PM, Adam D. I. Kramer wrote:
> Dear Colleagues,
>
> Occasionally I deal with computer-generated (i.e., websurvey) data
> files that haven't quite worked correctly. When I try to read the data into
> R, I get something like this:
>
> Error in scan(file, what, nmax,
Is it really necessary to further advertise this company which already
spams R-help subscribers?
Hadley
On Thu, Jun 4, 2009 at 10:41 PM, Ajay ohri wrote:
> Dear All,
>
> Slightly off -non technical topic ( but hey it is Friday)
>
> Following last week's interview with REvolution Computing which m
On Mon, Jun 1, 2009 at 2:18 PM, stephen sefick wrote:
> library(ggplot2)
>
> melt.updn <- (structure(list(date = structure(c(11808, 11869, 11961, 11992,
> 12084, 12173, 12265, 12418, 12600, 12631, 12753, 12996, 13057,
> 13149, 11808, 11869, 11961, 11992, 12084, 12173, 12265, 12418,
> 12600, 12631,
>> Let's see if I understand this. Do I iterate through
>> x <- factor(x, levels(c(levels(x), NA), exclude=NULL)
>> for each of the few hundred variables (x) in my data frame?
>
>
> Yes, for all being factors.
Wouldn't addNA() be the preferred method?
To do it for all variables is pretty simp
.
All proceeds go to the GGobi Foundation to support graphics research.
Find out more, and book your tickets online at
http://lookingatdata.com
Regards,
Hadley Wickham
Dianne Cook
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo
You might have an out-of-date version of the plyr package - try
install.packages("plyr")
Hadley
On Mon, Jun 1, 2009 at 10:20 AM, Matt Frost wrote:
> I'm trying to plot a time series in ggplot, but a date column in my
> data frame is causing errors. Rather than provide my own data, I'll
> just re
Hi Paul,
Unfortunately that's not something that's currently possible with
ggplot2, but I am thinking about how to make it possible.
Hadley
On Sat, May 16, 2009 at 7:48 AM, Paul Emberson wrote:
> Hi Stephen,
>
> The problem is that the label on the graph doesn't get rendered with a
> superscrip
On Thu, May 14, 2009 at 2:14 PM, Garritt Page wrote:
> Hello,I am using xyplot to try and create a conditional plot. Below is a
> toy example of the type of data I am working with
>
> slevel <- rep(rep(c(0.5,0.9), each=2, times=2), times=2)
>
> tlevel <- rep(rep(c(0.5,0.9), each=4), times=2)
>
>
On Thu, May 14, 2009 at 6:21 PM, Ping-Hsun Hsieh wrote:
> Hi All,
>
> I have a 1000x100 matrix.
> The calculation I would like to do is actually very simple: for each row,
> calculate the frequency of a given pattern. For example, a toy dataset is as
> follows.
>
> Col1 Col2 Col3 Co
On Thu, May 14, 2009 at 12:16 PM, Lori Simpson
wrote:
> I am writing a custom function that uses an R-function from the
> reshape package: cast. However, my question could be applicable to
> any R function.
>
> Normally one writes the arguments directly into a function, e.g.:
>
> result=cast(tabl
> This does it more or less your way:
>
> ds <- split(df, df$Name)
> ds <- lapply(ds, function(x){x$Index <- seq_along(x[,1]); x})
> df2 <- unsplit(ds, df$Name)
> tapply(df2$X1, df2[,c("Name", "Index")], function(x) x)
>
> athough there may exist much easier ways ...
Here's one way with the plyr a
On Sun, May 10, 2009 at 10:32 AM, Zeljko Vrba wrote:
> Searching the mail archives I found that using legend.position as in
> p.ring.3 + opts(legend.position="top")
>
> is a known bug. I tried doing
> p.ring.3 + opts(legend.position=c(0.8, 0.2))
>
> which works, but the legend background is trans
On Wed, May 6, 2009 at 8:12 PM, jim holtman wrote:
> Ths should do it:
>
>> do.call(rbind, lapply(split(x, x$ID), tail, 1))
> ID Type N
> 45900 45900 I 7
> 46550 46550 I 7
> 49270 49270 E 3
Or with plyr:
library(plyr)
ddply(x, "id", tail, 1)
plyr encapsulates the common split-
Hi Robert,
I'm organising one - sign up to the mailing list,
http://groups.google.com/group/houston-r. I'm hoping to organise our
first meeting this summer.
Hadley
On Wed, May 6, 2009 at 10:15 AM, Robert Sanford wrote:
> I'm looking for a Users Group in or near Houston, TX.
>
> Many thanks!
>
> Take a look at plyr and reshape packages (http://had.co.nz/), I have a hunch
> that they would have saved me a lot of headache had I found out about them
> earlier :)
As the author of these two packages, I'm admittedly biased, but I
think R is unparalleled for data preparation, manipulation, and
On Tue, May 5, 2009 at 3:55 PM, jwg20 wrote:
>
> Thanks for your help! I wasn't sure what the margins variable did, but I'm
> beginning to understand. I'm almost there, but with my data (and with ff_d)
> I tried to margin over two variable names, however it only does one of them.
> So with ff_d I
On Tue, May 5, 2009 at 3:03 PM, jwg20 wrote:
>
> I have a data set that I'm trying to melt and cast in a specific way using
> the reshape package. (I'll use the ff_d dataset from reshape so I don't have
> to post a toy data set here. )
>
> Lets say I'm looking for the interaction of treatment with
> If you do write your own, the hardest part will be picking the nice tick
> marks. They should be approximately evenly spaced, but at nice round values
> of the original variable: that's hard to do in general. R has the pretty()
> function for the linear scale, and doesn't do too badly on log a
On Thu, Apr 30, 2009 at 2:03 PM, MUHC-Research
wrote:
>
> Dear R-users,
>
> I recently began using the ggplot2 package and I am still in the process of
> getting used to it.
>
> My goal would be to plot on the same grid a number of curves derived from
> two distinct datasets. The first dataset (ca
Hi David,
I think the revolution blog is fantastic and a great service to the R
community. Thanks for all your hard work!
Hadley
On Fri, May 1, 2009 at 4:54 PM, David M Smith
wrote:
> I write about R every weekday at http://blog.revolution-computing.com
> . In case you missed them, here are so
On Fri, May 1, 2009 at 2:38 PM, Zeljko Vrba wrote:
> On Fri, May 01, 2009 at 01:06:34PM -0500, hadley wickham wrote:
>>
>> It should be trivial with ggplot2 too, but it's hard to provide
>> concrete advice without a concrete problem.
>>
> Elementary prob
> My issue is self-evident: using this method resulted in a 30 fold
> increase in time. My question is why? If I time the individual
> components separately, nothing is unusual. My hunch is the
> "interaction" between the model.matrix and nsga2 methods.
>
> Any ideas on how to speed this proces
> Is situation anything better with ggplot2? It seems rather easy to get e.g.
> line plots with error bars, provided that one feeds the data to some
> modeling/regression function and passes the result over for plotting.. but
> what
> if I have generated my own error bar data? This is almost tri
On Fri, May 1, 2009 at 12:22 PM, MUHC-Research
wrote:
>
> Dear R-users,
>
> I would have another question about the ggplot() function in the ggplot2
> package.
>
> All the examples I've read so far in the documentation make use of a single
> neatly formatted data.frame. However, sometimes, one may
It's hard to check without a reproducible example, but the following
code should give you a 3d array of lat x long x time:
library(reshape)
df$lat <- round_any(df$LATITUDE, 5)
df$long <- round_any(df$LONGITUDE, 5)
df$value <- df$TIME
cast(df, lat ~ long ~ time, mean)
On Thu, Apr 30, 2009 at 10
]
> library(ggplot2)
> qplot(year,value, data=data,label=countries, geom=c("line","text"),
> group=countries, col=countries)
>
> But I would like to have the text labels show only once - e.g. at 1990
> - and also control the size of the text. In my crude qplot, setting
> size=2 e.g. changes not onl
In statistics, a bumps chart is more commonly called a parallel
coordinates plot.
Hadley
On Sun, Apr 26, 2009 at 5:45 PM, Andreas Christoffersen
wrote:
> Hi there,
>
> I would like to make a 'bumps chart' like the ones described e.g.
> here: http://junkcharts.typepad.com/junk_charts/bumps_chart/
Have a look at the plyr package and associated documentation -
http://had.co.nz/plyr
Hadley
On Sun, Apr 26, 2009 at 12:42 PM, wrote:
> After a year my R programming style is still very "C like".
> I am still writing a lot of "for loops" and finding it difficult to recognize
> where, in place o
he original were changed; the sort of behavior that
> might be seen in a spreadsheet that had a copy "by reference".
>
> On Apr 26, 2009, at 11:28 AM, hadley wickham wrote:
>
>>>> I want to (1) create a deep copy of pop,
>>>
>>> I have already said *I*
>> I want to (1) create a deep copy of pop,
>
> I have already said *I* do not know how to create a "deep copy" in R.
Creating a deep copy is easy, because all copies are "deep" copies.
You need to try very hard to create a reference in R.
Hadley
--
http://had.co.nz/
_
On Fri, Apr 24, 2009 at 3:12 PM, sjaffe wrote:
>
> small example:
>
> a<-c(1.1, 2.1, 9.1)
> b<-cut(a,0:10)
> c<-data.frame(b,b)
> d<-table(c)
> dim(d)
> ##result: c(10, 10)
>
> But only 9 of the 100 cells are non-zero.
> If there were 10 columns, the table have 10 dimensions each of length 10, so
Hi Steve,
The general answer is yes, but the specific will depend on your
problem. Could you provide a small reproducible example to illustrate
your problem?
Hadley
On Fri, Apr 24, 2009 at 1:19 PM, sjaffe wrote:
>
> Perhaps this is a common question but I haven't been able to find the answer.
On Fri, Apr 24, 2009 at 5:50 AM, Duncan Murdoch wrote:
> Toby wrote:
>>
>> I'm trying to figure out how I can get a generalized 2D
>> list/array/matrix/whatever
>> working. Seems I can't figure out how to make the variables the right
>> type. I
>> always seem to get some sort of error... out of
On Thu, Apr 23, 2009 at 5:11 PM, ozan bakis wrote:
> Dear R Users,
> I have the following data frame:
>
> v1 <- c(rep(10,3),rep(11,2))
> v2 <- sample(5:10, 5, replace = T)
> v3 <- c(0,1,2,0,2)
> df <- data.frame(v1,v2,v3)
>> df
> v1 v2 v3
> 1 10 9 0
> 2 10 5 1
> 3 10 6 2
> 4 11 7 0
> 5 11
> "Have you read the posting guide and the FAQs? If you do not get a reply
> within two days, you may want to look at both and think about reformulating
> your query. Oh, and while you are at it, look through the archives, a lot of
> questions have already been asked and answered before."
As I say
ggplot2
ggplot2 is a plotting system for R, based on the grammar of graphics,
which tries to take the good parts of base and lattice graphics and
avoid bad parts. It takes care of many of the fiddly details
that make plotting a hassle (l
> Am I doing something wrong, here? If not, which are the real AIC and logLik
> values for the different models?
I don't think it's reasonable to expect that the log-likelihood
computed by different functions be should comparable. Are the
constant terms included or dropped?
Hadley
--
http://ha
On Fri, Apr 17, 2009 at 2:07 PM, Paul Warren Simonin
wrote:
> Thank you all for your advice.
> I have received some good tips, but it was suggested I write back with a
> small simulated data set to better illustrate my needs. So, currently my
> data frame looks something like:
>
> ID (date) Temp
Look at the output of pal.cr((0:40)/40)
Hadley
On Fri, Apr 17, 2009 at 2:42 PM, Etienne B. Racine wrote:
>
> I try to use ColorRamp as ColorRampPalette (i.e. with the same gradient), but
> it seems there is a nuance that I've missed.
>
> pal.crp<-colorRampPalette( c("blue", "white", "red"), space
On Fri, Apr 17, 2009 at 12:19 PM, jim holtman wrote:
> try this:
>
>> matrixx<-function(A){
> + B=matrix(NaN,nrow=(A+1),ncol=4)
> + k <- 1
> + for (i in 3:A){
> + for (j in i:A) {
> + B[k,] <- c(NaN, i-2, i-1, j)
> + k <- k + 1
> + }
> + }
>
On Fri, Apr 17, 2009 at 9:59 AM, Paul Warren Simonin
wrote:
> Hello!
> Thanks for reading this request for assistance. I have a question regarding
> creating a histogram-like figure from data that are not currently in the
> correct format for the "hist" command.
> Specifically, my data have been
, namef)
> res <- c(res, get(namef))
> }
> names(res) <- namesf
> }
> return(res)
> }
>
> df <- data.frame(id = 1:50, x = sample(c(NA, 1), 50, T), y = sample(1:2, 50,
> T), z = sample(letters[1:2], 50, T))
>
>> freq1(df$x)
> $freq_1
>
> Levels: a
>>
>
> R. Raubertas
> Merck & Co
>
>
>> -Original Message-
>> From: r-help-boun...@r-project.org
>> [mailto:r-help-boun...@r-project.org] On Behalf Of hadley wickham
>> Sent: Wednesday, April 15, 2009 10:55 AM
>> To: r-help
&g
plyr is a set of tools for a common set of problems: you need to break
down a big data structure into manageable pieces, operate on each
piece and then put all the pieces back together. For example, you
might want to:
* fit the same model to subsets of a data frame
* quickly calculate summary
In general, how can I increase a vector of length m (< n) to length n
by padding it with m - n missing values, without losing attributes?
The two approaches I've tried, using length<- and adding missings with
c, do not work in general:
> a <- as.Date("2008-01-01")
> c(a, NA)
[1] "2008-01-01" NA
>
On Mon, Apr 13, 2009 at 4:15 AM, Peter Dalgaard
wrote:
> Stavros Macrakis wrote:
>
>> It would of course be nice if the existing difftime class could be fit
>> into this, as it is currently pretty much a second-class citizen. For
>> example, c of two time differences is currently a numeric vector
> I'm generating some images in R to put into a document that I'm producing
> using Latex. This document in Latex is following a predefined model, which
> does not accept compilation with pdflatex, so I have to compile with latex
> -> dvi -> pdf. Because of that, I have to generate the images in R
> pnorm(37:39,lower.tail=FALSE)
> [1] 5.725571e-300 0.00e+00 0.00e+00
>
> This is just a limitation of double precision floating-point arithmetic
> ...
>
> curve(pnorm(x,lower.tail=FALSE),from=30,to=40,log="y")
> .Machine$double.xmin
But note
curve(pnorm(x,lower.tail=FALSE, log=T),fr
On Tue, Apr 7, 2009 at 4:41 PM, Jorge Ivan Velez
wrote:
> Hi Eik,
> You're absolutely right. My bad.
>
> Here is the correction of the code I sent:
>
> apply(mydata[,-1], 2, tapply, mydata[,1], function(x) sum(x)/length(x))
Or more simply:
apply(mydata[,-1], 2, tapply, mydata[,1], mean)
Hadley
On Tue, Apr 7, 2009 at 8:44 AM, wrote:
>
> I am trying to use the "cast" function from the reshape package, where the
> formula is not passed in directly, but as the result of the as.formula()
> function.
>
> Using reshape v. 0.7.2
>
> I am able to properly melt() by data with:
>
>> molten <- mel
Have a look at ?gpar - it will tell you about lineheight.
Hadley
On Tue, Apr 7, 2009 at 3:28 AM, Mark Heckmann wrote:
> I am trying to change the inter-line spacing in grid.text(), but I just
> don't find how to do it.
>
> pushViewport(viewport())
> grid.text("The inter-line spacing\n is too big
On Mon, Apr 6, 2009 at 5:31 PM, Jun Shen wrote:
> This is a good example to compare different approaches. My understanding is
>
> aggregate() can apply one function to multiple columns
> summarize() can apply multiple functions to one column
> I am not sure if ddply() can actually apply multiple f
On Mon, Apr 6, 2009 at 10:40 AM, baptiste auguie wrote:
> Here's one attempt with plyr, hopefully Hadley will give you a better
> solution ( I could not get cast() to do it either)
>
> test <-
> data.frame(a=c("A","A","A","A","B","B","B"),b=c(1,1,2,2,1,1,1),c=sample(1:7))
> ddply(test,.(a,b),.fun=
On Mon, Apr 6, 2009 at 9:34 AM, Stavros Macrakis wrote:
> There are various ways to do this in R.
>
> # sample data
> dd <- data.frame(a=1:10,b=sample(3,10,replace=T),c=sample(3,10,replace=T))
>
> Using the standard built-in functions, you can use:
>
> *** aggregate ***
>
> aggregate(dd,list(b=dd$
On Mon, Apr 6, 2009 at 8:49 AM, Daniel Brewer wrote:
> Hello,
>
> What is the best way to turn a list into a data.frame?
>
> I have a list with something like:
> $`3845`
> [1] "04010" "04012" "04360"
>
> $`1029`
> [1] "04110" "04115"
>
> And I would like to get a data frame like the following:
>
Hi Laura,
You might find the map_data function from the ggplot2 package helpful:
library(ggplot2)
library(maps)
head(map_data("state", "iowa"))
It formats the output of the map command into a self-documenting data frame.
Hadley
On Mon, Apr 6, 2009 at 7:00 AM, Laura Chihara wrote:
>
> I would
On Sat, Apr 4, 2009 at 12:28 PM, jim holtman wrote:
> Does this do what you want:
>
>> x <- read.table(textConnection("name wrist nLevel emot
> + 1 4094 3.34 1 frustrated
> + 2 4094 3.94
On Sat, Apr 4, 2009 at 12:09 PM, ds wrote:
>
> I have a data frame something like:
> name wrist
> nLevel emot
> 1 4094 3.34 1
> frustrated
> 2 4094 3.94 1
> frustra
On Fri, Apr 3, 2009 at 1:45 PM, wrote:
> I have a list of data.frames
>
>> str(bins)
>
> List of 19217
> $ 100026:'data.frame': 1 obs. of 6 variables:
> ..$ Sku : chr "100026"
> ..$ Bin : chr "T149C"
> ..$ Count: int 108
> ..$ X : int 20
> ..$ Y : int 149
> ..$ Z : chr "3"
> $
On Fri, Apr 3, 2009 at 8:43 AM, baptiste auguie wrote:
> That makes sense, so I can do something like,
>
> count <- function(x){
> as.integer(unclass(table(x)))
> }
>
> count(d$user_id)
>
> ddply(d, .(user_id), transform, count = count(user_id))
>
>> user_id website time count
>> 1 2
On Fri, Apr 3, 2009 at 4:43 AM, baptiste auguie wrote:
> Dear all,
>
> I'm puzzled by the following example inspired by a recent question on
> R-help,
>
>
> cc <- textConnection("user_id website time
> 20 google 0930
> 21 yahoo 0935
> 20 faceboo
> X1 X2
> 1 11 0
> 2 11 0
> 3 11 0
> 4 11 1
> 5 12 0
> 6 12 0
> 7 12 0
> 8 13 0
> 9 13 1
> 10 13 1
>
>
> and I want to select all rows pertaining to factor levels of X1 for
> which exists at least one "1" for X2. To be clear, I want rows 1:4
> (since there exists at least one o
On Thu, Apr 2, 2009 at 3:37 PM, Rowe, Brian Lee Yung (Portfolio
Analytics) wrote:
> Is this what you want:
>> d1[which(id != 4),]
Or just
d1[id != 4, ]
Hadley
--
http://had.co.nz/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/li
> Earlier I posted a question about memory usage, and the community's input was
> very helpful. However, I'm now extending my dataset (which I use when
> running a regression using lm). As a result, I am continuing to run into
> problems with memory usage, and I believe I need to shift to impl
On Wed, Apr 1, 2009 at 11:00 AM, hadley wickham wrote:
>> I tried messing with the line df$FixTime[which.min(df$FixInx)] changing it
>> to df[which.min(df$FixInx)] or adding new lines with the additional columns
>> that I want to include, but nothing seemed to work. I'
> I tried messing with the line df$FixTime[which.min(df$FixInx)] changing it
> to df[which.min(df$FixInx)] or adding new lines with the additional columns
> that I want to include, but nothing seemed to work. I'll admit I only have a
> mild understanding of what is going on with the function .fun.
On Tue, Mar 31, 2009 at 5:01 PM, Marianne Promberger
wrote:
> Hi,
>
> I'm having problems with qplot and the order of numeric factor levels.
>
> Factors with numeric levels show up in the order in which they appear
> in the data, not in the order of the levels (as far as I understand
> factors!)
>
On Tue, Mar 31, 2009 at 11:12 AM, Steve Murray wrote:
>
> Dear R Users,
>
> I'm trying to use the reshape package to 'melt' my gridded data into column
> format. I've done this before on individual files, but this time I'm trying
> to do it on a directory of files (with variable file names) - th
On Tue, Mar 31, 2009 at 11:31 AM, baptiste auguie wrote:
> Not exactly the output you asked for, but perhaps you can consider,
>
> library(doBy)
>> summaryBy(x3~x2+x1,data=x,FUN=mean)
>>
>> x2 x1 x3.mean
>> 1 1 A 1.5
>> 2 1 B 2.0
>> 3 1 C 3.5
>> 4 2 A 4.0
>> 5 2 B 5.
> col2rgb("#0079", TRUE)
[,1]
red 0
green0
blue 0
alpha 121
> col2rgb("#0080", TRUE)
[,1]
red255
green 255
blue 255
alpha0
> col2rgb("#0081", TRUE)
[,1]
red 0
green0
blue 0
alpha 129
Any ideas?
Thanks,
Hadley
--
http://had.co
On Mon, Mar 30, 2009 at 2:58 PM, Mike Lawrence wrote:
> I discovered Hadley Wickham's "plyr" package last week and have found
> it very useful in circumstances like this:
>
> library(plyr)
>
> firstfixtime = ddply(
> .data = data
> , .variables = c('Sub','Tr','IA')
> , .fun <- fu
On Mon, Mar 30, 2009 at 10:33 AM, Mike Lawrence wrote:
> To repent for my sins, I'll also suggest that Hadley Wickham's "plyr"
> package (http://had.co.nz/plyr/) is also useful/parsimonious in this
> context:
>
> a <- ldply(cust1_files,read.table)
You might also want to do
names(cust1_files) <-
701 - 800 of 1519 matches
Mail list logo