RE: [R] dots expansion

2004-08-04 Thread Liaw, Andy
In addition to Gabor's comments:

There's a reason why I didn't coerce the grouping variable to a factor.
rbind()ing data frames is much more expensive than rbind()ing
arrays/matrices.  Unless your data really have different data types in
different columns, it would mostly likely be better to work with the matrix
version of them.  If you really want a data frame with the grouping variable
as a factor, you can do the coercion afterward.

Andy

 From: Gabor Grothendieck
 
 Viet Nguyen vietnguyen at fastmail.fm writes:
 
  
  Thanks to all who helped.
  
  I used your ideas and code samples to write the following (for the 
  benefit of people who will search this list later):
  
  rbind.case - function(..., name=case, values) {
  dots - list(...);
  if (missing(values)) values - 1:length(dots);
  if (length(values)!=length(dots))
stop(length(values)!=length(list(...)));
  
  eval(parse(text=
 paste(cbind(rbind(...), ,name,
   =rep(values, sapply(dots, nrow))),sep=)));
  }
  
  The function is to be used with data frames. It's not as 
 good as it can 
  be but it works for my purpose.
 
 Regarding improvements, eliminate the semicolons at the end 
 of statements,
 place the default value for values= in the arg list to make 
 it more readable,
 use stopifnot to check args (also for readability), add a check for 
 data frames (which is mentioned after the code but not checked for, 
 and eliminate the eval and rep calculations by simply 
 lapplying over an 
 index and appending the name column to each data frame in turn:
 
 
 rbind.case - function(..., name=case, values = seq(along = 
 list(...))) 
 # rbind the ... data frames together adding a column named name whose 
 # value for rows from ith argument is values[i]
 {
dots - list(...)
stopifnot(length(dots) == length(values),
  all(sapply(dots, inherits, data.frame)))
 
f - function(i) { x - dots[[i]]; x[,name] - values[i]; x }
do.call(rbind, lapply(seq(along = dots), f))
 }
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] dots expansion

2004-08-04 Thread Gabor Grothendieck

That's a good point.  Your original solution is simpler than the poster's
and likely faster than mine.  

At any rate, continuing the with simplicity, rather than performance, vein if
one does want to allow matrices or data frames then the solution I posted won't
work, even if you take away the check for data frames, since I was using the
following trick to append a column without using cbind:

x[,newcol] - whatever

That appends whatever as a column named newcol if x is a data frame but
gives an error if x is a matrix.  

In that case, we should use cbind, and I think the problem the poster was
having is that cbind(x, name = values) regards name as literally that and
not a variable.  To get around this we can temporarily define our own 
cbind which allows the column names to be specified.  The new cbind 
definition is elementary and neither the last line, which does the key 
processing, nor the rest of it involves any rep arithmetic, indices or evals:

rbind.case - function(..., name = case, values = seq(along = list(...))) 
# rbind the ... data frames or matrices together adding a column named name 
# whose value for rows from ith argument is values[i]
{

  dots - list(...)
  stopifnot(length(dots) == length(values))

  # cbind x and y using indicated column names
  cbind - function(x, y, names.x = colnames(x), names.y = colnames(y)) { 
z - base::cbind(x, y)
colnames(z) - c(names.x, names.y)
z
  }

  do.call(rbind, mapply(cbind, dots, values, names.y = name, SIMPLIFY = F))

}




Liaw, Andy andy_liaw at merck.com writes:

: 
: In addition to Gabor's comments:
: 
: There's a reason why I didn't coerce the grouping variable to a factor.
: rbind()ing data frames is much more expensive than rbind()ing
: arrays/matrices.  Unless your data really have different data types in
: different columns, it would mostly likely be better to work with the matrix
: version of them.  If you really want a data frame with the grouping variable
: as a factor, you can do the coercion afterward.
: 
: Andy
: 
:  From: Gabor Grothendieck
:  
:  Viet Nguyen vietnguyen at fastmail.fm writes:
:  
:   
:   Thanks to all who helped.
:   
:   I used your ideas and code samples to write the following (for the 
:   benefit of people who will search this list later):
:   
:   rbind.case - function(..., name=case, values) {
:   dots - list(...);
:   if (missing(values)) values - 1:length(dots);
:   if (length(values)!=length(dots))
: stop(length(values)!=length(list(...)));
:   
:   eval(parse(text=
:  paste(cbind(rbind(...), ,name,
:=rep(values, sapply(dots, nrow))),sep=)));
:   }
:   
:   The function is to be used with data frames. It's not as 
:  good as it can 
:   be but it works for my purpose.
:  
:  Regarding improvements, eliminate the semicolons at the end 
:  of statements,
:  place the default value for values= in the arg list to make 
:  it more readable,
:  use stopifnot to check args (also for readability), add a check for 
:  data frames (which is mentioned after the code but not checked for, 
:  and eliminate the eval and rep calculations by simply 
:  lapplying over an 
:  index and appending the name column to each data frame in turn:
:  
:  
:  rbind.case - function(..., name=case, values = seq(along = 
:  list(...))) 
:  # rbind the ... data frames together adding a column named name whose 
:  # value for rows from ith argument is values[i]
:  {
: dots - list(...)
: stopifnot(length(dots) == length(values),
: all(sapply(dots, inherits, data.frame)))
:  
: f - function(i) { x - dots[[i]]; x[,name] - values[i]; x }
: do.call(rbind, lapply(seq(along = dots), f))
:  }
:  
:  __
:  R-help at stat.math.ethz.ch mailing list
:  https://www.stat.math.ethz.ch/mailman/listinfo/r-help
:  PLEASE do read the posting guide! 
:  http://www.R-project.org/posting-guide.html
:  
: 
: 
: __
: R-help at stat.math.ethz.ch mailing list
: https://www.stat.math.ethz.ch/mailman/listinfo/r-help
: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
: 
:

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] dots expansion

2004-08-03 Thread Viet Nguyen
Thanks to all who helped.
I used your ideas and code samples to write the following (for the 
benefit of people who will search this list later):

rbind.case - function(..., name=case, values) {
   dots - list(...);
   if (missing(values)) values - 1:length(dots);
   if (length(values)!=length(dots))
 stop(length(values)!=length(list(...)));
   eval(parse(text=
  paste(cbind(rbind(...), ,name,
=rep(values, sapply(dots, nrow))),sep=)));
}
The function is to be used with data frames. It's not as good as it can 
be but it works for my purpose.

Cheers
viet
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] dots expansion

2004-08-02 Thread Viet Nguyen
Hi list,
I'm trying to write a function similar to rbind, except that needs to 
add a factor to each component array before rbinding them together so 
that the rows from different arrays are distinguishable.

The problem that arose is how to loop through arguments in the dots 
... list. I need to get a hand on each of them but don't know how many 
of them there are and what their names are.

It'd be useful if I could look at how rbind(...) or c(...) do this but 
they are both Internal functions.

Thanks in anticipation of help!
Regards,
viet
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] dots expansion

2004-08-02 Thread Duncan Murdoch
On Tue, 03 Aug 2004 14:37:56 +1000, Viet Nguyen
[EMAIL PROTECTED] wrote:

Hi list,

I'm trying to write a function similar to rbind, except that needs to 
add a factor to each component array before rbinding them together so 
that the rows from different arrays are distinguishable.

The problem that arose is how to loop through arguments in the dots 
... list. I need to get a hand on each of them but don't know how many 
of them there are and what their names are.

It'd be useful if I could look at how rbind(...) or c(...) do this but 
they are both Internal functions.

list(...) will create a list, one item per argument.  If the args are
named then the corresponding list item will be named.

Duncan Murdoch

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html