[R] as.data.frame(cbind()) transforming numeric to factor?

2006-08-18 Thread Tom Boonen
Dear List,

why does as.data.frame(cbind()) transform numeric variables to
factors, once one of the other variablesused is a character vector?

#
x.1 - rnorm(10)
x.2 - c(rep(Test,10))
Foo - as.data.frame(cbind(x.1))
is.factor(Foo$x.1)

Foo - as.data.frame(cbind(x.1,x.2))
is.factor(Foo$x.1)
#

I assume there is a good reason for this, can somebody explain? Thanks.

Best,
Tom

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] as.data.frame(cbind()) transforming numeric to factor?

2006-08-18 Thread Marc Schwartz (via MN)
On Fri, 2006-08-18 at 10:41 -0400, Tom Boonen wrote:
 Dear List,
 
 why does as.data.frame(cbind()) transform numeric variables to
 factors, once one of the other variablesused is a character vector?
 
 #
 x.1 - rnorm(10)
 x.2 - c(rep(Test,10))
 Foo - as.data.frame(cbind(x.1))
 is.factor(Foo$x.1)
 
 Foo - as.data.frame(cbind(x.1,x.2))
 is.factor(Foo$x.1)
 #
 
 I assume there is a good reason for this, can somebody explain? Thanks.
 
 Best,
 Tom

See the Note section of ?cbind, which states:

The method dispatching is not done via UseMethod(), but by C-internal
dispatching. Therefore, there is no need for, e.g., rbind.default.

The dispatch algorithm is described in the source file
(ā€˜.../src/main/bind.cā€™) as

 1. For each argument we get the list of possible class memberships
from the class attribute.
 2. We inspect each class in turn to see if there is an an
applicable method.
 3. If we find an applicable method we make sure that it is
identical to any method determined for prior arguments. If it is
identical, we proceed, otherwise we immediately drop through to
the default code.

If you want to combine other objects with data frames, it may be
necessary to coerce them to data frames first. (Note that this algorithm
can result in calling the data frame method if the arguments are all
either data frames or vectors, and this will result in the coercion of
character vectors to factors.)


Thus, note the result of:

 str(cbind(x.1, x.2))
 chr [1:10, 1:2] -0.265756038510064 2.13220714034528 ...
 - attr(*, dimnames)=List of 2
  ..$ : NULL
  ..$ : chr [1:2] x.1 x.2

Since a matrix can only contain a single data type, the numeric vector
is coerced to character.

Then using as.data.frame() coerces the character matrix to factors,
which is the default behavior.

If you want to create a data frame, do it this way:

 str(data.frame(x.1, x.2))
`data.frame':   10 obs. of  2 variables:
 $ x.1: num  -0.266  2.132  2.096 -0.128 -0.466 ...
 $ x.2: Factor w/ 1 level Test: 1 1 1 1 1 1 1 1 1 1

or if you want to retain the character vector, use I():

 str(data.frame(x.1, I(x.2)))
`data.frame':   10 obs. of  2 variables:
 $ x.1: num  -0.266  2.132  2.096 -0.128 -0.466 ...
 $ x.2:Class 'AsIs'  chr [1:10] Test Test Test Test ...


See ?data.frame for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] as.data.frame(cbind()) transforming numeric to factor?

2006-08-18 Thread Prof Brian Ripley
On Fri, 18 Aug 2006, Tom Boonen wrote:

 Dear List,
 
 why does as.data.frame(cbind()) transform numeric variables to
 factors, once one of the other variablesused is a character vector?
 
 #
 x.1 - rnorm(10)
 x.2 - c(rep(Test,10))
 Foo - as.data.frame(cbind(x.1))
 is.factor(Foo$x.1)
 
 Foo - as.data.frame(cbind(x.1,x.2))
 is.factor(Foo$x.1)
 #
 
 I assume there is a good reason for this, can somebody explain? Thanks.

Only if you can explain the good reason why you did not just use 
data.frame(x.1, x.2)!

cbind() makes a matrix out of vectors, here a character matrix.  And then 
as.data.frame() converts character columns to factors.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] as.data.frame(cbind()) transforming numeric to factor?

2006-08-18 Thread Gabor Grothendieck
In R version 2.4.0 Under development (unstable) (2006-08-08 r38825)
one can do this:

as.data.frame(cbind(x.1,x.2),stringsAsFactors = FALSE)


On 8/18/06, Tom Boonen [EMAIL PROTECTED] wrote:
 Dear List,

 why does as.data.frame(cbind()) transform numeric variables to
 factors, once one of the other variablesused is a character vector?

 #
 x.1 - rnorm(10)
 x.2 - c(rep(Test,10))
 Foo - as.data.frame(cbind(x.1))
 is.factor(Foo$x.1)

 Foo - as.data.frame(cbind(x.1,x.2))
 is.factor(Foo$x.1)
 #

 I assume there is a good reason for this, can somebody explain? Thanks.

 Best,
 Tom

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] as.data.frame(cbind()) transforming numeric to factor?

2006-08-18 Thread Tom Boonen
Thanks everybody. I recognize my mistake now.

I think as.data.frame(cbind(x.1,x.2),stringsAsFactors = FALSE) would
be a good idea.

Tom

On 8/18/06, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 On Fri, 18 Aug 2006, Tom Boonen wrote:

  Dear List,
 
  why does as.data.frame(cbind()) transform numeric variables to
  factors, once one of the other variablesused is a character vector?
 
  #
  x.1 - rnorm(10)
  x.2 - c(rep(Test,10))
  Foo - as.data.frame(cbind(x.1))
  is.factor(Foo$x.1)
 
  Foo - as.data.frame(cbind(x.1,x.2))
  is.factor(Foo$x.1)
  #
 
  I assume there is a good reason for this, can somebody explain? Thanks.

 Only if you can explain the good reason why you did not just use
 data.frame(x.1, x.2)!

 cbind() makes a matrix out of vectors, here a character matrix.  And then
 as.data.frame() converts character columns to factors.

 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] as.data.frame(cbind()) transforming numeric to factor?

2006-08-18 Thread Martin Maechler
 Tom == Tom Boonen [EMAIL PROTECTED]
 on Fri, 18 Aug 2006 11:16:45 -0400 writes:

Tom Thanks everybody. I recognize my mistake now.
Tom I thinkas.data.frame(cbind(x.1,x.2),stringsAsFactors = FALSE)
Tom would be a good idea.

I think

data.frame(x.1, x.2 = I(x.2))

would be a considerably better idea.

[ The use of I(.) for preventing coercion to factors 
  is a much older and S-like way ]

Martin



Tom Tom

Tom On 8/18/06, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 On Fri, 18 Aug 2006, Tom Boonen wrote:
 
  Dear List,
 
  why does as.data.frame(cbind()) transform numeric variables to
  factors, once one of the other variablesused is a character vector?
 
  #
  x.1 - rnorm(10)
  x.2 - c(rep(Test,10))
  Foo - as.data.frame(cbind(x.1))
  is.factor(Foo$x.1)
 
  Foo - as.data.frame(cbind(x.1,x.2))
  is.factor(Foo$x.1)
  #
 
  I assume there is a good reason for this, can somebody explain? Thanks.
 
 Only if you can explain the good reason why you did not just use
 data.frame(x.1, x.2)!
 
 cbind() makes a matrix out of vectors, here a character matrix.  And then
 as.data.frame() converts character columns to factors.
 
 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 

Tom __
Tom R-help@stat.math.ethz.ch mailing list
Tom https://stat.ethz.ch/mailman/listinfo/r-help
Tom PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
Tom and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.