[R] Bug in colnames of data.frames?

2004-08-17 Thread Arne Henningsen
Hi,

I am using R 1.9.1 on on i686 PC with SuSE Linux 9.0.

I have a data.frame, e.g.:

 myData - data.frame( var1 = c( 1:4 ), var2 = c (5:8 ) )

If I add a new column by

 myData$var3 - myData[ , var1 ] + myData[ , var2 ]

everything is fine, but if I omit the commas:

 myData$var4 - myData[ var1 ] + myData[ var2 ]

the name shown above the 4th column is not var4:

 myData
  var1 var2 var3 var1
11566
22688
337   10   10
448   12   12

but names() and colnames() return the expected name:

 names( myData )
[1] var1 var2 var3 var4
 colnames( myData )
[1] var1 var2 var3 var4

And it is even worse: I am not able to change the name shown above the 4th 
column:
 names( myData )[ 4 ] - var5
 myData
  var1 var2 var3 var1
11566
22688
337   10   10
448   12   12

I guess that this is a bug, isn't it?

Arne

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Bug in colnames of data.frames?

2004-08-17 Thread Uwe Ligges
Arne Henningsen wrote:
Hi,
I am using R 1.9.1 on on i686 PC with SuSE Linux 9.0.
I have a data.frame, e.g.:

myData - data.frame( var1 = c( 1:4 ), var2 = c (5:8 ) )

If I add a new column by

myData$var3 - myData[ , var1 ] + myData[ , var2 ]

everything is fine, but if I omit the commas:

myData$var4 - myData[ var1 ] + myData[ var2 ]
This bug is the user ... ;-)
Type:  str(myData)
`data.frame':   4 obs. of  3 variables:
 $ var1: int  1 2 3 4
 $ var2: int  5 6 7 8
 $ var4:`data.frame':   4 obs. of  1 variable:
  ..$ var1: int  6 8 10 12
Aha! You have created a data.frame consisting of one column! What you 
mean really mean is
 myData$var5 - myData[[ var1 ]] + myData[[ var2 ]]

Uwe Ligges


the name shown above the 4th column is not var4:

myData
  var1 var2 var3 var1
11566
22688
337   10   10
448   12   12
but names() and colnames() return the expected name:

names( myData )
[1] var1 var2 var3 var4
colnames( myData )
[1] var1 var2 var3 var4
And it is even worse: I am not able to change the name shown above the 4th 
column:

names( myData )[ 4 ] - var5
myData
  var1 var2 var3 var1
11566
22688
337   10   10
448   12   12
I guess that this is a bug, isn't it?
Arne
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Bug in colnames of data.frames?

2004-08-17 Thread Peter Dalgaard
Arne Henningsen [EMAIL PROTECTED] writes:

 Hi,
 
 I am using R 1.9.1 on on i686 PC with SuSE Linux 9.0.
 
 I have a data.frame, e.g.:
 
  myData - data.frame( var1 = c( 1:4 ), var2 = c (5:8 ) )
 
 If I add a new column by
 
  myData$var3 - myData[ , var1 ] + myData[ , var2 ]
 
 everything is fine, but if I omit the commas:
 
  myData$var4 - myData[ var1 ] + myData[ var2 ]
 
 the name shown above the 4th column is not var4:
 
  myData
   var1 var2 var3 var1
 11566
 22688
 337   10   10
 448   12   12
 
 but names() and colnames() return the expected name:
 
  names( myData )
 [1] var1 var2 var3 var4
  colnames( myData )
 [1] var1 var2 var3 var4
 
 And it is even worse: I am not able to change the name shown above the 4th 
 column:
  names( myData )[ 4 ] - var5
  myData
   var1 var2 var3 var1
 11566
 22688
 337   10   10
 448   12   12
 
 I guess that this is a bug, isn't it?

Nope:

 str(myData)
`data.frame':   4 obs. of  4 variables:
 $ var1: int  1 2 3 4
 $ var2: int  5 6 7 8
 $ var3: int  6 8 10 12
 $ var4:`data.frame':   4 obs. of  1 variable:
  ..$ var1: int  6 8 10 12

It's slightly peculiar, but if a column of a data frame is itself a
rectangular structure (data frame or matrix), then the innermost names
are used. Cf.

 myData[,var4] - cbind(xyzzy=5:2)
 myData
  var1 var2 var3 xyzzy
1156 5
2268 4
337   10 3
448   12 2


Arguably, one might prefer

  var1 var2 var3  var4
 xyzzy
1156 5
2268 4
337   10 3
448   12 2

or something like that, but it's hardly a bug.


-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Bug in colnames of data.frames? -- NOT

2004-08-17 Thread Prof Brian Ripley
This is not a bug, and BTW data frames have names not colnames.
As I have said already today, don't confuse the printed repesentation of 
an object with the object itself.

On Tue, 17 Aug 2004, Arne Henningsen wrote:

 I am using R 1.9.1 on on i686 PC with SuSE Linux 9.0.
 
 I have a data.frame, e.g.:
 
  myData - data.frame( var1 = c( 1:4 ), var2 = c (5:8 ) )
 
 If I add a new column by
 
  myData$var3 - myData[ , var1 ] + myData[ , var2 ]
 
 everything is fine, but if I omit the commas:
 
  myData$var4 - myData[ var1 ] + myData[ var2 ]
 
 the name shown above the 4th column is not var4:
 
  myData
   var1 var2 var3 var1
 11566
 22688
 337   10   10
 448   12   12
 
 but names() and colnames() return the expected name:
 
  names( myData )
 [1] var1 var2 var3 var4
  colnames( myData )
 [1] var1 var2 var3 var4
 
 And it is even worse: I am not able to change the name shown above the 4th 
 column:
  names( myData )[ 4 ] - var5
  myData
   var1 var2 var3 var1
 11566
 22688
 337   10   10
 448   12   12
 
 I guess that this is a bug, isn't it?

No.  Take a look at the fourth column more carefully.

 myData[4]
  var1
16
28
3   10
4   12

 class(myData[4])
[1] data.frame

You included a single-column data frame in your data frame.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Bug in colnames of data.frames?

2004-08-17 Thread Marc Schwartz
On Tue, 2004-08-17 at 09:01, Arne Henningsen wrote:
 Hi,
 
 I am using R 1.9.1 on on i686 PC with SuSE Linux 9.0.
 
 I have a data.frame, e.g.:
 
  myData - data.frame( var1 = c( 1:4 ), var2 = c (5:8 ) )
 
 If I add a new column by
 
  myData$var3 - myData[ , var1 ] + myData[ , var2 ]
 
 everything is fine, but if I omit the commas:
 
  myData$var4 - myData[ var1 ] + myData[ var2 ]
 
 the name shown above the 4th column is not var4:
 
  myData
   var1 var2 var3 var1
 11566
 22688
 337   10   10
 448   12   12
 
 but names() and colnames() return the expected name:
 
  names( myData )
 [1] var1 var2 var3 var4
  colnames( myData )
 [1] var1 var2 var3 var4
 
 And it is even worse: I am not able to change the name shown above the 4th 
 column:
  names( myData )[ 4 ] - var5
  myData
   var1 var2 var3 var1
 11566
 22688
 337   10   10
 448   12   12
 
 I guess that this is a bug, isn't it?
 
 Arne


Here is a hint:

# This returns an integer vector
 str(myData[ , var1 ] + myData[ , var2 ])
 int [1:4] 6 8 10 12


# This returns a data.frame
 str(myData[ var1 ] + myData[ var2 ])
`data.frame':   4 obs. of  1 variable:
 $ var1: int  6 8 10 12


 str(myData)
`data.frame':   4 obs. of  5 variables:
 $ var1: int  1 2 3 4
 $ var2: int  5 6 7 8
 $ var3: int  6 8 10 12
 $ var4:`data.frame':   4 obs. of  1 variable:
  ..$ var1: int  6 8 10 12


Take a look at the details, value and coercion sections of ?.data.frame

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Bug in colnames of data.frames?

2004-08-17 Thread Marc Schwartz
On Tue, 2004-08-17 at 09:34, Marc Schwartz wrote:

 Take a look at the details, value and coercion sections of
 ?.data.frame

This must be my week for typos. That should be:

?[.data.frame (in ESS)

or

?[.data.frame (otherwise)

Marc

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Bug in colnames of data.frames?

2004-08-17 Thread David Forrest
On Tue, 17 Aug 2004, Arne Henningsen wrote:

 Thank you for all your answers!

 I agree with you that it is not a bug. My mistake was that I thought that a
 data frame is similar to a matrix, but as ?data.frame says they ... share
 many of the properties of matrices and of lists.
...

 I think the current presentation
  myData
   var1 var2 var3 xyzzy
 1156 5
 2268 4
 337   10 3
 448   12 2

 is confusing because it is not directly (without another command like str())
 apparent why myData[[ var1 ]] works fine while myData[[ xyzzy ]] does
 not.

In some ways it is a bug -- in the documentation, print.data.frame, or
format.data.frame

Consider assigning a wider dataframe to var4:

myData-data.frame(matrix(1:12,4),var4=I(data.frame(xyzzy=5:2,plugh=1:4)))
myData  # error
class(myData[[var4]])-data.frame
myData  # gives indications of sub-variables by var.xyzzy, var.plugh
myData[[var4.plugh]]  # NULL
myData[[var4]][[plugh]]

str(myData)

By the way, is there a way of making such an assignment in one step
without the I() class() hack?

dave
-- 
 Dave Forrest
 [EMAIL PROTECTED](804)684-7900w
 [EMAIL PROTECTED] (804)642-0662h
   http://maplepark.com/~drf5n/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html