Re: [R] data frame vs. matrix

2014-03-17 Thread Göran Broström



On 2014-03-16 23:56, Duncan Murdoch wrote:

On 14-03-16 2:57 PM, Göran Broström wrote:

I have always known that matrices are faster than data frames, for
instance this function:


dumkoll - function(n = 1000, df = TRUE){
   dfr - data.frame(x = rnorm(n), y = rnorm(n))
   if (df){
   for (i in 2:NROW(dfr)){
   if (!(i %% 100)) cat(i = , i, \n)
   dfr$x[i] - dfr$x[i-1]
   }
   }else{
   dm - as.matrix(dfr)
   for (i in 2:NROW(dm)){
   if (!(i %% 100)) cat(i = , i, \n)
   dm[i, 1] - dm[i-1, 1]
   }
   dfr$x - dm[, 1]
   }
}


system.time(dumkoll())

  user  system elapsed
 0.046   0.000   0.045

system.time(dumkoll(df = FALSE))

  user  system elapsed
 0.007   0.000   0.008
--

OK, no big deal, but I stumbled over a data frame with one million
records. Then, with df = TRUE,

usersystem   elapsed
44677.141  1271.544 46016.754

This is around 12 hours.

With df = FALSE, it took only six seconds! About 7500 time faster.

I was really surprised by the huge difference, and I wonder if this is
to be expected, or if it is some peculiarity with my installation: I'm
running Ubuntu 13.10 on a MacBook Pro with 8 Gb memory, R-3.0.3.


I don't find it surprising.  The line

dfr$x[i] - dfr$x[i-1]

will be executed about a million times.  It does the following:


Thanks for the explanation; I got the idea that dfr[1, i] - might be 
faster than dfr$x[i] - , but it is in fact significantly slower.

Helpful experience.

Göran


1.  Get a pointer to the x element of dfr.  This requires R to look
through all the names of dfr to figure out which one is x.

2.  Extract the i-1 element from it.  Not particularly slow.

3.  Get a pointer to the x element of dfr again.  (R doesn't cache these
things.)

4.  Set the i element of it to a new value.  This could require the
entire column or even the entire dataframe to be copied, if R hasn't
kept track of the fact that it is really being changed in place.  In a
complex assignment like that, I wouldn't be surprised if that took
place.  (In the matrix equivalent, it would be easier to recognize that
it is safe to change the existing value.)

Luke Tierney is making some changes in R-devel that might help a lot in
cases like this, but I expect the matrix code will always be faster.

Duncan Murdoch



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame vs. matrix

2014-03-17 Thread Göran Broström

On 2014-03-17 01:31, Jeff Newmiller wrote:

Did you really intend to make all of the x values the same?


Not at all; the code in the loop was in fact just nonsense. The point 
was to illustrate the huge difference in execution time. And that the 
relative difference seems to increase fast with the number of observations.



If so,
try one line instead of the for loop:

dfr$x[ 2:n ] - dfr$x[ 1 ]

If that was merely an error in your example, then you could use a
different one-liner:

dfr$x[ 2:n ] - dfr$x[ seq.int( n-1 ) ]

In either case, the speedup is considerable.


I know about all this, but sometimes you have situations where you 
cannot avoid an explicit loop.



I use data frames far more than matrices and don't feel I am
suffering for it, but then I also use creative indexing way more than
for loops.


I think that this example shows that you need both tools in your toolbox.

Göran



---



Jeff NewmillerThe .   .  Go Live...

DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
Go... Live:   OO#.. Dead: OO#..  Playing Research Engineer
(Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.
rocks...1k
---



Sent from my phone. Please excuse my brevity.


On March 16, 2014 11:57:33 AM PDT, Göran Broström
goran.brost...@umu.se wrote:

I have always known that matrices are faster than data frames,
for instance this function:


dumkoll - function(n = 1000, df = TRUE){ dfr - data.frame(x =
rnorm(n), y = rnorm(n)) if (df){ for (i in 2:NROW(dfr)){ if (!(i %%
100)) cat(i = , i, \n) dfr$x[i] - dfr$x[i-1] } }else{ dm -
as.matrix(dfr) for (i in 2:NROW(dm)){ if (!(i %% 100)) cat(i = ,
i, \n) dm[i, 1] - dm[i-1, 1] } dfr$x - dm[, 1] } }



system.time(dumkoll())


user  system elapsed 0.046   0.000   0.045


system.time(dumkoll(df = FALSE))


user  system elapsed 0.007   0.000   0.008 --

OK, no big deal, but I stumbled over a data frame with one million
records. Then, with df = TRUE,  user
system   elapsed 44677.141  1271.544 46016.754
 This is around 12 hours.

With df = FALSE, it took only six seconds! About 7500 time faster.

I was really surprised by the huge difference, and I wonder if this
is to be expected, or if it is some peculiarity with my
installation: I'm running Ubuntu 13.10 on a MacBook Pro with 8 Gb
memory, R-3.0.3.

Göran B.

__ R-help@r-project.org
mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame vs. matrix

2014-03-17 Thread Göran Broström

On 2014-03-17 00:36, William Dunlap wrote:

Duncan's analysis suggests another way to do this:
extract the 'x' vector, operate on that vector in a loop,
then insert the result into the data.frame.


Thanks Bill, that is a good improvement.

Göran


 I added
a df=quicker option to your df argument and made the test
dataset deterministic so we could verify that the algorithms
do the same thing:

dumkoll - function(n = 1000, df = TRUE){
  dfr - data.frame(x = log(seq_len(n)), y = sqrt(seq_len(n)))
  if (identical(df, quicker)) {
  x - dfr$x
  for(i in 2:length(x)) {
  x[i] - x[i-1]
  }
  dfr$x - x
  } else if (df){
  for (i in 2:NROW(dfr)){
  # if (!(i %% 100)) cat(i = , i, \n)
  dfr$x[i] - dfr$x[i-1]
  }
  }else{
  dm - as.matrix(dfr)
  for (i in 2:NROW(dm)){
  # if (!(i %% 100)) cat(i = , i, \n)
  dm[i, 1] - dm[i-1, 1]
  }
  dfr$x - dm[, 1]
  }
  dfr
}

Timings for 10^4, 2*10^4, and 4*10^4 show that the time is quadratic
in n for the df=TRUE case and close to linear in the other cases, with
the new method taking about 60% the time of the matrix method:
 n - c(10k=1e4, 20k=2e4, 40k=4e4)
 sapply(n, function(n)system.time(dumkoll(n, df=FALSE))[1:3])
   10k  20k  40k
user.self 0.11 0.22 0.43
sys.self  0.02 0.00 0.00
elapsed   0.12 0.22 0.44
 sapply(n, function(n)system.time(dumkoll(n, df=TRUE))[1:3])
   10k   20k   40k
user.self 3.59 14.74 78.37
sys.self  0.00  0.11  0.16
elapsed   3.59 14.91 78.81
 sapply(n, function(n)system.time(dumkoll(n, df=quicker))[1:3])
   10k  20k  40k
user.self 0.06 0.12 0.26
sys.self  0.00 0.00 0.00
elapsed   0.07 0.13 0.27
I also timed the 2 faster cases for n=10^6 and the time still looks linear
in n, with vector approach still taking about 60% the time of the matrix
approach.
 system.time(dumkoll(n=10^6, df=FALSE))
   user  system elapsed
  11.650.12   11.82
 system.time(dumkoll(n=10^6, df=quicker))
   user  system elapsed
   6.790.086.91
The results from each method are identical:
 identical(dumkoll(100,df=FALSE), dumkoll(100,df=TRUE))
[1] TRUE
 identical(dumkoll(100,df=FALSE), dumkoll(100,df=quicker))
[1] TRUE

If your data.frame has columns of various types, then as.matrix will
coerce them all to a common type (often character), so it may give
you the wrong result in addition to being unnecessarily slow.

Bill Dunlap
TIBCO Software
wdunlap tibco.com



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf
Of Duncan Murdoch
Sent: Sunday, March 16, 2014 3:56 PM
To: Göran Broström; r-help@r-project.org
Subject: Re: [R] data frame vs. matrix

On 14-03-16 2:57 PM, Göran Broström wrote:

I have always known that matrices are faster than data frames, for
instance this function:


dumkoll - function(n = 1000, df = TRUE){
   dfr - data.frame(x = rnorm(n), y = rnorm(n))
   if (df){
   for (i in 2:NROW(dfr)){
   if (!(i %% 100)) cat(i = , i, \n)
   dfr$x[i] - dfr$x[i-1]
   }
   }else{
   dm - as.matrix(dfr)
   for (i in 2:NROW(dm)){
   if (!(i %% 100)) cat(i = , i, \n)
   dm[i, 1] - dm[i-1, 1]
   }
   dfr$x - dm[, 1]
   }
}


system.time(dumkoll())

  user  system elapsed
 0.046   0.000   0.045

system.time(dumkoll(df = FALSE))

  user  system elapsed
 0.007   0.000   0.008
--

OK, no big deal, but I stumbled over a data frame with one million
records. Then, with df = TRUE,

usersystem   elapsed
44677.141  1271.544 46016.754

This is around 12 hours.

With df = FALSE, it took only six seconds! About 7500 time faster.

I was really surprised by the huge difference, and I wonder if this is
to be expected, or if it is some peculiarity with my installation: I'm
running Ubuntu 13.10 on a MacBook Pro with 8 Gb memory, R-3.0.3.


I don't find it surprising.  The line

dfr$x[i] - dfr$x[i-1]

will be executed about a million times.  It does the following:

1.  Get a pointer to the x element of dfr.  This requires R to look
through all the names of dfr to figure out which one is x.

2.  Extract the i-1 element from it.  Not particularly slow.

3.  Get a pointer to the x element of dfr again.  (R doesn't cache these
things.)

4.  Set the i element of it to a new value.  This could require the
entire column or even the entire dataframe to be copied, if R hasn't
kept track of the fact that it is really being changed in place.  In a
complex assignment like that, I wouldn't be surprised if that took
place.  (In the matrix equivalent, it would be easier to recognize that
it is safe to change the existing

Re: [R] data frame vs. matrix

2014-03-16 Thread Rui Barradas

Hello,

This is to be expected. Matrices can hold only one type of data so the 
problem is solved once and for all, data frames can have many types of 
data so the code to handle them must determine which type to handle on 
every access.


Hope this helps,

Rui Barradas

Em 16-03-2014 18:57, Göran Broström escreveu:

I have always known that matrices are faster than data frames, for
instance this function:


dumkoll - function(n = 1000, df = TRUE){
 dfr - data.frame(x = rnorm(n), y = rnorm(n))
 if (df){
 for (i in 2:NROW(dfr)){
 if (!(i %% 100)) cat(i = , i, \n)
 dfr$x[i] - dfr$x[i-1]
 }
 }else{
 dm - as.matrix(dfr)
 for (i in 2:NROW(dm)){
 if (!(i %% 100)) cat(i = , i, \n)
 dm[i, 1] - dm[i-1, 1]
 }
 dfr$x - dm[, 1]
 }
}


  system.time(dumkoll())

user  system elapsed
   0.046   0.000   0.045

  system.time(dumkoll(df = FALSE))

user  system elapsed
   0.007   0.000   0.008
--

OK, no big deal, but I stumbled over a data frame with one million
records. Then, with df = TRUE,

  usersystem   elapsed
44677.141  1271.544 46016.754

This is around 12 hours.

With df = FALSE, it took only six seconds! About 7500 time faster.

I was really surprised by the huge difference, and I wonder if this is
to be expected, or if it is some peculiarity with my installation: I'm
running Ubuntu 13.10 on a MacBook Pro with 8 Gb memory, R-3.0.3.

Göran B.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame vs. matrix

2014-03-16 Thread Duncan Murdoch

On 14-03-16 2:57 PM, Göran Broström wrote:

I have always known that matrices are faster than data frames, for
instance this function:


dumkoll - function(n = 1000, df = TRUE){
  dfr - data.frame(x = rnorm(n), y = rnorm(n))
  if (df){
  for (i in 2:NROW(dfr)){
  if (!(i %% 100)) cat(i = , i, \n)
  dfr$x[i] - dfr$x[i-1]
  }
  }else{
  dm - as.matrix(dfr)
  for (i in 2:NROW(dm)){
  if (!(i %% 100)) cat(i = , i, \n)
  dm[i, 1] - dm[i-1, 1]
  }
  dfr$x - dm[, 1]
  }
}


   system.time(dumkoll())

 user  system elapsed
0.046   0.000   0.045

   system.time(dumkoll(df = FALSE))

 user  system elapsed
0.007   0.000   0.008
--

OK, no big deal, but I stumbled over a data frame with one million
records. Then, with df = TRUE,

   usersystem   elapsed
44677.141  1271.544 46016.754

This is around 12 hours.

With df = FALSE, it took only six seconds! About 7500 time faster.

I was really surprised by the huge difference, and I wonder if this is
to be expected, or if it is some peculiarity with my installation: I'm
running Ubuntu 13.10 on a MacBook Pro with 8 Gb memory, R-3.0.3.


I don't find it surprising.  The line

dfr$x[i] - dfr$x[i-1]

will be executed about a million times.  It does the following:

1.  Get a pointer to the x element of dfr.  This requires R to look 
through all the names of dfr to figure out which one is x.


2.  Extract the i-1 element from it.  Not particularly slow.

3.  Get a pointer to the x element of dfr again.  (R doesn't cache these 
things.)


4.  Set the i element of it to a new value.  This could require the 
entire column or even the entire dataframe to be copied, if R hasn't 
kept track of the fact that it is really being changed in place.  In a 
complex assignment like that, I wouldn't be surprised if that took 
place.  (In the matrix equivalent, it would be easier to recognize that 
it is safe to change the existing value.)


Luke Tierney is making some changes in R-devel that might help a lot in 
cases like this, but I expect the matrix code will always be faster.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame vs. matrix

2014-03-16 Thread William Dunlap
Duncan's analysis suggests another way to do this:
extract the 'x' vector, operate on that vector in a loop,
then insert the result into the data.frame.  I added
a df=quicker option to your df argument and made the test
dataset deterministic so we could verify that the algorithms
do the same thing:

dumkoll - function(n = 1000, df = TRUE){
 dfr - data.frame(x = log(seq_len(n)), y = sqrt(seq_len(n)))
 if (identical(df, quicker)) {
 x - dfr$x
 for(i in 2:length(x)) {
 x[i] - x[i-1]
 }
 dfr$x - x
 } else if (df){
 for (i in 2:NROW(dfr)){
 # if (!(i %% 100)) cat(i = , i, \n)
 dfr$x[i] - dfr$x[i-1]
 }
 }else{
 dm - as.matrix(dfr)
 for (i in 2:NROW(dm)){
 # if (!(i %% 100)) cat(i = , i, \n)
 dm[i, 1] - dm[i-1, 1]
 }
 dfr$x - dm[, 1]
 }
 dfr
}

Timings for 10^4, 2*10^4, and 4*10^4 show that the time is quadratic
in n for the df=TRUE case and close to linear in the other cases, with
the new method taking about 60% the time of the matrix method:
n - c(10k=1e4, 20k=2e4, 40k=4e4)
sapply(n, function(n)system.time(dumkoll(n, df=FALSE))[1:3])
  10k  20k  40k
   user.self 0.11 0.22 0.43
   sys.self  0.02 0.00 0.00
   elapsed   0.12 0.22 0.44
sapply(n, function(n)system.time(dumkoll(n, df=TRUE))[1:3])
  10k   20k   40k
   user.self 3.59 14.74 78.37
   sys.self  0.00  0.11  0.16
   elapsed   3.59 14.91 78.81
sapply(n, function(n)system.time(dumkoll(n, df=quicker))[1:3])
  10k  20k  40k
   user.self 0.06 0.12 0.26
   sys.self  0.00 0.00 0.00
   elapsed   0.07 0.13 0.27
I also timed the 2 faster cases for n=10^6 and the time still looks linear
in n, with vector approach still taking about 60% the time of the matrix
approach.
system.time(dumkoll(n=10^6, df=FALSE))
  user  system elapsed 
 11.650.12   11.82 
system.time(dumkoll(n=10^6, df=quicker))
  user  system elapsed 
  6.790.086.91
The results from each method are identical:
identical(dumkoll(100,df=FALSE), dumkoll(100,df=TRUE))
   [1] TRUE
identical(dumkoll(100,df=FALSE), dumkoll(100,df=quicker))
   [1] TRUE

If your data.frame has columns of various types, then as.matrix will
coerce them all to a common type (often character), so it may give
you the wrong result in addition to being unnecessarily slow.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Duncan Murdoch
 Sent: Sunday, March 16, 2014 3:56 PM
 To: Göran Broström; r-help@r-project.org
 Subject: Re: [R] data frame vs. matrix
 
 On 14-03-16 2:57 PM, Göran Broström wrote:
  I have always known that matrices are faster than data frames, for
  instance this function:
 
 
  dumkoll - function(n = 1000, df = TRUE){
dfr - data.frame(x = rnorm(n), y = rnorm(n))
if (df){
for (i in 2:NROW(dfr)){
if (!(i %% 100)) cat(i = , i, \n)
dfr$x[i] - dfr$x[i-1]
}
}else{
dm - as.matrix(dfr)
for (i in 2:NROW(dm)){
if (!(i %% 100)) cat(i = , i, \n)
dm[i, 1] - dm[i-1, 1]
}
dfr$x - dm[, 1]
}
  }
 
  
 system.time(dumkoll())
 
   user  system elapsed
  0.046   0.000   0.045
 
 system.time(dumkoll(df = FALSE))
 
   user  system elapsed
  0.007   0.000   0.008
  --
 
  OK, no big deal, but I stumbled over a data frame with one million
  records. Then, with df = TRUE,
  
 usersystem   elapsed
  44677.141  1271.544 46016.754
  
  This is around 12 hours.
 
  With df = FALSE, it took only six seconds! About 7500 time faster.
 
  I was really surprised by the huge difference, and I wonder if this is
  to be expected, or if it is some peculiarity with my installation: I'm
  running Ubuntu 13.10 on a MacBook Pro with 8 Gb memory, R-3.0.3.
 
 I don't find it surprising.  The line
 
 dfr$x[i] - dfr$x[i-1]
 
 will be executed about a million times.  It does the following:
 
 1.  Get a pointer to the x element of dfr.  This requires R to look
 through all the names of dfr to figure out which one is x.
 
 2.  Extract the i-1 element from it.  Not particularly slow.
 
 3.  Get a pointer to the x element of dfr again.  (R doesn't cache these
 things.)
 
 4.  Set the i element of it to a new value.  This could require the
 entire column or even the entire dataframe to be copied, if R hasn't
 kept track of the fact that it is really being changed in place.  In a
 complex assignment like that, I wouldn't be surprised if that took
 place.  (In the matrix equivalent, it would be easier to recognize that
 it is safe to change the existing value.)
 
 Luke Tierney is making some changes in R

Re: [R] data frame vs. matrix

2014-03-16 Thread Jeff Newmiller
Did you really intend to make all of the x values the same? If so, try one line 
instead of the for loop:

dfr$x[ 2:n ] - dfr$x[ 1 ]

If that was merely an error in your example, then you could use a different 
one-liner:

dfr$x[ 2:n ] - dfr$x[ seq.int( n-1 ) ]

In either case, the speedup is considerable.

I use data frames far more than matrices and don't feel I am suffering for it, 
but then I also use creative indexing way more than for loops.

---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On March 16, 2014 11:57:33 AM PDT, Göran Broström goran.brost...@umu.se 
wrote:
I have always known that matrices are faster than data frames, for 
instance this function:


dumkoll - function(n = 1000, df = TRUE){
 dfr - data.frame(x = rnorm(n), y = rnorm(n))
 if (df){
 for (i in 2:NROW(dfr)){
 if (!(i %% 100)) cat(i = , i, \n)
 dfr$x[i] - dfr$x[i-1]
 }
 }else{
 dm - as.matrix(dfr)
 for (i in 2:NROW(dm)){
 if (!(i %% 100)) cat(i = , i, \n)
 dm[i, 1] - dm[i-1, 1]
 }
 dfr$x - dm[, 1]
 }
}


  system.time(dumkoll())

user  system elapsed
   0.046   0.000   0.045

  system.time(dumkoll(df = FALSE))

user  system elapsed
   0.007   0.000   0.008
--

OK, no big deal, but I stumbled over a data frame with one million 
records. Then, with df = TRUE,

  usersystem   elapsed
44677.141  1271.544 46016.754

This is around 12 hours.

With df = FALSE, it took only six seconds! About 7500 time faster.

I was really surprised by the huge difference, and I wonder if this is 
to be expected, or if it is some peculiarity with my installation: I'm 
running Ubuntu 13.10 on a MacBook Pro with 8 Gb memory, R-3.0.3.

Göran B.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame vs matrix quirk: Hinky error message?

2012-05-01 Thread Ista Zahn
Hi Bert,

The failure itself is the documented behavior: ?'[.data.frame' says

Matrix indexing ('x[i]' with a logical or a 2-column integer
 matrix 'i') using '[' is not recommended, and barely supported.
 For extraction, 'x' is first coerced to a matrix.  For
 replacement, a logical matrix (only) can be used to select the
 elements to be replaced in the same way as for a matrix.

The error message may be a bit hinky, as obviously data.frames can be
indexed by things other than logical matricies. Or is there another
reason this strikes you as odd?

Best,
Ista

On Tue, May 1, 2012 at 1:33 PM, Bert Gunter gunter.ber...@gene.com wrote:
 AdvisoRs:

 Is the following a bug, feature, hinky error message, or dumb Bert?

 mtest - matrix(1:12,nr=4)
 dftest - data.frame(mtest)
 ix - cbind(1:2,2:3)
 mtest[ix] - NA
 mtest
     [,1] [,2] [,3]
 [1,]    1   NA    9
 [2,]    2    6   NA
 [3,]    3    7   11
 [4,]    4    8   12

 ## But ...
 dftest[ix] - NA
 Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
  only logical matrix subscripts are allowed in replacement

 Obviously, I was expecting matrix indexing for replacement to work
 similarly in both cases; however, I can see why it would be
 problematic for data frames (mixed types), but was a bit nonplussed by
 the error message, which seems hinky to me.

 Cheers,
 Bert

 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame vs matrix quirk: Hinky error message?

2012-05-01 Thread Ted Harding
On 01-May-2012 17:33:23 Bert Gunter wrote:
 AdvisoRs:
 
 Is the following a bug, feature, hinky error message, or dumb Bert?
 
   mtest - matrix(1:12,nr=4)
   dftest - data.frame(mtest)
   ix - cbind(1:2,2:3)
   mtest[ix] - NA
   mtest
  [,1] [,2] [,3]
 [1,]1   NA9
 [2,]26   NA
 [3,]37   11
 [4,]48   12
 
## But ...
   dftest[ix] - NA
 Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
   only logical matrix subscripts are allowed in replacement
 
 Obviously, I was expecting matrix indexing for replacement to
 work similarly in both cases; however, I can see why it would
 be problematic for data frames (mixed types), but was a bit
 nonplussed by the error message, which seems hinky to me.
 
 Cheers,
 Bert

Also interesting is that, prior to the substitution commands,

  mtest[ix]
  # [1]  5 10
  dftest[ix]
  # [1]  5 10

both as one would expect on Bert's naive gounds (which, I confess,
I also share[d]).

Ted.

-
E-Mail: (Ted Harding) ted.hard...@wlandres.net
Date: 01-May-2012  Time: 19:03:14
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame vs matrix quirk: Hinky error message?

2012-05-01 Thread Bert Gunter
Many thanks, Ista:

I only looked in ].default so the answer is: Alternative 4: dumb
Bert. Rap knuckles with ruler.

Actually, indexing by a logical matrix doesn't make much  sense to me
in either case, as it does not have the effect of selecting individual
elements, which is what numeric matrix indices do. But that's a matter
of usage, neither bug nor feature.

If I had gotten something like the error message: Matrix indices not
allowed for replacement in data frames, I would not have been
surprised. But as you said, the behavior **IS** documented.

Best,
Bert



On Tue, May 1, 2012 at 10:49 AM, Ista Zahn istaz...@gmail.com wrote:
 Hi Bert,

 The failure itself is the documented behavior: ?'[.data.frame' says

 Matrix indexing ('x[i]' with a logical or a 2-column integer
     matrix 'i') using '[' is not recommended, and barely supported.
     For extraction, 'x' is first coerced to a matrix.  For
     replacement, a logical matrix (only) can be used to select the
     elements to be replaced in the same way as for a matrix.

 The error message may be a bit hinky, as obviously data.frames can be
 indexed by things other than logical matricies. Or is there another
 reason this strikes you as odd?

 Best,
 Ista

 On Tue, May 1, 2012 at 1:33 PM, Bert Gunter gunter.ber...@gene.com wrote:
 AdvisoRs:

 Is the following a bug, feature, hinky error message, or dumb Bert?

 mtest - matrix(1:12,nr=4)
 dftest - data.frame(mtest)
 ix - cbind(1:2,2:3)
 mtest[ix] - NA
 mtest
     [,1] [,2] [,3]
 [1,]    1   NA    9
 [2,]    2    6   NA
 [3,]    3    7   11
 [4,]    4    8   12

 ## But ...
 dftest[ix] - NA
 Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
  only logical matrix subscripts are allowed in replacement

 Obviously, I was expecting matrix indexing for replacement to work
 similarly in both cases; however, I can see why it would be
 problematic for data frames (mixed types), but was a bit nonplussed by
 the error message, which seems hinky to me.

 Cheers,
 Bert

 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame vs matrix quirk: Hinky error message?

2012-05-01 Thread Duncan Murdoch

On 01/05/2012 2:12 PM, Bert Gunter wrote:

Many thanks, Ista:

I only looked in ].default so the answer is: Alternative 4: dumb
Bert. Rap knuckles with ruler.

Actually, indexing by a logical matrix doesn't make much  sense to me
in either case, as it does not have the effect of selecting individual
elements, which is what numeric matrix indices do. But that's a matter
of usage, neither bug nor feature.

If I had gotten something like the error message: Matrix indices not
allowed for replacement in data frames, I would not have been
surprised. But as you said, the behavior **IS** documented.


Your version is not correct:  matrix indices *are* allowed for 
replacement, but only logical matrix indices, not two column numerical 
ones.   The message might be clearer if instead of saying only logical 
matrix subscripts are allowed in replacement
it said matrix subscripts must be logical matrices in replacement, but 
I think the basic problem is the limitation.  I'll fix that.


Duncan Murdoch



Best,
Bert



On Tue, May 1, 2012 at 10:49 AM, Ista Zahnistaz...@gmail.com  wrote:
  Hi Bert,

  The failure itself is the documented behavior: ?'[.data.frame' says

  Matrix indexing ('x[i]' with a logical or a 2-column integer
   matrix 'i') using '[' is not recommended, and barely supported.
   For extraction, 'x' is first coerced to a matrix.  For
   replacement, a logical matrix (only) can be used to select the
   elements to be replaced in the same way as for a matrix.

  The error message may be a bit hinky, as obviously data.frames can be
  indexed by things other than logical matricies. Or is there another
  reason this strikes you as odd?

  Best,
  Ista

  On Tue, May 1, 2012 at 1:33 PM, Bert Guntergunter.ber...@gene.com  wrote:
  AdvisoRs:

  Is the following a bug, feature, hinky error message, or dumb Bert?

  mtest- matrix(1:12,nr=4)
  dftest- data.frame(mtest)
  ix- cbind(1:2,2:3)
  mtest[ix]- NA
  mtest
   [,1] [,2] [,3]
  [1,]1   NA9
  [2,]26   NA
  [3,]37   11
  [4,]48   12

  ## But ...
  dftest[ix]- NA
  Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
only logical matrix subscripts are allowed in replacement

  Obviously, I was expecting matrix indexing for replacement to work
  similarly in both cases; however, I can see why it would be
  problematic for data frames (mixed types), but was a bit nonplussed by
  the error message, which seems hinky to me.

  Cheers,
  Bert

  --

  Bert Gunter
  Genentech Nonclinical Biostatistics

  Internal Contact Info:
  Phone: 467-7374
  Website:
  
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame vs matrix quirk: Hinky error message?

2012-05-01 Thread Bert Gunter
Duncan:

Maybe there **is** a bug, then.

  zmat - matrix(1:12,nr=4)
 zdf - data.frame(zmat)
 ix - cbind(c(FALSE,TRUE),c(TRUE,TRUE))
 zmat[ix]
[1]  2  3  4  6  7  8 10 11 12
 zdf[ix]
[1]  2  3  4  6  7  8 10 11 12
 zmat[ix] - NA
 zmat
 [,1] [,2] [,3]
[1,]159
[2,]   NA   NA   NA
[3,]   NA   NA   NA
[4,]   NA   NA   NA

## ??

 zdf[ix] - NA
Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
  only logical matrix subscripts are allowed in replacement

That matrix replacement should not work with (in general mixed type)
data frames seems reasonable, actually. Trying to fix things may not
be. But I leave this to you and your fellow expeRts,

Cheers,
Bert


On Tue, May 1, 2012 at 11:30 AM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
 On 01/05/2012 2:12 PM, Bert Gunter wrote:

 Many thanks, Ista:

 I only looked in ].default so the answer is: Alternative 4: dumb
 Bert. Rap knuckles with ruler.

 Actually, indexing by a logical matrix doesn't make much  sense to me
 in either case, as it does not have the effect of selecting individual
 elements, which is what numeric matrix indices do. But that's a matter
 of usage, neither bug nor feature.

 If I had gotten something like the error message: Matrix indices not
 allowed for replacement in data frames, I would not have been
 surprised. But as you said, the behavior **IS** documented.


 Your version is not correct:  matrix indices *are* allowed for replacement,
 but only logical matrix indices, not two column numerical ones.   The
 message might be clearer if instead of saying only logical matrix
 subscripts are allowed in replacement
 it said matrix subscripts must be logical matrices in replacement, but I
 think the basic problem is the limitation.  I'll fix that.

 Duncan Murdoch


 Best,
 Bert



 On Tue, May 1, 2012 at 10:49 AM, Ista Zahnistaz...@gmail.com  wrote:
   Hi Bert,
 
   The failure itself is the documented behavior: ?'[.data.frame' says
 
   Matrix indexing ('x[i]' with a logical or a 2-column integer
        matrix 'i') using '[' is not recommended, and barely supported.
        For extraction, 'x' is first coerced to a matrix.  For
        replacement, a logical matrix (only) can be used to select the
        elements to be replaced in the same way as for a matrix.
 
   The error message may be a bit hinky, as obviously data.frames can be
   indexed by things other than logical matricies. Or is there another
   reason this strikes you as odd?
 
   Best,
   Ista
 
   On Tue, May 1, 2012 at 1:33 PM, Bert Guntergunter.ber...@gene.com
   wrote:
   AdvisoRs:
 
   Is the following a bug, feature, hinky error message, or dumb Bert?
 
   mtest- matrix(1:12,nr=4)
   dftest- data.frame(mtest)
   ix- cbind(1:2,2:3)
   mtest[ix]- NA
   mtest
        [,1] [,2] [,3]
   [1,]    1   NA    9
   [2,]    2    6   NA
   [3,]    3    7   11
   [4,]    4    8   12
 
   ## But ...
   dftest[ix]- NA
   Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
     only logical matrix subscripts are allowed in replacement
 
   Obviously, I was expecting matrix indexing for replacement to work
   similarly in both cases; however, I can see why it would be
   problematic for data frames (mixed types), but was a bit nonplussed by
   the error message, which seems hinky to me.
 
   Cheers,
   Bert
 
   --
 
   Bert Gunter
   Genentech Nonclinical Biostatistics
 
   Internal Contact Info:
   Phone: 467-7374
   Website:
 
   http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
 
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.







-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame vs matrix quirk: Hinky error message?

2012-05-01 Thread David L Carlson
The difference is in recycling. If the logical matrix has the same
dimensions, it seems to work:

 jx - cbind(c(FALSE, TRUE, FALSE, TRUE), c(TRUE, FALSE, TRUE, FALSE),
c(FALSE, TRUE, FALSE, TRUE))
 zmat[jx] - NA
 zmat
 [,1] [,2] [,3]
[1,]1   NA9
[2,]   NA6   NA
[3,]3   NA   11
[4,]   NA8   NA
 zdf[jx] - NA
 zdf
  X1 X2 X3
1  1 NA  9
2 NA  6 NA
3  3 NA 11
4 NA  8 NA

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352



 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Bert Gunter
 Sent: Tuesday, May 01, 2012 1:46 PM
 To: Duncan Murdoch
 Cc: r-help@r-project.org
 Subject: Re: [R] Data frame vs matrix quirk: Hinky error message?
 
 Duncan:
 
 Maybe there **is** a bug, then.
 
   zmat - matrix(1:12,nr=4)
  zdf - data.frame(zmat)
  ix - cbind(c(FALSE,TRUE),c(TRUE,TRUE))
  zmat[ix]
 [1]  2  3  4  6  7  8 10 11 12
  zdf[ix]
 [1]  2  3  4  6  7  8 10 11 12
  zmat[ix] - NA
  zmat
  [,1] [,2] [,3]
 [1,]159
 [2,]   NA   NA   NA
 [3,]   NA   NA   NA
 [4,]   NA   NA   NA
 
 ## ??
 
  zdf[ix] - NA
 Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
   only logical matrix subscripts are allowed in replacement
 
 That matrix replacement should not work with (in general mixed type)
 data frames seems reasonable, actually. Trying to fix things may not
 be. But I leave this to you and your fellow expeRts,
 
 Cheers,
 Bert
 
 
 On Tue, May 1, 2012 at 11:30 AM, Duncan Murdoch
 murdoch.dun...@gmail.com wrote:
  On 01/05/2012 2:12 PM, Bert Gunter wrote:
 
  Many thanks, Ista:
 
  I only looked in ].default so the answer is: Alternative 4: dumb
  Bert. Rap knuckles with ruler.
 
  Actually, indexing by a logical matrix doesn't make much  sense to
 me
  in either case, as it does not have the effect of selecting
 individual
  elements, which is what numeric matrix indices do. But that's a
 matter
  of usage, neither bug nor feature.
 
  If I had gotten something like the error message: Matrix indices
 not
  allowed for replacement in data frames, I would not have been
  surprised. But as you said, the behavior **IS** documented.
 
 
  Your version is not correct:  matrix indices *are* allowed for
 replacement,
  but only logical matrix indices, not two column numerical ones.   The
  message might be clearer if instead of saying only logical matrix
  subscripts are allowed in replacement
  it said matrix subscripts must be logical matrices in replacement,
 but I
  think the basic problem is the limitation.  I'll fix that.
 
  Duncan Murdoch
 
 
  Best,
  Bert
 
 
 
  On Tue, May 1, 2012 at 10:49 AM, Ista Zahnistaz...@gmail.com
  wrote:
    Hi Bert,
  
    The failure itself is the documented behavior: ?'[.data.frame'
 says
  
    Matrix indexing ('x[i]' with a logical or a 2-column integer
         matrix 'i') using '[' is not recommended, and barely
 supported.
         For extraction, 'x' is first coerced to a matrix.  For
         replacement, a logical matrix (only) can be used to select
 the
         elements to be replaced in the same way as for a matrix.
  
    The error message may be a bit hinky, as obviously data.frames
 can be
    indexed by things other than logical matricies. Or is there
 another
    reason this strikes you as odd?
  
    Best,
    Ista
  
    On Tue, May 1, 2012 at 1:33 PM, Bert
 Guntergunter.ber...@gene.com
    wrote:
    AdvisoRs:
  
    Is the following a bug, feature, hinky error message, or dumb
 Bert?
  
    mtest- matrix(1:12,nr=4)
    dftest- data.frame(mtest)
    ix- cbind(1:2,2:3)
    mtest[ix]- NA
    mtest
         [,1] [,2] [,3]
    [1,]    1   NA    9
    [2,]    2    6   NA
    [3,]    3    7   11
    [4,]    4    8   12
  
    ## But ...
    dftest[ix]- NA
    Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
      only logical matrix subscripts are allowed in replacement
  
    Obviously, I was expecting matrix indexing for replacement to
 work
    similarly in both cases; however, I can see why it would be
    problematic for data frames (mixed types), but was a bit
 nonplussed by
    the error message, which seems hinky to me.
  
    Cheers,
    Bert
  
    --
  
    Bert Gunter
    Genentech Nonclinical Biostatistics
  
    Internal Contact Info:
    Phone: 467-7374
    Website:
  
    http://pharmadevelopment.roche.com/index/pdb/pdb-functional-
 groups/pdb-biostatistics/pdb-ncb-home.htm
  
    __
    R-help@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible
 code.
 
 
 
 
 
 
 
 --
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 
 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb

Re: [R] Data frame vs matrix quirk: Hinky error message?

2012-05-01 Thread Duncan Murdoch

On 01/05/2012 2:45 PM, Bert Gunter wrote:

Duncan:

Maybe there **is** a bug, then.

zmat- matrix(1:12,nr=4)
  zdf- data.frame(zmat)
  ix- cbind(c(FALSE,TRUE),c(TRUE,TRUE))
  zmat[ix]
[1]  2  3  4  6  7  8 10 11 12
  zdf[ix]
[1]  2  3  4  6  7  8 10 11 12
  zmat[ix]- NA
  zmat
  [,1] [,2] [,3]
[1,]159
[2,]   NA   NA   NA
[3,]   NA   NA   NA
[4,]   NA   NA   NA

## ??

  zdf[ix]- NA
Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
   only logical matrix subscripts are allowed in replacement

That matrix replacement should not work with (in general mixed type)
data frames seems reasonable, actually. Trying to fix things may not
be. But I leave this to you and your fellow expeRts,


My intention is to allow two column numeric indices, not to change the 
behaviour for logical matrix indices.  I'm planning to leave the not 
recommended note in the help page, because there will still be 
surprises as above, but the error message will just say


illegal matrix index in replacement

The rule will remain that a logical matrix needs to be of the same 
dimension as the original dataframe.  I'm not sure if this is documented 
currently, but it will be.


Duncan Murdoch



Cheers,
Bert


On Tue, May 1, 2012 at 11:30 AM, Duncan Murdoch
murdoch.dun...@gmail.com  wrote:
  On 01/05/2012 2:12 PM, Bert Gunter wrote:

  Many thanks, Ista:

  I only looked in ].default so the answer is: Alternative 4: dumb
  Bert. Rap knuckles with ruler.

  Actually, indexing by a logical matrix doesn't make much  sense to me
  in either case, as it does not have the effect of selecting individual
  elements, which is what numeric matrix indices do. But that's a matter
  of usage, neither bug nor feature.

  If I had gotten something like the error message: Matrix indices not
  allowed for replacement in data frames, I would not have been
  surprised. But as you said, the behavior **IS** documented.


  Your version is not correct:  matrix indices *are* allowed for replacement,
  but only logical matrix indices, not two column numerical ones.   The
  message might be clearer if instead of saying only logical matrix
  subscripts are allowed in replacement
  it said matrix subscripts must be logical matrices in replacement, but I
  think the basic problem is the limitation.  I'll fix that.

  Duncan Murdoch


  Best,
  Bert



  On Tue, May 1, 2012 at 10:49 AM, Ista Zahnistaz...@gmail.comwrote:
  Hi Bert,
  
  The failure itself is the documented behavior: ?'[.data.frame' says
  
  Matrix indexing ('x[i]' with a logical or a 2-column integer
   matrix 'i') using '[' is not recommended, and barely supported.
   For extraction, 'x' is first coerced to a matrix.  For
   replacement, a logical matrix (only) can be used to select the
   elements to be replaced in the same way as for a matrix.
  
  The error message may be a bit hinky, as obviously data.frames can be
  indexed by things other than logical matricies. Or is there another
  reason this strikes you as odd?
  
  Best,
  Ista
  
  On Tue, May 1, 2012 at 1:33 PM, Bert Guntergunter.ber...@gene.com
  wrote:
  AdvisoRs:
  
  Is the following a bug, feature, hinky error message, or dumb Bert?
  
  mtest- matrix(1:12,nr=4)
  dftest- data.frame(mtest)
  ix- cbind(1:2,2:3)
  mtest[ix]- NA
  mtest
   [,1] [,2] [,3]
  [1,]1   NA9
  [2,]26   NA
  [3,]37   11
  [4,]48   12
  
  ## But ...
  dftest[ix]- NA
  Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
only logical matrix subscripts are allowed in replacement
  
  Obviously, I was expecting matrix indexing for replacement to work
  similarly in both cases; however, I can see why it would be
  problematic for data frames (mixed types), but was a bit nonplussed by
  the error message, which seems hinky to me.
  
  Cheers,
  Bert
  
  --
  
  Bert Gunter
  Genentech Nonclinical Biostatistics
  
  Internal Contact Info:
  Phone: 467-7374
  Website:
  
  
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.









__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame vs matrix quirk: Hinky error message?

2012-05-01 Thread Nordlund, Dan (DSHS/RDA)
Bert,

I think this is what is needed for the data frame

ix - cbind(1:2,2:3)
ixm - matrix(FALSE,4,3)
ixm[ix] - TRUE
zdf[ixm] - NA

Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Bert Gunter
 Sent: Tuesday, May 01, 2012 11:46 AM
 To: Duncan Murdoch
 Cc: r-help@r-project.org
 Subject: Re: [R] Data frame vs matrix quirk: Hinky error message?
 
 Duncan:
 
 Maybe there **is** a bug, then.
 
   zmat - matrix(1:12,nr=4)
  zdf - data.frame(zmat)
  ix - cbind(c(FALSE,TRUE),c(TRUE,TRUE))
  zmat[ix]
 [1]  2  3  4  6  7  8 10 11 12
  zdf[ix]
 [1]  2  3  4  6  7  8 10 11 12
  zmat[ix] - NA
  zmat
  [,1] [,2] [,3]
 [1,]159
 [2,]   NA   NA   NA
 [3,]   NA   NA   NA
 [4,]   NA   NA   NA
 
 ## ??
 
  zdf[ix] - NA
 Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
   only logical matrix subscripts are allowed in replacement
 
 That matrix replacement should not work with (in general mixed type)
 data frames seems reasonable, actually. Trying to fix things may not
 be. But I leave this to you and your fellow expeRts,
 
 Cheers,
 Bert
 
 
 On Tue, May 1, 2012 at 11:30 AM, Duncan Murdoch
 murdoch.dun...@gmail.com wrote:
  On 01/05/2012 2:12 PM, Bert Gunter wrote:
 
  Many thanks, Ista:
 
  I only looked in ].default so the answer is: Alternative 4: dumb
  Bert. Rap knuckles with ruler.
 
  Actually, indexing by a logical matrix doesn't make much  sense to
 me
  in either case, as it does not have the effect of selecting
 individual
  elements, which is what numeric matrix indices do. But that's a
 matter
  of usage, neither bug nor feature.
 
  If I had gotten something like the error message: Matrix indices
 not
  allowed for replacement in data frames, I would not have been
  surprised. But as you said, the behavior **IS** documented.
 
 
  Your version is not correct:  matrix indices *are* allowed for
 replacement,
  but only logical matrix indices, not two column numerical ones.   The
  message might be clearer if instead of saying only logical matrix
  subscripts are allowed in replacement
  it said matrix subscripts must be logical matrices in replacement,
 but I
  think the basic problem is the limitation.  I'll fix that.
 
  Duncan Murdoch
 
 
  Best,
  Bert
 
 
 
  On Tue, May 1, 2012 at 10:49 AM, Ista Zahnistaz...@gmail.com
  wrote:
    Hi Bert,
  
    The failure itself is the documented behavior: ?'[.data.frame'
 says
  
    Matrix indexing ('x[i]' with a logical or a 2-column integer
         matrix 'i') using '[' is not recommended, and barely
 supported.
         For extraction, 'x' is first coerced to a matrix.  For
         replacement, a logical matrix (only) can be used to select
 the
         elements to be replaced in the same way as for a matrix.
  
    The error message may be a bit hinky, as obviously data.frames
 can be
    indexed by things other than logical matricies. Or is there
 another
    reason this strikes you as odd?
  
    Best,
    Ista
  
    On Tue, May 1, 2012 at 1:33 PM, Bert
 Guntergunter.ber...@gene.com
    wrote:
    AdvisoRs:
  
    Is the following a bug, feature, hinky error message, or dumb
 Bert?
  
    mtest- matrix(1:12,nr=4)
    dftest- data.frame(mtest)
    ix- cbind(1:2,2:3)
    mtest[ix]- NA
    mtest
         [,1] [,2] [,3]
    [1,]    1   NA    9
    [2,]    2    6   NA
    [3,]    3    7   11
    [4,]    4    8   12
  
    ## But ...
    dftest[ix]- NA
    Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
      only logical matrix subscripts are allowed in replacement
  
    Obviously, I was expecting matrix indexing for replacement to
 work
    similarly in both cases; however, I can see why it would be
    problematic for data frames (mixed types), but was a bit
 nonplussed by
    the error message, which seems hinky to me.
  
    Cheers,
    Bert
  
    --
  
    Bert Gunter
    Genentech Nonclinical Biostatistics
  
    Internal Contact Info:
    Phone: 467-7374
    Website:
  
    http://pharmadevelopment.roche.com/index/pdb/pdb-functional-
 groups/pdb-biostatistics/pdb-ncb-home.htm
  
    __
    R-help@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible
 code.
 
 
 
 
 
 
 
 --
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 
 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
 biostatistics/pdb-ncb-home.htm
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http

Re: [R] Data frame vs matrix quirk: Hinky error message?

2012-05-01 Thread Bert Gunter
Many thanks to all. I appreciate your kindness and patience.

The point is, of course, that matrix subscripting by logicals requires
different semantics than by numeric indices, as it must. I'd still say
this is a case of option 4, dumb Bert: I should have figured this out.

Duncan's proposed changes to both behavior and documentation would
certainly address all my points of confusion. However, I agree that
numeric replacement indices for data frames may be a can of worms:
presumably silent type conversion would be required when replacing
values in mixed type columns. Keeping the warnings in -- and maybe
issuing some more when the type conversion occurs -- is certainly a
good idea.

Best,
Bert

On Tue, May 1, 2012 at 12:57 PM, Nordlund, Dan (DSHS/RDA)
nord...@dshs.wa.gov wrote:
 Bert,

 I think this is what is needed for the data frame

 ix - cbind(1:2,2:3)
 ixm - matrix(FALSE,4,3)
 ixm[ix] - TRUE
 zdf[ixm] - NA

 Hope this is helpful,

 Dan

 Daniel J. Nordlund
 Washington State Department of Social and Health Services
 Planning, Performance, and Accountability
 Research and Data Analysis Division
 Olympia, WA 98504-5204


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Bert Gunter
 Sent: Tuesday, May 01, 2012 11:46 AM
 To: Duncan Murdoch
 Cc: r-help@r-project.org
 Subject: Re: [R] Data frame vs matrix quirk: Hinky error message?

 Duncan:

 Maybe there **is** a bug, then.

   zmat - matrix(1:12,nr=4)
  zdf - data.frame(zmat)
  ix - cbind(c(FALSE,TRUE),c(TRUE,TRUE))
  zmat[ix]
 [1] 2 3 4 6 7 8 10 11 12
  zdf[ix]
 [1] 2 3 4 6 7 8 10 11 12
  zmat[ix] - NA
  zmat
      [,1] [,2] [,3]
 [1,]    1    5    9
 [2,]   NA   NA   NA
 [3,]   NA   NA   NA
 [4,]   NA   NA   NA

 ## ??

  zdf[ix] - NA
 Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
   only logical matrix subscripts are allowed in replacement

 That matrix replacement should not work with (in general mixed type)
 data frames seems reasonable, actually. Trying to fix things may not
 be. But I leave this to you and your fellow expeRts,

 Cheers,
 Bert


 On Tue, May 1, 2012 at 11:30 AM, Duncan Murdoch
 murdoch.dun...@gmail.com wrote:
  On 01/05/2012 2:12 PM, Bert Gunter wrote:
 
  Many thanks, Ista:
 
  I only looked in ].default so the answer is: Alternative 4: dumb
  Bert. Rap knuckles with ruler.
 
  Actually, indexing by a logical matrix doesn't make much  sense to
 me
  in either case, as it does not have the effect of selecting
 individual
  elements, which is what numeric matrix indices do. But that's a
 matter
  of usage, neither bug nor feature.
 
  If I had gotten something like the error message: Matrix indices
 not
  allowed for replacement in data frames, I would not have been
  surprised. But as you said, the behavior **IS** documented.
 
 
  Your version is not correct:  matrix indices *are* allowed for
 replacement,
  but only logical matrix indices, not two column numerical ones.   The
  message might be clearer if instead of saying only logical matrix
  subscripts are allowed in replacement
  it said matrix subscripts must be logical matrices in replacement,
 but I
  think the basic problem is the limitation.  I'll fix that.
 
  Duncan Murdoch
 
 
  Best,
  Bert
 
 
 
  On Tue, May 1, 2012 at 10:49 AM, Ista Zahnistaz...@gmail.com
  wrote:
    Hi Bert,
  
    The failure itself is the documented behavior: ?'[.data.frame'
 says
  
    Matrix indexing ('x[i]' with a logical or a 2-column integer
         matrix 'i') using '[' is not recommended, and barely
 supported.
         For extraction, 'x' is first coerced to a matrix.  For
         replacement, a logical matrix (only) can be used to select
 the
         elements to be replaced in the same way as for a matrix.
  
    The error message may be a bit hinky, as obviously data.frames
 can be
    indexed by things other than logical matricies. Or is there
 another
    reason this strikes you as odd?
  
    Best,
    Ista
  
    On Tue, May 1, 2012 at 1:33 PM, Bert
 Guntergunter.ber...@gene.com
    wrote:
    AdvisoRs:
  
    Is the following a bug, feature, hinky error message, or dumb
 Bert?
  
    mtest- matrix(1:12,nr=4)
    dftest- data.frame(mtest)
    ix- cbind(1:2,2:3)
    mtest[ix]- NA
    mtest
         [,1] [,2] [,3]
    [1,]    1   NA    9
    [2,]    2    6   NA
    [3,]    3    7   11
    [4,]    4    8   12
  
    ## But ...
    dftest[ix]- NA
    Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
      only logical matrix subscripts are allowed in replacement
  
    Obviously, I was expecting matrix indexing for replacement to
 work
    similarly in both cases; however, I can see why it would be
    problematic for data frames (mixed types), but was a bit
 nonplussed by
    the error message, which seems hinky to me.
  
    Cheers,
    Bert
  
    --
  
    Bert Gunter
    Genentech Nonclinical Biostatistics
  
    Internal Contact Info:
    Phone: 467-7374
    Website:
  
    http

Re: [R] Data frame vs matrix quirk: Hinky error message?

2012-05-01 Thread Duncan Murdoch

On 12-05-01 3:57 PM, Nordlund, Dan (DSHS/RDA) wrote:

Bert,

I think this is what is needed for the data frame

ix- cbind(1:2,2:3)
ixm- matrix(FALSE,4,3)
ixm[ix]- TRUE
zdf[ixm]- NA

Hope this is helpful,


That's essentially what I did in adding the numeric indexing.  The only 
complication was handling the case where ix contains out of bound 
values; users don't want to hear that ixm[ix] - TRUE failed.


Duncan Murdoch



Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
project.org] On Behalf Of Bert Gunter
Sent: Tuesday, May 01, 2012 11:46 AM
To: Duncan Murdoch
Cc: r-help@r-project.org
Subject: Re: [R] Data frame vs matrix quirk: Hinky error message?

Duncan:

Maybe there **is** a bug, then.

zmat- matrix(1:12,nr=4)

zdf- data.frame(zmat)
ix- cbind(c(FALSE,TRUE),c(TRUE,TRUE))
zmat[ix]

[1]  2  3  4  6  7  8 10 11 12

zdf[ix]

[1]  2  3  4  6  7  8 10 11 12

zmat[ix]- NA
zmat

  [,1] [,2] [,3]
[1,]159
[2,]   NA   NA   NA
[3,]   NA   NA   NA
[4,]   NA   NA   NA

## ??


zdf[ix]- NA

Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
   only logical matrix subscripts are allowed in replacement

That matrix replacement should not work with (in general mixed type)
data frames seems reasonable, actually. Trying to fix things may not
be. But I leave this to you and your fellow expeRts,

Cheers,
Bert


On Tue, May 1, 2012 at 11:30 AM, Duncan Murdoch
murdoch.dun...@gmail.com  wrote:

On 01/05/2012 2:12 PM, Bert Gunter wrote:


Many thanks, Ista:

I only looked in ].default so the answer is: Alternative 4: dumb
Bert. Rap knuckles with ruler.

Actually, indexing by a logical matrix doesn't make much  sense to

me

in either case, as it does not have the effect of selecting

individual

elements, which is what numeric matrix indices do. But that's a

matter

of usage, neither bug nor feature.

If I had gotten something like the error message: Matrix indices

not

allowed for replacement in data frames, I would not have been
surprised. But as you said, the behavior **IS** documented.



Your version is not correct:  matrix indices *are* allowed for

replacement,

but only logical matrix indices, not two column numerical ones.   The
message might be clearer if instead of saying only logical matrix
subscripts are allowed in replacement
it said matrix subscripts must be logical matrices in replacement,

but I

think the basic problem is the limitation.  I'll fix that.

Duncan Murdoch



Best,
Bert



On Tue, May 1, 2012 at 10:49 AM, Ista Zahnistaz...@gmail.com

  wrote:

  Hi Bert,

  The failure itself is the documented behavior: ?'[.data.frame'

says


  Matrix indexing ('x[i]' with a logical or a 2-column integer
   matrix 'i') using '[' is not recommended, and barely

supported.

   For extraction, 'x' is first coerced to a matrix.  For
   replacement, a logical matrix (only) can be used to select

the

   elements to be replaced in the same way as for a matrix.

  The error message may be a bit hinky, as obviously data.frames

can be

  indexed by things other than logical matricies. Or is there

another

  reason this strikes you as odd?

  Best,
  Ista

  On Tue, May 1, 2012 at 1:33 PM, Bert

Guntergunter.ber...@gene.com

  wrote:

  AdvisoRs:

  Is the following a bug, feature, hinky error message, or dumb

Bert?



  mtest- matrix(1:12,nr=4)
  dftest- data.frame(mtest)
  ix- cbind(1:2,2:3)
  mtest[ix]- NA
  mtest

   [,1] [,2] [,3]
  [1,]1   NA9
  [2,]26   NA
  [3,]37   11
  [4,]48   12

  ## But ...

  dftest[ix]- NA

  Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
only logical matrix subscripts are allowed in replacement

  Obviously, I was expecting matrix indexing for replacement to

work

  similarly in both cases; however, I can see why it would be
  problematic for data frames (mixed types), but was a bit

nonplussed by

  the error message, which seems hinky to me.

  Cheers,
  Bert

  --

  Bert Gunter
  Genentech Nonclinical Biostatistics

  Internal Contact Info:
  Phone: 467-7374
  Website:

  http://pharmadevelopment.roche.com/index/pdb/pdb-functional-

groups/pdb-biostatistics/pdb-ncb-home.htm


  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible

code.










--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman

Re: [R] Data frame vs matrix quirk: Hinky error message?

2012-05-01 Thread Rui Barradas
P.S.

The way the logical matrix is constructed is NOT general purpose.
Quoting myself quoting Bert,

 Actually, it works, as long as the logical index matrix has the same
 dimensions as the data frame.
 
 zmat - matrix(1:12,nr=4)
 zdf - data.frame(zmat)
 
 # Numeric index matrix.
 ix - cbind(1:2,2:3)
 # Logical index matrix.
 ix2 - row(zdf) == ix[, 1]  col(zdf) == ix[, 2]
 

Here the number of rows in zdf is a multiple of the vectors ix[, 1] and ix[
, 2] lengths.
The recycling rules makes it work. But if the numeric index matrix has, say,
3 rows,
another way of constructing the logical one would be needed.

jx - cbind(1:3, c(2:3, 3))
row(zdf) == jx[, 1]  col(zdf) == jx[, 2]
  [,1]  [,2]  [,3]
[1,] FALSE FALSE FALSE
[2,] FALSE FALSE FALSE
[3,] FALSE FALSE FALSE
[4,] FALSE FALSE FALSE

(Anyway, I don't believe that was the point.)

R.B.



--
View this message in context: 
http://r.789695.n4.nabble.com/Data-frame-vs-matrix-quirk-Hinky-error-message-tp4601254p4601558.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame vs matrix quirk: Hinky error message?

2012-05-01 Thread Rui Barradas
Hello,


Bert Gunter wrote
 
 Duncan:
 
 Maybe there **is** a bug, then.
 
   zmat - matrix(1:12,nr=4)
 zdf - data.frame(zmat)
 ix - cbind(c(FALSE,TRUE),c(TRUE,TRUE))
 zmat[ix]
 [1]  2  3  4  6  7  8 10 11 12
 zdf[ix]
 [1]  2  3  4  6  7  8 10 11 12
 zmat[ix] - NA
 zmat
  [,1] [,2] [,3]
 [1,]159
 [2,]   NA   NA   NA
 [3,]   NA   NA   NA
 [4,]   NA   NA   NA
 
 ## ??
 
 zdf[ix] - NA
 Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
   only logical matrix subscripts are allowed in replacement
 
 That matrix replacement should not work with (in general mixed type)
 data frames seems reasonable, actually. Trying to fix things may not
 be. But I leave this to you and your fellow expeRts,
 
 Cheers,
 Bert
 
 
 On Tue, May 1, 2012 at 11:30 AM, Duncan Murdoch
 lt;murdoch.duncan@gt; wrote:
 On 01/05/2012 2:12 PM, Bert Gunter wrote:

 Many thanks, Ista:

 I only looked in ].default so the answer is: Alternative 4: dumb
 Bert. Rap knuckles with ruler.

 Actually, indexing by a logical matrix doesn't make much  sense to me
 in either case, as it does not have the effect of selecting individual
 elements, which is what numeric matrix indices do. But that's a matter
 of usage, neither bug nor feature.

 If I had gotten something like the error message: Matrix indices not
 allowed for replacement in data frames, I would not have been
 surprised. But as you said, the behavior **IS** documented.


 Your version is not correct:  matrix indices *are* allowed for
 replacement,
 but only logical matrix indices, not two column numerical ones.   The
 message might be clearer if instead of saying only logical matrix
 subscripts are allowed in replacement
 it said matrix subscripts must be logical matrices in replacement, but
 I
 think the basic problem is the limitation.  I'll fix that.

 Duncan Murdoch


 Best,
 Bert



 On Tue, May 1, 2012 at 10:49 AM, Ista Zahnlt;istazahn@gt;  wrote:
   Hi Bert,
 
   The failure itself is the documented behavior: ?'[.data.frame' says
 
   Matrix indexing ('x[i]' with a logical or a 2-column integer
        matrix 'i') using '[' is not recommended, and barely supported.
        For extraction, 'x' is first coerced to a matrix.  For
        replacement, a logical matrix (only) can be used to select the
        elements to be replaced in the same way as for a matrix.
 
   The error message may be a bit hinky, as obviously data.frames can be
   indexed by things other than logical matricies. Or is there another
   reason this strikes you as odd?
 
   Best,
   Ista
 
   On Tue, May 1, 2012 at 1:33 PM, Bert Gunterlt;gunter.berton@gt;
   wrote:
   AdvisoRs:
 
   Is the following a bug, feature, hinky error message, or dumb Bert?
 
   mtest- matrix(1:12,nr=4)
   dftest- data.frame(mtest)
   ix- cbind(1:2,2:3)
   mtest[ix]- NA
   mtest
        [,1] [,2] [,3]
   [1,]    1   NA    9
   [2,]    2    6   NA
   [3,]    3    7   11
   [4,]    4    8   12
 
   ## But ...
   dftest[ix]- NA
   Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
     only logical matrix subscripts are allowed in replacement
 
   Obviously, I was expecting matrix indexing for replacement to work
   similarly in both cases; however, I can see why it would be
   problematic for data frames (mixed types), but was a bit nonplussed
 by
   the error message, which seems hinky to me.
 
   Cheers,
   Bert
 
   --
 
   Bert Gunter
   Genentech Nonclinical Biostatistics
 
   Internal Contact Info:
   Phone: 467-7374
   Website:
 
 
  http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
 
   __
   R-help@ mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.




 
 
 
 -- 
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 
 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

Actually, it works, as long as the logical index matrix has the same
dimensions as the data frame.


zmat - matrix(1:12,nr=4)
zdf - data.frame(zmat)

# Numeric index matrix.
ix - cbind(1:2,2:3)
# Logical index matrix.
ix2 - row(zdf) == ix[, 1]  col(zdf) == ix[, 2]

zmat[ix]
zmat[ix2]

zdf[ix]
zdf[ix2]

zmat[ix] - NA
zmat
# So far so good,
# But now, as already seen, error
zdf[ix] - NA
# Works
zdf[ix2] - NA
zdf

It even makes sense...

Rui Barradas


--
View this message in context: 
http://r.789695.n4.nabble.com/Data-frame-vs-matrix-quirk-Hinky-error-message-tp4601254p4601507.html
Sent from the R help mailing list archive at Nabble.com.


Re: [R] Data frame vs matrix quirk: Hinky error message?

2012-05-01 Thread David Winsemius


On May 1, 2012, at 1:33 PM, Bert Gunter wrote:


AdvisoRs:

Is the following a bug, feature, hinky error message, or dumb Bert?


mtest - matrix(1:12,nr=4)
dftest - data.frame(mtest)
ix - cbind(1:2,2:3)
mtest[ix] - NA
mtest

   [,1] [,2] [,3]
[1,]1   NA9
[2,]26   NA
[3,]37   11
[4,]48   12

## But ...

dftest[ix] - NA

Error in `[-.data.frame`(`*tmp*`, ix, value = NA) :
only logical matrix subscripts are allowed in replacement


I'm not sure _I_ would have expected '[-.data.frame' to recognize  
that a matrix was being offered because the [.] formalism  without a  
comma (called i-indexing on the help page) would generally be  
referencing only columns (i.e. list elements). I had not realized the  
possibilitiy of offering a logical matrix to df but it does succeed as  
predicted by


?[.data.frame

 For replacement, a logical matrix (only) can be used to select the  
elements to be replaced in the same way as for a matrix.


So how you want to characterize documented behavior is your call. I  
would never choose the label you offered.



 mtest - matrix(FALSE, 4,4)
 ix - cbind(1:2,2:3)
 dftest - data.frame(mtest)
 mtest[ix] - TRUE
 dftest[mtest] - a
 dftest
X1X2X3X4
1 FALSE a FALSE FALSE
2 FALSE FALSE a FALSE
3 FALSE FALSE FALSE FALSE
4 FALSE FALSE FALSE FALSE

The nonassignment operation still succeeds:

 dftest[ix]
[1] a a



Obviously, I was expecting matrix indexing for replacement to work
similarly in both cases; however, I can see why it would be
problematic for data frames (mixed types), but was a bit nonplussed by
the error message, which seems hinky to me.

Cheers,
Bert

--

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.