Re: [Rd] Shouldn't vector indexing with negative out-of-range index give an error?

2015-05-06 Thread Martin Maechler
 John Chambers j...@stat.stanford.edu
 on Tue, 5 May 2015 08:39:46 -0700 writes:

 When someone suggests that we might have had a reason for some 
peculiarity in the original S, my usual reaction is Or else we never thought 
of the problem.
 In this case, however, there is a relevant statement in the 1988 blue 
book.  In the discussion of subscripting (p 358) the definition for negative i 
says: the indices consist of the elements of seq(along=x) that do not match 
any elements in -i.

 Suggesting that no bounds checking on -i takes place.

 John

Indeed!  
Thanks a lot John, for the perspective and clarification!

I'm committing a patch to the documentation now.
Martin


 On May 5, 2015, at 7:01 AM, Martin Maechler 
maech...@lynne.stat.math.ethz.ch wrote:

 Henrik Bengtsson henrik.bengts...@ucsf.edu
 on Mon, 4 May 2015 12:20:44 -0700 writes:
 
 In Section 'Indexing by vectors' of 'R Language Definition'
 
(http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Indexing-by-vectors)
 it says:
 
 Integer. All elements of i must have the same sign. If they are
 positive, the elements of x with those index numbers are selected. If
 i contains negative elements, all elements except those indicated are
 selected.
 
 If i is positive and exceeds length(x) then the corresponding
 selection is NA. A negative out of bounds value for i causes an error.
 
 A special case is the zero index, which has null effects: x[0] is an
 empty vector and otherwise including zeros among positive or negative
 indices has the same effect as if they were omitted.
 
 However, that A negative out of bounds value for i causes an error
 in the second paragraph does not seem to apply.  Instead, R silently
 ignore negative indices that are out of range.  For example:
 
 x - 1:4
 x[-9L]
 [1] 1 2 3 4
 x[-c(1:9)]
 integer(0)
 x[-c(3:9)]
 [1] 1 2
 
 y - as.list(1:4)
 y[-c(1:9)]
 list()
 
 Is the observed non-error the correct behavior and therefore the
 documentation is incorrect, or is it vice verse?  (...or is it me
 missing something)
 
 I get the above on R devel, R 3.2.0, and as far back as R 2.11.0
 (haven't check earlier versions).
 
 Thank you, Henrik!
 
 I've checked further back: The change happened between R 2.5.1 and R 
2.6.0.
 
 The previous behavior was
 
 (1:3)[-(3:5)]
 Error: subscript out of bounds
 
 If you start reading NEWS.2, you see a *lot* of new features
 (and bug fixes) in the 2.6.0 news, but from my browsing, none of
 them mentioned the new behavior as feature.
 
 Let's -- for a moment -- declare it a bug in the code, i.e., not
 in the documentation:
 
 - As 2.6.0  happened quite a while ago (Oct. 2007),  
 we could wonder how much R code will break if we fix the bug.
 
 - Is the R package authors' community willing to do the necessary
 cleanup in their packages ?
 
             
 
 
 Now, after reading the source code for a while, and looking at
 the changes, I've found the log entry
 
 
 r42123 | ihaka | 2007-07-05 02:00:05 +0200 (Thu, 05 Jul 2007) | 4 lines
 
 Changed the behaviour of out-of-bounds negative
 subscripts to match that of S.  Such values are
 now ignored rather than tripping an error.
 
 
 
 So, it was changed on purpose, by one of the true Rs, very
 much on purpose.
 
 Making it a *warning* instead of the original error
 may have been both more cautious and more helpful for
 detecting programming errors.
 
 OTOH, John Chambers, the father of S and hence grandfather of R,
 may have had good reasons why it seemed more logical to silently
 ignore such out of bound negative indices:
 One could argue that
 
 x[-5]  means  leave away the 5-th element of x
 
 and if there is no 5-th element of x, leaving it away should be a no-op.
 
 After all this musing and history detection, my gut decision
 would be to only change the documentation which Ross forgot to change.
 
 But of course, it may be interesting to hear other programmeR's feedback 
on this.
 
 Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Shouldn't vector indexing with negative out-of-range index give an error?

2015-05-06 Thread Henrik Bengtsson
On Wed, May 6, 2015 at 1:33 AM, Martin Maechler
maech...@lynne.stat.math.ethz.ch wrote:
 John Chambers j...@stat.stanford.edu
 on Tue, 5 May 2015 08:39:46 -0700 writes:

  When someone suggests that we might have had a reason for some 
 peculiarity in the original S, my usual reaction is Or else we never thought 
 of the problem.
  In this case, however, there is a relevant statement in the 1988 blue 
 book.  In the discussion of subscripting (p 358) the definition for negative 
 i says: the indices consist of the elements of seq(along=x) that do not 
 match any elements in -i.

  Suggesting that no bounds checking on -i takes place.

  John

 Indeed!
 Thanks a lot John, for the perspective and clarification!

 I'm committing a patch to the documentation now.

Thank you both and also credits to Dongcan Jiang for pointing out to
me that errors were indeed not generated in this case.

I agree with the decision. It's interesting to notice that now the
only way an error is generated is when index-vector subsetting is done
using mixed positive and negative indices, e.g. x[c(-1,1)].

/Henrik

 Martin


  On May 5, 2015, at 7:01 AM, Martin Maechler 
 maech...@lynne.stat.math.ethz.ch wrote:

  Henrik Bengtsson henrik.bengts...@ucsf.edu
  on Mon, 4 May 2015 12:20:44 -0700 writes:
 
  In Section 'Indexing by vectors' of 'R Language Definition'
  
 (http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Indexing-by-vectors)
  it says:
 
  Integer. All elements of i must have the same sign. If they are
  positive, the elements of x with those index numbers are selected. If
  i contains negative elements, all elements except those indicated are
  selected.
 
  If i is positive and exceeds length(x) then the corresponding
  selection is NA. A negative out of bounds value for i causes an error.
 
  A special case is the zero index, which has null effects: x[0] is an
  empty vector and otherwise including zeros among positive or negative
  indices has the same effect as if they were omitted.
 
  However, that A negative out of bounds value for i causes an error
  in the second paragraph does not seem to apply.  Instead, R silently
  ignore negative indices that are out of range.  For example:
 
  x - 1:4
  x[-9L]
  [1] 1 2 3 4
  x[-c(1:9)]
  integer(0)
  x[-c(3:9)]
  [1] 1 2
 
  y - as.list(1:4)
  y[-c(1:9)]
  list()
 
  Is the observed non-error the correct behavior and therefore the
  documentation is incorrect, or is it vice verse?  (...or is it me
  missing something)
 
  I get the above on R devel, R 3.2.0, and as far back as R 2.11.0
  (haven't check earlier versions).
 
  Thank you, Henrik!
 
  I've checked further back: The change happened between R 2.5.1 and R 
 2.6.0.
 
  The previous behavior was
 
  (1:3)[-(3:5)]
  Error: subscript out of bounds
 
  If you start reading NEWS.2, you see a *lot* of new features
  (and bug fixes) in the 2.6.0 news, but from my browsing, none of
  them mentioned the new behavior as feature.
 
  Let's -- for a moment -- declare it a bug in the code, i.e., not
  in the documentation:
 
  - As 2.6.0  happened quite a while ago (Oct. 2007),
  we could wonder how much R code will break if we fix the bug.
 
  - Is the R package authors' community willing to do the necessary
  cleanup in their packages ?
 
             
 
 
  Now, after reading the source code for a while, and looking at
  the changes, I've found the log entry
 
  
 
  r42123 | ihaka | 2007-07-05 02:00:05 +0200 (Thu, 05 Jul 2007) | 4 lines
 
  Changed the behaviour of out-of-bounds negative
  subscripts to match that of S.  Such values are
  now ignored rather than tripping an error.
 
  
 
 
  So, it was changed on purpose, by one of the true Rs, very
  much on purpose.
 
  Making it a *warning* instead of the original error
  may have been both more cautious and more helpful for
  detecting programming errors.
 
  OTOH, John Chambers, the father of S and hence grandfather of R,
  may have had good reasons why it seemed more logical to silently
  ignore such out of bound negative indices:
  One could argue that
 
  x[-5]  means  leave away the 5-th element of x
 
  and if there is no 5-th element of x, leaving it away should be a 
 no-op.
 
  After all this musing and history detection, my gut decision
  would be to only change the documentation which Ross forgot to change.
 
  But of course, it may be interesting 

Re: [Rd] Shouldn't vector indexing with negative out-of-range index give an error?

2015-05-06 Thread Hervé Pagès

Hi,

On 05/06/2015 09:04 AM, Henrik Bengtsson wrote:

On Wed, May 6, 2015 at 1:33 AM, Martin Maechler
maech...@lynne.stat.math.ethz.ch wrote:

John Chambers j...@stat.stanford.edu
 on Tue, 5 May 2015 08:39:46 -0700 writes:


  When someone suggests that we might have had a reason for some peculiarity in 
the original S, my usual reaction is Or else we never thought of the problem.
  In this case, however, there is a relevant statement in the 1988 blue book.  In 
the discussion of subscripting (p 358) the definition for negative i says: the indices consist 
of the elements of seq(along=x) that do not match any elements in -i.

  Suggesting that no bounds checking on -i takes place.

  John

Indeed!
Thanks a lot John, for the perspective and clarification!

I'm committing a patch to the documentation now.


Thank you both and also credits to Dongcan Jiang for pointing out to
me that errors were indeed not generated in this case.

I agree with the decision. It's interesting to notice that now the
only way an error is generated is when index-vector subsetting is done
using mixed positive and negative indices, e.g. x[c(-1,1)].


This is why in situations where I need to extract a single element from
an atomic vector I use [[ instead of [. It's safer (performs 
bound-checking), a little bit faster (at least last time I checked), and

drops the name of the element.

BTW did you know that one can use a negative index with [[ on a
vector of length 2?

   c(a=2, b=6)[[-1]]
  [1] 6
   c(a=2, b=6)[[-2]]
  [1] 2
   list(a=22, b=6:5)[[-1]]
  [1] 6 5
   list(a=22, b=6:5)[[-2]]
  [1] 22
   list(a=22, b=6:5)[[c(-1, -2)]]
  [1] 6
   list(a=22, b=6:5)[[c(-1, -1)]]

Also works with [[-:

   x - list(a=22, b=6:5)
   x[[c(-1, -2)]] - 99L
   x
  $a
  [1] 22

  $b
  [1] 99  5

Not that I ever needed that feature though...

Cheers,
H.



/Henrik


Martin


  On May 5, 2015, at 7:01 AM, Martin Maechler 
maech...@lynne.stat.math.ethz.ch wrote:

  Henrik Bengtsson henrik.bengts...@ucsf.edu
  on Mon, 4 May 2015 12:20:44 -0700 writes:
 
  In Section 'Indexing by vectors' of 'R Language Definition'
  
(http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Indexing-by-vectors)
  it says:
 
  Integer. All elements of i must have the same sign. If they are
  positive, the elements of x with those index numbers are selected. If
  i contains negative elements, all elements except those indicated are
  selected.
 
  If i is positive and exceeds length(x) then the corresponding
  selection is NA. A negative out of bounds value for i causes an error.
 
  A special case is the zero index, which has null effects: x[0] is an
  empty vector and otherwise including zeros among positive or negative
  indices has the same effect as if they were omitted.
 
  However, that A negative out of bounds value for i causes an error
  in the second paragraph does not seem to apply.  Instead, R silently
  ignore negative indices that are out of range.  For example:
 
  x - 1:4
  x[-9L]
  [1] 1 2 3 4
  x[-c(1:9)]
  integer(0)
  x[-c(3:9)]
  [1] 1 2
 
  y - as.list(1:4)
  y[-c(1:9)]
  list()
 
  Is the observed non-error the correct behavior and therefore the
  documentation is incorrect, or is it vice verse?  (...or is it me
  missing something)
 
  I get the above on R devel, R 3.2.0, and as far back as R 2.11.0
  (haven't check earlier versions).
 
  Thank you, Henrik!
 
  I've checked further back: The change happened between R 2.5.1 and R 
2.6.0.
 
  The previous behavior was
 
  (1:3)[-(3:5)]
  Error: subscript out of bounds
 
  If you start reading NEWS.2, you see a *lot* of new features
  (and bug fixes) in the 2.6.0 news, but from my browsing, none of
  them mentioned the new behavior as feature.
 
  Let's -- for a moment -- declare it a bug in the code, i.e., not
  in the documentation:
 
  - As 2.6.0  happened quite a while ago (Oct. 2007),
  we could wonder how much R code will break if we fix the bug.
 
  - Is the R package authors' community willing to do the necessary
  cleanup in their packages ?
 
             
 
 
  Now, after reading the source code for a while, and looking at
  the changes, I've found the log entry
 
  
  r42123 | ihaka | 2007-07-05 02:00:05 +0200 (Thu, 05 Jul 2007) | 4 lines
 
  Changed the behaviour of out-of-bounds negative
  subscripts to match that of S.  Such values are
  now ignored rather than tripping an error.
 
  
 
  So, it was changed on purpose, by one of the true Rs, very
  

Re: [Rd] Shouldn't vector indexing with negative out-of-range index give an error?

2015-05-05 Thread Martin Maechler
 Henrik Bengtsson henrik.bengts...@ucsf.edu
 on Mon, 4 May 2015 12:20:44 -0700 writes:

 In Section 'Indexing by vectors' of 'R Language Definition'
 
(http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Indexing-by-vectors)
 it says:

 Integer. All elements of i must have the same sign. If they are
 positive, the elements of x with those index numbers are selected. If
 i contains negative elements, all elements except those indicated are
 selected.

 If i is positive and exceeds length(x) then the corresponding
 selection is NA. A negative out of bounds value for i causes an error.

 A special case is the zero index, which has null effects: x[0] is an
 empty vector and otherwise including zeros among positive or negative
 indices has the same effect as if they were omitted.

 However, that A negative out of bounds value for i causes an error
 in the second paragraph does not seem to apply.  Instead, R silently
 ignore negative indices that are out of range.  For example:

 x - 1:4
 x[-9L]
 [1] 1 2 3 4
 x[-c(1:9)]
 integer(0)
 x[-c(3:9)]
 [1] 1 2

 y - as.list(1:4)
 y[-c(1:9)]
 list()

 Is the observed non-error the correct behavior and therefore the
 documentation is incorrect, or is it vice verse?  (...or is it me
 missing something)

 I get the above on R devel, R 3.2.0, and as far back as R 2.11.0
 (haven't check earlier versions).

Thank you, Henrik!

I've checked further back: The change happened between R 2.5.1 and R 2.6.0.

The previous behavior was

   (1:3)[-(3:5)]
  Error: subscript out of bounds

If you start reading NEWS.2, you see a *lot* of new features
(and bug fixes) in the 2.6.0 news, but from my browsing, none of
them mentioned the new behavior as feature.

Let's -- for a moment -- declare it a bug in the code, i.e., not
in the documentation:

- As 2.6.0  happened quite a while ago (Oct. 2007),  
  we could wonder how much R code will break if we fix the bug.

- Is the R package authors' community willing to do the necessary
  cleanup in their packages ?

            


Now, after reading the source code for a while, and looking at
the changes, I've found the log entry


r42123 | ihaka | 2007-07-05 02:00:05 +0200 (Thu, 05 Jul 2007) | 4 lines

Changed the behaviour of out-of-bounds negative
subscripts to match that of S.  Such values are
now ignored rather than tripping an error.



So, it was changed on purpose, by one of the true Rs, very
much on purpose.

Making it a *warning* instead of the original error
may have been both more cautious and more helpful for
detecting programming errors.

OTOH, John Chambers, the father of S and hence grandfather of R,
may have had good reasons why it seemed more logical to silently
ignore such out of bound negative indices:
One could argue that

   x[-5]  means  leave away the 5-th element of x

and if there is no 5-th element of x, leaving it away should be a no-op.

After all this musing and history detection, my gut decision
would be to only change the documentation which Ross forgot to change.

But of course, it may be interesting to hear other programmeR's feedback on 
this.

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Shouldn't vector indexing with negative out-of-range index give an error?

2015-05-05 Thread John Chambers
When someone suggests that we might have had a reason for some peculiarity in 
the original S, my usual reaction is Or else we never thought of the problem.

In this case, however, there is a relevant statement in the 1988 blue book.  
In the discussion of subscripting (p 358) the definition for negative i says: 
the indices consist of the elements of seq(along=x) that do not match any 
elements in -i.

Suggesting that no bounds checking on -i takes place.

John


On May 5, 2015, at 7:01 AM, Martin Maechler maech...@lynne.stat.math.ethz.ch 
wrote:

 Henrik Bengtsson henrik.bengts...@ucsf.edu
   on Mon, 4 May 2015 12:20:44 -0700 writes:
 
 In Section 'Indexing by vectors' of 'R Language Definition'
 (http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Indexing-by-vectors)
 it says:
 
 Integer. All elements of i must have the same sign. If they are
 positive, the elements of x with those index numbers are selected. If
 i contains negative elements, all elements except those indicated are
 selected.
 
 If i is positive and exceeds length(x) then the corresponding
 selection is NA. A negative out of bounds value for i causes an error.
 
 A special case is the zero index, which has null effects: x[0] is an
 empty vector and otherwise including zeros among positive or negative
 indices has the same effect as if they were omitted.
 
 However, that A negative out of bounds value for i causes an error
 in the second paragraph does not seem to apply.  Instead, R silently
 ignore negative indices that are out of range.  For example:
 
 x - 1:4
 x[-9L]
 [1] 1 2 3 4
 x[-c(1:9)]
 integer(0)
 x[-c(3:9)]
 [1] 1 2
 
 y - as.list(1:4)
 y[-c(1:9)]
 list()
 
 Is the observed non-error the correct behavior and therefore the
 documentation is incorrect, or is it vice verse?  (...or is it me
 missing something)
 
 I get the above on R devel, R 3.2.0, and as far back as R 2.11.0
 (haven't check earlier versions).
 
 Thank you, Henrik!
 
 I've checked further back: The change happened between R 2.5.1 and R 2.6.0.
 
 The previous behavior was
 
 (1:3)[-(3:5)]
 Error: subscript out of bounds
 
 If you start reading NEWS.2, you see a *lot* of new features
 (and bug fixes) in the 2.6.0 news, but from my browsing, none of
 them mentioned the new behavior as feature.
 
 Let's -- for a moment -- declare it a bug in the code, i.e., not
 in the documentation:
 
 - As 2.6.0  happened quite a while ago (Oct. 2007),  
 we could wonder how much R code will break if we fix the bug.
 
 - Is the R package authors' community willing to do the necessary
 cleanup in their packages ?
 
             
 
 
 Now, after reading the source code for a while, and looking at
 the changes, I've found the log entry
 
 
 r42123 | ihaka | 2007-07-05 02:00:05 +0200 (Thu, 05 Jul 2007) | 4 lines
 
 Changed the behaviour of out-of-bounds negative
 subscripts to match that of S.  Such values are
 now ignored rather than tripping an error.
 
 
 
 So, it was changed on purpose, by one of the true Rs, very
 much on purpose.
 
 Making it a *warning* instead of the original error
 may have been both more cautious and more helpful for
 detecting programming errors.
 
 OTOH, John Chambers, the father of S and hence grandfather of R,
 may have had good reasons why it seemed more logical to silently
 ignore such out of bound negative indices:
 One could argue that
 
  x[-5]  means  leave away the 5-th element of x
 
 and if there is no 5-th element of x, leaving it away should be a no-op.
 
 After all this musing and history detection, my gut decision
 would be to only change the documentation which Ross forgot to change.
 
 But of course, it may be interesting to hear other programmeR's feedback on 
 this.
 
 Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel