subject:"\[Rd\] multiple issues with is.unsorted\(\)"

[Rd] multiple issues with is.unsorted()

2013-04-24 Thread Hervé Pagès


Hi,

In the man page for is.unsorted():

  Value:

 A length-one logical value.  All objects of length 0 or 1 are
 sorted: the result will be ‘NA’ for objects of length 2 or more
 except for atomic vectors and objects with a class (where the ‘=’
 or ‘’ method is used to compare ‘x[i]’ with ‘x[i-1]’ for ‘i’ in
 ‘2:length(x)’).

This contains many incorrect statements:

  length(NA)
 [1] 1
  is.unsorted(NA)
 [1] NA
  length(list(NA))
 [1] 1
  is.unsorted(list(NA))
 [1] NA

= Contradicts all objects of length 0 or 1 are sorted.

  is.unsorted(raw(2))
 Error in is.unsorted(raw(2)) : unimplemented type 'raw' in 
'isUnsorted'


= Doesn't agree with the doc (unless except for atomic vectors
   means it might fail for atomic vectors).

  setClass(A, representation(aa=integer))
  a - new(A, aa=4:1)
  length(a)
 [1] 1

  is.unsorted(a)
 [1] FALSE
 Warning message:
 In is.na(x) : is.na() applied to non-(list or vector) of type 'S4'

= Ok, but it's arguable the warning is useful/justified from a user
   point of view. The warning *seems* to suggest that defining an
   is.na method for my objects is required for is.unsorted() to
   work properly but the doc doesn't make this clear.

Anyway, let's define one, so the warning goes away:

  setMethod(is.na, A, function(x) is.na(x@aa))
 [1] is.na

Let's define a length method:

  setMethod(length, A, function(x) length(x@aa))
 [1] length
  length(a)
 [1] 4

  is.unsorted(a)
 [1] FALSE

= Is this correct? Hard to know. The doc is not clear about what
   should happen for objects of length 2 or more and with a class
   but with no = or  methods.

Let's define [, =, and :

  setMethod([, A, function(x, i, j, ..., drop=TRUE) new(A, 
aa=x@aa[i]))

 [1] [
  rev(a)
 An object of class A
 Slot aa:
 [1] 1 2 3 4

  setMethod(=, c(A, A), function(e1, e2) {e1@aa = e2@aa})
 [1] =
  a = a[3]
 [1]  TRUE  TRUE  TRUE FALSE

  setMethod(, c(A, A), function(e1, e2) {e1@aa  e2@aa})
 [1] 
  a  a[3]
 [1]  TRUE  TRUE FALSE FALSE

  is.unsorted(a)
 [1] FALSE

 is.unsorted(rev(a))
[1] FALSE

Still not working as expected. So what's required exactly for making
is.unsorted() work on an object with a class?

BTW, is.unsorted() would be *much* faster, at least on atomic vectors,
without those calls to is.na(). The C code could check for NAs, without
having to do this as a first pass on the full vector like it is the
case with the current implementation. If the vector if unsorted, the
C code is typically able to bail out early so the speed-up will
typically be 1x or more if the vector as millions of elements.

Thanks,
H.

 sessionInfo()
R version 3.0.0 (2013-04-03)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_3.0.0

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fhcrc.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] multiple issues with is.unsorted()

2013-04-24 Thread Martin Maechler

Dear Herve,

 Hervé Pagès hpa...@fhcrc.org
 on Tue, 23 Apr 2013 23:09:21 -0700 writes:

 Hi, In the man page for is.unsorted():

Value:

   A length-one logical value.  All objects of length 0
 or 1 are sorted: the result will be ‘NA’ for objects of
 length 2 or more except for atomic vectors and objects
 with a class (where the ‘=’ or ‘’ method is used to
 compare ‘x[i]’ with ‘x[i-1]’ for ‘i’ in ‘2:length(x)’).

 This contains many incorrect statements:

 length(NA)
   [1] 1
 is.unsorted(NA)
   [1] NA
 length(list(NA))
   [1] 1
 is.unsorted(list(NA))
   [1] NA

 = Contradicts all objects of length 0 or 1 are sorted.

 is.unsorted(raw(2))
   Error in is.unsorted(raw(2)) : unimplemented type
 'raw' in 'isUnsorted'

 = Doesn't agree with the doc (unless except for atomic
 vectors means it might fail for atomic vectors).

 setClass(A, representation(aa=integer)) a - new(A,
 aa=4:1) length(a)
   [1] 1

 is.unsorted(a)
   [1] FALSE Warning message: In is.na(x) : is.na()
 applied to non-(list or vector) of type 'S4'

 = Ok, but it's arguable the warning is useful/justified
 from a user point of view. The warning *seems* to suggest
 that defining an is.na method for my objects is required
 for is.unsorted() to work properly but the doc doesn't
 make this clear.

 Anyway, let's define one, so the warning goes away:

 setMethod(is.na, A, function(x) is.na(x@aa))
   [1] is.na

 Let's define a length method:

 setMethod(length, A, function(x) length(x@aa))
   [1] length
 length(a)
   [1] 4

 is.unsorted(a)
   [1] FALSE

 = Is this correct? Hard to know. The doc is not clear
 about what should happen for objects of length 2 or more
 and with a class but with no = or  methods.

 Let's define [, =, and :

 setMethod([, A, function(x, i, j, ..., drop=TRUE)
 new(A,
 aa=x@aa[i])) [1] [
 rev(a)
   An object of class A Slot aa: [1] 1 2 3 4

 setMethod(=, c(A, A), function(e1, e2) {e1@aa =
 e2@aa})
   [1] =
 a = a[3]
   [1] TRUE TRUE TRUE FALSE

 setMethod(, c(A, A), function(e1, e2) {e1@aa 
 e2@aa})
   [1] 
 a  a[3]
   [1] TRUE TRUE FALSE FALSE

 is.unsorted(a)
   [1] FALSE

 is.unsorted(rev(a))
  [1] FALSE

 Still not working as expected. So what's required exactly
 for making is.unsorted() work on an object with a class?

well, read the source code. :-) ;-)

More seriously: On another hidden help page, you find

  \code{.gt} and \code{.gtn} are callbacks from \code{\link{rank}} and
  \code{\link{is.unsorted}} used for classed objects.

In other words, you'd need do define a method for 
 .gtn  for S4 objects in this case.

 yes, indeed I don't know why this is not at all documented.



 BTW, is.unsorted() would be *much* faster, at least on
 atomic vectors, without those calls to is.na(). 

Well, in all R versions, apart from R-devel as of yesterday,
the source of is.unsorted() has been

  is.unsorted - function(x, na.rm = FALSE, strictly = FALSE)
  {
  if(is.null(x)) return(FALSE)
  if(!na.rm  any(is.na(x)))## FIXME is.na(large) is too slow
  return(NA)
  ## else
  if(na.rm  any(ii - is.na(x)))
  x - x[!ii]
  .Internal(is.unsorted(x, strictly))
  }

so you see the FIXME.

In R-devel  (and probably  R-patched  in the nearer future),
that line is

  if(!na.rm  anyMissing(x))

so there's no slow code anymore, at least not for the default
case of  na.rm = FALSE.


 The C code
 could check for NAs, without having to do this as a first
 pass on the full vector like it is the case with the
 current implementation. If the vector if unsorted, the C
 code is typically able to bail out early so the speed-up
 will typically be 1x or more if the vector as millions
 of elements.

you are right (but again: the most important case na.rm=FALSE
case has been solved already I'd say),
but you know well that we do gratefully accept good patches to
the R sources.


 Thanks, H.

 sessionInfo()
 R version 3.0.0 (2013-04-03) Platform:
 x86_64-unknown-linux-gnu (64-bit)

 locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3]
 LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5]
 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7]
 LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11]
 LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages: [1] stats graphics grDevices utils
 datasets methods base

 loaded via a namespace (and not attached): [1] tools_3.0.0

 -- 
 Hervé Pagès

 Program in Computational Biology Division of Public Health
 Sciences Fred Hutchinson Cancer Research Center 1100
 Fairview Ave. N, M1-B514 P.O. Box 19024

Re: [Rd] multiple issues with is.unsorted()

2013-04-24 Thread Martin Maechler

More comments .. see inline

 Martin Maechler maech...@stat.math.ethz.ch
 on Wed, 24 Apr 2013 11:29:39 +0200 writes:

 Dear Herve,
 Hervé Pagès hpa...@fhcrc.org
 on Tue, 23 Apr 2013 23:09:21 -0700 writes:

 Hi, In the man page for is.unsorted():

 Value:

 A length-one logical value.  All objects of length 0 or 1
 are sorted: the result will be ‘NA’ for objects of length
 2 or more except for atomic vectors and objects with a
 class (where the ‘=’ or ‘’ method is used to compare
 ‘x[i]’ with ‘x[i-1]’ for ‘i’ in ‘2:length(x)’).

 This contains many incorrect statements:

 length(NA)
 [1] 1
 is.unsorted(NA)
 [1] NA
 length(list(NA))
 [1] 1
 is.unsorted(list(NA))
 [1] NA

 = Contradicts all objects of length 0 or 1 are sorted.

Ok.  I really think we should change the above.
If NA is for a missing number, it still cannot be unsorted if it
is of length one.

-- the above will give FALSE  real soon now.

 is.unsorted(raw(2))
 Error in is.unsorted(raw(2)) : unimplemented type 'raw'
 in 'isUnsorted'

 = Doesn't agree with the doc (unless except for atomic
 vectors means it might fail for atomic vectors).

Well, the doc says about 'x'
|  \item{x}{an \R object with a class or a numeric, complex, character or
|logical vector.}
so strictly, is.unsorted() is not to be used on raw vectors.

However I think you have a point:
Raw vectors didn't exist when  is.unsorted()  was
invented, so where not considered back then.
Originally,  raw vectors were really almost only there for
storage, i.e. basically read and write, but now we have
as '' , '=' '=='  etc  working well for raw() ,
we could allow  is.unsorted() to work, too.

Note however, that if you try to sort(raw) you also always get
an error about sort() not being implemented for raw(),...
something we could arguably reconsider, as we admitted the 
relational operators ( = == =   != ) to work.
{{anyone donating patches to R-devel for sort()ing raw ?}}


 setClass(A, representation(aa=integer)) 
 new(A, aa=4:1) 
 length(a)
 [1] 1

 is.unsorted(a)
 [1] FALSE
  Warning message: In is.na(x) : is.na() applied
 to non-(list or vector) of type 'S4'

 = Ok, but it's arguable the warning is useful/justified
 from a user point of view. The warning *seems* to suggest
 that defining an is.na method for my objects is
 required for is.unsorted() to work properly but the doc
 doesn't make this clear.

you are right.
We are going to improve on this, at least the documentation.


[.]

The S4 part I've already started addressing in the last reply.
(and we may get back to that.. )

[.]

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] multiple issues with is.unsorted()

2013-04-24 Thread Hervé Pagès


Hi Martin,

On 04/24/2013 02:29 AM, Martin Maechler wrote:

Dear Herve,


Hervé Pagès hpa...@fhcrc.org
 on Tue, 23 Apr 2013 23:09:21 -0700 writes:


  Hi, In the man page for is.unsorted():

 Value:

A length-one logical value.  All objects of length 0
  or 1 are sorted: the result will be ‘NA’ for objects of
  length 2 or more except for atomic vectors and objects
  with a class (where the ‘=’ or ‘’ method is used to
  compare ‘x[i]’ with ‘x[i-1]’ for ‘i’ in ‘2:length(x)’).

  This contains many incorrect statements:

  length(NA)
[1] 1
  is.unsorted(NA)
[1] NA
  length(list(NA))
[1] 1
  is.unsorted(list(NA))
[1] NA

  = Contradicts all objects of length 0 or 1 are sorted.

  is.unsorted(raw(2))
Error in is.unsorted(raw(2)) : unimplemented type
  'raw' in 'isUnsorted'

  = Doesn't agree with the doc (unless except for atomic
  vectors means it might fail for atomic vectors).

  setClass(A, representation(aa=integer)) a - new(A,
  aa=4:1) length(a)
[1] 1

  is.unsorted(a)
[1] FALSE Warning message: In is.na(x) : is.na()
  applied to non-(list or vector) of type 'S4'

  = Ok, but it's arguable the warning is useful/justified
  from a user point of view. The warning *seems* to suggest
  that defining an is.na method for my objects is required
  for is.unsorted() to work properly but the doc doesn't
  make this clear.

  Anyway, let's define one, so the warning goes away:

  setMethod(is.na, A, function(x) is.na(x@aa))
[1] is.na

  Let's define a length method:

  setMethod(length, A, function(x) length(x@aa))
[1] length
  length(a)
[1] 4

  is.unsorted(a)
[1] FALSE

  = Is this correct? Hard to know. The doc is not clear
  about what should happen for objects of length 2 or more
  and with a class but with no = or  methods.

  Let's define [, =, and :

  setMethod([, A, function(x, i, j, ..., drop=TRUE)
  new(A,
  aa=x@aa[i])) [1] [
  rev(a)
An object of class A Slot aa: [1] 1 2 3 4

  setMethod(=, c(A, A), function(e1, e2) {e1@aa =
  e2@aa})
[1] =
  a = a[3]
[1] TRUE TRUE TRUE FALSE

  setMethod(, c(A, A), function(e1, e2) {e1@aa 
  e2@aa})
[1] 
  a  a[3]
[1] TRUE TRUE FALSE FALSE

  is.unsorted(a)
[1] FALSE

  is.unsorted(rev(a))
   [1] FALSE

  Still not working as expected. So what's required exactly
  for making is.unsorted() work on an object with a class?

well, read the source code. :-) ;-)

More seriously: On another hidden help page, you find

   \code{.gt} and \code{.gtn} are callbacks from \code{\link{rank}} and
   \code{\link{is.unsorted}} used for classed objects.

In other words, you'd need do define a method for
  .gtn  for S4 objects in this case.


Ah, good to know.



 yes, indeed I don't know why this is not at all documented.



  BTW, is.unsorted() would be *much* faster, at least on
  atomic vectors, without those calls to is.na().

Well, in all R versions, apart from R-devel as of yesterday,
the source of is.unsorted() has been

   is.unsorted - function(x, na.rm = FALSE, strictly = FALSE)
   {
   if(is.null(x)) return(FALSE)
   if(!na.rm  any(is.na(x)))## FIXME is.na(large) is too slow
  return(NA)
   ## else
   if(na.rm  any(ii - is.na(x)))
  x - x[!ii]
   .Internal(is.unsorted(x, strictly))
   }

so you see the FIXME.

In R-devel  (and probably  R-patched  in the nearer future),
that line is

   if(!na.rm  anyMissing(x))

so there's no slow code anymore, at least not for the default
case of  na.rm = FALSE.


  The C code
  could check for NAs, without having to do this as a first
  pass on the full vector like it is the case with the
  current implementation. If the vector if unsorted, the C
  code is typically able to bail out early so the speed-up
  will typically be 1x or more if the vector as millions
  of elements.

you are right (but again: the most important case na.rm=FALSE
 case has been solved already I'd say4),
but you know well that we do gratefully accept good patches to
the R sources.


Will do. Thanks!

H.




  Thanks, H.

  sessionInfo()
  R version 3.0.0 (2013-04-03) Platform:
  x86_64-unknown-linux-gnu (64-bit)

  locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3]
  LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5]
  LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7]
  LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11]
  LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

  attached base packages: [1] stats graphics grDevices utils
  datasets methods base

  loaded via a namespace (and not attached): [1]

Re: [Rd] multiple issues with is.unsorted()

2013-04-24 Thread William Dunlap

 is.unsorted(NA)
 [1] NA
 = Contradicts all objects of length 0 or 1 are sorted.

 Ok.  I really think we should change the above.
 If NA is for a missing number, it still cannot be unsorted if it
 is of length one.
 
 -- the above will give FALSE  real soon now.

It depends what you are using the result of is.unsorted() for.  If you want
to know if you can save time by not calling x-sort(x)  then is.unsorted(NA)
should not say that NA is sorted, as sort(NA) has length 0.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On 
 Behalf
 Of Martin Maechler
 Sent: Wednesday, April 24, 2013 8:41 AM
 To: Hervé Pagès; r-de...@stat.math.ethz.ch
 Cc: Martin Maechler
 Subject: Re: [Rd] multiple issues with is.unsorted()
 
 More comments .. see inline
 
  Martin Maechler maech...@stat.math.ethz.ch
  on Wed, 24 Apr 2013 11:29:39 +0200 writes:
 
  Dear Herve,
  Hervé Pagès hpa...@fhcrc.org
  on Tue, 23 Apr 2013 23:09:21 -0700 writes:
 
  Hi, In the man page for is.unsorted():
 
  Value:
 
  A length-one logical value.  All objects of length 0 or 1
  are sorted: the result will be ‘NA’ for objects of length
  2 or more except for atomic vectors and objects with a
  class (where the ‘=’ or ‘’ method is used to compare
  ‘x[i]’ with ‘x[i-1]’ for ‘i’ in ‘2:length(x)’).
 
  This contains many incorrect statements:
 
  length(NA)
  [1] 1
  is.unsorted(NA)
  [1] NA
  length(list(NA))
  [1] 1
  is.unsorted(list(NA))
  [1] NA
 
  = Contradicts all objects of length 0 or 1 are sorted.
 
 Ok.  I really think we should change the above.
 If NA is for a missing number, it still cannot be unsorted if it
 is of length one.
 
 -- the above will give FALSE  real soon now.
 
  is.unsorted(raw(2))
  Error in is.unsorted(raw(2)) : unimplemented type 'raw'
  in 'isUnsorted'
 
  = Doesn't agree with the doc (unless except for atomic
  vectors means it might fail for atomic vectors).
 
 Well, the doc says about 'x'
 |  \item{x}{an \R object with a class or a numeric, complex, character or
 |logical vector.}
 so strictly, is.unsorted() is not to be used on raw vectors.
 
 However I think you have a point:
 Raw vectors didn't exist when  is.unsorted()  was
 invented, so where not considered back then.
 Originally,  raw vectors were really almost only there for
 storage, i.e. basically read and write, but now we have
 as '' , '=' '=='  etc  working well for raw() ,
 we could allow  is.unsorted() to work, too.
 
 Note however, that if you try to sort(raw) you also always get
 an error about sort() not being implemented for raw(),...
 something we could arguably reconsider, as we admitted the
 relational operators ( = == =   != ) to work.
 {{anyone donating patches to R-devel for sort()ing raw ?}}
 
 
  setClass(A, representation(aa=integer))
  new(A, aa=4:1)
  length(a)
  [1] 1
 
  is.unsorted(a)
  [1] FALSE
   Warning message: In is.na(x) : is.na() applied
  to non-(list or vector) of type 'S4'
 
  = Ok, but it's arguable the warning is useful/justified
  from a user point of view. The warning *seems* to suggest
  that defining an is.na method for my objects is
  required for is.unsorted() to work properly but the doc
  doesn't make this clear.
 
 you are right.
 We are going to improve on this, at least the documentation.
 
 
 [.]
 
 The S4 part I've already started addressing in the last reply.
 (and we may get back to that.. )
 
 [.]
 
 Martin
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] multiple issues with is.unsorted()

2013-04-24 Thread Hervé Pagès


Hi,

On 04/24/2013 09:27 AM, William Dunlap wrote:

 is.unsorted(NA)
 [1] NA
 = Contradicts all objects of length 0 or 1 are sorted.

Ok.  I really think we should change the above.
If NA is for a missing number, it still cannot be unsorted if it
is of length one.

-- the above will give FALSE  real soon now.


It depends what you are using the result of is.unsorted() for.  If you want
to know if you can save time by not calling x-sort(x)  then is.unsorted(NA)
should not say that NA is sorted, as sort(NA) has length 0.


Glad you mention this. This is related but actually a different issue
which is that by default is.unsorted() and sort() don't treat NAs
consistently: the former keeps them, the latter removes them. So if
you want to use is.unsorted() for deciding whether or not you're going
to call sort() (without specifying 'na.last'), you should do
'is.unsorted( , na.rm=TRUE)'.

This is why IMO 'is.unsorted( , na.rm=TRUE)' is an important use case
and should be as fast as possible.

If you want to keep NAs, you'll have to sort 'x' with either
na.last=TRUE or na.last=FALSE. So it makes a lot of sense that
is.unsorted(x) returns FALSE if x is a single NA, because, in that
case, 'x' doesn't need to be sorted.

Cheers,
H.



Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com



-Original Message-
From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On 
Behalf
Of Martin Maechler
Sent: Wednesday, April 24, 2013 8:41 AM
To: Hervé Pagès; r-de...@stat.math.ethz.ch
Cc: Martin Maechler
Subject: Re: [Rd] multiple issues with is.unsorted()

More comments .. see inline


Martin Maechler maech...@stat.math.ethz.ch
 on Wed, 24 Apr 2013 11:29:39 +0200 writes:


  Dear Herve,

Hervé Pagès hpa...@fhcrc.org
on Tue, 23 Apr 2013 23:09:21 -0700 writes:


  Hi, In the man page for is.unsorted():

  Value:

  A length-one logical value.  All objects of length 0 or 1
  are sorted: the result will be ‘NA’ for objects of length
  2 or more except for atomic vectors and objects with a
  class (where the ‘=’ or ‘’ method is used to compare
  ‘x[i]’ with ‘x[i-1]’ for ‘i’ in ‘2:length(x)’).

  This contains many incorrect statements:

  length(NA)
  [1] 1
  is.unsorted(NA)
  [1] NA
  length(list(NA))
  [1] 1
  is.unsorted(list(NA))
  [1] NA

  = Contradicts all objects of length 0 or 1 are sorted.

Ok.  I really think we should change the above.
If NA is for a missing number, it still cannot be unsorted if it
is of length one.

-- the above will give FALSE  real soon now.

  is.unsorted(raw(2))
  Error in is.unsorted(raw(2)) : unimplemented type 'raw'
  in 'isUnsorted'

  = Doesn't agree with the doc (unless except for atomic
  vectors means it might fail for atomic vectors).

Well, the doc says about 'x'
|  \item{x}{an \R object with a class or a numeric, complex, character or
|logical vector.}
so strictly, is.unsorted() is not to be used on raw vectors.

However I think you have a point:
Raw vectors didn't exist when  is.unsorted()  was
invented, so where not considered back then.
Originally,  raw vectors were really almost only there for
storage, i.e. basically read and write, but now we have
as '' , '=' '=='  etc  working well for raw() ,
we could allow  is.unsorted() to work, too.

Note however, that if you try to sort(raw) you also always get
an error about sort() not being implemented for raw(),...
something we could arguably reconsider, as we admitted the
relational operators ( = == =   != ) to work.
{{anyone donating patches to R-devel for sort()ing raw ?}}


  setClass(A, representation(aa=integer))
  new(A, aa=4:1)
  length(a)
  [1] 1

  is.unsorted(a)
  [1] FALSE
   Warning message: In is.na(x) : is.na() applied
  to non-(list or vector) of type 'S4'

  = Ok, but it's arguable the warning is useful/justified
  from a user point of view. The warning *seems* to suggest
  that defining an is.na method for my objects is
  required for is.unsorted() to work properly but the doc
  doesn't make this clear.

you are right.
We are going to improve on this, at least the documentation.


[.]

The S4 part I've already started addressing in the last reply.
(and we may get back to that.. )

[.]

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fhcrc.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] multiple issues with is.unsorted()

2013-04-24 Thread Hervé Pagès




On 04/24/2013 12:00 PM, Hervé Pagès wrote:

Hi,

On 04/24/2013 09:27 AM, William Dunlap wrote:

 is.unsorted(NA)
 [1] NA
 = Contradicts all objects of length 0 or 1 are sorted.

Ok.  I really think we should change the above.
If NA is for a missing number, it still cannot be unsorted if it
is of length one.

-- the above will give FALSE  real soon now.


It depends what you are using the result of is.unsorted() for.  If you
want
to know if you can save time by not calling x-sort(x)  then
is.unsorted(NA)
should not say that NA is sorted, as sort(NA) has length 0.


Glad you mention this. This is related but actually a different issue
which is that by default is.unsorted() and sort() don't treat NAs
consistently: the former keeps them, the latter removes them. So if
you want to use is.unsorted() for deciding whether or not you're going
to call sort() (without specifying 'na.last'), you should do
'is.unsorted( , na.rm=TRUE)'.

This is why IMO 'is.unsorted( , na.rm=TRUE)' is an important use case
and should be as fast as possible.

If you want to keep NAs, you'll have to sort 'x' with either
na.last=TRUE or na.last=FALSE. So it makes a lot of sense that
is.unsorted(x) returns FALSE if x is a single NA, because, in that
case, 'x' doesn't need to be sorted.


And I should add that, for that use case (want to keep NAs when
sorting), is.unsorted() is totally useless anyway because it will
return NA if 'x' has length = 2 and contains NAs :-/

Cheers,
H.



Cheers,
H.



Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com



-Original Message-
From: r-devel-boun...@r-project.org
[mailto:r-devel-boun...@r-project.org] On Behalf
Of Martin Maechler
Sent: Wednesday, April 24, 2013 8:41 AM
To: Hervé Pagès; r-de...@stat.math.ethz.ch
Cc: Martin Maechler
Subject: Re: [Rd] multiple issues with is.unsorted()

More comments .. see inline


Martin Maechler maech...@stat.math.ethz.ch
 on Wed, 24 Apr 2013 11:29:39 +0200 writes:


  Dear Herve,

Hervé Pagès hpa...@fhcrc.org
on Tue, 23 Apr 2013 23:09:21 -0700 writes:


  Hi, In the man page for is.unsorted():

  Value:

  A length-one logical value.  All objects of length 0 or 1
  are sorted: the result will be ‘NA’ for objects of length
  2 or more except for atomic vectors and objects with a
  class (where the ‘=’ or ‘’ method is used to compare
  ‘x[i]’ with ‘x[i-1]’ for ‘i’ in ‘2:length(x)’).

  This contains many incorrect statements:

  length(NA)
  [1] 1
  is.unsorted(NA)
  [1] NA
  length(list(NA))
  [1] 1
  is.unsorted(list(NA))
  [1] NA

  = Contradicts all objects of length 0 or 1 are sorted.

Ok.  I really think we should change the above.
If NA is for a missing number, it still cannot be unsorted if it
is of length one.

-- the above will give FALSE  real soon now.

  is.unsorted(raw(2))
  Error in is.unsorted(raw(2)) : unimplemented type 'raw'
  in 'isUnsorted'

  = Doesn't agree with the doc (unless except for atomic
  vectors means it might fail for atomic vectors).

Well, the doc says about 'x'
|  \item{x}{an \R object with a class or a numeric, complex,
character or
|logical vector.}
so strictly, is.unsorted() is not to be used on raw vectors.

However I think you have a point:
Raw vectors didn't exist when  is.unsorted()  was
invented, so where not considered back then.
Originally,  raw vectors were really almost only there for
storage, i.e. basically read and write, but now we have
as '' , '=' '=='  etc  working well for raw() ,
we could allow  is.unsorted() to work, too.

Note however, that if you try to sort(raw) you also always get
an error about sort() not being implemented for raw(),...
something we could arguably reconsider, as we admitted the
relational operators ( = == =   != ) to work.
{{anyone donating patches to R-devel for sort()ing raw ?}}


  setClass(A, representation(aa=integer))
  new(A, aa=4:1)
  length(a)
  [1] 1

  is.unsorted(a)
  [1] FALSE
   Warning message: In is.na(x) : is.na() applied
  to non-(list or vector) of type 'S4'

  = Ok, but it's arguable the warning is useful/justified
  from a user point of view. The warning *seems* to suggest
  that defining an is.na method for my objects is
  required for is.unsorted() to work properly but the doc
  doesn't make this clear.

you are right.
We are going to improve on this, at least the documentation.


[.]

The S4 part I've already started addressing in the last reply.
(and we may get back to that.. )

[.]

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fhcrc.org
Phone:  (206) 667-5791

[Rd] multiple issues with is.unsorted()

Re: [Rd] multiple issues with is.unsorted()

Re: [Rd] multiple issues with is.unsorted()

Re: [Rd] multiple issues with is.unsorted()

Re: [Rd] multiple issues with is.unsorted()

Re: [Rd] multiple issues with is.unsorted()

Re: [Rd] multiple issues with is.unsorted()

7 matches

Site Navigation

Mail list logo

Footer information