[Rd] STRING_IS_SORTED claims as.character(1:100) is sorted

2018-11-15 Thread Michael Sannella via R-devel
If I have loaded the C code:
SEXP altrep_STRING_IS_SORTED(SEXP x)
{
return ScalarInteger(STRING_IS_SORTED(x));
}
and defined the function:
issort <- function(x) .Call("altrep_STRING_IS_SORTED",x)

I am seeing the following results in R 3.5.1/Linux:
> issort(LETTERS)
[1] NA
> issort(as.character(1:100))  ## should return NA
[1] 1
> issort(as.character(100:1))  ## should return NA
[1] -1
> issort(as.character(1:100+1L))
[1] NA

issort(as.character(1:100)) should return NA, since the string vector
"1","2",..."10",... is not sorted.  I suspect that the problem is that
the Is_sorted method for deferred_string is just calling the Is_sorted
method for the source object 1:100 (which _is_ a sorted integer
vector).  It should probably just return NA for any source object.

  ~~ Michael Sannella

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] error unserializing ascii format (v2 or v3)

2018-11-07 Thread Michael Sannella via R-devel
I ran into an interesting error unserializing a file created with
ascii=TRUE:

R 3.5.1 (Windows or Linux):
> unserialize(serialize(list(raw=as.raw(c(39,41))), NULL, version=2,
ascii=TRUE))
Error in unserialize(serialize(list(raw = as.raw(c(39, 41))), NULL,
version = 2,  :
  ReadItem: unknown type 29, perhaps written by later version of R

The same error happens when the serialization is done with version=2
or version=3.  It does not happen if the serialization is done with
ascii=FALSE.

Note that 0x29 == 41.  It looks like unserialize is reading the wrong
line.

I tried this in earlier versions of R on Windows, and the same error
happens in every version from R-2.15.3 (the earliest I have) on up.

  ~~ Michael Sannella

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R_ext/Altrep.h should be more C++-friendly

2018-10-09 Thread Michael Sannella via R-devel
I am not able to #include "R_ext/Altrep.h" from a C++ file.  I think
it needs two changes:

1. add the same __cplusplus check as most of the other header files:
#ifdef  __cplusplus
extern "C" {
#endif
...
#ifdef  __cplusplus
}
#endif

2. change the line
R_new_altrep(R_altrep_class_t class, SEXP data1, SEXP data2);
 to
R_new_altrep(R_altrep_class_t cls, SEXP data1, SEXP data2);
 since C++ doesn't like an argument named 'class'

  ~~ Michael Sannella

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] bug with OutDec option and deferred_string altrep object

2018-10-09 Thread Michael Sannella via R-devel
While implementing R's new 'altrep' functionality in the TERR engine,
I discovered a bug in R's 'deferred_string' altrep object: it is not
using the correct value of the 'OutDec' option when it expands a
deferred_string.  See the following example:

R 3.5.1: (same results in R 3.6.0 devel engine built 10/5)
> options(scipen=0, OutDec=".")
> as.character(123.456)
[1] "123.456"
> options(scipen=-5, OutDec=",")
> as.character(123.456)
[1] "1,23456e+02"
> xx <- as.character(123.456)
> options(scipen=0, OutDec=".")
> xx
[1] "1.23456e+02"
>

In the example above, the variable 'xx' is set to a deferred_string
while OutDec is ','.  However, when the string is actually formatted
(when xx is printed), it uses the current option value OutDec='.' to
format the string.  I think that deferred_string should use the value
OutDec=',' from when as.character was called.

Note that the behavior is different with the 'scipen' option: The
deferred_string object records the scipen=-5 value when as.character
is called, and uses this value when xx is printed.  Looking at the
deferred_string object, it appears that CDR(R_altrep_data1()) is
set to a scalar integer containing the scipen value at the time the
deferred_string was created.

Ideally, the deferred_string object would save both the scipen and
OutDec option values.  I'd suggest saving these values as regular
pairlist values, say by setting the data1 field to pairlist(,
scipen=-5L, OutDec=',') for the value of xx above.  To save space, you
could avoid saving these values in the common case where scipen=0L,
OutDec='.'.  It would also be better if the data1 field was a
well-formed pairlist; the current value of the data1 field causes
R_inspect to segfault.

I understand that you probably wouldn't want to change the
deferred_string structure.  An alternative fix would be to avoid this
case by:
  1. Never create a deferred_string if OutDec is not '.'.
  2. When expanding an element of a deferred_string, temporarily set
OutDec to '.'.

  ~~ Michael Sannella

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel