[Rd] STRING_IS_SORTED claims as.character(1:100) is sorted
If I have loaded the C code: SEXP altrep_STRING_IS_SORTED(SEXP x) { return ScalarInteger(STRING_IS_SORTED(x)); } and defined the function: issort <- function(x) .Call("altrep_STRING_IS_SORTED",x) I am seeing the following results in R 3.5.1/Linux: > issort(LETTERS) [1] NA > issort(as.character(1:100)) ## should return NA [1] 1 > issort(as.character(100:1)) ## should return NA [1] -1 > issort(as.character(1:100+1L)) [1] NA issort(as.character(1:100)) should return NA, since the string vector "1","2",..."10",... is not sorted. I suspect that the problem is that the Is_sorted method for deferred_string is just calling the Is_sorted method for the source object 1:100 (which _is_ a sorted integer vector). It should probably just return NA for any source object. ~~ Michael Sannella [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] error unserializing ascii format (v2 or v3)
I ran into an interesting error unserializing a file created with ascii=TRUE: R 3.5.1 (Windows or Linux): > unserialize(serialize(list(raw=as.raw(c(39,41))), NULL, version=2, ascii=TRUE)) Error in unserialize(serialize(list(raw = as.raw(c(39, 41))), NULL, version = 2, : ReadItem: unknown type 29, perhaps written by later version of R The same error happens when the serialization is done with version=2 or version=3. It does not happen if the serialization is done with ascii=FALSE. Note that 0x29 == 41. It looks like unserialize is reading the wrong line. I tried this in earlier versions of R on Windows, and the same error happens in every version from R-2.15.3 (the earliest I have) on up. ~~ Michael Sannella [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] R_ext/Altrep.h should be more C++-friendly
I am not able to #include "R_ext/Altrep.h" from a C++ file. I think it needs two changes: 1. add the same __cplusplus check as most of the other header files: #ifdef __cplusplus extern "C" { #endif ... #ifdef __cplusplus } #endif 2. change the line R_new_altrep(R_altrep_class_t class, SEXP data1, SEXP data2); to R_new_altrep(R_altrep_class_t cls, SEXP data1, SEXP data2); since C++ doesn't like an argument named 'class' ~~ Michael Sannella [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] bug with OutDec option and deferred_string altrep object
While implementing R's new 'altrep' functionality in the TERR engine, I discovered a bug in R's 'deferred_string' altrep object: it is not using the correct value of the 'OutDec' option when it expands a deferred_string. See the following example: R 3.5.1: (same results in R 3.6.0 devel engine built 10/5) > options(scipen=0, OutDec=".") > as.character(123.456) [1] "123.456" > options(scipen=-5, OutDec=",") > as.character(123.456) [1] "1,23456e+02" > xx <- as.character(123.456) > options(scipen=0, OutDec=".") > xx [1] "1.23456e+02" > In the example above, the variable 'xx' is set to a deferred_string while OutDec is ','. However, when the string is actually formatted (when xx is printed), it uses the current option value OutDec='.' to format the string. I think that deferred_string should use the value OutDec=',' from when as.character was called. Note that the behavior is different with the 'scipen' option: The deferred_string object records the scipen=-5 value when as.character is called, and uses this value when xx is printed. Looking at the deferred_string object, it appears that CDR(R_altrep_data1()) is set to a scalar integer containing the scipen value at the time the deferred_string was created. Ideally, the deferred_string object would save both the scipen and OutDec option values. I'd suggest saving these values as regular pairlist values, say by setting the data1 field to pairlist(, scipen=-5L, OutDec=',') for the value of xx above. To save space, you could avoid saving these values in the common case where scipen=0L, OutDec='.'. It would also be better if the data1 field was a well-formed pairlist; the current value of the data1 field causes R_inspect to segfault. I understand that you probably wouldn't want to change the deferred_string structure. An alternative fix would be to avoid this case by: 1. Never create a deferred_string if OutDec is not '.'. 2. When expanding an element of a deferred_string, temporarily set OutDec to '.'. ~~ Michael Sannella [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel