Dear R developers, I have discovered a bug in the implementation of lzma decompression in memDecompress(). It is only triggered if the uncompressed size of the content is more than 3 times as large as the compressed content. Here's a simple example to reproduce it:
n <- 200 char <- paste(replicate(n, "1234567890"), collapse="") char.comp <- memCompress(char, type="xz") char.dec <- memDecompress(char.comp, type="xz", asChar=TRUE) nchar(char.dec) == nchar(char) raw <- serialize(char, connection=NULL) raw.comp <- memCompress(raw, type="xz") raw.dec <- memDecompress(raw.comp, type="xz") length(raw.dec) == length(raw) char.uns <- unserialize(raw.dec) The root cause seems to be, that lzma_code() will return LZMA_OK even if it could not decompress the whole content. In this case strm.avail_in will be greater than zero. The following patch changes the respective if statements: http://www.statistik.tu-dortmund.de/~olafm/temp/memdecompress.patch It also contains a small fix from the xz upstream for an uninitialized field in lzma_stream. Cheers, Olaf ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel