Thanks for the replies. I take the point, although it does seem like a substantial regression (on non-Windows platforms).
I like to keep the external dependencies of my packages minimal, but I will look into the mmap package - thanks, Jeff, for the tip. Aside from that, though, what is the alternative to using seek? If I want to read something at (original, uncompressed) byte offset 352, as here, do I have to read and discard everything that comes before it first? That seems inelegant at best... Regards, Jon On 23 September 2011 16:54, Jeffrey Ryan <jeffrey.r...@lemnica.com> wrote: > seek() in general is a bad idea IMO if you are writing cross-platform code. > > ?seek > > Warning: > > Use of ‘seek’ on Windows is discouraged. We have found so many > errors in the Windows implementation of file positioning that > users are advised to use it only at their own risk, and asked not > to waste the R developers' time with bug reports on Windows' > deficiencies. > > Aside from making me laugh, the above highlights the core reason to not use > IMO. > > For not zipped files, you can try the mmap package. ?mmap and ?types > are good starting points. Allows for accessing binary data on disk > with very simple R-like semantics, and is very fast. Not as fast as a > sequential read... but fast. At present this is 'little endian' only > though, but that describes most of the world today. > > Best, > Jeff > > On Fri, Sep 23, 2011 at 8:58 AM, Jon Clayden <jon.clay...@gmail.com> wrote: >> Dear all, >> >> In R-devel (2011-09-23 r57050), I'm running into a serious problem >> with seek()ing on connections opened with gzfile(). A warning is >> generated and the file position does not seek to the requested >> location. It doesn't seem to occur all the time - I tried to create a >> small example file to illustrate it, but the problem didn't occur. >> However, it can be seen with a file I use for testing my packages, >> which is available through the URL >> <https://github.com/jonclayden/tractor/blob/master/tests/data/nifti/maskedb0_lia.nii.gz?raw=true>: >> >>> con <- gzfile("~/Downloads/maskedb0_lia.nii.gz","rb") >>> seek(con, 352) >> [1] 0 >> Warning message: >> In seek.connection(con, 352) : >> seek on a gzfile connection returned an internal error >>> seek(con, NA) >> [1] 190 >> >> The same commands with the same file work as expected in R 2.13.1, and >> have worked over many previous versions of R. >> >>> con <- gzfile("~/Downloads/maskedb0_lia.nii.gz","rb") >>> seek(con, 352) >> [1] 0 >>> seek(con, NA) >> [1] 352 >> >> My sessionInfo() output is: >> >> R Under development (unstable) (2011-09-23 r57050) >> Platform: x86_64-apple-darwin11.1.0 (64-bit) >> >> locale: >> [1] en_GB.UTF-8/en_US.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 >> >> attached base packages: >> [1] splines stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] tractor.nt_2.0.1 tractor.session_2.0.3 tractor.utils_2.0.0 >> [4] tractor.base_2.0.3 reportr_0.2.0 >> >> This seems to occur whether or not R is compiled with >> "--with-system-zlib". I see some zlib-related changes mentioned in the >> NEWS, but I don't see any indication that this is expected. Could >> anyone shed any light on it, please? >> >> Thanks and all the best, >> Jon >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > > > -- > Jeffrey Ryan > jeffrey.r...@lemnica.com > > www.lemnica.com > www.esotericR.com > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel