Thank you, Henrik! This saves us a lot of time! Uwe
Henrik Bengtsson wrote: > On 01/01/2008, Henrik Bengtsson <[EMAIL PROTECTED]> wrote: >> Also make sure the problem is not due to downloading a gzip file in >> text mode, because to the best of my understanding that is platform >> dependent. That is, use download.file(..., mode="wb") instead of the >> default, which is mode="w". (This is such a common error that I would >> like to suggest mode="wb" to become the default.) > > Ok, that solves the problem with your example file. On WinXP/R v2.6.1: > >> library(R.utils) >> uri <- >> "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_matrix.txt.gz" > >> download.file(uri, "test.txt.gz") # mode="w" > trying URL > 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_ma > trix.txt.gz' > ftp data connection made, file length 918804 bytes > opened URL > downloaded 897 Kb >> file.info("test.txt.gz")$size > [1] 922243 > >> download.file(uri, "test2.txt.gz") > ftp data connection made, file length 918804 bytes > opened URL > downloaded 897 Kb >> file.info("test2.txt.gz")$size > [1] 918804 > >> gunzip("test.txt.gz") > Error in readBin(inn, what = raw(0), size = 1, n = BFR.SIZE) : > negative length vectors are not allowed >> gunzip("test2.txt.gz") >> file.info("test2.txt")$size > [1] 3338362 > > /H > >> /Henrik >> >> On 01/01/2008, Uwe Ligges <[EMAIL PROTECTED]> wrote: >>> I see. It is either a bug or something related to the following >>> paragraph from ?seek: >>> >>> We have found so many errors in the Windows implementation of file >>> positioning that users are advised to use it only at their own >>> risk, and asked not to waste the R developers' time with bug >>> reports on Windows' deficiencies. >>> >>> I will investigate more closely when I am back in office end of this week. >>> >>> Best, >>> Uwe >>> >>> >>> >>> >>> Sean Davis wrote: >>>> Sorry, Uwe. Of course: >>>> >>>> Both in relatively recent R-devel (one mac, one windows): >>>> >>>> ### gunzip pulled from R.utils to be a simple function >>>> ### In R.utils, implemented as a method >>>> gunzip <- function(filename, destname=gsub("[.]gz$", "", filename), >>>> overwrite=FALSE, remove=TRUE, BFR.SIZE=1e7) { >>>> if (filename == destname) >>>> stop(sprintf("Argument 'filename' and 'destname' are identical: %s", >>>> filename)); >>>> if (!overwrite && file.exists(destname)) >>>> stop(sprintf("File already exists: %s", destname)); >>>> >>>> inn <- gzfile(filename, "rb"); >>>> on.exit(if (!is.null(inn)) close(inn)); >>>> >>>> out <- file(destname, "wb"); >>>> on.exit(close(out), add=TRUE); >>>> >>>> nbytes <- 0; >>>> repeat { >>>> bfr <- readBin(inn, what=raw(0), size=1, n=BFR.SIZE); >>>> n <- length(bfr); >>>> if (n == 0) >>>> break; >>>> nbytes <- nbytes + n; >>>> writeBin(bfr, con=out, size=1); >>>> }; >>>> >>>> if (remove) { >>>> close(inn); >>>> inn <- NULL; >>>> file.remove(filename); >>>> } >>>> >>>> invisible(nbytes); >>>> } >>>> download.file(' >>>> ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_matrix.txt.gz','test.txt.gz' >>>> <ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_matrix.txt.gz','test.txt.gz'>) >>>> gunzip('test.txt.gz') >>>> >>>> Under windows, this results in the error reported below. Under mac and >>>> linux, results in test.txt being created in the current working >>>> directory. The actual gunzip function is pretty bare bones, so I don't >>>> think it complicates matters much to use it in this example. >>>> >>>> Sean >>>> >>>> >>>> On Dec 31, 2007 1:24 PM, Uwe Ligges <[EMAIL PROTECTED] >>>> <mailto:[EMAIL PROTECTED]>> wrote: >>>> >>>> Can you give a reproducible example, pelase? >>>> >>>> Uwe Ligges >>>> >>>> >>>> Sean Davis wrote: >>>> > I have been trying to use the gunzip function in the R.utils >>>> package. It >>>> > opens a connection to a gzfile, uses readBin to read from that >>>> connection, >>>> > and then uses writeBin to write out the raw data to a new file. >>>> This works >>>> > as expected under linux/mac, but under Windows, I get: >>>> > >>>> > Error in readBin(inn, what= raw(0), size = 1, n=BFR.SIZE) : >>>> > negative length vectors are not allowed >>>> > >>>> > A simple traceback shows the error in readBin. I wouldn't be >>>> surprised if >>>> > this is a programming issue not located in readBin, but I am >>>> confused about >>>> > the difference in behaviors on Windows versus mac/linux. Any >>>> insight into >>>> > what I can do to remedy the issue and have a cross-platform >>>> gunzip()? >>>> > >>>> > Thanks, >>>> > Sean >>>> > >>>> > [[alternative HTML version deleted]] >>>> > >>>> > ______________________________________________ >>>> > R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list >>>> > https://stat.ethz.ch/mailman/listinfo/r-devel >>>> >>>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel