Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-12 Thread Joris Meys
I can confirm that Excel does all kind of strange things when opening a csv
file and saving it from Excel, including adding unnecessarily another set
of quotes around already qouted text fields. But I never had problems with
Excel not getting linux-type line endings correctly. I'll see if I can make
Excel mess it up, but given the amount of excel crap I had to endure over
the years, I'd be surprised if I missed such behaviour until now.

Cheers
Joris

On Wed, May 9, 2018 at 3:09 PM, peter dalgaard  wrote:

> There was a hint in the Twitterverse that Excel has issues with line
> endings in .csv. Can anyone elaborate on that? Then again, Excel goes
> belly-up on comma separators in central European locales anyway...
>
> -pd
>
> > On 8 May 2018, at 22:47 , Hadley Wickham  wrote:
> >
> >
> > Also note that MS just announced support for unix line endings in notepad
> >
> > https://blogs.msdn.microsoft.com/commandline/2018/05/08/
> extended-eol-in-notepad/
> >
> > Hadley
> >
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)


---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-09 Thread Dirk Eddelbuettel

On 9 May 2018 at 10:37, Tomas Kalibera wrote:
| And for that reason the behavior should be as intuitive as possible when 
| designed. What was intuitive 15-20 years ago may not be intuitive now, 
| but that should probably not be a justification for a change in 
| documented behavior.

Time for downloadFile() (or download_file()) to complement the existing
download.file() but providing what we now think of as intuitive behaviour?

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-09 Thread peter dalgaard
There was a hint in the Twitterverse that Excel has issues with line endings in 
.csv. Can anyone elaborate on that? Then again, Excel goes belly-up on comma 
separators in central European locales anyway...

-pd

> On 8 May 2018, at 22:47 , Hadley Wickham  wrote:
> 
> 
> Also note that MS just announced support for unix line endings in notepad
> 
> https://blogs.msdn.microsoft.com/commandline/2018/05/08/extended-eol-in-notepad/
> 
> Hadley
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-09 Thread Duncan Murdoch

On 08/05/2018 4:47 PM, Hadley Wickham wrote:

On Tue, May 8, 2018 at 8:15 AM, Hadley Wickham  wrote:

On Thu, May 3, 2018 at 11:34 PM, Tomas Kalibera
 wrote:

On 05/03/2018 11:14 PM, Henrik Bengtsson wrote:


Also, as mentioned in my
https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, when
not specifying the mode argument, the default on Windows is mode = "w"
*except* for certain, case-sensitive, filename extensions:

  if(missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$",
url)))
  mode <- "wb"

Just like the need for mode = "wb" on Windows, the above
special-file-extension-hack is only happening on Windows, and is only
documented in ?download.file if you're on Windows; so someone who's on
Linux/macOS trying to help someone on Windows may not be aware of
this. This adds to even more confusions, e.g. "works for me".


If we were designing the API today, it would probably make more sense not to
convert any line endings by default. Today's editors _usually_ can cope with
different line endings and it is probably easier to detect that a text file
has incorrect line endings rather than detecting that a binary file has been
corrupted by an attempt to convert line endings. But whether to change
existing, documented behavior is a different question. In order to help
users and programmers who do not read the documentation carefully we would
create problems for users and programmers who do. The current heuristic/hack
is in line with the compatibility approach: it detects files that are
obviously binary, so it changes the default behavior only for cases when it
would obviously cause damage.


 From a purely utilitarian standpoint, there are far more users who do
not carefully read the documentation than users who do ;)

(I'd also argue that basing the decision on the file extension is
suboptimal, and it would be better to use the mime type if provided by
the server)


Also note that MS just announced support for unix line endings in notepad

https://blogs.msdn.microsoft.com/commandline/2018/05/08/extended-eol-in-notepad/


Perhaps soon RStudio will follow Notepad's lead, and not convert line 
endings when it saves a non-native file.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-09 Thread Tomas Kalibera

On 05/08/2018 05:15 PM, Hadley Wickham wrote:

On Thu, May 3, 2018 at 11:34 PM, Tomas Kalibera
 wrote:

On 05/03/2018 11:14 PM, Henrik Bengtsson wrote:

Also, as mentioned in my
https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, when
not specifying the mode argument, the default on Windows is mode = "w"
*except* for certain, case-sensitive, filename extensions:

  if(missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$",
url)))
  mode <- "wb"

Just like the need for mode = "wb" on Windows, the above
special-file-extension-hack is only happening on Windows, and is only
documented in ?download.file if you're on Windows; so someone who's on
Linux/macOS trying to help someone on Windows may not be aware of
this. This adds to even more confusions, e.g. "works for me".

If we were designing the API today, it would probably make more sense not to
convert any line endings by default. Today's editors _usually_ can cope with
different line endings and it is probably easier to detect that a text file
has incorrect line endings rather than detecting that a binary file has been
corrupted by an attempt to convert line endings. But whether to change
existing, documented behavior is a different question. In order to help
users and programmers who do not read the documentation carefully we would
create problems for users and programmers who do. The current heuristic/hack
is in line with the compatibility approach: it detects files that are
obviously binary, so it changes the default behavior only for cases when it
would obviously cause damage.

 From a purely utilitarian standpoint, there are far more users who do
not carefully read the documentation than users who do ;)
And for that reason the behavior should be as intuitive as possible when 
designed. What was intuitive 15-20 years ago may not be intuitive now, 
but that should probably not be a justification for a change in 
documented behavior.

(I'd also argue that basing the decision on the file extension is
suboptimal, and it would be better to use the mime type if provided by
the server)
Yes, that would be nice. Also some binary files could be detected via 
magic numbers (yet not all, e.g. RDS do not have them). It won't be as 
trivial as decoding the URL, though.


Tomas



Hadley



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-08 Thread Hadley Wickham
On Tue, May 8, 2018 at 8:15 AM, Hadley Wickham  wrote:
> On Thu, May 3, 2018 at 11:34 PM, Tomas Kalibera
>  wrote:
>> On 05/03/2018 11:14 PM, Henrik Bengtsson wrote:
>>>
>>> Also, as mentioned in my
>>> https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, when
>>> not specifying the mode argument, the default on Windows is mode = "w"
>>> *except* for certain, case-sensitive, filename extensions:
>>>
>>>  if(missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$",
>>> url)))
>>>  mode <- "wb"
>>>
>>> Just like the need for mode = "wb" on Windows, the above
>>> special-file-extension-hack is only happening on Windows, and is only
>>> documented in ?download.file if you're on Windows; so someone who's on
>>> Linux/macOS trying to help someone on Windows may not be aware of
>>> this. This adds to even more confusions, e.g. "works for me".
>>
>> If we were designing the API today, it would probably make more sense not to
>> convert any line endings by default. Today's editors _usually_ can cope with
>> different line endings and it is probably easier to detect that a text file
>> has incorrect line endings rather than detecting that a binary file has been
>> corrupted by an attempt to convert line endings. But whether to change
>> existing, documented behavior is a different question. In order to help
>> users and programmers who do not read the documentation carefully we would
>> create problems for users and programmers who do. The current heuristic/hack
>> is in line with the compatibility approach: it detects files that are
>> obviously binary, so it changes the default behavior only for cases when it
>> would obviously cause damage.
>
> From a purely utilitarian standpoint, there are far more users who do
> not carefully read the documentation than users who do ;)
>
> (I'd also argue that basing the decision on the file extension is
> suboptimal, and it would be better to use the mime type if provided by
> the server)

Also note that MS just announced support for unix line endings in notepad

https://blogs.msdn.microsoft.com/commandline/2018/05/08/extended-eol-in-notepad/

Hadley

-- 
http://hadley.nz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-08 Thread Hadley Wickham
On Thu, May 3, 2018 at 11:34 PM, Tomas Kalibera
 wrote:
> On 05/03/2018 11:14 PM, Henrik Bengtsson wrote:
>>
>> Also, as mentioned in my
>> https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, when
>> not specifying the mode argument, the default on Windows is mode = "w"
>> *except* for certain, case-sensitive, filename extensions:
>>
>>  if(missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$",
>> url)))
>>  mode <- "wb"
>>
>> Just like the need for mode = "wb" on Windows, the above
>> special-file-extension-hack is only happening on Windows, and is only
>> documented in ?download.file if you're on Windows; so someone who's on
>> Linux/macOS trying to help someone on Windows may not be aware of
>> this. This adds to even more confusions, e.g. "works for me".
>
> If we were designing the API today, it would probably make more sense not to
> convert any line endings by default. Today's editors _usually_ can cope with
> different line endings and it is probably easier to detect that a text file
> has incorrect line endings rather than detecting that a binary file has been
> corrupted by an attempt to convert line endings. But whether to change
> existing, documented behavior is a different question. In order to help
> users and programmers who do not read the documentation carefully we would
> create problems for users and programmers who do. The current heuristic/hack
> is in line with the compatibility approach: it detects files that are
> obviously binary, so it changes the default behavior only for cases when it
> would obviously cause damage.

>From a purely utilitarian standpoint, there are far more users who do
not carefully read the documentation than users who do ;)

(I'd also argue that basing the decision on the file extension is
suboptimal, and it would be better to use the mime type if provided by
the server)

Hadley

-- 
http://hadley.nz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-07 Thread Gabe Becker
Hey all,

I don't have a strong opinion about whether the default should ultimately
eventually change or not. Many people who use windows (a set which does not
include me) seem to think it would be better.

I will say that like Hugh, I'm strongly against making the argument
mandatory as an interim step. That is much less backwards compatible (ie it
will break much more existing code) than just changing the default would. I
would be for smarter heuristics, perhaps a warning, and eventually a change
instead if the change is ultimately decided on as the way forward.

Best,
~G

On Mon, May 7, 2018 at 5:32 AM, Hugh Parsonage 
wrote:

> I'd add my support for mode = "wb" to (eventually) become the default,
> though I respect Tomas's comments about backwards-compatibility.
>
> Instead of making the argument mandatory (which would immediately
> break scripts -- even ones that won't be helped by changing to mode =
> 'wb') or otherwise changing behaviour, perhaps download.file could
> start to emit a message (not a warning) whenever the argument is
> missing on Windows. The message could say something like 'Using `mode
> = 'w'` which will corrupt non-text files. Set `mode = 'wb'` for binary
> downloads or see the help page for other options.' Emitting a message
> has the lightest impact on existing scripts, while alerting new users
> to future mistakes.
>
> On 7 May 2018 at 18:49, Joris Meys  wrote:
> > Martin, also from me a heartfelt thank you for taking care of this. Some
> > thoughts on Henrik's response:
> >
> > On Mon, May 7, 2018 at 2:28 AM, Henrik Bengtsson <
> henrik.bengts...@gmail.com
> >> wrote:
> >
> >>
> >> I still argue that the current behavior cause more harm than it helps.
> >>
> >
> > I agree with your analysis of the problems this legacy behaviour causes.
> >
> > Deprecating the default mode="w" on Windows can be done in steps, e.g.
> >> by making the argument mandatory for a while. This could be done on
> >> all platforms because we're already all affected, i.e. we need to
> >> specify 'mode' to avoid surprises.
> >>
> >
> > That sounds like a reasonable way to move away from this discrepancy
> > between OS.
> >
> >
> >> What about case-insensitive matching, e.g. data.ZIP and data.Rdata?
> >>
> >
> > Totally agree, and easily solved by eg adding ignore.case = TRUE to the
> > grep() call.
> >
> >
> >> A quick scan of the R source code suggests that R is also working with
> >> the following filename extensions (using various case styles):
> >>
> >> What about all the other file extensions that we know for sure are
> binary?
> >>
> >
> > If the default isn't changed, doesn't it make more sense to actually turn
> > the logic around? Text files that are downloaded over the internet are
> > almost always .txt, .csv, or a few other extensions used for text data .
> > Those are actually the only files where some people with very old Windows
> > programs for text processing can get into trouble. So instead of adding
> > every possible binary extension, one can put "wb" as default and change
> to
> > "w" if it is a text file instead of the other way around. That would not
> > change the concept of the behaviour, but ensures that the function
> doesn't
> > fail to detect a binary file. Not detecting a text file is far less of a
> > problem, as not converting the line endings doesn't destruct the file.
> >
> > Cheers
> > Joris
> >
> > --
> > Joris Meys
> > Statistical consultant
> >
> > Department of Data Analysis and Mathematical Modelling
> > Ghent University
> > Coupure Links 653, B-9000 Gent (Belgium)
> >  9000+Gent,%C2%A0Belgium=gmail=g>
> >
> > ---
> > Biowiskundedagen 2017-2018
> > http://www.biowiskundedagen.ugent.be/
> >
> > ---
> > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>


-- 
Gabriel Becker, Ph.D
Scientist
Bioinformatics and Computational Biology
Genentech Research

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-07 Thread Hugh Parsonage
I'd add my support for mode = "wb" to (eventually) become the default,
though I respect Tomas's comments about backwards-compatibility.

Instead of making the argument mandatory (which would immediately
break scripts -- even ones that won't be helped by changing to mode =
'wb') or otherwise changing behaviour, perhaps download.file could
start to emit a message (not a warning) whenever the argument is
missing on Windows. The message could say something like 'Using `mode
= 'w'` which will corrupt non-text files. Set `mode = 'wb'` for binary
downloads or see the help page for other options.' Emitting a message
has the lightest impact on existing scripts, while alerting new users
to future mistakes.

On 7 May 2018 at 18:49, Joris Meys  wrote:
> Martin, also from me a heartfelt thank you for taking care of this. Some
> thoughts on Henrik's response:
>
> On Mon, May 7, 2018 at 2:28 AM, Henrik Bengtsson > wrote:
>
>>
>> I still argue that the current behavior cause more harm than it helps.
>>
>
> I agree with your analysis of the problems this legacy behaviour causes.
>
> Deprecating the default mode="w" on Windows can be done in steps, e.g.
>> by making the argument mandatory for a while. This could be done on
>> all platforms because we're already all affected, i.e. we need to
>> specify 'mode' to avoid surprises.
>>
>
> That sounds like a reasonable way to move away from this discrepancy
> between OS.
>
>
>> What about case-insensitive matching, e.g. data.ZIP and data.Rdata?
>>
>
> Totally agree, and easily solved by eg adding ignore.case = TRUE to the
> grep() call.
>
>
>> A quick scan of the R source code suggests that R is also working with
>> the following filename extensions (using various case styles):
>>
>> What about all the other file extensions that we know for sure are binary?
>>
>
> If the default isn't changed, doesn't it make more sense to actually turn
> the logic around? Text files that are downloaded over the internet are
> almost always .txt, .csv, or a few other extensions used for text data .
> Those are actually the only files where some people with very old Windows
> programs for text processing can get into trouble. So instead of adding
> every possible binary extension, one can put "wb" as default and change to
> "w" if it is a text file instead of the other way around. That would not
> change the concept of the behaviour, but ensures that the function doesn't
> fail to detect a binary file. Not detecting a text file is far less of a
> problem, as not converting the line endings doesn't destruct the file.
>
> Cheers
> Joris
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
> 
>
> ---
> Biowiskundedagen 2017-2018
> http://www.biowiskundedagen.ugent.be/
>
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-07 Thread Joris Meys
Martin, also from me a heartfelt thank you for taking care of this. Some
thoughts on Henrik's response:

On Mon, May 7, 2018 at 2:28 AM, Henrik Bengtsson  wrote:

>
> I still argue that the current behavior cause more harm than it helps.
>

I agree with your analysis of the problems this legacy behaviour causes.

Deprecating the default mode="w" on Windows can be done in steps, e.g.
> by making the argument mandatory for a while. This could be done on
> all platforms because we're already all affected, i.e. we need to
> specify 'mode' to avoid surprises.
>

That sounds like a reasonable way to move away from this discrepancy
between OS.


> What about case-insensitive matching, e.g. data.ZIP and data.Rdata?
>

Totally agree, and easily solved by eg adding ignore.case = TRUE to the
grep() call.


> A quick scan of the R source code suggests that R is also working with
> the following filename extensions (using various case styles):
>
> What about all the other file extensions that we know for sure are binary?
>

If the default isn't changed, doesn't it make more sense to actually turn
the logic around? Text files that are downloaded over the internet are
almost always .txt, .csv, or a few other extensions used for text data .
Those are actually the only files where some people with very old Windows
programs for text processing can get into trouble. So instead of adding
every possible binary extension, one can put "wb" as default and change to
"w" if it is a text file instead of the other way around. That would not
change the concept of the behaviour, but ensures that the function doesn't
fail to detect a binary file. Not detecting a text file is far less of a
problem, as not converting the line endings doesn't destruct the file.

Cheers
Joris

-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)


---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-06 Thread Henrik Bengtsson
Thanks for the comments, feedback, and improvements.

I still argue that the current behavior cause more harm than it helps.

First of all, it increases the risk for code that does not work on all
platforms, which I'd say is one of the strengths and design goals of
R.  To write cross-platform code, a developer basically needs to
specify argument 'mode'.

A second problem is that people who work on non-Windows platforms will
not be aware of this problem.  Yes, adding this Windows-specific
behavior to the help on all platforms will help a bit (thanks for
doing that).  However, since there are so many non-Windows users out
there that write documentation, vignettes, blog posts, host classes
and workshops, it is quite likely that you'll see things like
"Download the data file using `download.file(url, file)` and then
...".  Boom, a "beginner" on Windows will have problems and even the
non-Windows instructor may not know what's going and quickly lots of
time is wasted.

A third problem is wasted bandwidth because the same file has to be
downloaded a second time.  If the default is changed to mode="wb" and
someone truly needs mode="w", the penalty should be smaller because
such text-based files are likely to be much smaller than binary files,
which are often several GiB these days.

What could lower the risk for the above,and help the user and helpers,
is to give an informative warning whenever 'mode' is not specified,
e.g.

   The file 'NNN' is downloaded as a text file (mode = "w"). If you
meant to download it as a binary file, specify mode = "wb".

Deprecating the default mode="w" on Windows can be done in steps, e.g.
by making the argument mandatory for a while. This could be done on
all platforms because we're already all affected, i.e. we need to
specify 'mode' to avoid surprises.

Even if the default won't change, below are some more
comments/observations that is related to the current implementation of
download.file() on Windows:

ADD MORE EXTENSIONS?

What about case-insensitive matching, e.g. data.ZIP and data.Rdata?

A quick scan of the R source code suggests that R is also working with
the following filename extensions (using various case styles):

* Rbin (src/library/tools/R/install.R)
* rda, Rda (tests/reg-tests-1a.R)
* rdb (src/library/tools/R/install.R)
* rds, RDS, Rds (src/library/tools/R/install.R)
* rdx (src/library/tools/R/install.R)
* RData, Rdata, rdata (src/library/tools/R/install.R)

Should the tar extension also be added?

What about binary image formats that R produces, e.g. filename
extensions bmp, jpg, jpeg, pdf, png, tif, tiff?

What about all the other file extensions that we know for sure are binary?


VECTORIZATION:

For some value of the 'method' argument, the current implementation
will download the same file differently depending on other files
downloaded at the same time.  For example, here a PNG file is
downloaded in text mode and its content is translated:

> urls <- c("https://www.r-project.org/logo/Rlogo.png;)
> download.file(urls, destfile = basename(urls), method = "libcurl")
trying URL 'https://www.r-project.org/logo/Rlogo.png'
Content length 48148 bytes (47 KB)
downloaded 47 KB
> file.size(basename(urls))
[1] 48281

But if we throw in a "known" binary extension, the PNG file be
downloaded as binary:

> urls <- c("https://www.r-project.org/logo/Rlogo.png;, 
> "https://cran.r-project.org/bin/windows/contrib/3.6/future_1.8.1.zip;)
> download.file(urls, destfile = basename(urls), method = "libcurl")
trying URL 'https://www.r-project.org/logo/Rlogo.png'
trying URL 'https://cran.r-project.org/bin/windows/contrib/3.6/future_1.8.1.zip'
> file.size(basename(urls))
[1]  48148 527069

Best,

Henrik

On Fri, May 4, 2018 at 1:18 AM, Martin Maechler
 wrote:
>> Joris Meys 
>> on Fri, 4 May 2018 10:00:07 +0200 writes:
>
> > On Fri, May 4, 2018 at 8:34 AM, Tomas Kalibera
> >  wrote:
>
> >> The current heuristic/hack is in line with the
> >> compatibility approach: it detects files that are
> >> obviously binary, so it changes the default behavior only
> >> for cases when it would obviously cause damage.
> >>
> >> Tomas
>
>
> > Well, I was trying to download a .gz file and
> > download.file() didn't detect that. Reason for that is
> > obviously that the link doesn't contain .gz but %2Egz ,
> > using the ASCII code for the dot instead of the dot
> > itself. That's general practice in a lot of links.
>
> > Hence I propose to change the line in download.file() that
> > does this check to:
>
> >   if (missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$",
> >   URLdecode(url
>
> > using URLdecode() ensures that .gz, .RData etc will be
> > detected correctly in an encoded URL.
>
> > Cheers Joris
>
> Makes sense to me and I plan to add it when also adding '.rds'
>
> { OTOH, after reading the thread about this: 

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-04 Thread Martin Maechler
> Joris Meys 
> on Fri, 4 May 2018 10:00:07 +0200 writes:

> On Fri, May 4, 2018 at 8:34 AM, Tomas Kalibera
>  wrote:

>> The current heuristic/hack is in line with the
>> compatibility approach: it detects files that are
>> obviously binary, so it changes the default behavior only
>> for cases when it would obviously cause damage.
>> 
>> Tomas


> Well, I was trying to download a .gz file and
> download.file() didn't detect that. Reason for that is
> obviously that the link doesn't contain .gz but %2Egz ,
> using the ASCII code for the dot instead of the dot
> itself. That's general practice in a lot of links.

> Hence I propose to change the line in download.file() that
> does this check to:

>   if (missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$",
>   URLdecode(url

> using URLdecode() ensures that .gz, .RData etc will be
> detected correctly in an encoded URL.

> Cheers Joris

Makes sense to me and I plan to add it when also adding '.rds'

{ OTOH, after reading the thread about this: Shouldn't you make
  your code more robust and use   mode = "wb" (or "ab") in any case?
  ;-)
}
 
Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-04 Thread Joris Meys
On Fri, May 4, 2018 at 8:34 AM, Tomas Kalibera 
wrote:

> The current heuristic/hack is in line with the compatibility approach: it
> detects files that are obviously binary, so it changes the default behavior
> only for cases when it would obviously cause damage.
>
> Tomas


Well, I was trying to download a .gz file and download.file() didn't detect
that. Reason for that is obviously that the link doesn't contain .gz but
%2Egz , using the ASCII code for the dot instead of the dot itself. That's
general practice in a lot of links.

Hence I propose to change the line in download.file() that does this check
to:

  if (missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$",
   URLdecode(url

using URLdecode() ensures that .gz, .RData etc will be detected correctly
in an encoded URL.

Cheers
Joris

-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)


---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-04 Thread Martin Maechler
> Tomas Kalibera 
> on Fri, 4 May 2018 08:34:03 +0200 writes:

> On 05/03/2018 11:14 PM, Henrik Bengtsson wrote:
>> Also, as mentioned in my
>> https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html,
>> when not specifying the mode argument, the default on
>> Windows is mode = "w" *except* for certain,
>> case-sensitive, filename extensions:
>> 
>> if(missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$", 
url)))
>>  mode <- "wb"
>> 
>> Just like the need for mode = "wb" on Windows, the above
>> special-file-extension-hack is only happening on Windows,
>> and is only documented in ?download.file if you're on
>> Windows; so someone who's on Linux/macOS trying to help
>> someone on Windows may not be aware of this. This adds to
>> even more confusions, e.g. "works for me".

> If we were designing the API today, it would probably make
> more sense not to convert any line endings by
> default. Today's editors _usually_ can cope with different
> line endings and it is probably easier to detect that a
> text file has incorrect line endings rather than detecting
> that a binary file has been corrupted by an attempt to
> convert line endings.  But whether to change existing,
> documented behavior is a different question. In order to
> help users and programmers who do not read the
> documentation carefully we would create problems for users
> and programmers who do. 

> The current heuristic/hack is in
> line with the compatibility approach: it detects files
> that are obviously binary, so it changes the default
> behavior only for cases when it would obviously cause
> damage.

> Tomas


Thank you, Tomas;  I was about to say something similar but
probably less convincingly. 

There's one thing I strongly agree with Henrik:  The
only-on-Windows documented Windows behavior should be documented
on all platforms.

I'll update the help page,

and will also add the .rds extension to the above list
[ --- yes, we all should use saveRDS() and readRDS() whenever
  sensible in favor of save() and load() ]

Martin


>> /Henrik
>> 
>> On Thu, May 3, 2018 at 7:27 AM, Joris Meys
>>  wrote:
>>> Thank you Henrik and Martin for explaining what was
>>> going on. Very insightful!
>>> 
>>> On Thu, May 3, 2018 at 4:21 PM, Jeroen Ooms
>>>  wrote:
 On Thu, May 3, 2018 at 2:42 PM, Henrik Bengtsson
  wrote:
> Use mode="wb" when you download the file. See
> https://github.com/HenrikBengtsson/Wishlist-for-R/issues/30.
> 
> R core, and others, is there a good argument for why
> we are not making this the default download mode? It
> seems like a such a simple fix to such a common
> "mistake".
 I'd like to second this feature request. This default
 behaviour is unexpected and often leads to r scripts
 that were written on mac/linux, to produce corrupted
 files on windows, checksum mismatches, etc.
 
 Even for text files, the default should be to download
 the file as-is.  Trying to "fix" line-endings should be
 opt-in, never the default.  Downloading a file via a
 browser or ftp client on windows also doesn't change
 the file, why should R?
>>> 
>>> I third the feature request.
>>> 
 
 
 On Thu, May 3, 2018 at 3:02 PM, Duncan Murdoch
  wrote:
> Many downloads are text files (HTML, CSV, etc.), and
> if those are downloaded in binary, a Windows user
> might end up with a file that Notepad can't handle,
> because it would have Unix-style line endings.
 True but I don't think this is relevant. The same holds
 e.g. for the R files in source packages, which also
 have unix line endings. Most Windows users will use an
 actual editor that understands both types of line
 endings, or can convert between the two.
 
 Downloading-file should do just that.
>>> 
>>> Again, I agree. In my (limited) experience the only
>>> program that fails to properly display \n as a line
>>> ending, is Notepad. But it can still open the file
>>> regardless. If line ending conflicts cause bugs, it's
>>> almost always a unix-like OS struggling with
>>> Windows-style endings. I have yet to meet the first one
>>> the other way around.
>>> 
>>> Cheers Joris
>>> 
>>> 
>>> --
>>> Joris Meys Statistical consultant
>>> 
>>> Department of Data Analysis and Mathematical Modelling
>>> Ghent University Coupure Links 653, B-9000 Gent
>>> (Belgium)
>>> 
>>> ---
>>> Biowiskundedagen 2017-2018

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-04 Thread Tomas Kalibera

On 05/03/2018 11:14 PM, Henrik Bengtsson wrote:

Also, as mentioned in my
https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, when
not specifying the mode argument, the default on Windows is mode = "w"
*except* for certain, case-sensitive, filename extensions:

 if(missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$", url)))
 mode <- "wb"

Just like the need for mode = "wb" on Windows, the above
special-file-extension-hack is only happening on Windows, and is only
documented in ?download.file if you're on Windows; so someone who's on
Linux/macOS trying to help someone on Windows may not be aware of
this. This adds to even more confusions, e.g. "works for me".
If we were designing the API today, it would probably make more sense 
not to convert any line endings by default. Today's editors _usually_ 
can cope with different line endings and it is probably easier to detect 
that a text file has incorrect line endings rather than detecting that a 
binary file has been corrupted by an attempt to convert line endings. 
But whether to change existing, documented behavior is a different 
question. In order to help users and programmers who do not read the 
documentation carefully we would create problems for users and 
programmers who do. The current heuristic/hack is in line with the 
compatibility approach: it detects files that are obviously binary, so 
it changes the default behavior only for cases when it would obviously 
cause damage.


Tomas




/Henrik

On Thu, May 3, 2018 at 7:27 AM, Joris Meys  wrote:

Thank you Henrik and Martin for explaining what was going on. Very
insightful!

On Thu, May 3, 2018 at 4:21 PM, Jeroen Ooms  wrote:

On Thu, May 3, 2018 at 2:42 PM, Henrik Bengtsson
 wrote:

Use mode="wb" when you download the file. See
https://github.com/HenrikBengtsson/Wishlist-for-R/issues/30.

R core, and others, is there a good argument for why we are not making
this
the default download mode? It seems like a such a simple fix to such a
common "mistake".

I'd like to second this feature request. This default behaviour is
unexpected and often leads to r scripts that were written on
mac/linux, to produce corrupted files on windows, checksum mismatches,
etc.

Even for text files, the default should be to download the file as-is.
Trying to "fix" line-endings should be opt-in, never the default.
Downloading a file via a browser or ftp client on windows also doesn't
change the file, why should R?


I third the feature request.




On Thu, May 3, 2018 at 3:02 PM, Duncan Murdoch 
wrote:

Many downloads are text files (HTML, CSV, etc.), and if those are
downloaded
in binary, a Windows user might end up with a file that Notepad can't
handle, because it would have Unix-style line endings.

True but I don't think this is relevant. The same holds e.g. for the R
files in source packages, which also have unix line endings. Most
Windows users will use an actual editor that understands both types of
line endings, or can convert between the two.

Downloading-file should do just that.


Again, I agree. In my (limited) experience the only program that fails to
properly display \n as a line ending, is Notepad. But it can still open the
file regardless. If line ending conflicts cause bugs, it's almost always a
unix-like OS struggling with Windows-style endings. I have yet to meet the
first one the other way around.

Cheers
Joris


--
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)

---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Henrik Bengtsson
Also, as mentioned in my
https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, when
not specifying the mode argument, the default on Windows is mode = "w"
*except* for certain, case-sensitive, filename extensions:

if(missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$", url)))
mode <- "wb"

Just like the need for mode = "wb" on Windows, the above
special-file-extension-hack is only happening on Windows, and is only
documented in ?download.file if you're on Windows; so someone who's on
Linux/macOS trying to help someone on Windows may not be aware of
this. This adds to even more confusions, e.g. "works for me".

/Henrik

On Thu, May 3, 2018 at 7:27 AM, Joris Meys  wrote:
> Thank you Henrik and Martin for explaining what was going on. Very
> insightful!
>
> On Thu, May 3, 2018 at 4:21 PM, Jeroen Ooms  wrote:
>>
>> On Thu, May 3, 2018 at 2:42 PM, Henrik Bengtsson
>>  wrote:
>> > Use mode="wb" when you download the file. See
>> > https://github.com/HenrikBengtsson/Wishlist-for-R/issues/30.
>> >
>> > R core, and others, is there a good argument for why we are not making
>> > this
>> > the default download mode? It seems like a such a simple fix to such a
>> > common "mistake".
>>
>> I'd like to second this feature request. This default behaviour is
>> unexpected and often leads to r scripts that were written on
>> mac/linux, to produce corrupted files on windows, checksum mismatches,
>> etc.
>>
>> Even for text files, the default should be to download the file as-is.
>> Trying to "fix" line-endings should be opt-in, never the default.
>> Downloading a file via a browser or ftp client on windows also doesn't
>> change the file, why should R?
>
>
> I third the feature request.
>
>>
>>
>>
>> On Thu, May 3, 2018 at 3:02 PM, Duncan Murdoch 
>> wrote:
>> > Many downloads are text files (HTML, CSV, etc.), and if those are
>> > downloaded
>> > in binary, a Windows user might end up with a file that Notepad can't
>> > handle, because it would have Unix-style line endings.
>>
>> True but I don't think this is relevant. The same holds e.g. for the R
>> files in source packages, which also have unix line endings. Most
>> Windows users will use an actual editor that understands both types of
>> line endings, or can convert between the two.
>>
>> Downloading-file should do just that.
>
>
> Again, I agree. In my (limited) experience the only program that fails to
> properly display \n as a line ending, is Notepad. But it can still open the
> file regardless. If line ending conflicts cause bugs, it's almost always a
> unix-like OS struggling with Windows-style endings. I have yet to meet the
> first one the other way around.
>
> Cheers
> Joris
>
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
>
> ---
> Biowiskundedagen 2017-2018
> http://www.biowiskundedagen.ugent.be/
>
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Joris Meys
Thank you Henrik and Martin for explaining what was going on. Very
insightful!

On Thu, May 3, 2018 at 4:21 PM, Jeroen Ooms  wrote:

> On Thu, May 3, 2018 at 2:42 PM, Henrik Bengtsson
>  wrote:
> > Use mode="wb" when you download the file. See
> > https://github.com/HenrikBengtsson/Wishlist-for-R/issues/30.
> >
> > R core, and others, is there a good argument for why we are not making
> this
> > the default download mode? It seems like a such a simple fix to such a
> > common "mistake".
>
> I'd like to second this feature request. This default behaviour is
> unexpected and often leads to r scripts that were written on
> mac/linux, to produce corrupted files on windows, checksum mismatches,
> etc.
>
> Even for text files, the default should be to download the file as-is.
> Trying to "fix" line-endings should be opt-in, never the default.
> Downloading a file via a browser or ftp client on windows also doesn't
> change the file, why should R?
>

I third the feature request.


>
>
> On Thu, May 3, 2018 at 3:02 PM, Duncan Murdoch 
> wrote:
> > Many downloads are text files (HTML, CSV, etc.), and if those are
> downloaded
> > in binary, a Windows user might end up with a file that Notepad can't
> > handle, because it would have Unix-style line endings.
>
> True but I don't think this is relevant. The same holds e.g. for the R
> files in source packages, which also have unix line endings. Most
> Windows users will use an actual editor that understands both types of
> line endings, or can convert between the two.
>
> Downloading-file should do just that.
>

Again, I agree. In my (limited) experience the only program that fails to
properly display \n as a line ending, is Notepad. But it can still open the
file regardless. If line ending conflicts cause bugs, it's almost always a
unix-like OS struggling with Windows-style endings. I have yet to meet the
first one the other way around.

Cheers
Joris


-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)


---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Jeroen Ooms
On Thu, May 3, 2018 at 2:42 PM, Henrik Bengtsson
 wrote:
> Use mode="wb" when you download the file. See
> https://github.com/HenrikBengtsson/Wishlist-for-R/issues/30.
>
> R core, and others, is there a good argument for why we are not making this
> the default download mode? It seems like a such a simple fix to such a
> common "mistake".

I'd like to second this feature request. This default behaviour is
unexpected and often leads to r scripts that were written on
mac/linux, to produce corrupted files on windows, checksum mismatches,
etc.

Even for text files, the default should be to download the file as-is.
Trying to "fix" line-endings should be opt-in, never the default.
Downloading a file via a browser or ftp client on windows also doesn't
change the file, why should R?


On Thu, May 3, 2018 at 3:02 PM, Duncan Murdoch  wrote:
> Many downloads are text files (HTML, CSV, etc.), and if those are downloaded
> in binary, a Windows user might end up with a file that Notepad can't
> handle, because it would have Unix-style line endings.

True but I don't think this is relevant. The same holds e.g. for the R
files in source packages, which also have unix line endings. Most
Windows users will use an actual editor that understands both types of
line endings, or can convert between the two.

Downloading-file should do just that.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Martin Morgan


On 05/03/2018 05:48 AM, Joris Meys wrote:

Dear all,

I've been diving a bit deeper into this per request of Tomas Kalibra, and
found the following :

- the lock on the file is only after trying to read it using oligo, so
that's not a R problem in itself. The problem is independent of extrenal
packages.

- using Windows' fc utility and cygwin's cmp utility I found out that every
so often the download.file() function inserts an extra byte. There's no
real obvious pattern in how these bytes are added, but the file downloaded
using download.file() is actually larger (in this case by about 8 kb). The
file xxx_inR.CEL.gz is read in using:


I believe the difference in mode = "w" vs "wb", and the reason this is 
restricted to Windows downloads, is due to the difference in text file 
line endings, where with mode="w", download.file (and many other 
utilities outside R) recognize the "foo\n" as "foo\r\n". Obviously this 
messes up binary files.


I guess in the CEL.gz file there are about 8k "\n" characters.

Henrik's suggestion (default = "wb") would introduce the complementary 
problem -- text files would have incorrect line endings.


Martin





setwd("E:/Temp/genexpr/Compare")
id <- "GSM907854"
flink <- paste0("
https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907854=file=GSM907854%2ECEL%2Egz
")
fname <- paste0(id,"_inR.CEL.gz")
download.file(flink,
   destfile = fname)

The file xxx_direct.CEL.gz is downloaded from
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907854 (download link
at the bottom of the page).

Output of dir in CMD:

05/03/2018  11:02 AM 4,529,547 GSM907854_direct.CEL.gz
05/03/2018  11:17 AM 4,537,668 GSM907854_inR.CEL.gz

or from R :


diff(file.size(dir())) # contains both CEL files.

[1] 8121

Strangely enough I get the following message from download.file() :

Content type 'application/octet-stream' length 4529547 bytes (4.3 MB)
downloaded 4.3 MB

So the reported length is exactly the same as if I would download the file
directly, but the file on disk itself is larger. So it seems
download.file() is adding bytes when saving the data on disk.  This
behaviour is independent of antivirus and/or firewalls turned on or off.

Also keep in mind that these are NOT standard gzipped files. These files
are a specific format for Affymetrix Human Gene 1.0 ST Arrays.

If I need to run other tests, please let me know.
Kind regards

Joris

On Wed, May 2, 2018 at 9:21 PM, Joris Meys  wrote:


Dear all,

I've noticed by trying to download gz files from here :
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811

At the bottom one can download GSM907811.CEL.gz . If I download this
manually and try

oligo::read.celfiles("GSM907811.CEL.gz")

everything works fine. (oligo is a bioConductor package)

However, if I download using

download.file("https://www.ncbi.nlm.nih.gov/geo/download/
?acc=GSM907811=file=GSM907811%2ECEL%2Egz",
   destfile = "GSM907811.CEL.gz")

The file is downloaded, but oligo::read.celfiles() returns the following
error:

Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) :
   End of gz file reached unexpectedly. Perhaps this file is truncated.

Moreover, if I try to delete it after using download.file(), I get a
warning that permission is denied. I can only remove it using Windows file
explorer after I closed the R session, indicating that the connection is
still open. Yet, showConnections() doesn't show any open connections either.

Session info below. Note that I started from a completely fresh R session.
oligo is needed due to the specific file format of these gz files. They're
not standard tarred files.

Cheers
Joris

Session Info

-

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C

[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets
methods
[9] base

other attached packages:
  [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8
oligo_1.44.0
  [4] Biobase_2.39.2 oligoClasses_1.42.0
RSQLite_2.1.0
  [7] Biostrings_2.48.0  XVector_0.19.9
IRanges_2.13.28
[10] S4Vectors_0.17.42  BiocGenerics_0.25.3

loaded via a namespace (and not attached):
  [1] Rcpp_0.12.16compiler_3.5.0
  [3] BiocInstaller_1.30.0GenomeInfoDb_1.15.5
  [5] bitops_1.0-6iterators_1.0.9
  [7] tools_3.5.0 zlibbioc_1.25.0
  [9] digest_0.6.15   bit_1.1-12
[11] memoise_1.1.0   preprocessCore_1.41.0
[13] lattice_0.20-35 ff_2.2-13
[15] pkgconfig_2.0.1 Matrix_1.2-14
[17] foreach_1.4.4   DelayedArray_0.5.31

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Duncan Murdoch

On 03/05/2018 8:42 AM, Henrik Bengtsson wrote:

Use mode="wb" when you download the file. See
https://github.com/HenrikBengtsson/Wishlist-for-R/issues/30.

R core, and others, is there a good argument for why we are not making this
the default download mode? It seems like a such a simple fix to such a
common "mistake".


Many downloads are text files (HTML, CSV, etc.), and if those are 
downloaded in binary, a Windows user might end up with a file that 
Notepad can't handle, because it would have Unix-style line endings.
(It's possible Notepad no longer requires CR LF endings; I haven't used 
it in years.  But there are probably other brain-dead Windows programs 
that do.)


Duncan Murdoch




Henrik

On Thu, May 3, 2018, 00:44 Joris Meys  wrote:


Dear all,

I've noticed by trying to download gz files from here :
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811

At the bottom one can download GSM907811.CEL.gz . If I download this
manually and try

oligo::read.celfiles("GSM907811.CEL.gz")

everything works fine. (oligo is a bioConductor package)

However, if I download using

download.file("

https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907811=file=GSM907811%2ECEL%2Egz
",
   destfile = "GSM907811.CEL.gz")

The file is downloaded, but oligo::read.celfiles() returns the following
error:

Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) :
   End of gz file reached unexpectedly. Perhaps this file is truncated.

Moreover, if I try to delete it after using download.file(), I get a
warning that permission is denied. I can only remove it using Windows file
explorer after I closed the R session, indicating that the connection is
still open. Yet, showConnections() doesn't show any open connections
either.

Session info below. Note that I started from a completely fresh R session.
oligo is needed due to the specific file format of these gz files. They're
not standard tarred files.

Cheers
Joris

Session Info

-

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets
methods
[9] base

other attached packages:
  [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8
oligo_1.44.0
  [4] Biobase_2.39.2 oligoClasses_1.42.0
RSQLite_2.1.0
  [7] Biostrings_2.48.0  XVector_0.19.9
IRanges_2.13.28
[10] S4Vectors_0.17.42  BiocGenerics_0.25.3

loaded via a namespace (and not attached):
  [1] Rcpp_0.12.16compiler_3.5.0
  [3] BiocInstaller_1.30.0GenomeInfoDb_1.15.5
  [5] bitops_1.0-6iterators_1.0.9
  [7] tools_3.5.0 zlibbioc_1.25.0
  [9] digest_0.6.15   bit_1.1-12
[11] memoise_1.1.0   preprocessCore_1.41.0
[13] lattice_0.20-35 ff_2.2-13
[15] pkgconfig_2.0.1 Matrix_1.2-14
[17] foreach_1.4.4   DelayedArray_0.5.31
[19] yaml_2.1.18 GenomeInfoDbData_1.1.0
[21] affxparser_1.52.0   bit64_0.9-7
[23] grid_3.5.0  BiocParallel_1.13.3
[25] blob_1.1.1  codetools_0.2-15
[27] matrixStats_0.53.1  GenomicRanges_1.31.23
[29] splines_3.5.0   SummarizedExperiment_1.9.17
[31] RCurl_1.95-4.10 affyio_1.49.2


--
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)
<
https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium=gmail=g




---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

 [[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Joris Meys
Dear all,

I've been diving a bit deeper into this per request of Tomas Kalibra, and
found the following :

- the lock on the file is only after trying to read it using oligo, so
that's not a R problem in itself. The problem is independent of extrenal
packages.

- using Windows' fc utility and cygwin's cmp utility I found out that every
so often the download.file() function inserts an extra byte. There's no
real obvious pattern in how these bytes are added, but the file downloaded
using download.file() is actually larger (in this case by about 8 kb). The
file xxx_inR.CEL.gz is read in using:

setwd("E:/Temp/genexpr/Compare")
id <- "GSM907854"
flink <- paste0("
https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907854=file=GSM907854%2ECEL%2Egz
")
fname <- paste0(id,"_inR.CEL.gz")
download.file(flink,
  destfile = fname)

The file xxx_direct.CEL.gz is downloaded from
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907854 (download link
at the bottom of the page).

Output of dir in CMD:

05/03/2018  11:02 AM 4,529,547 GSM907854_direct.CEL.gz
05/03/2018  11:17 AM 4,537,668 GSM907854_inR.CEL.gz

or from R :

> diff(file.size(dir())) # contains both CEL files.
[1] 8121

Strangely enough I get the following message from download.file() :

Content type 'application/octet-stream' length 4529547 bytes (4.3 MB)
downloaded 4.3 MB

So the reported length is exactly the same as if I would download the file
directly, but the file on disk itself is larger. So it seems
download.file() is adding bytes when saving the data on disk.  This
behaviour is independent of antivirus and/or firewalls turned on or off.

Also keep in mind that these are NOT standard gzipped files. These files
are a specific format for Affymetrix Human Gene 1.0 ST Arrays.

If I need to run other tests, please let me know.
Kind regards

Joris

On Wed, May 2, 2018 at 9:21 PM, Joris Meys  wrote:

> Dear all,
>
> I've noticed by trying to download gz files from here :
> https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811
>
> At the bottom one can download GSM907811.CEL.gz . If I download this
> manually and try
>
> oligo::read.celfiles("GSM907811.CEL.gz")
>
> everything works fine. (oligo is a bioConductor package)
>
> However, if I download using
>
> download.file("https://www.ncbi.nlm.nih.gov/geo/download/
> ?acc=GSM907811=file=GSM907811%2ECEL%2Egz",
>   destfile = "GSM907811.CEL.gz")
>
> The file is downloaded, but oligo::read.celfiles() returns the following
> error:
>
> Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) :
>   End of gz file reached unexpectedly. Perhaps this file is truncated.
>
> Moreover, if I try to delete it after using download.file(), I get a
> warning that permission is denied. I can only remove it using Windows file
> explorer after I closed the R session, indicating that the connection is
> still open. Yet, showConnections() doesn't show any open connections either.
>
> Session info below. Note that I started from a completely fresh R session.
> oligo is needed due to the specific file format of these gz files. They're
> not standard tarred files.
>
> Cheers
> Joris
>
> Session Info
> 
> -
>
> R version 3.5.0 (2018-04-23)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows >= 8 x64 (build 9200)
>
> Matrix products: default
>
> locale:
> [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
> Kingdom.1252
> [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
>
> [5] LC_TIME=English_United Kingdom.1252
>
> attached base packages:
> [1] stats4parallel  stats graphics  grDevices utils datasets
> methods
> [9] base
>
> other attached packages:
>  [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8
> oligo_1.44.0
>  [4] Biobase_2.39.2 oligoClasses_1.42.0
> RSQLite_2.1.0
>  [7] Biostrings_2.48.0  XVector_0.19.9
> IRanges_2.13.28
> [10] S4Vectors_0.17.42  BiocGenerics_0.25.3
>
> loaded via a namespace (and not attached):
>  [1] Rcpp_0.12.16compiler_3.5.0
>  [3] BiocInstaller_1.30.0GenomeInfoDb_1.15.5
>  [5] bitops_1.0-6iterators_1.0.9
>  [7] tools_3.5.0 zlibbioc_1.25.0
>  [9] digest_0.6.15   bit_1.1-12
> [11] memoise_1.1.0   preprocessCore_1.41.0
> [13] lattice_0.20-35 ff_2.2-13
> [15] pkgconfig_2.0.1 Matrix_1.2-14
> [17] foreach_1.4.4   DelayedArray_0.5.31
> [19] yaml_2.1.18 GenomeInfoDbData_1.1.0
> [21] affxparser_1.52.0   bit64_0.9-7
> [23] grid_3.5.0  BiocParallel_1.13.3
> [25] blob_1.1.1  codetools_0.2-15
> [27] matrixStats_0.53.1  GenomicRanges_1.31.23
> [29] splines_3.5.0   SummarizedExperiment_1.9.17
> [31] RCurl_1.95-4.10 affyio_1.49.2
>
>
> --
> Joris Meys
> Statistical consultant
>
> 

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Martin Morgan



On 05/02/2018 03:21 PM, Joris Meys wrote:

Dear all,

I've noticed by trying to download gz files from here :
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811

At the bottom one can download GSM907811.CEL.gz . If I download this
manually and try

oligo::read.celfiles("GSM907811.CEL.gz")

everything works fine. (oligo is a bioConductor package)

However, if I download using

download.file("
https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907811=file=GSM907811%2ECEL%2Egz
",
   destfile = "GSM907811.CEL.gz")


On windows, the 'mode' argument to download.file() needs to be "wb" 
(write binary) for binary files.


Martin



The file is downloaded, but oligo::read.celfiles() returns the following
error:

Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) :
   End of gz file reached unexpectedly. Perhaps this file is truncated.

Moreover, if I try to delete it after using download.file(), I get a
warning that permission is denied. I can only remove it using Windows file
explorer after I closed the R session, indicating that the connection is
still open. Yet, showConnections() doesn't show any open connections either.

Session info below. Note that I started from a completely fresh R session.
oligo is needed due to the specific file format of these gz files. They're
not standard tarred files.

Cheers
Joris

Session Info
-

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets
methods
[9] base

other attached packages:
  [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8
oligo_1.44.0
  [4] Biobase_2.39.2 oligoClasses_1.42.0
RSQLite_2.1.0
  [7] Biostrings_2.48.0  XVector_0.19.9
IRanges_2.13.28
[10] S4Vectors_0.17.42  BiocGenerics_0.25.3

loaded via a namespace (and not attached):
  [1] Rcpp_0.12.16compiler_3.5.0
  [3] BiocInstaller_1.30.0GenomeInfoDb_1.15.5
  [5] bitops_1.0-6iterators_1.0.9
  [7] tools_3.5.0 zlibbioc_1.25.0
  [9] digest_0.6.15   bit_1.1-12
[11] memoise_1.1.0   preprocessCore_1.41.0
[13] lattice_0.20-35 ff_2.2-13
[15] pkgconfig_2.0.1 Matrix_1.2-14
[17] foreach_1.4.4   DelayedArray_0.5.31
[19] yaml_2.1.18 GenomeInfoDbData_1.1.0
[21] affxparser_1.52.0   bit64_0.9-7
[23] grid_3.5.0  BiocParallel_1.13.3
[25] blob_1.1.1  codetools_0.2-15
[27] matrixStats_0.53.1  GenomicRanges_1.31.23
[29] splines_3.5.0   SummarizedExperiment_1.9.17
[31] RCurl_1.95-4.10 affyio_1.49.2





This email message may contain legally privileged and/or...{{dropped:2}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Joris Meys
Using the correct mode absolutely solves it. Apologies for not trying the
obvious.

Cheers
Joris

On Thu, May 3, 2018 at 2:10 PM, Martin Morgan  wrote:

>
>
> On 05/02/2018 03:21 PM, Joris Meys wrote:
>
>> Dear all,
>>
>> I've noticed by trying to download gz files from here :
>> https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811
>>
>> At the bottom one can download GSM907811.CEL.gz . If I download this
>> manually and try
>>
>> oligo::read.celfiles("GSM907811.CEL.gz")
>>
>> everything works fine. (oligo is a bioConductor package)
>>
>> However, if I download using
>>
>> download.file("
>> https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907811
>> mat=file=GSM907811%2ECEL%2Egz
>> ",
>>destfile = "GSM907811.CEL.gz")
>>
>
> On windows, the 'mode' argument to download.file() needs to be "wb" (write
> binary) for binary files.
>
> Martin
>
>
>> The file is downloaded, but oligo::read.celfiles() returns the following
>> error:
>>
>> Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) :
>>End of gz file reached unexpectedly. Perhaps this file is truncated.
>>
>> Moreover, if I try to delete it after using download.file(), I get a
>> warning that permission is denied. I can only remove it using Windows file
>> explorer after I closed the R session, indicating that the connection is
>> still open. Yet, showConnections() doesn't show any open connections
>> either.
>>
>> Session info below. Note that I started from a completely fresh R session.
>> oligo is needed due to the specific file format of these gz files. They're
>> not standard tarred files.
>>
>> Cheers
>> Joris
>>
>> Session Info
>> 
>> -
>>
>> R version 3.5.0 (2018-04-23)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> Running under: Windows >= 8 x64 (build 9200)
>>
>> Matrix products: default
>>
>> locale:
>> [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
>> Kingdom.1252
>> [3] LC_MONETARY=English_United Kingdom.1252
>> LC_NUMERIC=C
>> [5] LC_TIME=English_United Kingdom.1252
>>
>> attached base packages:
>> [1] stats4parallel  stats graphics  grDevices utils datasets
>> methods
>> [9] base
>>
>> other attached packages:
>>   [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8
>> oligo_1.44.0
>>   [4] Biobase_2.39.2 oligoClasses_1.42.0
>> RSQLite_2.1.0
>>   [7] Biostrings_2.48.0  XVector_0.19.9
>> IRanges_2.13.28
>> [10] S4Vectors_0.17.42  BiocGenerics_0.25.3
>>
>> loaded via a namespace (and not attached):
>>   [1] Rcpp_0.12.16compiler_3.5.0
>>   [3] BiocInstaller_1.30.0GenomeInfoDb_1.15.5
>>   [5] bitops_1.0-6iterators_1.0.9
>>   [7] tools_3.5.0 zlibbioc_1.25.0
>>   [9] digest_0.6.15   bit_1.1-12
>> [11] memoise_1.1.0   preprocessCore_1.41.0
>> [13] lattice_0.20-35 ff_2.2-13
>> [15] pkgconfig_2.0.1 Matrix_1.2-14
>> [17] foreach_1.4.4   DelayedArray_0.5.31
>> [19] yaml_2.1.18 GenomeInfoDbData_1.1.0
>> [21] affxparser_1.52.0   bit64_0.9-7
>> [23] grid_3.5.0  BiocParallel_1.13.3
>> [25] blob_1.1.1  codetools_0.2-15
>> [27] matrixStats_0.53.1  GenomicRanges_1.31.23
>> [29] splines_3.5.0   SummarizedExperiment_1.9.17
>> [31] RCurl_1.95-4.10 affyio_1.49.2
>>
>>
>>
>
> This email message may contain legally privileged and/or confidential
> information.  If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited.  If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.
>



-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)


---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Henrik Bengtsson
Use mode="wb" when you download the file. See
https://github.com/HenrikBengtsson/Wishlist-for-R/issues/30.

R core, and others, is there a good argument for why we are not making this
the default download mode? It seems like a such a simple fix to such a
common "mistake".

Henrik

On Thu, May 3, 2018, 00:44 Joris Meys  wrote:

> Dear all,
>
> I've noticed by trying to download gz files from here :
> https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811
>
> At the bottom one can download GSM907811.CEL.gz . If I download this
> manually and try
>
> oligo::read.celfiles("GSM907811.CEL.gz")
>
> everything works fine. (oligo is a bioConductor package)
>
> However, if I download using
>
> download.file("
>
> https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907811=file=GSM907811%2ECEL%2Egz
> ",
>   destfile = "GSM907811.CEL.gz")
>
> The file is downloaded, but oligo::read.celfiles() returns the following
> error:
>
> Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) :
>   End of gz file reached unexpectedly. Perhaps this file is truncated.
>
> Moreover, if I try to delete it after using download.file(), I get a
> warning that permission is denied. I can only remove it using Windows file
> explorer after I closed the R session, indicating that the connection is
> still open. Yet, showConnections() doesn't show any open connections
> either.
>
> Session info below. Note that I started from a completely fresh R session.
> oligo is needed due to the specific file format of these gz files. They're
> not standard tarred files.
>
> Cheers
> Joris
>
> Session Info
>
> -
>
> R version 3.5.0 (2018-04-23)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows >= 8 x64 (build 9200)
>
> Matrix products: default
>
> locale:
> [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
> Kingdom.1252
> [3] LC_MONETARY=English_United Kingdom.1252
> LC_NUMERIC=C
> [5] LC_TIME=English_United Kingdom.1252
>
> attached base packages:
> [1] stats4parallel  stats graphics  grDevices utils datasets
> methods
> [9] base
>
> other attached packages:
>  [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8
> oligo_1.44.0
>  [4] Biobase_2.39.2 oligoClasses_1.42.0
> RSQLite_2.1.0
>  [7] Biostrings_2.48.0  XVector_0.19.9
> IRanges_2.13.28
> [10] S4Vectors_0.17.42  BiocGenerics_0.25.3
>
> loaded via a namespace (and not attached):
>  [1] Rcpp_0.12.16compiler_3.5.0
>  [3] BiocInstaller_1.30.0GenomeInfoDb_1.15.5
>  [5] bitops_1.0-6iterators_1.0.9
>  [7] tools_3.5.0 zlibbioc_1.25.0
>  [9] digest_0.6.15   bit_1.1-12
> [11] memoise_1.1.0   preprocessCore_1.41.0
> [13] lattice_0.20-35 ff_2.2-13
> [15] pkgconfig_2.0.1 Matrix_1.2-14
> [17] foreach_1.4.4   DelayedArray_0.5.31
> [19] yaml_2.1.18 GenomeInfoDbData_1.1.0
> [21] affxparser_1.52.0   bit64_0.9-7
> [23] grid_3.5.0  BiocParallel_1.13.3
> [25] blob_1.1.1  codetools_0.2-15
> [27] matrixStats_0.53.1  GenomicRanges_1.31.23
> [29] splines_3.5.0   SummarizedExperiment_1.9.17
> [31] RCurl_1.95-4.10 affyio_1.49.2
>
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
> <
> https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium=gmail=g
> >
>
> ---
> Biowiskundedagen 2017-2018
> http://www.biowiskundedagen.ugent.be/
>
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Joris Meys
Dear all,

I've noticed by trying to download gz files from here :
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811

At the bottom one can download GSM907811.CEL.gz . If I download this
manually and try

oligo::read.celfiles("GSM907811.CEL.gz")

everything works fine. (oligo is a bioConductor package)

However, if I download using

download.file("
https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907811=file=GSM907811%2ECEL%2Egz
",
  destfile = "GSM907811.CEL.gz")

The file is downloaded, but oligo::read.celfiles() returns the following
error:

Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) :
  End of gz file reached unexpectedly. Perhaps this file is truncated.

Moreover, if I try to delete it after using download.file(), I get a
warning that permission is denied. I can only remove it using Windows file
explorer after I closed the R session, indicating that the connection is
still open. Yet, showConnections() doesn't show any open connections either.

Session info below. Note that I started from a completely fresh R session.
oligo is needed due to the specific file format of these gz files. They're
not standard tarred files.

Cheers
Joris

Session Info
-

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets
methods
[9] base

other attached packages:
 [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8
oligo_1.44.0
 [4] Biobase_2.39.2 oligoClasses_1.42.0
RSQLite_2.1.0
 [7] Biostrings_2.48.0  XVector_0.19.9
IRanges_2.13.28
[10] S4Vectors_0.17.42  BiocGenerics_0.25.3

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.16compiler_3.5.0
 [3] BiocInstaller_1.30.0GenomeInfoDb_1.15.5
 [5] bitops_1.0-6iterators_1.0.9
 [7] tools_3.5.0 zlibbioc_1.25.0
 [9] digest_0.6.15   bit_1.1-12
[11] memoise_1.1.0   preprocessCore_1.41.0
[13] lattice_0.20-35 ff_2.2-13
[15] pkgconfig_2.0.1 Matrix_1.2-14
[17] foreach_1.4.4   DelayedArray_0.5.31
[19] yaml_2.1.18 GenomeInfoDbData_1.1.0
[21] affxparser_1.52.0   bit64_0.9-7
[23] grid_3.5.0  BiocParallel_1.13.3
[25] blob_1.1.1  codetools_0.2-15
[27] matrixStats_0.53.1  GenomicRanges_1.31.23
[29] splines_3.5.0   SummarizedExperiment_1.9.17
[31] RCurl_1.95-4.10 affyio_1.49.2


-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)


---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel