On 11-04-12 07:06 PM, Simon Urbanek wrote:

On Apr 12, 2011, at 8:53 PM, Hervé Pagès wrote:

Hi Uwe,

On 11-04-11 08:13 AM, Uwe Ligges wrote:


On 11.04.2011 02:47, Hervé Pagès wrote:
Hi,

More about the new --resave-data option

As mentioned previously here

https://stat.ethz.ch/pipermail/r-devel/2011-April/060511.html

'R CMD build' and 'R CMD INSTALL' handle this new option
inconsistently. The former does --resave-data="gzip" by default.
The latter doesn't seem to support the --resave-data= syntax:
the --resave-data flag must either be present or not. And by
default 'R CMD INSTALL' won't resave the data.

Also, because now 'R CMD build' is resaving the data, shouldn't it
reinstall the package in order to be able to do this correctly?

Here is why. There is this new warning in 'R CMD check' that complains
about files not of a type allowed in a 'data' directory:


http://bioconductor.org/checkResults/2.8/bioc-LATEST/Icens/lamb1-checksrc.html



The Icens package also has .R files under data/ with things like:

bet<- matrix(scan("CMVdata", quiet=TRUE),nc=5,byr=TRUE)

i.e. the R code needs to access some of the text files located
in the data/ folder. So in order to get rid of this warning I
tried to move those text files to inst/extdata/ and I modified
the code in the .R file so it does:

CMVdata_filepath<- system.file("extdata", "CMVdata", package="Icens")
bet<- matrix(scan(CMVdata_filepath, quiet=TRUE),nc=5,byr=TRUE)

But now 'R CMD build' fails to resave the data because the package
was not installed first and the CMVdata file could not be found.

Unfortunately, for a lot of people that means that the safe way to
build a source tarball now is with

R CMD build --keep-empty-dirs --no-resave-data


Hervé,

actually is makes some sense to have these defaults from a CRAN
maintainer's point of view:

--keep-empty-dirs:
we found many packages containing empty dirs unnecessarily and the idea
is to exclude them at the build state rather than at the later
installation stage. Note that the package maintainer is supposed to run
build (and knows if the empty dirs are to be included, the user who runs
INSTALL does not).

--no-resave-data:
We found many packages with unsufficiently compressed data. This should
be fixed when building the package, not later when installing it, since
the reduces size is useful in the source tarball already.

So it does make some sense to have different defaults in build as
opposed to INSTALL from my point of view (although I could live with
different, tough).

If you deliberately ignore the fact that 'R CMD INSTALL' is also used
by developers to install from the *package source tree* (by opposition
to end users who use it to install from a *source tarball*,

.. for a good reason, IMHO no serious developer would do that for obvious 
reasons -

This sounds like saying that no serious developer working on a big
project involving a lot of files to compile should use 'make'.
I mean, serious developers like you *always* do 'make clean' before
they do 'make' on the R tree when they need to test a change, even
a small one? And this only takes a "fraction of second" for them?
Hey, I'd love to be able to do that too! ;-)

H.

you'd be working on a dirty copy creating many unnecessary problems and 
polluting your sources. The first time you'll spend an hour chasing a 
non-existent problem due to stale binary objects in your tree you'll learn that 
lesson ;). The fraction of a second spent in R CMD build is well worth the 
hours saved. IMHO the only valid reason to run INSTALL on a (freshly unpacked 
tar ball) directory is to capture config.log.

Cheers,
Simon



even though
they don't use it directly), then you have a point. So maybe I should
have been more explicit about the problem that it can be for the
*developer* to have 'R CMD build' and 'R CMD INSTALL' behave
differently by default.

Of course I'm not suggesting that 'R CMD INSTALL' should behave
differently (by default) depending on whether it's used on a source
tarball (mode 1) or a package source tree (mode 2).

I'm suggesting that, by default, the 3 commands (R CMD build +
R CMD INSTALL in mode 1 and 2) behave consistently.

With the latest changes, and by default, 'R CMD INSTALL' is still doing
the right thing, but not 'R CMD build' anymore.

I perfectly understand the intention behind those new flags, which is
to try to "optimize" the resulting source tarball but what would you
think if 'gcc' had some optimization flags that can generate broken
executables (under some circumstances) and if these flags were enabled
by default?

Note that I would have no problem with 'R CMD build' trying to resave
the data by default if the current implementation of that feature
was working properly, but unfortunately it's broken (see my previous
email for the details).

Thanks,
H.


If you need further arguments for the discussion: I also tend to use
--no-vignettes nowadays if my code does not change considerably. ;-)

Best wishes,
Uwe



I hope the list of options/flags that we need to use to "fix" 'R CMD
build' (and make it consistent with R CMD INSTALL) is not going to
grow too much ;-)

Thanks,
H.




--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to