On 05/09/2013 12:32 PM, Dr Gregory Jefferis wrote:
Dear Duncan,

This certainly looks useful. Might you consider adding the ability to
supply an alternative digest function? Details below.

Thanks, that's a good idea.

Duncan Murdoch

I often use a homemade "make" type function which starts by looking at
modification times e.g. in a private package

https://github.com/jefferis/nat.utils/blob/master/R/make.r

For some of my work, I use hash functions. However because I typically
work with many large files I often use a special digest process e.g.
using the crc checksum embedded in a gzip file directly or hashing only
the part of a large file that is (almost) certain to change.

Perhaps (code unchecked) along the lines of:

changedFiles <- function(snapshot, timestamp = tempfile("timestamp"),
file.info = NULL,
        digest = FALSE, digestfun=NULL, full.names = FALSE, ...)

if(digest){
        if(is.null(digestfun)) digestfun=tools::md5sum
        else digestfun=match.fun(digestfun)
        info <- data.frame(info, digest = digestfun(fullnames))
}

etc

OR alternatively using only one argument:

changedFiles <- function(snapshot, timestamp = tempfile("timestamp"),
file.info = NULL,
        digest = FALSE, full.names = FALSE, ...)

if(is.logical(digest)){
        if(digest) digestfun=tools::md5sum
} else {
        # Assume that digest specifies a function that we want to use
        digestfun=match.fun(digest)
        digest=TRUE
}

if(digest)
        info <- data.frame(info, digest = digestfun(fullnames))

etc

Many thanks,

Greg.

On 4 Sep 2013, at 18:53, Duncan Murdoch wrote:

> In a number of places internal to R, we need to know which files have
> changed (e.g. after building a vignette).  I've just written a general
> purpose function "changedFiles" that I'll probably commit to R-devel.
> Comments on the design (or bug reports) would be appreciated.
>
> The source for the function and the Rd page for it are inline below.
>
> ----- changedFiles.R:
> changedFiles <- function(snapshot, timestamp = tempfile("timestamp"),
> file.info = NULL,
>   md5sum = FALSE, full.names = FALSE, ...) {
> dosnapshot <- function(args) {
> fullnames <- do.call(list.files, c(full.names = TRUE, args))
> names <- do.call(list.files, c(full.names = full.names, args))
> if (isTRUE(file.info) || (is.character(file.info) &&
> length(file.info))) {
>  info <- file.info(fullnames)
> rownames(info) <- names
>  if (isTRUE(file.info))
>      file.info <- c("size", "isdir", "mode", "mtime")
> } else
>  info <- data.frame(row.names=names)
> if (md5sum)
> info <- data.frame(info, md5sum = tools::md5sum(fullnames))
> list(info = info, timestamp = timestamp, file.info = file.info,
> md5sum = md5sum, full.names = full.names, args = args)


--
Gregory Jefferis, PhD                   Tel: 01223 267048
Division of Neurobiology
MRC Laboratory of Molecular Biology
Francis Crick Avenue
Cambridge Biomedical Campus
Cambridge, CB2 OQH, UK

http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis
http://jefferislab.org
http://flybrain.stanford.edu

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to