Re: [Rd] tools::md5sum(directory) behavior different on Windows vs. Unix

2013-09-29 Thread Scott Kostyshak
On Mon, Sep 9, 2013 at 3:00 AM, Scott Kostyshak  wrote:
> tools::md5sum gives a warning if it receives a directory as an
> argument on Unix but not on Windows.
>
> From what I understand, this happens because in Windows a directory is
> not treated as a file so fopen returns NULL. Then, NA is returned
> without a warning. On Unix, a directory is treated as a file so fopen
> does not return NULL so md5 is run and fails, leading to a warning.
>
> This is a good opportunity for me to understand further (in addition
> to [1] and the many places where OS special cases are mentioned) in
> which cases R tries to behave the same on Windows as on Unix and in
> which cases it allows for differences (in this case, a warning vs. no
> warning). For example, it would be straightforward to create a patch
> that would lead to the same behavior in this case. tools::md5sum could
> either issue a warning for each argument that is a directory or it
> could issue no warning (consistent with file.info). Would either patch
> be considered?

Attached is a patch that gives a warning if an element in the file
argument is not a regular file (e.g. is a directory or does not
exist). In my opinion the advantages of this patch are:

(1) the same warnings are generated on all platforms in the case where
one of the elements is a folder.
(2) a warning is also given if a file does not exist.

Comments?

Scott

>
> Or is this difference encouraged because the concept of a file is
> different on Unix than on Windows?
>
> Scott
>
> [1] 
> http://cran.r-project.org/bin/windows/base/rw-FAQ.html#What-should-I-expect-to-behave-differently-from-the-Unix-version
>
>
> --
> Scott Kostyshak
> Economics PhD Candidate
> Princeton University
Index: trunk/src/library/tools/R/md5.R
===
--- trunk/src/library/tools/R/md5.R (revision 64011)
+++ trunk/src/library/tools/R/md5.R (working copy)
@@ -17,7 +17,18 @@
 #  http://www.r-project.org/Licenses/
 
 md5sum <- function(files)
-structure(.Call(Rmd5, files), names=files)
+{
+reg_ <- file_test("-f", files)
+regFiles <- files[reg_]
+notReg <- files[!reg_]
+if(!all(reg_))
+warning("The following are not regular files: ",
+paste(shQuote(notReg), collapse = " "))
+names(files) <- files
+files[!reg_] <- NA
+files[reg_] <- .Call(Rmd5, regFiles)
+files
+}
 
 .installMD5sums <- function(pkgDir, outDir = pkgDir)
 {
Index: trunk/src/library/tools/man/md5sum.Rd
===
--- trunk/src/library/tools/man/md5sum.Rd   (revision 64011)
+++ trunk/src/library/tools/man/md5sum.Rd   (working copy)
@@ -18,7 +18,8 @@
 \value{
   A character vector of the same length as \code{files}, with names
   equal to \code{files}. The elements
-  will be \code{NA} for non-existent or unreadable files, otherwise
+  will be \code{NA} for non-existent or unreadable files (in which case
+  a warning will be generated), otherwise
   a 32-character string of hexadecimal digits.
 
   On Windows all files are read in binary mode (as the \code{md5sum}
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] how to interpose my own "[" function?

2013-09-29 Thread Andrew Piskorski
I want to create my own "[" function (for use on vectors, matrices,
arrays, etc.), which calls the stock R "[", does some additional work,
and then finally returns the modified result.

But, how do I properly call the stock R "[" function?  It takes a
varying number of positional arguments, and its R-level closure is
just:  .Primitive("[")  It's implemented by do_subset_dflt in
src/main/subset.c.

The only thing I've come up with so far is using eval after crudely
re-constructing the original incoming call (example below), which is
both very slow and gets some of the semantics wrong.

Is there some good way for my function to just say, "take ALL my
incoming arguments, whatever they might be, and pass them as-is to
.Primitive('['), then return control to me here"?  Or said another
way, what I want is to hook the end of the R-level "[" function and do
some extra work there before returning to the user.

Basically, where should I look to figure out how to do this?  Is it
even feasible at all when the call I want to intercept is implemented
in C and is Primitive like "[" is?  Is there some other technique I
should be using instead to accomplish this sort of thing?

Thanks for your help!  Example of awful eval-based code follows:


my.subset <- function(x ,i ,j ,... ,drop=TRUE) { 
   brace.fcn <- get("[",pos="package:base") 
   code <- 'brace.fcn(x,' 
   if (!missing(i))code <- paste(code ,'i' ,sep="") 
   # This fails to distinguish between the mat[1:21] and mat[1:21,] cases: 
   if (length(dim(x)) > 1 && (missing(i) || length(dim(i)) <= 1)) 
  code <- paste(code ,',' ,sep="") 
   if (!missing(j))code <- paste(code ,'j' ,sep="") 
   if (!missing(...))  code <- paste(code ,',...' ,sep="") 
   if (!missing(drop)) code <- paste(code ,',drop=drop' ,sep="") 
   code <- paste(code ,')' ,sep="") 
   result <- eval(parse(text=code)) 
   # FINALLY we have the stock result, now modify it some more... 
   result 
} 

-- 
Andrew Piskorski 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] how to interpose my own "[" function?

2013-09-29 Thread Henrik Bengtsson
Typically you use NextMethod(), but otherwise you can either "unclass"
your object first or use .subset().  Not sure from ?.subset whether
that is ok to use or not.

/Henrik

On Sun, Sep 29, 2013 at 8:26 PM, Andrew Piskorski  wrote:
> I want to create my own "[" function (for use on vectors, matrices,
> arrays, etc.), which calls the stock R "[", does some additional work,
> and then finally returns the modified result.
>
> But, how do I properly call the stock R "[" function?  It takes a
> varying number of positional arguments, and its R-level closure is
> just:  .Primitive("[")  It's implemented by do_subset_dflt in
> src/main/subset.c.
>
> The only thing I've come up with so far is using eval after crudely
> re-constructing the original incoming call (example below), which is
> both very slow and gets some of the semantics wrong.
>
> Is there some good way for my function to just say, "take ALL my
> incoming arguments, whatever they might be, and pass them as-is to
> .Primitive('['), then return control to me here"?  Or said another
> way, what I want is to hook the end of the R-level "[" function and do
> some extra work there before returning to the user.
>
> Basically, where should I look to figure out how to do this?  Is it
> even feasible at all when the call I want to intercept is implemented
> in C and is Primitive like "[" is?  Is there some other technique I
> should be using instead to accomplish this sort of thing?
>
> Thanks for your help!  Example of awful eval-based code follows:
>
>
> my.subset <- function(x ,i ,j ,... ,drop=TRUE) {
>brace.fcn <- get("[",pos="package:base")
>code <- 'brace.fcn(x,'
>if (!missing(i))code <- paste(code ,'i' ,sep="")
># This fails to distinguish between the mat[1:21] and mat[1:21,] cases:
>if (length(dim(x)) > 1 && (missing(i) || length(dim(i)) <= 1))
>   code <- paste(code ,',' ,sep="")
>if (!missing(j))code <- paste(code ,'j' ,sep="")
>if (!missing(...))  code <- paste(code ,',...' ,sep="")
>if (!missing(drop)) code <- paste(code ,',drop=drop' ,sep="")
>code <- paste(code ,')' ,sep="")
>result <- eval(parse(text=code))
># FINALLY we have the stock result, now modify it some more...
>result
> }
>
> --
> Andrew Piskorski 
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel