Re: [Rd] unlink() on "~" removes the home directory

2020-02-26 Thread Hervé Pagès

On 2/26/20 14:47, Gábor Csárdi wrote:

!!! DON'T TRY THE CODE IN THIS EMAIL AT HOME !!!


Ok I'll try it at work on my boss's computer, sounds a lot safer.

H.



Well, unlink() does what it is supposed to do, so you could argue that
there is nothing wrong with it. Also, nobody would call unlink() on
"~", right?

The situation is not so simple, however. E.g. if you happen to have a
directory called "~", and you iterate over all files and directories
to selectively remove some of them, then your code might end up
calling unlink on the local "~" directory, and then your home is gone.

But you would not create a directory named "~", that is just asking
for trouble. Well, surely, _intentionally_ you would not do that.
Unintentionally, you might. E.g. something like this is enough:

# Create a subpath within a base directory
badfun <- function(base = ".", path) {
   dir.create(file.path(base, path), recursive = TRUE, showWarnings = FALSE)
}
badfun(path = "~/foo")

(If you did run this, be very careful how you remove the directory called "~"!)

A real example is `R CMD build` which deletes the home directory of
the current user if the root of the package contains a non-empty "~"
directory. Luckily this is now fixed in R-devel, so R 4.0.0 will do
better. (R 3.6.3 will not.) See
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_wch_r-2Dsource_commit_1d4f7aa1dac427ea2213d1f7cd7b5c16e896af22=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=C3VCGF44o7jATPOlC8aZhaT4YGU1JtcOixJKZgu6KyI=iWNt-0G2gZa99bnOqNBMOHph0NyVoJdsIwuA07GhJZQ=

I have seen several bug reports about various packages (that call R
CMD build) removing the home directory, so this indeed happens in
practice to a number of people. The commit above will fix `R CMD
build`, but it would be great to "fix" this in general.

It seems pretty hard to prevent users from creating of a "~"
directory. But preventing unlink() from deleting "~" does not actually
seem too hard. If unlink() could just refuse removing "~" (when expand
= TRUE), that would be great. It seems to me that the current behavior
is very-very rarely intended, and its consequences are potentially
disastrous.

If unlink("~", recursive = TRUE) errors, you can still remove a local
"~" file/dir with unlink("./~", ...). And you can still remove your
home directory if you really want to do that, with
unlink(path.expand("~"), ...). So no functionality is lost.

Also, if anyone is aware of packages/functions that tend to create "~"
directories or files, please let me know.

I would be happy to submit a patch for the new unlink("~") behavior.

Thanks,
Gabor

__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=C3VCGF44o7jATPOlC8aZhaT4YGU1JtcOixJKZgu6KyI=FeZWU9uN-HwDNkSBOmbYXiGqu8q8-U6DI-ddyUn7HHw=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] unlink() on "~" removes the home directory

2020-02-26 Thread Gábor Csárdi
!!! DON'T TRY THE CODE IN THIS EMAIL AT HOME !!!

Well, unlink() does what it is supposed to do, so you could argue that
there is nothing wrong with it. Also, nobody would call unlink() on
"~", right?

The situation is not so simple, however. E.g. if you happen to have a
directory called "~", and you iterate over all files and directories
to selectively remove some of them, then your code might end up
calling unlink on the local "~" directory, and then your home is gone.

But you would not create a directory named "~", that is just asking
for trouble. Well, surely, _intentionally_ you would not do that.
Unintentionally, you might. E.g. something like this is enough:

# Create a subpath within a base directory
badfun <- function(base = ".", path) {
  dir.create(file.path(base, path), recursive = TRUE, showWarnings = FALSE)
}
badfun(path = "~/foo")

(If you did run this, be very careful how you remove the directory called "~"!)

A real example is `R CMD build` which deletes the home directory of
the current user if the root of the package contains a non-empty "~"
directory. Luckily this is now fixed in R-devel, so R 4.0.0 will do
better. (R 3.6.3 will not.) See
https://github.com/wch/r-source/commit/1d4f7aa1dac427ea2213d1f7cd7b5c16e896af22

I have seen several bug reports about various packages (that call R
CMD build) removing the home directory, so this indeed happens in
practice to a number of people. The commit above will fix `R CMD
build`, but it would be great to "fix" this in general.

It seems pretty hard to prevent users from creating of a "~"
directory. But preventing unlink() from deleting "~" does not actually
seem too hard. If unlink() could just refuse removing "~" (when expand
= TRUE), that would be great. It seems to me that the current behavior
is very-very rarely intended, and its consequences are potentially
disastrous.

If unlink("~", recursive = TRUE) errors, you can still remove a local
"~" file/dir with unlink("./~", ...). And you can still remove your
home directory if you really want to do that, with
unlink(path.expand("~"), ...). So no functionality is lost.

Also, if anyone is aware of packages/functions that tend to create "~"
directories or files, please let me know.

I would be happy to submit a patch for the new unlink("~") behavior.

Thanks,
Gabor

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RIOT 2020

2020-02-26 Thread Abby Spurdle
If people want to create a new interpreter (for R or any other
data-driven programming language), or do something closely related
(such as adapt an existing interpreter), I think a better strategy
would be to focus on real time computing.

I note that Oracle who appears to be sponsoring this event, also
acquired ChorusOS (via their acquisition of Sun). I think they should
release ChorusOS under an open source license, and consider investing
into that.


On Wed, Feb 26, 2020 at 5:41 AM Stepan  wrote:
>
> I hope you don’t mind us using this mailing list for a small
> advertisement, but we think it is most relevant for this group:
>
> We'd like to invite you to RIOT 2020 - the 5rd workshop on R
> Implementation, Optimization and Tooling [1]. It will take place
> co-located with, and during, useR! 2020 in St. Louis on July 8th. RIOT
> is an excellent venue for deep technical discussions about R
> implementations, tools, optimizations and R extension, and will be very
> interesting for anyone interested in what’s under the hood of R.
>
> Regards,
> Stepan Sindelar, Lukas Stadler (Oracle Labs), Jan Vitek (Northeastern),
> Alexander Bertram (BeDataDriven)
>
> [1] http://riotworkshop.github.io/
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Profiling: attributing costs to place of invocation (instead of place of evaluation)?

2020-02-26 Thread Lionel Henry
There's a patch under review:
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17595

Best,
Lionel

On 2/26/20, Kirill Müller  wrote:
> Hi
>
>
> Consider the following example:
>
> f <- function(expr) g(expr)
> g <- function(expr) {
>    h(expr)
> }
> h <- function(expr) {
>    expr # evaluation happens here
>    i(expr)
> }
> i <- function(expr) {
>    expr # already evaluated, no costs here
>    invisible()
> }
>
> rprof <- tempfile()
> Rprof(rprof)
> f(replicate(1e2, sample.int(1e4)))
> Rprof(NULL)
> cat(readLines(rprof), sep = "\n")
> #> sample.interval=2
> #> "sample.int" "FUN" "lapply" "sapply" "replicate" "h" "g" "f"
> #> "sample.int" "FUN" "lapply" "sapply" "replicate" "h" "g" "f"
> #> "sample.int" "FUN" "lapply" "sapply" "replicate" "h" "g" "f"
>
> The evaluation of the slow replicate() call is deferred to the execution
> of h(), but there's no replicate() call in h's definition. This makes
> parsing the profile much more difficult than necessary.
>
> I have pasted an experimental patch below (off of 3.6.2) that produces
> the following output:
>
> cat(readLines(rprof), sep = "\n")
> #> sample.interval=2
> #> "sample.int" "FUN" "lapply" "sapply" "replicate" "f"
> #> "sample.int" "FUN" "lapply" "sapply" "replicate" "f"
> #> "sample.int" "FUN" "lapply" "sapply" "replicate" "f"
>
> This attributes the cost to the replicate() call to f(), where the call
> is actually defined. From my experience, this will give a much better
> understanding of the actual costs of each part of the code. The SIGPROF
> handler looks at sysparent and cloenv before deciding if an element of
> the call stack is to be included in the profile.
>
> Is there interest in integrating a variant of this patch, perhaps with
> an optional argument to Rprof()?
>
> Thanks!
>
>
> Best regards
>
> Kirill
>
>
> Index: src/main/eval.c
> ===
> --- src/main/eval.c    (revision 77857)
> +++ src/main/eval.c    (working copy)
> @@ -218,7 +218,10 @@
>   if (R_Line_Profiling)
>   lineprof(buf, R_getCurrentSrcref());
>
> +    SEXP sysparent = NULL;
> +
>   for (cptr = R_GlobalContext; cptr; cptr = cptr->nextcontext) {
> +    if (sysparent != NULL && cptr->cloenv != sysparent &&
> cptr->sysparent != sysparent) continue;
>   if ((cptr->callflag & (CTXT_FUNCTION | CTXT_BUILTIN))
>       && TYPEOF(cptr->call) == LANGSXP) {
>       SEXP fun = CAR(cptr->call);
> @@ -292,6 +295,8 @@
>           else
>           lineprof(buf, cptr->srcref);
>       }
> +
> +        sysparent = cptr->sysparent;
>       }
>   }
>   }
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Profiling: attributing costs to place of invocation (instead of place of evaluation)?

2020-02-26 Thread Kirill Müller

Hi


Consider the following example:

f <- function(expr) g(expr)
g <- function(expr) {
  h(expr)
}
h <- function(expr) {
  expr # evaluation happens here
  i(expr)
}
i <- function(expr) {
  expr # already evaluated, no costs here
  invisible()
}

rprof <- tempfile()
Rprof(rprof)
f(replicate(1e2, sample.int(1e4)))
Rprof(NULL)
cat(readLines(rprof), sep = "\n")
#> sample.interval=2
#> "sample.int" "FUN" "lapply" "sapply" "replicate" "h" "g" "f"
#> "sample.int" "FUN" "lapply" "sapply" "replicate" "h" "g" "f"
#> "sample.int" "FUN" "lapply" "sapply" "replicate" "h" "g" "f"

The evaluation of the slow replicate() call is deferred to the execution 
of h(), but there's no replicate() call in h's definition. This makes 
parsing the profile much more difficult than necessary.


I have pasted an experimental patch below (off of 3.6.2) that produces 
the following output:


cat(readLines(rprof), sep = "\n")
#> sample.interval=2
#> "sample.int" "FUN" "lapply" "sapply" "replicate" "f"
#> "sample.int" "FUN" "lapply" "sapply" "replicate" "f"
#> "sample.int" "FUN" "lapply" "sapply" "replicate" "f"

This attributes the cost to the replicate() call to f(), where the call 
is actually defined. From my experience, this will give a much better 
understanding of the actual costs of each part of the code. The SIGPROF 
handler looks at sysparent and cloenv before deciding if an element of 
the call stack is to be included in the profile.


Is there interest in integrating a variant of this patch, perhaps with 
an optional argument to Rprof()?


Thanks!


Best regards

Kirill


Index: src/main/eval.c
===
--- src/main/eval.c    (revision 77857)
+++ src/main/eval.c    (working copy)
@@ -218,7 +218,10 @@
 if (R_Line_Profiling)
 lineprof(buf, R_getCurrentSrcref());

+    SEXP sysparent = NULL;
+
 for (cptr = R_GlobalContext; cptr; cptr = cptr->nextcontext) {
+    if (sysparent != NULL && cptr->cloenv != sysparent && 
cptr->sysparent != sysparent) continue;

 if ((cptr->callflag & (CTXT_FUNCTION | CTXT_BUILTIN))
     && TYPEOF(cptr->call) == LANGSXP) {
     SEXP fun = CAR(cptr->call);
@@ -292,6 +295,8 @@
         else
         lineprof(buf, cptr->srcref);
     }
+
+        sysparent = cptr->sysparent;
     }
 }
 }

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel