[GitHub] spark pull request #19589: [SPARKR][SPARK-22344] Set java.io.tmpdir for Spar...

2017-10-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19589


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19589: [SPARKR][SPARK-22344] Set java.io.tmpdir for Spar...

2017-10-27 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/19589#discussion_r147545176
  
--- Diff: R/pkg/tests/run-all.R ---
@@ -38,6 +38,10 @@ sparkRFilesBefore <- list.files(path = sparkRDir, 
all.files = TRUE)
 sparkRTestMaster <- "local[1]"
 if (identical(Sys.getenv("NOT_CRAN"), "true")) {
   sparkRTestMaster <- ""
+} else {
+  # Disable hsperfdata on CRAN
+  old_java_opt <- Sys.getenv("_JAVA_OPTIONS")
+  Sys.setenv("_JAVA_OPTIONS"=paste("-XX:-UsePerfData", old_java_opt, sep=" 
"))
--- End diff --

also `sep=" "` is the default


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19589: [SPARKR][SPARK-22344] Set java.io.tmpdir for Spar...

2017-10-27 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/19589#discussion_r147545114
  
--- Diff: R/pkg/inst/tests/testthat/test_basic.R ---
@@ -18,7 +18,11 @@
 context("basic tests for CRAN")
 
 test_that("create DataFrame from list or data.frame", {
-  sparkR.session(master = sparkRTestMaster, enableHiveSupport = FALSE)
+  r_tmp_dir <- tempdir()
+  tmp_arg <- paste("-Djava.io.tmpdir=", r_tmp_dir, sep = "")
--- End diff --

nit: paste0


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19589: [SPARKR][SPARK-22344] Set java.io.tmpdir for Spar...

2017-10-27 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/19589#discussion_r147545247
  
--- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd ---
@@ -36,6 +36,12 @@ opts_hooks$set(eval = function(options) {
   }
   options
 })
+r_tmp_dir <- tempdir()
+tmp_arg <- paste("-Djava.io.tmpdir=", r_tmp_dir, sep = "")
+sparkSessionConfig <- list(spark.driver.extraJavaOptions = tmp_arg,
+   spark.executor.extraJavaOptions = tmp_arg)
+old_java_opt <- Sys.getenv("_JAVA_OPTIONS")
+new_java_opt <- Sys.setenv("_JAVA_OPTIONS"=paste("-XX:-UsePerfData", 
old_java_opt, sep = " "))
--- End diff --

don't need `new_java_opt`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19589: [SPARKR][SPARK-22344] Set java.io.tmpdir for Spar...

2017-10-27 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/19589#discussion_r147545213
  
--- Diff: R/pkg/tests/run-all.R ---
@@ -38,6 +38,10 @@ sparkRFilesBefore <- list.files(path = sparkRDir, 
all.files = TRUE)
 sparkRTestMaster <- "local[1]"
 if (identical(Sys.getenv("NOT_CRAN"), "true")) {
   sparkRTestMaster <- ""
+} else {
+  # Disable hsperfdata on CRAN
+  old_java_opt <- Sys.getenv("_JAVA_OPTIONS")
+  Sys.setenv("_JAVA_OPTIONS"=paste("-XX:-UsePerfData", old_java_opt, sep=" 
"))
--- End diff --

space between variable/value
`"_JAVA_OPTIONS" = paste("-XX:-UsePerfData", old_java_opt, sep = " "`
we *should* have a lintr rule that detects this at one point..


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19589: [SPARKR][SPARK-22344] Set java.io.tmpdir for Spar...

2017-10-27 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/19589#discussion_r147545189
  
--- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd ---
@@ -57,7 +63,7 @@ We use default settings in which it runs in local mode. 
It auto downloads Spark
 
 ```{r, include=FALSE}
 install.spark()
-sparkR.session(master = "local[1]")
+sparkR.session(master = "local[1]", sparkConfig = sparkSessionConfig, 
enableHiveSupport = FALSE)
--- End diff --

hmm, good catch on `enableHiveSupport` I think this should be fine


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19589: [SPARKR][SPARK-22344] Set java.io.tmpdir for Spar...

2017-10-27 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/19589#discussion_r147545171
  
--- Diff: R/pkg/tests/run-all.R ---
@@ -38,6 +38,10 @@ sparkRFilesBefore <- list.files(path = sparkRDir, 
all.files = TRUE)
 sparkRTestMaster <- "local[1]"
 if (identical(Sys.getenv("NOT_CRAN"), "true")) {
   sparkRTestMaster <- ""
+} else {
+  # Disable hsperfdata on CRAN
+  old_java_opt <- Sys.getenv("_JAVA_OPTIONS")
+  Sys.setenv("_JAVA_OPTIONS"=paste("-XX:-UsePerfData", old_java_opt, sep=" 
"))
--- End diff --

I think perhaps it'd better to set `spark.driver.extraJavaOptions` and 
`spark.executor.extraJavaOptions` (use a variable to store its value, as in 
`sparkRTestMaster`, or like in vignettes, `sparkSessionConfig`)

btw, I think the convention is camelCase variables :)



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19589: [SPARKR][SPARK-22344] Set java.io.tmpdir for Spar...

2017-10-27 Thread shivaram
GitHub user shivaram opened a pull request:

https://github.com/apache/spark/pull/19589

[SPARKR][SPARK-22344] Set java.io.tmpdir for SparkR tests

This PR sets the java.io.tmpdir for CRAN checks and also disables the 
hsperfdata for the JVM when running CRAN checks. Together this prevents files 
from being left behind in `/tmp`

## How was this patch tested?
Tested manually on a clean EC2 machine

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shivaram/spark-1 sparkr-tmpdir-clean

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19589.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19589


commit 2b399deeb09af3ff8bb561e84f22948fd589e96d
Author: Shivaram Venkataraman 
Date:   2017-10-27T17:07:46Z

Set java.io.tmpdir for CRAN checks
Also disable the hsperfdata for the JVM for CRAN checks




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org