[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-12-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16014


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-12-08 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r91583848
  
--- Diff: dev/create-release/release-build.sh ---
@@ -172,11 +172,30 @@ if [[ "$1" == "package" ]]; then
 MVN_HOME=`$MVN -version 2>&1 | grep 'Maven home' | awk '{print $NF}'`
 
 
-if [ -z "$BUILD_PIP_PACKAGE" ]; then
-  echo "Creating distribution without PIP package"
+if [ -z "$BUILD_PACKAGE" ]; then
+  echo "Creating distribution without PIP/R package"
   ./dev/make-distribution.sh --name $NAME --mvn $MVN_HOME/bin/mvn 
--tgz $FLAGS \
 -DzincPort=$ZINC_PORT 2>&1 >  ../binary-release-$NAME.log
   cd ..
+elif [[ "$BUILD_PACKAGE" == "withr" ]]; then
+  echo "Creating distribution with R package"
+  ./dev/make-distribution.sh --name $NAME --mvn $MVN_HOME/bin/mvn 
--tgz --r $FLAGS \
+-DzincPort=$ZINC_PORT 2>&1 >  ../binary-release-$NAME.log
+  cd ..
+
+  echo "Copying and signing R source package"
+  R_DIST_NAME=SparkR_$SPARK_VERSION.tar.gz
--- End diff --

Just to clarify this is the tgz that we will upload to CRAN right ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-12-08 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r91579011
  
--- Diff: dev/create-release/release-build.sh ---
@@ -221,14 +235,13 @@ if [[ "$1" == "package" ]]; then
 
   # We increment the Zinc port each time to avoid OOM's and other 
craziness if multiple builds
   # share the same Zinc server.
-  # Make R source package only once. (--r)
   FLAGS="-Psparkr -Phive -Phive-thriftserver -Pyarn -Pmesos"
   make_binary_release "hadoop2.3" "-Phadoop-2.3 $FLAGS" "3033" &
   make_binary_release "hadoop2.4" "-Phadoop-2.4 $FLAGS" "3034" &
   make_binary_release "hadoop2.6" "-Phadoop-2.6 $FLAGS" "3035" &
   make_binary_release "hadoop2.7" "-Phadoop-2.7 $FLAGS" "3036" "withpip" &
   make_binary_release "hadoop2.4-without-hive" "-Psparkr -Phadoop-2.4 
-Pyarn -Pmesos" "3037" &
-  make_binary_release "without-hadoop" "--r -Psparkr -Phadoop-provided 
-Pyarn -Pmesos" "3038" &
+  make_binary_release "without-hadoop" "-Psparkr -Phadoop-provided -Pyarn 
-Pmesos" "3038" "withr" &
--- End diff --

I think it sounds fine. I was waiting to see if @rxin (or @Joshrosen ?) 
would take a look because I have not reviewed changes to this file before. Let 
me take another closer look and then we can merge it to branch-2.1 -- We'll see 
what happens to the RC process after that


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-12-07 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r91459472
  
--- Diff: dev/create-release/release-build.sh ---
@@ -221,14 +235,13 @@ if [[ "$1" == "package" ]]; then
 
   # We increment the Zinc port each time to avoid OOM's and other 
craziness if multiple builds
   # share the same Zinc server.
-  # Make R source package only once. (--r)
   FLAGS="-Psparkr -Phive -Phive-thriftserver -Pyarn -Pmesos"
   make_binary_release "hadoop2.3" "-Phadoop-2.3 $FLAGS" "3033" &
   make_binary_release "hadoop2.4" "-Phadoop-2.4 $FLAGS" "3034" &
   make_binary_release "hadoop2.6" "-Phadoop-2.6 $FLAGS" "3035" &
   make_binary_release "hadoop2.7" "-Phadoop-2.7 $FLAGS" "3036" "withpip" &
   make_binary_release "hadoop2.4-without-hive" "-Psparkr -Phadoop-2.4 
-Pyarn -Pmesos" "3037" &
-  make_binary_release "without-hadoop" "--r -Psparkr -Phadoop-provided 
-Pyarn -Pmesos" "3038" &
+  make_binary_release "without-hadoop" "-Psparkr -Phadoop-provided -Pyarn 
-Pmesos" "3038" "withr" &
--- End diff --

@shivaram 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-12-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r91196396
  
--- Diff: dev/create-release/release-build.sh ---
@@ -221,14 +235,13 @@ if [[ "$1" == "package" ]]; then
 
   # We increment the Zinc port each time to avoid OOM's and other 
craziness if multiple builds
   # share the same Zinc server.
-  # Make R source package only once. (--r)
   FLAGS="-Psparkr -Phive -Phive-thriftserver -Pyarn -Pmesos"
   make_binary_release "hadoop2.3" "-Phadoop-2.3 $FLAGS" "3033" &
   make_binary_release "hadoop2.4" "-Phadoop-2.4 $FLAGS" "3034" &
   make_binary_release "hadoop2.6" "-Phadoop-2.6 $FLAGS" "3035" &
   make_binary_release "hadoop2.7" "-Phadoop-2.7 $FLAGS" "3036" "withpip" &
   make_binary_release "hadoop2.4-without-hive" "-Psparkr -Phadoop-2.4 
-Pyarn -Pmesos" "3037" &
-  make_binary_release "without-hadoop" "--r -Psparkr -Phadoop-provided 
-Pyarn -Pmesos" "3038" &
+  make_binary_release "without-hadoop" "-Psparkr -Phadoop-provided -Pyarn 
-Pmesos" "3038" "withr" &
--- End diff --

It was mostly to use a "separate profile" from "withpip"

Running this `R CMD build` would run some Spark code (mainly in vignettes 
since tests are not run in `R CMD check`), but nothing that depends on the file 
system etc.

Also the Spark jar, while loaded and called into during the process, will 
not be packaged into the resulting R source package, so I thought it didn't 
matter which build profile we would run this in.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-12-06 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r91135457
  
--- Diff: dev/create-release/release-build.sh ---
@@ -221,14 +235,13 @@ if [[ "$1" == "package" ]]; then
 
   # We increment the Zinc port each time to avoid OOM's and other 
craziness if multiple builds
   # share the same Zinc server.
-  # Make R source package only once. (--r)
   FLAGS="-Psparkr -Phive -Phive-thriftserver -Pyarn -Pmesos"
   make_binary_release "hadoop2.3" "-Phadoop-2.3 $FLAGS" "3033" &
   make_binary_release "hadoop2.4" "-Phadoop-2.4 $FLAGS" "3034" &
   make_binary_release "hadoop2.6" "-Phadoop-2.6 $FLAGS" "3035" &
   make_binary_release "hadoop2.7" "-Phadoop-2.7 $FLAGS" "3036" "withpip" &
   make_binary_release "hadoop2.4-without-hive" "-Psparkr -Phadoop-2.4 
-Pyarn -Pmesos" "3037" &
-  make_binary_release "without-hadoop" "--r -Psparkr -Phadoop-provided 
-Pyarn -Pmesos" "3038" &
+  make_binary_release "without-hadoop" "-Psparkr -Phadoop-provided -Pyarn 
-Pmesos" "3038" "withr" &
--- End diff --

Any specific reason to use the `without-hadoop` build for the R package ? 
Just wondering if this will affect the users in any fashion 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-12-04 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r90792118
  
--- Diff: R/check-cran.sh ---
@@ -82,4 +83,20 @@ else
   # This will run tests and/or build vignettes, and require SPARK_HOME
   SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/"R CMD check 
$CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
 fi
+
+# Install source package to get it to generate vignettes rds files, etc.
+if [ -n "$CLEAN_INSTALL" ]
--- End diff --

So I did the diff. Here are the new files in the output of 
`make-distribution` in master branch with this change vs. 2.0.0 
Files Added: 

```
- R/lib/SparkR/Meta/vignette.rds
- /R/lib/SparkR/doc/
- /R/lib/SparkR/doc/index.html
- /R/lib/SparkR/doc/sparkr-vignettes.R
- /R/lib/SparkR/doc/sparkr-vignettes.Rmd
- /R/lib/SparkR/doc/sparkr-vignettes.html
```

Files removed: A bunch of HTML files starting from
```
/R/lib/SparkR/html/AFTSurvivalRegressionModel-class.html
...
/R/lib/SparkR/html/year.html
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-12-04 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r90790181
  
--- Diff: dev/make-distribution.sh ---
@@ -208,11 +212,24 @@ cp -r "$SPARK_HOME/data" "$DISTDIR"
 # Make pip package
 if [ "$MAKE_PIP" == "true" ]; then
   echo "Building python distribution package"
-  cd $SPARK_HOME/python
+  pushd "$SPARK_HOME/python" > /dev/null
   python setup.py sdist
-  cd ..
+  popd > /dev/null
+else
+  echo "Skipping building python distribution package"
+fi
+
+# Make R package - this is used for both CRAN release and packing R layout 
into distribution
+if [ "$MAKE_R" == "true" ]; then
+  echo "Building R source package"
+  pushd "$SPARK_HOME/R" > /dev/null
+  # Build source package and run full checks
+  # Install source package to get it to generate vignettes, etc.
+  # Do not source the check-cran.sh - it should be run from where it is 
for it to set SPARK_HOME
+  NO_TESTS=1 CLEAN_INSTALL=1 "$SPARK_HOME/"R/check-cran.sh
--- End diff --

Yeah longer term that sounds like a good idea.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-12-04 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r90792416
  
--- Diff: R/check-cran.sh ---
@@ -82,4 +83,20 @@ else
   # This will run tests and/or build vignettes, and require SPARK_HOME
   SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/"R CMD check 
$CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
 fi
+
+# Install source package to get it to generate vignettes rds files, etc.
+if [ -n "$CLEAN_INSTALL" ]
--- End diff --

So it looks like we lost the knitted HTML files in the SparkR package with 
this change. FWIW this may not be bad as the html files are not usually used 
locally and only used for the website and I think the docs creation part of the 
build should pick that up. (Verifying that now)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-12-04 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r90792485
  
--- Diff: dev/make-distribution.sh ---
@@ -71,6 +72,9 @@ while (( "$#" )); do
 --pip)
   MAKE_PIP=true
   ;;
+--r)
+  MAKE_R=true
--- End diff --

FWIW if you want this to get picked up by the official release building 
procedure we also need to edit release-build.sh [1]. Can you coordinate this 
with @rxin ?

[1] 
https://github.com/apache/spark/blob/d9eb4c7215f26dd05527c0b9980af35087ab9d64/dev/create-release/release-build.sh#L220


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-11-28 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r89912555
  
--- Diff: dev/make-distribution.sh ---
@@ -208,11 +212,24 @@ cp -r "$SPARK_HOME/data" "$DISTDIR"
 # Make pip package
 if [ "$MAKE_PIP" == "true" ]; then
   echo "Building python distribution package"
-  cd $SPARK_HOME/python
+  pushd "$SPARK_HOME/python" > /dev/null
   python setup.py sdist
-  cd ..
+  popd > /dev/null
+else
+  echo "Skipping building python distribution package"
+fi
+
+# Make R package - this is used for both CRAN release and packing R layout 
into distribution
+if [ "$MAKE_R" == "true" ]; then
+  echo "Building R source package"
+  pushd "$SPARK_HOME/R" > /dev/null
+  # Build source package and run full checks
+  # Install source package to get it to generate vignettes, etc.
+  # Do not source the check-cran.sh - it should be run from where it is 
for it to set SPARK_HOME
+  NO_TESTS=1 CLEAN_INSTALL=1 "$SPARK_HOME/"R/check-cran.sh
--- End diff --

I agree. I think it is somewhat debatable whether we should run `R CMD 
check` in `make-distribution.sh` - but I feel there are gaps with what we check 
in Jenkins that it is worthwhile to repeat that here.

For everything else it's just convenient to call R from here. We could 
factor out the R environment stuff and have a separate `install.sh` (possibly 
replacing `install-dev.sh` since this does more with the source package? What 
do you think?)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-11-28 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r89849402
  
--- Diff: R/check-cran.sh ---
@@ -82,4 +83,20 @@ else
   # This will run tests and/or build vignettes, and require SPARK_HOME
   SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/"R CMD check 
$CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
 fi
+
+# Install source package to get it to generate vignettes rds files, etc.
+if [ -n "$CLEAN_INSTALL" ]
--- End diff --

Isn't this already done by `install-dev.sh` ? I'm a bit confused as to why 
we need to call install again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-11-28 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r89850470
  
--- Diff: dev/make-distribution.sh ---
@@ -208,11 +212,24 @@ cp -r "$SPARK_HOME/data" "$DISTDIR"
 # Make pip package
 if [ "$MAKE_PIP" == "true" ]; then
   echo "Building python distribution package"
-  cd $SPARK_HOME/python
+  pushd "$SPARK_HOME/python" > /dev/null
   python setup.py sdist
-  cd ..
+  popd > /dev/null
+else
+  echo "Skipping building python distribution package"
+fi
+
+# Make R package - this is used for both CRAN release and packing R layout 
into distribution
+if [ "$MAKE_R" == "true" ]; then
+  echo "Building R source package"
+  pushd "$SPARK_HOME/R" > /dev/null
+  # Build source package and run full checks
+  # Install source package to get it to generate vignettes, etc.
+  # Do not source the check-cran.sh - it should be run from where it is 
for it to set SPARK_HOME
+  NO_TESTS=1 CLEAN_INSTALL=1 "$SPARK_HOME/"R/check-cran.sh
--- End diff --

Its a little awkward that we use `check-cran.sh` to build, install the 
package. I think it points to the fact that we can refactor the scripts more. 
But that can be done in a future PR


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-11-26 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r89677001
  
--- Diff: dev/create-release/release-build.sh ---
@@ -189,6 +189,9 @@ if [[ "$1" == "package" ]]; then
   SHA512 $PYTHON_DIST_NAME > \
   $PYTHON_DIST_NAME.sha
 
+echo "Copying R source package"
+cp spark-$SPARK_VERSION-bin-$NAME/R/SparkR_$SPARK_VERSION.tar.gz .
--- End diff --

Thanks @srowen for asking. I've updated the PR description above.

"
This PR has 2 key changes. One, we are building source package (aka bundle 
package) for SparkR which could be released on CRAN. Two, we should include in 
the official Spark binary distributions SparkR installed from this source 
package instead (which would have help/vignettes rds needed for those to work 
when the SparkR package is loaded in R, whereas earlier approach with devtools 
does not)
"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-11-26 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r89668616
  
--- Diff: dev/create-release/release-build.sh ---
@@ -189,6 +189,9 @@ if [[ "$1" == "package" ]]; then
   SHA512 $PYTHON_DIST_NAME > \
   $PYTHON_DIST_NAME.sha
 
+echo "Copying R source package"
+cp spark-$SPARK_VERSION-bin-$NAME/R/SparkR_$SPARK_VERSION.tar.gz .
--- End diff --

For clarity, this is the heart of the change? we were including R source 
before in releases, right, at least the source release? does this add something 
different?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-11-25 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r89661428
  
--- Diff: dev/create-release/release-build.sh ---
@@ -189,6 +189,9 @@ if [[ "$1" == "package" ]]; then
   SHA512 $PYTHON_DIST_NAME > \
   $PYTHON_DIST_NAME.sha
 
+echo "Copying R source package"
+cp spark-$SPARK_VERSION-bin-$NAME/R/SparkR_$SPARK_VERSION.tar.gz .
--- End diff --

this is the source package we should release to CRAN


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-11-25 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r89661414
  
--- Diff: R/pkg/NAMESPACE ---
@@ -3,7 +3,7 @@
 importFrom("methods", "setGeneric", "setMethod", "setOldClass")
 importFrom("methods", "is", "new", "signature", "show")
 importFrom("stats", "gaussian", "setNames")
-importFrom("utils", "download.file", "object.size", "packageVersion", 
"untar")
+importFrom("utils", "download.file", "object.size", "packageVersion", 
"tail", "untar")
--- End diff --

This was regressed from a recent commit. check-cran.sh actually is flagging 
this in an existing NOTE but we only check for # of NOTE (which is still 1), 
and so this went in undetected.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-11-25 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16014#discussion_r89661364
  
--- Diff: R/pkg/DESCRIPTION ---
@@ -1,28 +1,27 @@
 Package: SparkR
 Type: Package
-Title: R Frontend for Apache Spark
 Version: 2.1.0
-Date: 2016-11-06
--- End diff --

this is removed - I tried but haven't found a way to update this 
automatically, (I guess this could be in the 
[release-tag](https://github.com/apache/spark/blob/master/dev/create-release/release-tag.sh)
 script though)
But more importantly, seems like many (most?) packages do not have this in 
their DESCRIPTION.

In any case, release date are stamped when releasing to CRAN.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-11-25 Thread felixcheung
GitHub user felixcheung opened a pull request:

https://github.com/apache/spark/pull/16014

[SPARK-18590][SPARKR] build R source package when making distribution

## What changes were proposed in this pull request?

We should include in Spark distribution the built source package for 
SparkR. This will enable help and vignettes when the package is used. Also this 
source package is what we would release to CRAN.

### more details

These are the additional steps in make-distribution; please see 
[here](https://github.com/apache/spark/blob/master/R/CRAN_RELEASE.md) on what's 
going to a CRAN release, which is now run during make-distribution.sh.
1. package needs to be installed because the first code block in vignettes 
is `library(SparkR)` without lib path
2. `R CMD build` will build vignettes
3. `R CMD check` on the source package will install package and build 
vignettes again (this time from source packaged)
 (will skip tests here but tests will need to pass for CRAN release process 
to success - ideally, during release signoff we should install from the R 
package and run tests)
4. `R CMD Install` on the source package (this is the only way to generate 
doc/vignettes rds files correctly, not in step #1)
 (the output of this step is what we package into Spark dist and sparkr.zip)

Alternatively, 
   R CMD build should already be installing the package in a temp directory 
though it might just be finding this location and set it to lib.loc parameter; 
another approach is perhaps we could try calling `R CMD INSTALL --build pkg` 
instead.
 But in any case, despite installing the package multiple times this is 
relatively fast. 
Building vignettes takes a while though.



## How was this patch tested?

Manually, CI.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/felixcheung/spark rdist

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16014.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16014


commit 79771392f7a8c7fe4ed90b20aec05e5e65304975
Author: Felix Cheung 
Date:   2016-11-25T23:00:25Z

build source package in make-distribution, and take that as a part of the 
distribution




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org