Git Push Summary
Repository: spark Updated Tags: refs/tags/1.4.0-rc2-test [deleted] 8f50218f3 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[2/2] spark git commit: Preparing development version 1.4.1-SNAPSHOT
Preparing development version 1.4.1-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f2f74b9b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f2f74b9b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f2f74b9b Branch: refs/heads/branch-1.4 Commit: f2f74b9b1a4002e00988e33fc78ec23ab3ca899e Parents: 0da7396 Author: Patrick Wendell pwend...@gmail.com Authored: Sat May 23 14:59:37 2015 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Sat May 23 14:59:37 2015 -0700 -- assembly/pom.xml | 2 +- bagel/pom.xml | 2 +- core/pom.xml | 2 +- examples/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml| 2 +- external/kafka-assembly/pom.xml | 2 +- external/kafka/pom.xml| 2 +- external/mqtt/pom.xml | 2 +- external/twitter/pom.xml | 2 +- external/zeromq/pom.xml | 2 +- extras/java8-tests/pom.xml| 2 +- extras/kinesis-asl/pom.xml| 2 +- extras/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml| 2 +- launcher/pom.xml | 2 +- mllib/pom.xml | 2 +- network/common/pom.xml| 2 +- network/shuffle/pom.xml | 2 +- network/yarn/pom.xml | 2 +- pom.xml | 2 +- repl/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- unsafe/pom.xml| 2 +- yarn/pom.xml | 2 +- 30 files changed, 30 insertions(+), 30 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/f2f74b9b/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index b8a821d..b53d7c3 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.1-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/f2f74b9b/bagel/pom.xml -- diff --git a/bagel/pom.xml b/bagel/pom.xml index c1aa32b..d631ff5 100644 --- a/bagel/pom.xml +++ b/bagel/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.1-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/f2f74b9b/core/pom.xml -- diff --git a/core/pom.xml b/core/pom.xml index 8acb923..adbb7c2 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.1-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/f2f74b9b/examples/pom.xml -- diff --git a/examples/pom.xml b/examples/pom.xml index 706a97d..bf804bb 100644 --- a/examples/pom.xml +++ b/examples/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.1-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/f2f74b9b/external/flume-sink/pom.xml -- diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml index e8784eb..076ddaa 100644 --- a/external/flume-sink/pom.xml +++ b/external/flume-sink/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.1-SNAPSHOT/version relativePath../../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/f2f74b9b/external/flume/pom.xml -- diff --git a/external/flume/pom.xml b/external/flume/pom.xml index 1794f3e..2491c97 100644 --- a/external/flume/pom.xml +++ b/external/flume/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +
[1/2] spark git commit: Preparing Spark release v1.4.0-rc2-test
Repository: spark Updated Branches: refs/heads/branch-1.4 8da8caab1 - f2f74b9b1 Preparing Spark release v1.4.0-rc2-test Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0da73969 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0da73969 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0da73969 Branch: refs/heads/branch-1.4 Commit: 0da739699087f28ddd6d59f61edc4d548c3016a9 Parents: 8da8caa Author: Patrick Wendell pwend...@gmail.com Authored: Sat May 23 14:59:31 2015 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Sat May 23 14:59:31 2015 -0700 -- assembly/pom.xml | 2 +- bagel/pom.xml | 2 +- core/pom.xml | 2 +- examples/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml| 2 +- external/kafka-assembly/pom.xml | 2 +- external/kafka/pom.xml| 2 +- external/mqtt/pom.xml | 2 +- external/twitter/pom.xml | 2 +- external/zeromq/pom.xml | 2 +- extras/java8-tests/pom.xml| 2 +- extras/kinesis-asl/pom.xml| 2 +- extras/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml| 2 +- launcher/pom.xml | 2 +- mllib/pom.xml | 2 +- network/common/pom.xml| 2 +- network/shuffle/pom.xml | 2 +- network/yarn/pom.xml | 2 +- pom.xml | 2 +- repl/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- unsafe/pom.xml| 2 +- yarn/pom.xml | 2 +- 30 files changed, 30 insertions(+), 30 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/0da73969/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index b53d7c3..b8a821d 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.1-SNAPSHOT/version +version1.4.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/0da73969/bagel/pom.xml -- diff --git a/bagel/pom.xml b/bagel/pom.xml index d631ff5..c1aa32b 100644 --- a/bagel/pom.xml +++ b/bagel/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.1-SNAPSHOT/version +version1.4.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/0da73969/core/pom.xml -- diff --git a/core/pom.xml b/core/pom.xml index adbb7c2..8acb923 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.1-SNAPSHOT/version +version1.4.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/0da73969/examples/pom.xml -- diff --git a/examples/pom.xml b/examples/pom.xml index bf804bb..706a97d 100644 --- a/examples/pom.xml +++ b/examples/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.1-SNAPSHOT/version +version1.4.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/0da73969/external/flume-sink/pom.xml -- diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml index 076ddaa..e8784eb 100644 --- a/external/flume-sink/pom.xml +++ b/external/flume-sink/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.1-SNAPSHOT/version +version1.4.0/version relativePath../../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/0da73969/external/flume/pom.xml -- diff --git a/external/flume/pom.xml b/external/flume/pom.xml index 2491c97..1794f3e 100644 --- a/external/flume/pom.xml +++ b/external/flume/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId
spark git commit: [HOTFIX] Copy SparkR lib if it exists in make-distribution
Repository: spark Updated Branches: refs/heads/branch-1.4 8d6d8a538 - fbc4480d9 [HOTFIX] Copy SparkR lib if it exists in make-distribution This is to fix an issue reported in #6373 where the `cp` would fail if `-Psparkr` was not used in the build cc dragos pwendell Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Closes #6379 from shivaram/make-distribution-hotfix and squashes the following commits: 08eb7e4 [Shivaram Venkataraman] Copy SparkR lib if it exists in make-distribution (cherry picked from commit b231baa24857ea83c8062dd4e033db4e35bf457d) Signed-off-by: Shivaram Venkataraman shiva...@cs.berkeley.edu Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fbc4480d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fbc4480d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/fbc4480d Branch: refs/heads/branch-1.4 Commit: fbc4480d9359a10609b79d429a15a244eff5f65f Parents: 8d6d8a5 Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Authored: Sat May 23 12:28:16 2015 -0700 Committer: Shivaram Venkataraman shiva...@cs.berkeley.edu Committed: Sat May 23 12:28:24 2015 -0700 -- make-distribution.sh | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/fbc4480d/make-distribution.sh -- diff --git a/make-distribution.sh b/make-distribution.sh index 7882734..a2b0c43 100755 --- a/make-distribution.sh +++ b/make-distribution.sh @@ -229,10 +229,13 @@ cp $SPARK_HOME/conf/*.template $DISTDIR/conf cp $SPARK_HOME/README.md $DISTDIR cp -r $SPARK_HOME/bin $DISTDIR cp -r $SPARK_HOME/python $DISTDIR -mkdir -p $DISTDIR/R/lib -cp -r $SPARK_HOME/R/lib/SparkR $DISTDIR/R/lib cp -r $SPARK_HOME/sbin $DISTDIR cp -r $SPARK_HOME/ec2 $DISTDIR +# Copy SparkR if it exists +if [ -d $SPARK_HOME/R/lib/SparkR ]; then + mkdir -p $DISTDIR/R/lib + cp -r $SPARK_HOME/R/lib/SparkR $DISTDIR/R/lib +fi # Download and copy in tachyon, if requested if [ $SPARK_TACHYON == true ]; then - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [HOTFIX] Copy SparkR lib if it exists in make-distribution
Repository: spark Updated Branches: refs/heads/master 2b7e63585 - b231baa24 [HOTFIX] Copy SparkR lib if it exists in make-distribution This is to fix an issue reported in #6373 where the `cp` would fail if `-Psparkr` was not used in the build cc dragos pwendell Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Closes #6379 from shivaram/make-distribution-hotfix and squashes the following commits: 08eb7e4 [Shivaram Venkataraman] Copy SparkR lib if it exists in make-distribution Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b231baa2 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b231baa2 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b231baa2 Branch: refs/heads/master Commit: b231baa24857ea83c8062dd4e033db4e35bf457d Parents: 2b7e635 Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Authored: Sat May 23 12:28:16 2015 -0700 Committer: Shivaram Venkataraman shiva...@cs.berkeley.edu Committed: Sat May 23 12:28:16 2015 -0700 -- make-distribution.sh | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/b231baa2/make-distribution.sh -- diff --git a/make-distribution.sh b/make-distribution.sh index 7882734..a2b0c43 100755 --- a/make-distribution.sh +++ b/make-distribution.sh @@ -229,10 +229,13 @@ cp $SPARK_HOME/conf/*.template $DISTDIR/conf cp $SPARK_HOME/README.md $DISTDIR cp -r $SPARK_HOME/bin $DISTDIR cp -r $SPARK_HOME/python $DISTDIR -mkdir -p $DISTDIR/R/lib -cp -r $SPARK_HOME/R/lib/SparkR $DISTDIR/R/lib cp -r $SPARK_HOME/sbin $DISTDIR cp -r $SPARK_HOME/ec2 $DISTDIR +# Copy SparkR if it exists +if [ -d $SPARK_HOME/R/lib/SparkR ]; then + mkdir -p $DISTDIR/R/lib + cp -r $SPARK_HOME/R/lib/SparkR $DISTDIR/R/lib +fi # Download and copy in tachyon, if requested if [ $SPARK_TACHYON == true ]; then - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[2/2] spark git commit: Preparing development version 1.4.1-SNAPSHOT
Preparing development version 1.4.1-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8da8caab Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8da8caab Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8da8caab Branch: refs/heads/branch-1.4 Commit: 8da8caab17ee4fa17d150cea4ab01ff132c5ba7c Parents: 8f50218 Author: Patrick Wendell pwend...@gmail.com Authored: Sat May 23 14:46:27 2015 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Sat May 23 14:46:27 2015 -0700 -- assembly/pom.xml | 2 +- bagel/pom.xml | 2 +- core/pom.xml | 2 +- examples/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml| 2 +- external/kafka-assembly/pom.xml | 2 +- external/kafka/pom.xml| 2 +- external/mqtt/pom.xml | 2 +- external/twitter/pom.xml | 2 +- external/zeromq/pom.xml | 2 +- extras/java8-tests/pom.xml| 2 +- extras/kinesis-asl/pom.xml| 2 +- extras/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml| 2 +- launcher/pom.xml | 2 +- mllib/pom.xml | 2 +- network/common/pom.xml| 2 +- network/shuffle/pom.xml | 2 +- network/yarn/pom.xml | 2 +- pom.xml | 2 +- repl/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- unsafe/pom.xml| 2 +- yarn/pom.xml | 2 +- 30 files changed, 30 insertions(+), 30 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/8da8caab/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index b8a821d..b53d7c3 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.1-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/8da8caab/bagel/pom.xml -- diff --git a/bagel/pom.xml b/bagel/pom.xml index c1aa32b..d631ff5 100644 --- a/bagel/pom.xml +++ b/bagel/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.1-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/8da8caab/core/pom.xml -- diff --git a/core/pom.xml b/core/pom.xml index 8acb923..adbb7c2 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.1-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/8da8caab/examples/pom.xml -- diff --git a/examples/pom.xml b/examples/pom.xml index 706a97d..bf804bb 100644 --- a/examples/pom.xml +++ b/examples/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.1-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/8da8caab/external/flume-sink/pom.xml -- diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml index e8784eb..076ddaa 100644 --- a/external/flume-sink/pom.xml +++ b/external/flume-sink/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.1-SNAPSHOT/version relativePath../../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/8da8caab/external/flume/pom.xml -- diff --git a/external/flume/pom.xml b/external/flume/pom.xml index 1794f3e..2491c97 100644 --- a/external/flume/pom.xml +++ b/external/flume/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +
Git Push Summary
Repository: spark Updated Tags: refs/tags/1.4.0-rc2-test [created] 8f50218f3 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[1/2] spark git commit: Preparing Spark release 1.4.0-rc2-test
Repository: spark Updated Branches: refs/heads/branch-1.4 fbc4480d9 - 8da8caab1 Preparing Spark release 1.4.0-rc2-test Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8f50218f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8f50218f Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8f50218f Branch: refs/heads/branch-1.4 Commit: 8f50218f3842acbde64b7cb316c47702ea0f88af Parents: fbc4480 Author: Patrick Wendell pwend...@gmail.com Authored: Sat May 23 14:46:23 2015 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Sat May 23 14:46:23 2015 -0700 -- assembly/pom.xml | 2 +- bagel/pom.xml | 2 +- core/pom.xml | 2 +- examples/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml| 2 +- external/kafka-assembly/pom.xml | 2 +- external/kafka/pom.xml| 2 +- external/mqtt/pom.xml | 2 +- external/twitter/pom.xml | 2 +- external/zeromq/pom.xml | 2 +- extras/java8-tests/pom.xml| 2 +- extras/kinesis-asl/pom.xml| 2 +- extras/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml| 2 +- launcher/pom.xml | 2 +- mllib/pom.xml | 2 +- network/common/pom.xml| 2 +- network/shuffle/pom.xml | 2 +- network/yarn/pom.xml | 2 +- pom.xml | 2 +- repl/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- unsafe/pom.xml| 2 +- yarn/pom.xml | 2 +- 30 files changed, 30 insertions(+), 30 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/8f50218f/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index 626c857..b8a821d 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0-SNAPSHOT/version +version1.4.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/8f50218f/bagel/pom.xml -- diff --git a/bagel/pom.xml b/bagel/pom.xml index 1f3dec9..c1aa32b 100644 --- a/bagel/pom.xml +++ b/bagel/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0-SNAPSHOT/version +version1.4.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/8f50218f/core/pom.xml -- diff --git a/core/pom.xml b/core/pom.xml index bfa49d0..8acb923 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0-SNAPSHOT/version +version1.4.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/8f50218f/examples/pom.xml -- diff --git a/examples/pom.xml b/examples/pom.xml index 5b04b4f..706a97d 100644 --- a/examples/pom.xml +++ b/examples/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0-SNAPSHOT/version +version1.4.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/8f50218f/external/flume-sink/pom.xml -- diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml index 1f3e619..e8784eb 100644 --- a/external/flume-sink/pom.xml +++ b/external/flume-sink/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0-SNAPSHOT/version +version1.4.0/version relativePath../../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/8f50218f/external/flume/pom.xml -- diff --git a/external/flume/pom.xml b/external/flume/pom.xml index 8df7edb..1794f3e 100644 --- a/external/flume/pom.xml +++ b/external/flume/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId
spark git commit: [SPARK-7287] [HOTFIX] Disable o.a.s.deploy.SparkSubmitSuite --packages
Repository: spark Updated Branches: refs/heads/branch-1.4 f2f74b9b1 - 641edc99f [SPARK-7287] [HOTFIX] Disable o.a.s.deploy.SparkSubmitSuite --packages Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/641edc99 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/641edc99 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/641edc99 Branch: refs/heads/branch-1.4 Commit: 641edc99fc66018409ca1c8edc373b0a790db1d9 Parents: f2f74b9 Author: Patrick Wendell patr...@databricks.com Authored: Sat May 23 19:44:03 2015 -0700 Committer: Patrick Wendell patr...@databricks.com Committed: Sat May 23 19:44:23 2015 -0700 -- .../src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/641edc99/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala -- diff --git a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala index 8f64ab5..ea9227a 100644 --- a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala @@ -335,7 +335,8 @@ class SparkSubmitSuite extends FunSuite with Matchers with ResetSystemProperties runSparkSubmit(args) } - test(includes jars passed in through --packages) { + // SPARK-7287 + ignore(includes jars passed in through --packages) { val unusedJar = TestUtils.createJarWithClasses(Seq.empty) val main = MavenCoordinate(my.great.lib, mylib, 0.1) val dep = MavenCoordinate(my.great.dep, mylib, 0.1) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-7287] [HOTFIX] Disable o.a.s.deploy.SparkSubmitSuite --packages
Repository: spark Updated Branches: refs/heads/master b231baa24 - 3c1a2d049 [SPARK-7287] [HOTFIX] Disable o.a.s.deploy.SparkSubmitSuite --packages Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3c1a2d04 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3c1a2d04 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/3c1a2d04 Branch: refs/heads/master Commit: 3c1a2d049cd4bf35fd48a032f5008b7bab60833e Parents: b231baa Author: Patrick Wendell patr...@databricks.com Authored: Sat May 23 19:44:03 2015 -0700 Committer: Patrick Wendell patr...@databricks.com Committed: Sat May 23 19:44:03 2015 -0700 -- .../src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/3c1a2d04/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala -- diff --git a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala index 8f64ab5..ea9227a 100644 --- a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala @@ -335,7 +335,8 @@ class SparkSubmitSuite extends FunSuite with Matchers with ResetSystemProperties runSparkSubmit(args) } - test(includes jars passed in through --packages) { + // SPARK-7287 + ignore(includes jars passed in through --packages) { val unusedJar = TestUtils.createJarWithClasses(Seq.empty) val main = MavenCoordinate(my.great.lib, mylib, 0.1) val dep = MavenCoordinate(my.great.dep, mylib, 0.1) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
Git Push Summary
Repository: spark Updated Tags: refs/tags/v1.4.0-rc2-test [deleted] 0da739699 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-6811] Copy SparkR lib in make-distribution.sh
Repository: spark Updated Branches: refs/heads/master 7af3818c6 - a40bca011 [SPARK-6811] Copy SparkR lib in make-distribution.sh This change also remove native libraries from SparkR to make sure our distribution works across platforms Tested by building on Mac, running on Amazon Linux (CentOS), Windows VM and vice-versa (built on Linux run on Mac) I will also test this with YARN soon and update this PR. Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Closes #6373 from shivaram/sparkr-binary and squashes the following commits: ae41b5c [Shivaram Venkataraman] Remove native libraries from SparkR Also include the built SparkR package in make-distribution.sh Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a40bca01 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a40bca01 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a40bca01 Branch: refs/heads/master Commit: a40bca0111de45763c3ef4270afb2185c16b8f95 Parents: 7af3818 Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Authored: Sat May 23 00:04:01 2015 -0700 Committer: Shivaram Venkataraman shiva...@cs.berkeley.edu Committed: Sat May 23 00:04:01 2015 -0700 -- R/pkg/NAMESPACE | 5 +++- R/pkg/R/utils.R | 38 - R/pkg/src-native/Makefile | 27 ++ R/pkg/src-native/Makefile.win | 27 ++ R/pkg/src-native/string_hash_code.c | 49 R/pkg/src/Makefile | 27 -- R/pkg/src/Makefile.win | 27 -- R/pkg/src/string_hash_code.c| 49 make-distribution.sh| 2 ++ 9 files changed, 146 insertions(+), 105 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/a40bca01/R/pkg/NAMESPACE -- diff --git a/R/pkg/NAMESPACE b/R/pkg/NAMESPACE index 64ffdcf..411126a 100644 --- a/R/pkg/NAMESPACE +++ b/R/pkg/NAMESPACE @@ -1,6 +1,9 @@ # Imports from base R importFrom(methods, setGeneric, setMethod, setOldClass) -useDynLib(SparkR, stringHashCode) + +# Disable native libraries till we figure out how to package it +# See SPARKR-7839 +#useDynLib(SparkR, stringHashCode) # S3 methods exported export(sparkR.init) http://git-wip-us.apache.org/repos/asf/spark/blob/a40bca01/R/pkg/R/utils.R -- diff --git a/R/pkg/R/utils.R b/R/pkg/R/utils.R index 0e7b7bd..69b2700 100644 --- a/R/pkg/R/utils.R +++ b/R/pkg/R/utils.R @@ -122,13 +122,49 @@ hashCode - function(key) { intBits - packBits(rawToBits(rawVec), integer) as.integer(bitwXor(intBits[2], intBits[1])) } else if (class(key) == character) { -.Call(stringHashCode, key) +# TODO: SPARK-7839 means we might not have the native library available +if (is.loaded(stringHashCode)) { + .Call(stringHashCode, key) +} else { + n - nchar(key) + if (n == 0) { +0L + } else { +asciiVals - sapply(charToRaw(key), function(x) { strtoi(x, 16L) }) +hashC - 0 +for (k in 1:length(asciiVals)) { + hashC - mult31AndAdd(hashC, asciiVals[k]) +} +as.integer(hashC) + } +} } else { warning(paste(Could not hash object, returning 0, sep = )) as.integer(0) } } +# Helper function used to wrap a 'numeric' value to integer bounds. +# Useful for implementing C-like integer arithmetic +wrapInt - function(value) { + if (value .Machine$integer.max) { +value - value - 2 * .Machine$integer.max - 2 + } else if (value -1 * .Machine$integer.max) { +value - 2 * .Machine$integer.max + value + 2 + } + value +} + +# Multiply `val` by 31 and add `addVal` to the result. Ensures that +# integer-overflows are handled at every step. +mult31AndAdd - function(val, addVal) { + vec - c(bitwShiftL(val, c(4,3,2,1,0)), addVal) + Reduce(function(a, b) { + wrapInt(as.numeric(a) + as.numeric(b)) + }, + vec) +} + # Create a new RDD with serializedMode == byte. # Return itself if already in byte format. serializeToBytes - function(rdd) { http://git-wip-us.apache.org/repos/asf/spark/blob/a40bca01/R/pkg/src-native/Makefile -- diff --git a/R/pkg/src-native/Makefile b/R/pkg/src-native/Makefile new file mode 100644 index 000..a55a56f --- /dev/null +++ b/R/pkg/src-native/Makefile @@ -0,0 +1,27 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses
spark git commit: [SPARK-6811] Copy SparkR lib in make-distribution.sh
Repository: spark Updated Branches: refs/heads/branch-1.4 c636b87dc - c8eb76ba6 [SPARK-6811] Copy SparkR lib in make-distribution.sh This change also remove native libraries from SparkR to make sure our distribution works across platforms Tested by building on Mac, running on Amazon Linux (CentOS), Windows VM and vice-versa (built on Linux run on Mac) I will also test this with YARN soon and update this PR. Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Closes #6373 from shivaram/sparkr-binary and squashes the following commits: ae41b5c [Shivaram Venkataraman] Remove native libraries from SparkR Also include the built SparkR package in make-distribution.sh (cherry picked from commit a40bca0111de45763c3ef4270afb2185c16b8f95) Signed-off-by: Shivaram Venkataraman shiva...@cs.berkeley.edu Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c8eb76ba Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c8eb76ba Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c8eb76ba Branch: refs/heads/branch-1.4 Commit: c8eb76ba673026f2fb2b22e8b3e8102a5940297c Parents: c636b87 Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Authored: Sat May 23 00:04:01 2015 -0700 Committer: Shivaram Venkataraman shiva...@cs.berkeley.edu Committed: Sat May 23 00:04:32 2015 -0700 -- R/pkg/NAMESPACE | 5 +++- R/pkg/R/utils.R | 38 - R/pkg/src-native/Makefile | 27 ++ R/pkg/src-native/Makefile.win | 27 ++ R/pkg/src-native/string_hash_code.c | 49 R/pkg/src/Makefile | 27 -- R/pkg/src/Makefile.win | 27 -- R/pkg/src/string_hash_code.c| 49 make-distribution.sh| 2 ++ 9 files changed, 146 insertions(+), 105 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/c8eb76ba/R/pkg/NAMESPACE -- diff --git a/R/pkg/NAMESPACE b/R/pkg/NAMESPACE index 64ffdcf..411126a 100644 --- a/R/pkg/NAMESPACE +++ b/R/pkg/NAMESPACE @@ -1,6 +1,9 @@ # Imports from base R importFrom(methods, setGeneric, setMethod, setOldClass) -useDynLib(SparkR, stringHashCode) + +# Disable native libraries till we figure out how to package it +# See SPARKR-7839 +#useDynLib(SparkR, stringHashCode) # S3 methods exported export(sparkR.init) http://git-wip-us.apache.org/repos/asf/spark/blob/c8eb76ba/R/pkg/R/utils.R -- diff --git a/R/pkg/R/utils.R b/R/pkg/R/utils.R index 0e7b7bd..69b2700 100644 --- a/R/pkg/R/utils.R +++ b/R/pkg/R/utils.R @@ -122,13 +122,49 @@ hashCode - function(key) { intBits - packBits(rawToBits(rawVec), integer) as.integer(bitwXor(intBits[2], intBits[1])) } else if (class(key) == character) { -.Call(stringHashCode, key) +# TODO: SPARK-7839 means we might not have the native library available +if (is.loaded(stringHashCode)) { + .Call(stringHashCode, key) +} else { + n - nchar(key) + if (n == 0) { +0L + } else { +asciiVals - sapply(charToRaw(key), function(x) { strtoi(x, 16L) }) +hashC - 0 +for (k in 1:length(asciiVals)) { + hashC - mult31AndAdd(hashC, asciiVals[k]) +} +as.integer(hashC) + } +} } else { warning(paste(Could not hash object, returning 0, sep = )) as.integer(0) } } +# Helper function used to wrap a 'numeric' value to integer bounds. +# Useful for implementing C-like integer arithmetic +wrapInt - function(value) { + if (value .Machine$integer.max) { +value - value - 2 * .Machine$integer.max - 2 + } else if (value -1 * .Machine$integer.max) { +value - 2 * .Machine$integer.max + value + 2 + } + value +} + +# Multiply `val` by 31 and add `addVal` to the result. Ensures that +# integer-overflows are handled at every step. +mult31AndAdd - function(val, addVal) { + vec - c(bitwShiftL(val, c(4,3,2,1,0)), addVal) + Reduce(function(a, b) { + wrapInt(as.numeric(a) + as.numeric(b)) + }, + vec) +} + # Create a new RDD with serializedMode == byte. # Return itself if already in byte format. serializeToBytes - function(rdd) { http://git-wip-us.apache.org/repos/asf/spark/blob/c8eb76ba/R/pkg/src-native/Makefile -- diff --git a/R/pkg/src-native/Makefile b/R/pkg/src-native/Makefile new file mode 100644 index 000..a55a56f --- /dev/null +++ b/R/pkg/src-native/Makefile @@ -0,0 +1,27 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license
spark git commit: [SPARK-6806] [SPARKR] [DOCS] Fill in SparkR examples in programming guide
Repository: spark Updated Branches: refs/heads/branch-1.4 b928db4fe - c636b87dc [SPARK-6806] [SPARKR] [DOCS] Fill in SparkR examples in programming guide sqlCtx - sqlContext You can check the docs by: ``` $ cd docs $ SKIP_SCALADOC=1 jekyll serve ``` cc shivaram Author: Davies Liu dav...@databricks.com Closes #5442 from davies/r_docs and squashes the following commits: 7a12ec6 [Davies Liu] remove rdd in R docs 8496b26 [Davies Liu] remove the docs related to RDD e23b9d6 [Davies Liu] delete R docs for RDD API 222e4ff [Davies Liu] Merge branch 'master' into r_docs 89684ce [Davies Liu] Merge branch 'r_docs' of github.com:davies/spark into r_docs f0a10e1 [Davies Liu] address comments from @shivaram f61de71 [Davies Liu] Update pairRDD.R 3ef7cf3 [Davies Liu] use + instead of function(a,b) a+b 2f10a77 [Davies Liu] address comments from @cafreeman 9c2a062 [Davies Liu] mention R api together with Python API 23f751a [Davies Liu] Fill in SparkR examples in programming guide (cherry picked from commit 7af3818c6b2bf35bfa531ab7cc3a4a714385015e) Signed-off-by: Shivaram Venkataraman shiva...@cs.berkeley.edu Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c636b87d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c636b87d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c636b87d Branch: refs/heads/branch-1.4 Commit: c636b87dc287ce99a887bc59cad31aaf48477a56 Parents: b928db4 Author: Davies Liu dav...@databricks.com Authored: Sat May 23 00:00:30 2015 -0700 Committer: Shivaram Venkataraman shiva...@cs.berkeley.edu Committed: Sat May 23 00:02:22 2015 -0700 -- R/README.md | 4 +- R/pkg/R/DataFrame.R | 176 R/pkg/R/RDD.R| 2 +- R/pkg/R/SQLContext.R | 165 --- R/pkg/R/pairRDD.R| 4 +- R/pkg/R/sparkR.R | 10 +- R/pkg/inst/profile/shell.R | 6 +- R/pkg/inst/tests/test_sparkSQL.R | 156 +++--- docs/_plugins/copy_api_dirs.rb | 68 --- docs/api.md | 3 +- docs/index.md| 23 ++- docs/programming-guide.md| 21 +- docs/quick-start.md | 18 +- docs/sql-programming-guide.md| 373 +- 14 files changed, 706 insertions(+), 323 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/c636b87d/R/README.md -- diff --git a/R/README.md b/R/README.md index a6970e3..d7d65b4 100644 --- a/R/README.md +++ b/R/README.md @@ -52,7 +52,7 @@ The SparkR documentation (Rd files and HTML files) are not a part of the source SparkR comes with several sample programs in the `examples/src/main/r` directory. To run one of them, use `./bin/sparkR filename args`. For example: -./bin/sparkR examples/src/main/r/pi.R local[2] +./bin/sparkR examples/src/main/r/dataframe.R You can also run the unit-tests for SparkR by running (you need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first): @@ -63,5 +63,5 @@ You can also run the unit-tests for SparkR by running (you need to install the [ The `./bin/spark-submit` and `./bin/sparkR` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run ``` export YARN_CONF_DIR=/etc/hadoop/conf -./bin/spark-submit --master yarn examples/src/main/r/pi.R 4 +./bin/spark-submit --master yarn examples/src/main/r/dataframe.R ``` http://git-wip-us.apache.org/repos/asf/spark/blob/c636b87d/R/pkg/R/DataFrame.R -- diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R index a7fa32e..ed8093c 100644 --- a/R/pkg/R/DataFrame.R +++ b/R/pkg/R/DataFrame.R @@ -65,9 +65,9 @@ dataFrame - function(sdf, isCached = FALSE) { #' @examples #'\dontrun{ #' sc - sparkR.init() -#' sqlCtx - sparkRSQL.init(sc) +#' sqlContext - sparkRSQL.init(sc) #' path - path/to/file.json -#' df - jsonFile(sqlCtx, path) +#' df - jsonFile(sqlContext, path) #' printSchema(df) #'} setMethod(printSchema, @@ -88,9 +88,9 @@ setMethod(printSchema, #' @examples #'\dontrun{ #' sc - sparkR.init() -#' sqlCtx - sparkRSQL.init(sc) +#' sqlContext - sparkRSQL.init(sc) #' path - path/to/file.json -#' df - jsonFile(sqlCtx, path) +#' df - jsonFile(sqlContext, path) #' dfSchema - schema(df) #'} setMethod(schema, @@ -110,9 +110,9 @@ setMethod(schema, #' @examples #'\dontrun{ #' sc - sparkR.init() -#' sqlCtx - sparkRSQL.init(sc) +#' sqlContext - sparkRSQL.init(sc) #' path - path/to/file.json -#' df - jsonFile(sqlCtx, path) +#' df - jsonFile(sqlContext, path) #' explain(df, TRUE) #'} setMethod(explain, @@
spark git commit: [SPARK-6806] [SPARKR] [DOCS] Fill in SparkR examples in programming guide
Repository: spark Updated Branches: refs/heads/master 4583cf4be - 7af3818c6 [SPARK-6806] [SPARKR] [DOCS] Fill in SparkR examples in programming guide sqlCtx - sqlContext You can check the docs by: ``` $ cd docs $ SKIP_SCALADOC=1 jekyll serve ``` cc shivaram Author: Davies Liu dav...@databricks.com Closes #5442 from davies/r_docs and squashes the following commits: 7a12ec6 [Davies Liu] remove rdd in R docs 8496b26 [Davies Liu] remove the docs related to RDD e23b9d6 [Davies Liu] delete R docs for RDD API 222e4ff [Davies Liu] Merge branch 'master' into r_docs 89684ce [Davies Liu] Merge branch 'r_docs' of github.com:davies/spark into r_docs f0a10e1 [Davies Liu] address comments from @shivaram f61de71 [Davies Liu] Update pairRDD.R 3ef7cf3 [Davies Liu] use + instead of function(a,b) a+b 2f10a77 [Davies Liu] address comments from @cafreeman 9c2a062 [Davies Liu] mention R api together with Python API 23f751a [Davies Liu] Fill in SparkR examples in programming guide Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7af3818c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7af3818c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7af3818c Branch: refs/heads/master Commit: 7af3818c6b2bf35bfa531ab7cc3a4a714385015e Parents: 4583cf4 Author: Davies Liu dav...@databricks.com Authored: Sat May 23 00:00:30 2015 -0700 Committer: Shivaram Venkataraman shiva...@cs.berkeley.edu Committed: Sat May 23 00:01:40 2015 -0700 -- R/README.md | 4 +- R/pkg/R/DataFrame.R | 176 R/pkg/R/RDD.R| 2 +- R/pkg/R/SQLContext.R | 165 --- R/pkg/R/pairRDD.R| 4 +- R/pkg/R/sparkR.R | 10 +- R/pkg/inst/profile/shell.R | 6 +- R/pkg/inst/tests/test_sparkSQL.R | 156 +++--- docs/_plugins/copy_api_dirs.rb | 68 --- docs/api.md | 3 +- docs/index.md| 23 ++- docs/programming-guide.md| 21 +- docs/quick-start.md | 18 +- docs/sql-programming-guide.md| 373 +- 14 files changed, 706 insertions(+), 323 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/7af3818c/R/README.md -- diff --git a/R/README.md b/R/README.md index a6970e3..d7d65b4 100644 --- a/R/README.md +++ b/R/README.md @@ -52,7 +52,7 @@ The SparkR documentation (Rd files and HTML files) are not a part of the source SparkR comes with several sample programs in the `examples/src/main/r` directory. To run one of them, use `./bin/sparkR filename args`. For example: -./bin/sparkR examples/src/main/r/pi.R local[2] +./bin/sparkR examples/src/main/r/dataframe.R You can also run the unit-tests for SparkR by running (you need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first): @@ -63,5 +63,5 @@ You can also run the unit-tests for SparkR by running (you need to install the [ The `./bin/spark-submit` and `./bin/sparkR` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run ``` export YARN_CONF_DIR=/etc/hadoop/conf -./bin/spark-submit --master yarn examples/src/main/r/pi.R 4 +./bin/spark-submit --master yarn examples/src/main/r/dataframe.R ``` http://git-wip-us.apache.org/repos/asf/spark/blob/7af3818c/R/pkg/R/DataFrame.R -- diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R index a7fa32e..ed8093c 100644 --- a/R/pkg/R/DataFrame.R +++ b/R/pkg/R/DataFrame.R @@ -65,9 +65,9 @@ dataFrame - function(sdf, isCached = FALSE) { #' @examples #'\dontrun{ #' sc - sparkR.init() -#' sqlCtx - sparkRSQL.init(sc) +#' sqlContext - sparkRSQL.init(sc) #' path - path/to/file.json -#' df - jsonFile(sqlCtx, path) +#' df - jsonFile(sqlContext, path) #' printSchema(df) #'} setMethod(printSchema, @@ -88,9 +88,9 @@ setMethod(printSchema, #' @examples #'\dontrun{ #' sc - sparkR.init() -#' sqlCtx - sparkRSQL.init(sc) +#' sqlContext - sparkRSQL.init(sc) #' path - path/to/file.json -#' df - jsonFile(sqlCtx, path) +#' df - jsonFile(sqlContext, path) #' dfSchema - schema(df) #'} setMethod(schema, @@ -110,9 +110,9 @@ setMethod(schema, #' @examples #'\dontrun{ #' sc - sparkR.init() -#' sqlCtx - sparkRSQL.init(sc) +#' sqlContext - sparkRSQL.init(sc) #' path - path/to/file.json -#' df - jsonFile(sqlCtx, path) +#' df - jsonFile(sqlContext, path) #' explain(df, TRUE) #'} setMethod(explain, @@ -139,9 +139,9 @@ setMethod(explain, #' @examples #'\dontrun{ #' sc - sparkR.init() -#' sqlCtx - sparkRSQL.init(sc) +#' sqlContext -
spark git commit: [HOTFIX] Add tests for SparkListenerApplicationStart with Driver Logs.
Repository: spark Updated Branches: refs/heads/master baa89838c - 368b8c2b5 [HOTFIX] Add tests for SparkListenerApplicationStart with Driver Logs. #6166 added the driver logs to `SparkListenerApplicationStart`. This adds tests in `JsonProtocolSuite` to ensure we don't regress. Author: Hari Shreedharan hshreedha...@apache.org Closes #6368 from harishreedharan/jsonprotocol-test and squashes the following commits: dc9eafc [Hari Shreedharan] [HOTFIX] Add tests for SparkListenerApplicationStart with Driver Logs. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/368b8c2b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/368b8c2b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/368b8c2b Branch: refs/heads/master Commit: 368b8c2b5ed8b06b00ac87059f75915b13ba3b8d Parents: baa8983 Author: Hari Shreedharan hshreedha...@apache.org Authored: Fri May 22 23:07:56 2015 -0700 Committer: Andrew Or and...@databricks.com Committed: Fri May 22 23:07:56 2015 -0700 -- .../apache/spark/util/JsonProtocolSuite.scala | 25 ++-- 1 file changed, 23 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/368b8c2b/core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala -- diff --git a/core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala b/core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala index 0c5221d..0d9126f 100644 --- a/core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala +++ b/core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala @@ -75,10 +75,12 @@ class JsonProtocolSuite extends FunSuite { val blockManagerRemoved = SparkListenerBlockManagerRemoved(2L, BlockManagerId(Scarce, to be counted..., 100)) val unpersistRdd = SparkListenerUnpersistRDD(12345) +val logUrlMap = Map(stderr - mystderr, stdout - mystdout).toMap val applicationStart = SparkListenerApplicationStart(The winner of all, Some(appId), 42L, Garfield, Some(appAttempt)) +val applicationStartWithLogs = SparkListenerApplicationStart(The winner of all, Some(appId), + 42L, Garfield, Some(appAttempt), Some(logUrlMap)) val applicationEnd = SparkListenerApplicationEnd(42L) -val logUrlMap = Map(stderr - mystderr, stdout - mystdout).toMap val executorAdded = SparkListenerExecutorAdded(executorAddedTime, exec1, new ExecutorInfo(Hostee.awesome.com, 11, logUrlMap)) val executorRemoved = SparkListenerExecutorRemoved(executorRemovedTime, exec2, test reason) @@ -97,6 +99,7 @@ class JsonProtocolSuite extends FunSuite { testEvent(blockManagerRemoved, blockManagerRemovedJsonString) testEvent(unpersistRdd, unpersistRDDJsonString) testEvent(applicationStart, applicationStartJsonString) +testEvent(applicationStartWithLogs, applicationStartJsonWithLogUrlsString) testEvent(applicationEnd, applicationEndJsonString) testEvent(executorAdded, executorAddedJsonString) testEvent(executorRemoved, executorRemovedJsonString) @@ -277,10 +280,12 @@ class JsonProtocolSuite extends FunSuite { test(SparkListenerApplicationStart backwards compatibility) { // SparkListenerApplicationStart in Spark 1.0.0 do not have an appId property. // SparkListenerApplicationStart pre-Spark 1.4 does not have appAttemptId. -val applicationStart = SparkListenerApplicationStart(test, None, 1L, user, None) +// SparkListenerApplicationStart pre-Spark 1.5 does not have driverLogs +val applicationStart = SparkListenerApplicationStart(test, None, 1L, user, None, None) val oldEvent = JsonProtocol.applicationStartToJson(applicationStart) .removeField({ _._1 == App ID }) .removeField({ _._1 == App Attempt ID }) + .removeField({ _._1 == Driver Logs}) assert(applicationStart === JsonProtocol.applicationStartFromJson(oldEvent)) } @@ -1544,6 +1549,22 @@ class JsonProtocolSuite extends FunSuite { |} + private val applicationStartJsonWithLogUrlsString = + + |{ + | Event: SparkListenerApplicationStart, + | App Name: The winner of all, + | App ID: appId, + | Timestamp: 42, + | User: Garfield, + | App Attempt ID: appAttempt, + | Driver Logs : { + | stderr : mystderr, + | stdout : mystdout + | } + |} + + private val applicationEndJsonString = |{ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-5090] [EXAMPLES] The improvement of python converter for hbase
Repository: spark Updated Branches: refs/heads/master 368b8c2b5 - 4583cf4be [SPARK-5090] [EXAMPLES] The improvement of python converter for hbase Hi, Following the discussion in http://apache-spark-developers-list.1001551.n3.nabble.com/python-converter-in-HBaseConverter-scala-spark-examples-td10001.html. I made some modification in three files in package examples: 1. HBaseConverters.scala: the new converter will converts all the records in an hbase results into a single string 2. hbase_input.py: as the value string may contain several records, we can use ast package to convert the string into dict 3. HBaseTest.scala: as the package examples use hbase 0.98.7 the original constructor HTableDescriptor is deprecated. The updation to new constructor is made Author: GenTang gen.tan...@gmail.com Closes #3920 from GenTang/master and squashes the following commits: d2153df [GenTang] import JSONObject precisely 4802481 [GenTang] dump the result into a singl String 62df7f0 [GenTang] remove the comment 21de653 [GenTang] return the string in json format 15b1fe3 [GenTang] the modification of comments 5cbbcfc [GenTang] the improvement of pythonconverter ceb31c5 [GenTang] the modification for adapting updation of hbase 3253b61 [GenTang] the modification accompanying the improvement of pythonconverter Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4583cf4b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4583cf4b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4583cf4b Branch: refs/heads/master Commit: 4583cf4be17155c68178155acf6866d7cc8f7df0 Parents: 368b8c2 Author: GenTang gen.tan...@gmail.com Authored: Fri May 22 23:37:03 2015 -0700 Committer: Davies Liu dav...@databricks.com Committed: Fri May 22 23:37:03 2015 -0700 -- examples/src/main/python/hbase_inputformat.py | 21 .../pythonconverters/HBaseConverters.scala | 20 --- 2 files changed, 30 insertions(+), 11 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/4583cf4b/examples/src/main/python/hbase_inputformat.py -- diff --git a/examples/src/main/python/hbase_inputformat.py b/examples/src/main/python/hbase_inputformat.py index 5b82a14..c5ae5d0 100644 --- a/examples/src/main/python/hbase_inputformat.py +++ b/examples/src/main/python/hbase_inputformat.py @@ -18,6 +18,7 @@ from __future__ import print_function import sys +import json from pyspark import SparkContext @@ -27,24 +28,24 @@ Create test data in HBase first: hbase(main):016:0 create 'test', 'f1' 0 row(s) in 1.0430 seconds -hbase(main):017:0 put 'test', 'row1', 'f1', 'value1' +hbase(main):017:0 put 'test', 'row1', 'f1:a', 'value1' 0 row(s) in 0.0130 seconds -hbase(main):018:0 put 'test', 'row2', 'f1', 'value2' +hbase(main):018:0 put 'test', 'row1', 'f1:b', 'value2' 0 row(s) in 0.0030 seconds -hbase(main):019:0 put 'test', 'row3', 'f1', 'value3' +hbase(main):019:0 put 'test', 'row2', 'f1', 'value3' 0 row(s) in 0.0050 seconds -hbase(main):020:0 put 'test', 'row4', 'f1', 'value4' +hbase(main):020:0 put 'test', 'row3', 'f1', 'value4' 0 row(s) in 0.0110 seconds hbase(main):021:0 scan 'test' ROW COLUMN+CELL - row1 column=f1:, timestamp=1401883411986, value=value1 - row2 column=f1:, timestamp=1401883415212, value=value2 - row3 column=f1:, timestamp=1401883417858, value=value3 - row4 column=f1:, timestamp=1401883420805, value=value4 + row1 column=f1:a, timestamp=1401883411986, value=value1 + row1 column=f1:b, timestamp=1401883415212, value=value2 + row2 column=f1:, timestamp=1401883417858, value=value3 + row3 column=f1:, timestamp=1401883420805, value=value4 4 row(s) in 0.0240 seconds if __name__ == __main__: @@ -64,6 +65,8 @@ if __name__ == __main__: table = sys.argv[2] sc = SparkContext(appName=HBaseInputFormat) +# Other options for configuring scan behavior are available. More information available at +# https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormat.java conf = {hbase.zookeeper.quorum: host, hbase.mapreduce.inputtable: table} if len(sys.argv) 3: conf = {hbase.zookeeper.quorum: host, zookeeper.znode.parent: sys.argv[3], @@ -78,6 +81,8 @@ if __name__ == __main__: keyConverter=keyConv, valueConverter=valueConv, conf=conf) +hbase_rdd = hbase_rdd.flatMapValues(lambda v: v.split(\n)).mapValues(json.loads) + output = hbase_rdd.collect() for (k, v) in
spark git commit: [SPARK-7838] [STREAMING] Set scope for kinesis stream
Repository: spark Updated Branches: refs/heads/master 017b3404a - baa89838c [SPARK-7838] [STREAMING] Set scope for kinesis stream Author: Tathagata Das tathagata.das1...@gmail.com Closes #6369 from tdas/SPARK-7838 and squashes the following commits: 87d1c7f [Tathagata Das] Addressed comment 37775d8 [Tathagata Das] set scope for kinesis stream Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/baa89838 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/baa89838 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/baa89838 Branch: refs/heads/master Commit: baa89838cca96fa091c9e5ce62be01e1a265d820 Parents: 017b340 Author: Tathagata Das tathagata.das1...@gmail.com Authored: Fri May 22 23:05:54 2015 -0700 Committer: Andrew Or and...@databricks.com Committed: Fri May 22 23:05:54 2015 -0700 -- .../org/apache/spark/streaming/kinesis/KinesisUtils.scala | 9 ++--- .../scala/org/apache/spark/streaming/StreamingContext.scala | 2 +- 2 files changed, 7 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/baa89838/extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala -- diff --git a/extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala b/extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala index b114bcf..2531aeb 100644 --- a/extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala +++ b/extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala @@ -63,9 +63,12 @@ object KinesisUtils { checkpointInterval: Duration, storageLevel: StorageLevel ): ReceiverInputDStream[Array[Byte]] = { -ssc.receiverStream( - new KinesisReceiver(kinesisAppName, streamName, endpointUrl, validateRegion(regionName), -initialPositionInStream, checkpointInterval, storageLevel, None)) +// Setting scope to override receiver stream's scope of receiver stream +ssc.withNamedScope(kinesis stream) { + ssc.receiverStream( +new KinesisReceiver(kinesisAppName, streamName, endpointUrl, validateRegion(regionName), + initialPositionInStream, checkpointInterval, storageLevel, None)) +} } /** http://git-wip-us.apache.org/repos/asf/spark/blob/baa89838/streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala -- diff --git a/streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala b/streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala index 7b77d44..5e58ed7 100644 --- a/streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala +++ b/streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala @@ -262,7 +262,7 @@ class StreamingContext private[streaming] ( * * Note: Return statements are NOT allowed in the given body. */ - private def withNamedScope[U](name: String)(body: = U): U = { + private[streaming] def withNamedScope[U](name: String)(body: = U): U = { RDDOperationScope.withScope(sc, name, allowNesting = false, ignoreParent = false)(body) } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-7777][Streaming] Handle the case when there is no block in a batch
Repository: spark Updated Branches: refs/heads/branch-1.4 c8eb76ba6 - ea9db50bc [SPARK-][Streaming] Handle the case when there is no block in a batch In the old implementation, if a batch has no block, `areWALRecordHandlesPresent` will be `true` and it will return `WriteAheadLogBackedBlockRDD`. This PR handles this case by returning `WriteAheadLogBackedBlockRDD` or `BlockRDD` according to the configuration. Author: zsxwing zsxw...@gmail.com Closes #6372 from zsxwing/SPARK- and squashes the following commits: 788f895 [zsxwing] Handle the case when there is no block in a batch (cherry picked from commit ad0badba1450295982738934da2cc121cde18213) Signed-off-by: Tathagata Das tathagata.das1...@gmail.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ea9db50b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ea9db50b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ea9db50b Branch: refs/heads/branch-1.4 Commit: ea9db50bc3ade82fb9966df34961a17b255b86d7 Parents: c8eb76b Author: zsxwing zsxw...@gmail.com Authored: Sat May 23 02:11:17 2015 -0700 Committer: Tathagata Das tathagata.das1...@gmail.com Committed: Sat May 23 02:11:28 2015 -0700 -- .../dstream/ReceiverInputDStream.scala | 47 .../spark/streaming/InputStreamsSuite.scala | 31 + 2 files changed, 60 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/ea9db50b/streaming/src/main/scala/org/apache/spark/streaming/dstream/ReceiverInputDStream.scala -- diff --git a/streaming/src/main/scala/org/apache/spark/streaming/dstream/ReceiverInputDStream.scala b/streaming/src/main/scala/org/apache/spark/streaming/dstream/ReceiverInputDStream.scala index 5cfe43a..e4ff05e 100644 --- a/streaming/src/main/scala/org/apache/spark/streaming/dstream/ReceiverInputDStream.scala +++ b/streaming/src/main/scala/org/apache/spark/streaming/dstream/ReceiverInputDStream.scala @@ -73,27 +73,38 @@ abstract class ReceiverInputDStream[T: ClassTag](@transient ssc_ : StreamingCont val inputInfo = InputInfo(id, blockInfos.map(_.numRecords).sum) ssc.scheduler.inputInfoTracker.reportInfo(validTime, inputInfo) -// Are WAL record handles present with all the blocks -val areWALRecordHandlesPresent = blockInfos.forall { _.walRecordHandleOption.nonEmpty } +if (blockInfos.nonEmpty) { + // Are WAL record handles present with all the blocks + val areWALRecordHandlesPresent = blockInfos.forall { _.walRecordHandleOption.nonEmpty } -if (areWALRecordHandlesPresent) { - // If all the blocks have WAL record handle, then create a WALBackedBlockRDD - val isBlockIdValid = blockInfos.map { _.isBlockIdValid() }.toArray - val walRecordHandles = blockInfos.map { _.walRecordHandleOption.get }.toArray - new WriteAheadLogBackedBlockRDD[T]( -ssc.sparkContext, blockIds, walRecordHandles, isBlockIdValid) -} else { - // Else, create a BlockRDD. However, if there are some blocks with WAL info but not others - // then that is unexpected and log a warning accordingly. - if (blockInfos.find(_.walRecordHandleOption.nonEmpty).nonEmpty) { -if (WriteAheadLogUtils.enableReceiverLog(ssc.conf)) { - logError(Some blocks do not have Write Ahead Log information; + -this is unexpected and data may not be recoverable after driver failures) -} else { - logWarning(Some blocks have Write Ahead Log information; this is unexpected) + if (areWALRecordHandlesPresent) { +// If all the blocks have WAL record handle, then create a WALBackedBlockRDD +val isBlockIdValid = blockInfos.map { _.isBlockIdValid() }.toArray +val walRecordHandles = blockInfos.map { _.walRecordHandleOption.get }.toArray +new WriteAheadLogBackedBlockRDD[T]( + ssc.sparkContext, blockIds, walRecordHandles, isBlockIdValid) + } else { +// Else, create a BlockRDD. However, if there are some blocks with WAL info but not +// others then that is unexpected and log a warning accordingly. +if (blockInfos.find(_.walRecordHandleOption.nonEmpty).nonEmpty) { + if (WriteAheadLogUtils.enableReceiverLog(ssc.conf)) { +logError(Some blocks do not have Write Ahead Log information; + + this is unexpected and data may not be recoverable after driver failures) + } else { +logWarning(Some blocks have Write Ahead Log information; this is unexpected) + } } +
spark git commit: [SPARK-7777][Streaming] Handle the case when there is no block in a batch
Repository: spark Updated Branches: refs/heads/master a40bca011 - ad0badba1 [SPARK-][Streaming] Handle the case when there is no block in a batch In the old implementation, if a batch has no block, `areWALRecordHandlesPresent` will be `true` and it will return `WriteAheadLogBackedBlockRDD`. This PR handles this case by returning `WriteAheadLogBackedBlockRDD` or `BlockRDD` according to the configuration. Author: zsxwing zsxw...@gmail.com Closes #6372 from zsxwing/SPARK- and squashes the following commits: 788f895 [zsxwing] Handle the case when there is no block in a batch Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ad0badba Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ad0badba Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ad0badba Branch: refs/heads/master Commit: ad0badba1450295982738934da2cc121cde18213 Parents: a40bca0 Author: zsxwing zsxw...@gmail.com Authored: Sat May 23 02:11:17 2015 -0700 Committer: Tathagata Das tathagata.das1...@gmail.com Committed: Sat May 23 02:11:17 2015 -0700 -- .../dstream/ReceiverInputDStream.scala | 47 .../spark/streaming/InputStreamsSuite.scala | 31 + 2 files changed, 60 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/ad0badba/streaming/src/main/scala/org/apache/spark/streaming/dstream/ReceiverInputDStream.scala -- diff --git a/streaming/src/main/scala/org/apache/spark/streaming/dstream/ReceiverInputDStream.scala b/streaming/src/main/scala/org/apache/spark/streaming/dstream/ReceiverInputDStream.scala index 5cfe43a..e4ff05e 100644 --- a/streaming/src/main/scala/org/apache/spark/streaming/dstream/ReceiverInputDStream.scala +++ b/streaming/src/main/scala/org/apache/spark/streaming/dstream/ReceiverInputDStream.scala @@ -73,27 +73,38 @@ abstract class ReceiverInputDStream[T: ClassTag](@transient ssc_ : StreamingCont val inputInfo = InputInfo(id, blockInfos.map(_.numRecords).sum) ssc.scheduler.inputInfoTracker.reportInfo(validTime, inputInfo) -// Are WAL record handles present with all the blocks -val areWALRecordHandlesPresent = blockInfos.forall { _.walRecordHandleOption.nonEmpty } +if (blockInfos.nonEmpty) { + // Are WAL record handles present with all the blocks + val areWALRecordHandlesPresent = blockInfos.forall { _.walRecordHandleOption.nonEmpty } -if (areWALRecordHandlesPresent) { - // If all the blocks have WAL record handle, then create a WALBackedBlockRDD - val isBlockIdValid = blockInfos.map { _.isBlockIdValid() }.toArray - val walRecordHandles = blockInfos.map { _.walRecordHandleOption.get }.toArray - new WriteAheadLogBackedBlockRDD[T]( -ssc.sparkContext, blockIds, walRecordHandles, isBlockIdValid) -} else { - // Else, create a BlockRDD. However, if there are some blocks with WAL info but not others - // then that is unexpected and log a warning accordingly. - if (blockInfos.find(_.walRecordHandleOption.nonEmpty).nonEmpty) { -if (WriteAheadLogUtils.enableReceiverLog(ssc.conf)) { - logError(Some blocks do not have Write Ahead Log information; + -this is unexpected and data may not be recoverable after driver failures) -} else { - logWarning(Some blocks have Write Ahead Log information; this is unexpected) + if (areWALRecordHandlesPresent) { +// If all the blocks have WAL record handle, then create a WALBackedBlockRDD +val isBlockIdValid = blockInfos.map { _.isBlockIdValid() }.toArray +val walRecordHandles = blockInfos.map { _.walRecordHandleOption.get }.toArray +new WriteAheadLogBackedBlockRDD[T]( + ssc.sparkContext, blockIds, walRecordHandles, isBlockIdValid) + } else { +// Else, create a BlockRDD. However, if there are some blocks with WAL info but not +// others then that is unexpected and log a warning accordingly. +if (blockInfos.find(_.walRecordHandleOption.nonEmpty).nonEmpty) { + if (WriteAheadLogUtils.enableReceiverLog(ssc.conf)) { +logError(Some blocks do not have Write Ahead Log information; + + this is unexpected and data may not be recoverable after driver failures) + } else { +logWarning(Some blocks have Write Ahead Log information; this is unexpected) + } } +new BlockRDD[T](ssc.sc, blockIds) + } +} else { + // If no block is ready now, creating
spark git commit: [SPARK-7840] add insertInto() to Writer
Repository: spark Updated Branches: refs/heads/master efe3bfdf4 - be47af1bd [SPARK-7840] add insertInto() to Writer Add tests later. Author: Davies Liu dav...@databricks.com Closes #6375 from davies/insertInto and squashes the following commits: 826423e [Davies Liu] add insertInto() to Writer Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/be47af1b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/be47af1b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/be47af1b Branch: refs/heads/master Commit: be47af1bdba469f84775c2b5936f8cb956c7c02b Parents: efe3bfd Author: Davies Liu dav...@databricks.com Authored: Sat May 23 09:07:14 2015 -0700 Committer: Davies Liu dav...@databricks.com Committed: Sat May 23 09:07:14 2015 -0700 -- python/pyspark/sql/dataframe.py | 2 +- python/pyspark/sql/readwriter.py | 22 +++--- 2 files changed, 16 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/be47af1b/python/pyspark/sql/dataframe.py -- diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py index 55cad82..9364875 100644 --- a/python/pyspark/sql/dataframe.py +++ b/python/pyspark/sql/dataframe.py @@ -163,7 +163,7 @@ class DataFrame(object): Optionally overwriting any existing data. -self._jdf.insertInto(tableName, overwrite) +self.write.insertInto(tableName, overwrite) @since(1.3) def saveAsTable(self, tableName, source=None, mode=error, **options): http://git-wip-us.apache.org/repos/asf/spark/blob/be47af1b/python/pyspark/sql/readwriter.py -- diff --git a/python/pyspark/sql/readwriter.py b/python/pyspark/sql/readwriter.py index 02b3aab..b6fd413 100644 --- a/python/pyspark/sql/readwriter.py +++ b/python/pyspark/sql/readwriter.py @@ -226,17 +226,25 @@ class DataFrameWriter(object): else: jwrite.save(path) +def insertInto(self, tableName, overwrite=False): + +Inserts the content of the :class:`DataFrame` to the specified table. +It requires that the schema of the class:`DataFrame` is the same as the +schema of the table. + +Optionally overwriting any existing data. + +self._jwrite.mode(overwrite if overwrite else append).insertInto(tableName) + @since(1.4) def saveAsTable(self, name, format=None, mode=error, **options): -Saves the contents of this :class:`DataFrame` to a data source as a table. - -The data source is specified by the ``source`` and a set of ``options``. -If ``source`` is not specified, the default data source configured by -``spark.sql.sources.default`` will be used. +Saves the content of the :class:`DataFrame` as the specified table. -Additionally, mode is used to specify the behavior of the saveAsTable operation when -table already exists in the data source. There are four modes: +In the case the table already exists, behavior of this function depends on the +save mode, specified by the `mode` function (default to throwing an exception). +When `mode` is `Overwrite`, the schema of the [[DataFrame]] does not need to be +the same as that of the existing table. * `append`: Append contents of this :class:`DataFrame` to existing data. * `overwrite`: Overwrite existing data. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-7840] add insertInto() to Writer
Repository: spark Updated Branches: refs/heads/branch-1.4 d1515381c - c6e574213 [SPARK-7840] add insertInto() to Writer Add tests later. Author: Davies Liu dav...@databricks.com Closes #6375 from davies/insertInto and squashes the following commits: 826423e [Davies Liu] add insertInto() to Writer (cherry picked from commit be47af1bdba469f84775c2b5936f8cb956c7c02b) Signed-off-by: Davies Liu dav...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c6e57421 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c6e57421 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c6e57421 Branch: refs/heads/branch-1.4 Commit: c6e574213d47357aefc82347b73d925de47140b5 Parents: d151538 Author: Davies Liu dav...@databricks.com Authored: Sat May 23 09:07:14 2015 -0700 Committer: Davies Liu dav...@databricks.com Committed: Sat May 23 09:07:45 2015 -0700 -- python/pyspark/sql/dataframe.py | 2 +- python/pyspark/sql/readwriter.py | 22 +++--- 2 files changed, 16 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/c6e57421/python/pyspark/sql/dataframe.py -- diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py index 55cad82..9364875 100644 --- a/python/pyspark/sql/dataframe.py +++ b/python/pyspark/sql/dataframe.py @@ -163,7 +163,7 @@ class DataFrame(object): Optionally overwriting any existing data. -self._jdf.insertInto(tableName, overwrite) +self.write.insertInto(tableName, overwrite) @since(1.3) def saveAsTable(self, tableName, source=None, mode=error, **options): http://git-wip-us.apache.org/repos/asf/spark/blob/c6e57421/python/pyspark/sql/readwriter.py -- diff --git a/python/pyspark/sql/readwriter.py b/python/pyspark/sql/readwriter.py index 02b3aab..b6fd413 100644 --- a/python/pyspark/sql/readwriter.py +++ b/python/pyspark/sql/readwriter.py @@ -226,17 +226,25 @@ class DataFrameWriter(object): else: jwrite.save(path) +def insertInto(self, tableName, overwrite=False): + +Inserts the content of the :class:`DataFrame` to the specified table. +It requires that the schema of the class:`DataFrame` is the same as the +schema of the table. + +Optionally overwriting any existing data. + +self._jwrite.mode(overwrite if overwrite else append).insertInto(tableName) + @since(1.4) def saveAsTable(self, name, format=None, mode=error, **options): -Saves the contents of this :class:`DataFrame` to a data source as a table. - -The data source is specified by the ``source`` and a set of ``options``. -If ``source`` is not specified, the default data source configured by -``spark.sql.sources.default`` will be used. +Saves the content of the :class:`DataFrame` as the specified table. -Additionally, mode is used to specify the behavior of the saveAsTable operation when -table already exists in the data source. There are four modes: +In the case the table already exists, behavior of this function depends on the +save mode, specified by the `mode` function (default to throwing an exception). +When `mode` is `Overwrite`, the schema of the [[DataFrame]] does not need to be +the same as that of the existing table. * `append`: Append contents of this :class:`DataFrame` to existing data. * `overwrite`: Overwrite existing data. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: Fix install jira-python
Repository: spark Updated Branches: refs/heads/master be47af1bd - a4df0f2d8 Fix install jira-python jira-pytyhon package should be installed by sudo pip install jira cc pwendell Author: Davies Liu dav...@databricks.com Closes #6367 from davies/fix_jira_python2 and squashes the following commits: fbb3c8e [Davies Liu] Fix install jira-python Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a4df0f2d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a4df0f2d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a4df0f2d Branch: refs/heads/master Commit: a4df0f2d84ff24318b139db534521141d9d4d593 Parents: be47af1 Author: Davies Liu dav...@databricks.com Authored: Sat May 23 09:14:07 2015 -0700 Committer: Davies Liu dav...@databricks.com Committed: Sat May 23 09:14:07 2015 -0700 -- dev/create-release/releaseutils.py | 2 +- dev/github_jira_sync.py| 2 +- dev/merge_spark_pr.py | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/a4df0f2d/dev/create-release/releaseutils.py -- diff --git a/dev/create-release/releaseutils.py b/dev/create-release/releaseutils.py index 26221b2..51ab25a 100755 --- a/dev/create-release/releaseutils.py +++ b/dev/create-release/releaseutils.py @@ -27,7 +27,7 @@ try: from jira.exceptions import JIRAError except ImportError: print This tool requires the jira-python library -print Install using 'sudo pip install jira-python' +print Install using 'sudo pip install jira' sys.exit(-1) try: http://git-wip-us.apache.org/repos/asf/spark/blob/a4df0f2d/dev/github_jira_sync.py -- diff --git a/dev/github_jira_sync.py b/dev/github_jira_sync.py index ff1e396..287f0ca 100755 --- a/dev/github_jira_sync.py +++ b/dev/github_jira_sync.py @@ -28,7 +28,7 @@ try: import jira.client except ImportError: print This tool requires the jira-python library -print Install using 'sudo pip install jira-python' +print Install using 'sudo pip install jira' sys.exit(-1) # User facing configs http://git-wip-us.apache.org/repos/asf/spark/blob/a4df0f2d/dev/merge_spark_pr.py -- diff --git a/dev/merge_spark_pr.py b/dev/merge_spark_pr.py index 1c126f5..787c5cc 100755 --- a/dev/merge_spark_pr.py +++ b/dev/merge_spark_pr.py @@ -426,7 +426,7 @@ def main(): print JIRA_USERNAME and JIRA_PASSWORD not set print Exiting without trying to close the associated JIRA. else: -print Could not find jira-python library. Run 'sudo pip install jira-python' to install. +print Could not find jira-python library. Run 'sudo pip install jira' to install. print Exiting without trying to close the associated JIRA. if __name__ == __main__: - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] DataFrame window function related updates
Repository: spark Updated Branches: refs/heads/master ad0badba1 - efe3bfdf4 [SPARK-7322, SPARK-7836, SPARK-7822][SQL] DataFrame window function related updates 1. ntile should take an integer as parameter. 2. Added Python API (based on #6364) 3. Update documentation of various DataFrame Python functions. Author: Davies Liu dav...@databricks.com Author: Reynold Xin r...@databricks.com Closes #6374 from rxin/window-final and squashes the following commits: 69004c7 [Reynold Xin] Style fix. 288cea9 [Reynold Xin] Update documentaiton. 7cb8985 [Reynold Xin] Merge pull request #6364 from davies/window 66092b4 [Davies Liu] update docs ed73cb4 [Reynold Xin] [SPARK-7322][SQL] Improve DataFrame window function documentation. ef55132 [Davies Liu] Merge branch 'master' of github.com:apache/spark into window4 8936ade [Davies Liu] fix maxint in python 3 2649358 [Davies Liu] update docs 778e2c0 [Davies Liu] SPARK-7836 and SPARK-7822: Python API of window functions Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/efe3bfdf Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/efe3bfdf Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/efe3bfdf Branch: refs/heads/master Commit: efe3bfdf496aa6206ace2697e31dd4c0c3c824fb Parents: ad0badb Author: Davies Liu dav...@databricks.com Authored: Sat May 23 08:30:05 2015 -0700 Committer: Yin Huai yh...@databricks.com Committed: Sat May 23 08:30:05 2015 -0700 -- python/pyspark/sql/__init__.py | 25 +-- python/pyspark/sql/column.py| 54 +++-- python/pyspark/sql/context.py | 2 - python/pyspark/sql/dataframe.py | 2 + python/pyspark/sql/functions.py | 147 +++--- python/pyspark/sql/group.py | 2 + python/pyspark/sql/tests.py | 31 ++- python/pyspark/sql/window.py| 158 +++ .../scala/org/apache/spark/sql/functions.scala | 197 +-- .../sql/hive/HiveDataFrameWindowSuite.scala | 20 +- 10 files changed, 464 insertions(+), 174 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/efe3bfdf/python/pyspark/sql/__init__.py -- diff --git a/python/pyspark/sql/__init__.py b/python/pyspark/sql/__init__.py index 66b0bff..8fee92a 100644 --- a/python/pyspark/sql/__init__.py +++ b/python/pyspark/sql/__init__.py @@ -18,26 +18,28 @@ Important classes of Spark SQL and DataFrames: -- L{SQLContext} +- :class:`pyspark.sql.SQLContext` Main entry point for :class:`DataFrame` and SQL functionality. -- L{DataFrame} +- :class:`pyspark.sql.DataFrame` A distributed collection of data grouped into named columns. -- L{Column} +- :class:`pyspark.sql.Column` A column expression in a :class:`DataFrame`. -- L{Row} +- :class:`pyspark.sql.Row` A row of data in a :class:`DataFrame`. -- L{HiveContext} +- :class:`pyspark.sql.HiveContext` Main entry point for accessing data stored in Apache Hive. -- L{GroupedData} +- :class:`pyspark.sql.GroupedData` Aggregation methods, returned by :func:`DataFrame.groupBy`. -- L{DataFrameNaFunctions} +- :class:`pyspark.sql.DataFrameNaFunctions` Methods for handling missing data (null values). -- L{DataFrameStatFunctions} +- :class:`pyspark.sql.DataFrameStatFunctions` Methods for statistics functionality. -- L{functions} +- :class:`pyspark.sql.functions` List of built-in functions available for :class:`DataFrame`. -- L{types} +- :class:`pyspark.sql.types` List of data types available. +- :class:`pyspark.sql.Window` + For working with window functions. from __future__ import absolute_import @@ -66,8 +68,9 @@ from pyspark.sql.column import Column from pyspark.sql.dataframe import DataFrame, SchemaRDD, DataFrameNaFunctions, DataFrameStatFunctions from pyspark.sql.group import GroupedData from pyspark.sql.readwriter import DataFrameReader, DataFrameWriter +from pyspark.sql.window import Window, WindowSpec __all__ = [ 'SQLContext', 'HiveContext', 'DataFrame', 'GroupedData', 'Column', 'Row', -'DataFrameNaFunctions', 'DataFrameStatFunctions' +'DataFrameNaFunctions', 'DataFrameStatFunctions', 'Window', 'WindowSpec', ] http://git-wip-us.apache.org/repos/asf/spark/blob/efe3bfdf/python/pyspark/sql/column.py -- diff --git a/python/pyspark/sql/column.py b/python/pyspark/sql/column.py index baf1ecb..8dc5039 100644 --- a/python/pyspark/sql/column.py +++ b/python/pyspark/sql/column.py @@ -116,6 +116,8 @@ class Column(object): df.colName + 1
spark git commit: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] DataFrame window function related updates
Repository: spark Updated Branches: refs/heads/branch-1.4 ea9db50bc - d1515381c [SPARK-7322, SPARK-7836, SPARK-7822][SQL] DataFrame window function related updates 1. ntile should take an integer as parameter. 2. Added Python API (based on #6364) 3. Update documentation of various DataFrame Python functions. Author: Davies Liu dav...@databricks.com Author: Reynold Xin r...@databricks.com Closes #6374 from rxin/window-final and squashes the following commits: 69004c7 [Reynold Xin] Style fix. 288cea9 [Reynold Xin] Update documentaiton. 7cb8985 [Reynold Xin] Merge pull request #6364 from davies/window 66092b4 [Davies Liu] update docs ed73cb4 [Reynold Xin] [SPARK-7322][SQL] Improve DataFrame window function documentation. ef55132 [Davies Liu] Merge branch 'master' of github.com:apache/spark into window4 8936ade [Davies Liu] fix maxint in python 3 2649358 [Davies Liu] update docs 778e2c0 [Davies Liu] SPARK-7836 and SPARK-7822: Python API of window functions (cherry picked from commit efe3bfdf496aa6206ace2697e31dd4c0c3c824fb) Signed-off-by: Yin Huai yh...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d1515381 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d1515381 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d1515381 Branch: refs/heads/branch-1.4 Commit: d1515381cb957f40daf026144ce3ac014660df23 Parents: ea9db50 Author: Davies Liu dav...@databricks.com Authored: Sat May 23 08:30:05 2015 -0700 Committer: Yin Huai yh...@databricks.com Committed: Sat May 23 08:30:18 2015 -0700 -- python/pyspark/sql/__init__.py | 25 +-- python/pyspark/sql/column.py| 54 +++-- python/pyspark/sql/context.py | 2 - python/pyspark/sql/dataframe.py | 2 + python/pyspark/sql/functions.py | 147 +++--- python/pyspark/sql/group.py | 2 + python/pyspark/sql/tests.py | 31 ++- python/pyspark/sql/window.py| 158 +++ .../scala/org/apache/spark/sql/functions.scala | 197 +-- .../sql/hive/HiveDataFrameWindowSuite.scala | 20 +- 10 files changed, 464 insertions(+), 174 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/d1515381/python/pyspark/sql/__init__.py -- diff --git a/python/pyspark/sql/__init__.py b/python/pyspark/sql/__init__.py index 66b0bff..8fee92a 100644 --- a/python/pyspark/sql/__init__.py +++ b/python/pyspark/sql/__init__.py @@ -18,26 +18,28 @@ Important classes of Spark SQL and DataFrames: -- L{SQLContext} +- :class:`pyspark.sql.SQLContext` Main entry point for :class:`DataFrame` and SQL functionality. -- L{DataFrame} +- :class:`pyspark.sql.DataFrame` A distributed collection of data grouped into named columns. -- L{Column} +- :class:`pyspark.sql.Column` A column expression in a :class:`DataFrame`. -- L{Row} +- :class:`pyspark.sql.Row` A row of data in a :class:`DataFrame`. -- L{HiveContext} +- :class:`pyspark.sql.HiveContext` Main entry point for accessing data stored in Apache Hive. -- L{GroupedData} +- :class:`pyspark.sql.GroupedData` Aggregation methods, returned by :func:`DataFrame.groupBy`. -- L{DataFrameNaFunctions} +- :class:`pyspark.sql.DataFrameNaFunctions` Methods for handling missing data (null values). -- L{DataFrameStatFunctions} +- :class:`pyspark.sql.DataFrameStatFunctions` Methods for statistics functionality. -- L{functions} +- :class:`pyspark.sql.functions` List of built-in functions available for :class:`DataFrame`. -- L{types} +- :class:`pyspark.sql.types` List of data types available. +- :class:`pyspark.sql.Window` + For working with window functions. from __future__ import absolute_import @@ -66,8 +68,9 @@ from pyspark.sql.column import Column from pyspark.sql.dataframe import DataFrame, SchemaRDD, DataFrameNaFunctions, DataFrameStatFunctions from pyspark.sql.group import GroupedData from pyspark.sql.readwriter import DataFrameReader, DataFrameWriter +from pyspark.sql.window import Window, WindowSpec __all__ = [ 'SQLContext', 'HiveContext', 'DataFrame', 'GroupedData', 'Column', 'Row', -'DataFrameNaFunctions', 'DataFrameStatFunctions' +'DataFrameNaFunctions', 'DataFrameStatFunctions', 'Window', 'WindowSpec', ] http://git-wip-us.apache.org/repos/asf/spark/blob/d1515381/python/pyspark/sql/column.py -- diff --git a/python/pyspark/sql/column.py b/python/pyspark/sql/column.py index baf1ecb..8dc5039 100644 ---
spark git commit: [SPARK-7654] [SQL] Move insertInto into reader/writer interface.
Repository: spark Updated Branches: refs/heads/master a4df0f2d8 - 2b7e63585 [SPARK-7654] [SQL] Move insertInto into reader/writer interface. This one continues the work of https://github.com/apache/spark/pull/6216. Author: Yin Huai yh...@databricks.com Author: Reynold Xin r...@databricks.com Closes #6366 from yhuai/insert and squashes the following commits: 3d717fb [Yin Huai] Use insertInto to handle the casue when table exists and Append is used for saveAsTable. 56d2540 [Yin Huai] Add PreWriteCheck to HiveContext's analyzer. c636e35 [Yin Huai] Remove unnecessary empty lines. cf83837 [Yin Huai] Move insertInto to write. Also, remove the partition columns from InsertIntoHadoopFsRelation. 0841a54 [Reynold Xin] Removed experimental tag for deprecated methods. 33ed8ef [Reynold Xin] [SPARK-7654][SQL] Move insertInto into reader/writer interface. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2b7e6358 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2b7e6358 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2b7e6358 Branch: refs/heads/master Commit: 2b7e63585d61be2dab78b70af3867cda3983d5b1 Parents: a4df0f2 Author: Yin Huai yh...@databricks.com Authored: Sat May 23 09:48:20 2015 -0700 Committer: Yin Huai yh...@databricks.com Committed: Sat May 23 09:48:20 2015 -0700 -- .../scala/org/apache/spark/sql/DataFrame.scala | 52 +++ .../org/apache/spark/sql/DataFrameReader.scala | 18 +- .../org/apache/spark/sql/DataFrameWriter.scala | 66 +--- .../spark/sql/parquet/ParquetTableSupport.scala | 2 +- .../spark/sql/sources/DataSourceStrategy.scala | 5 +- .../org/apache/spark/sql/sources/commands.scala | 2 +- .../org/apache/spark/sql/sources/ddl.scala | 1 - .../org/apache/spark/sql/sources/rules.scala| 19 +- .../org/apache/spark/sql/hive/HiveContext.scala | 4 ++ .../sql/hive/InsertIntoHiveTableSuite.scala | 6 +- .../sql/hive/MetastoreDataSourcesSuite.scala| 8 +-- .../sql/hive/execution/SQLQuerySuite.scala | 8 +-- .../apache/spark/sql/hive/parquetSuites.scala | 4 +- .../sql/sources/hadoopFsRelationSuites.scala| 10 --- 14 files changed, 116 insertions(+), 89 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/2b7e6358/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala b/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala index 3ec1c4a..f968577 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala @@ -1395,28 +1395,6 @@ class DataFrame private[sql]( def write: DataFrameWriter = new DataFrameWriter(this) /** - * :: Experimental :: - * Adds the rows from this RDD to the specified table, optionally overwriting the existing data. - * @group output - * @since 1.3.0 - */ - @Experimental - def insertInto(tableName: String, overwrite: Boolean): Unit = { -sqlContext.executePlan(InsertIntoTable(UnresolvedRelation(Seq(tableName)), - Map.empty, logicalPlan, overwrite, ifNotExists = false)).toRdd - } - - /** - * :: Experimental :: - * Adds the rows from this RDD to the specified table. - * Throws an exception if the table already exists. - * @group output - * @since 1.3.0 - */ - @Experimental - def insertInto(tableName: String): Unit = insertInto(tableName, overwrite = false) - - /** * Returns the content of the [[DataFrame]] as a RDD of JSON strings. * @group rdd * @since 1.3.0 @@ -1551,13 +1529,7 @@ class DataFrame private[sql]( */ @deprecated(Use write.mode(mode).saveAsTable(tableName), 1.4.0) def saveAsTable(tableName: String, mode: SaveMode): Unit = { -if (sqlContext.catalog.tableExists(Seq(tableName)) mode == SaveMode.Append) { - // If table already exists and the save mode is Append, - // we will just call insertInto to append the contents of this DataFrame. - insertInto(tableName, overwrite = false) -} else { - write.mode(mode).saveAsTable(tableName) -} +write.mode(mode).saveAsTable(tableName) } /** @@ -1713,9 +1685,29 @@ class DataFrame private[sql]( write.format(source).mode(mode).options(options).save() } + + /** + * Adds the rows from this RDD to the specified table, optionally overwriting the existing data. + * @group output + */ + @deprecated(Use write.mode(SaveMode.Append|SaveMode.Overwrite).saveAsTable(tableName), 1.4.0) + def insertInto(tableName: String, overwrite: Boolean): Unit = { +write.mode(if (overwrite) SaveMode.Overwrite else SaveMode.Append).insertInto(tableName) + } + + /** + * Adds the
spark git commit: [SPARK-7654] [SQL] Move insertInto into reader/writer interface.
Repository: spark Updated Branches: refs/heads/branch-1.4 c6e574213 - 8d6d8a538 [SPARK-7654] [SQL] Move insertInto into reader/writer interface. This one continues the work of https://github.com/apache/spark/pull/6216. Author: Yin Huai yh...@databricks.com Author: Reynold Xin r...@databricks.com Closes #6366 from yhuai/insert and squashes the following commits: 3d717fb [Yin Huai] Use insertInto to handle the casue when table exists and Append is used for saveAsTable. 56d2540 [Yin Huai] Add PreWriteCheck to HiveContext's analyzer. c636e35 [Yin Huai] Remove unnecessary empty lines. cf83837 [Yin Huai] Move insertInto to write. Also, remove the partition columns from InsertIntoHadoopFsRelation. 0841a54 [Reynold Xin] Removed experimental tag for deprecated methods. 33ed8ef [Reynold Xin] [SPARK-7654][SQL] Move insertInto into reader/writer interface. (cherry picked from commit 2b7e63585d61be2dab78b70af3867cda3983d5b1) Signed-off-by: Yin Huai yh...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8d6d8a53 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8d6d8a53 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8d6d8a53 Branch: refs/heads/branch-1.4 Commit: 8d6d8a538c46d9b41db1b62ebe7b7c038fdb057c Parents: c6e5742 Author: Yin Huai yh...@databricks.com Authored: Sat May 23 09:48:20 2015 -0700 Committer: Yin Huai yh...@databricks.com Committed: Sat May 23 09:48:30 2015 -0700 -- .../scala/org/apache/spark/sql/DataFrame.scala | 52 +++ .../org/apache/spark/sql/DataFrameReader.scala | 18 +- .../org/apache/spark/sql/DataFrameWriter.scala | 66 +--- .../spark/sql/parquet/ParquetTableSupport.scala | 2 +- .../spark/sql/sources/DataSourceStrategy.scala | 5 +- .../org/apache/spark/sql/sources/commands.scala | 2 +- .../org/apache/spark/sql/sources/ddl.scala | 1 - .../org/apache/spark/sql/sources/rules.scala| 19 +- .../org/apache/spark/sql/hive/HiveContext.scala | 4 ++ .../sql/hive/InsertIntoHiveTableSuite.scala | 6 +- .../sql/hive/MetastoreDataSourcesSuite.scala| 8 +-- .../sql/hive/execution/SQLQuerySuite.scala | 8 +-- .../apache/spark/sql/hive/parquetSuites.scala | 4 +- .../sql/sources/hadoopFsRelationSuites.scala| 10 --- 14 files changed, 116 insertions(+), 89 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/8d6d8a53/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala b/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala index 3ec1c4a..f968577 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala @@ -1395,28 +1395,6 @@ class DataFrame private[sql]( def write: DataFrameWriter = new DataFrameWriter(this) /** - * :: Experimental :: - * Adds the rows from this RDD to the specified table, optionally overwriting the existing data. - * @group output - * @since 1.3.0 - */ - @Experimental - def insertInto(tableName: String, overwrite: Boolean): Unit = { -sqlContext.executePlan(InsertIntoTable(UnresolvedRelation(Seq(tableName)), - Map.empty, logicalPlan, overwrite, ifNotExists = false)).toRdd - } - - /** - * :: Experimental :: - * Adds the rows from this RDD to the specified table. - * Throws an exception if the table already exists. - * @group output - * @since 1.3.0 - */ - @Experimental - def insertInto(tableName: String): Unit = insertInto(tableName, overwrite = false) - - /** * Returns the content of the [[DataFrame]] as a RDD of JSON strings. * @group rdd * @since 1.3.0 @@ -1551,13 +1529,7 @@ class DataFrame private[sql]( */ @deprecated(Use write.mode(mode).saveAsTable(tableName), 1.4.0) def saveAsTable(tableName: String, mode: SaveMode): Unit = { -if (sqlContext.catalog.tableExists(Seq(tableName)) mode == SaveMode.Append) { - // If table already exists and the save mode is Append, - // we will just call insertInto to append the contents of this DataFrame. - insertInto(tableName, overwrite = false) -} else { - write.mode(mode).saveAsTable(tableName) -} +write.mode(mode).saveAsTable(tableName) } /** @@ -1713,9 +1685,29 @@ class DataFrame private[sql]( write.format(source).mode(mode).options(options).save() } + + /** + * Adds the rows from this RDD to the specified table, optionally overwriting the existing data. + * @group output + */ + @deprecated(Use write.mode(SaveMode.Append|SaveMode.Overwrite).saveAsTable(tableName), 1.4.0) + def insertInto(tableName: String, overwrite: Boolean): Unit = { +
[1/2] spark git commit: Preparing Spark release v1.4.0-rc2
Repository: spark Updated Branches: refs/heads/branch-1.4 641edc99f - 947d700ec Preparing Spark release v1.4.0-rc2 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/03fb26a3 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/03fb26a3 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/03fb26a3 Branch: refs/heads/branch-1.4 Commit: 03fb26a3e50e00739cc815ba4e2e82d71d003168 Parents: 641edc9 Author: Patrick Wendell pwend...@gmail.com Authored: Sat May 23 20:13:00 2015 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Sat May 23 20:13:00 2015 -0700 -- assembly/pom.xml | 2 +- bagel/pom.xml | 2 +- core/pom.xml | 2 +- examples/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml| 2 +- external/kafka-assembly/pom.xml | 2 +- external/kafka/pom.xml| 2 +- external/mqtt/pom.xml | 2 +- external/twitter/pom.xml | 2 +- external/zeromq/pom.xml | 2 +- extras/java8-tests/pom.xml| 2 +- extras/kinesis-asl/pom.xml| 2 +- extras/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml| 2 +- launcher/pom.xml | 2 +- mllib/pom.xml | 2 +- network/common/pom.xml| 2 +- network/shuffle/pom.xml | 2 +- network/yarn/pom.xml | 2 +- pom.xml | 2 +- repl/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- unsafe/pom.xml| 2 +- yarn/pom.xml | 2 +- 30 files changed, 30 insertions(+), 30 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/03fb26a3/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index b53d7c3..b8a821d 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.1-SNAPSHOT/version +version1.4.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/03fb26a3/bagel/pom.xml -- diff --git a/bagel/pom.xml b/bagel/pom.xml index d631ff5..c1aa32b 100644 --- a/bagel/pom.xml +++ b/bagel/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.1-SNAPSHOT/version +version1.4.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/03fb26a3/core/pom.xml -- diff --git a/core/pom.xml b/core/pom.xml index adbb7c2..8acb923 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.1-SNAPSHOT/version +version1.4.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/03fb26a3/examples/pom.xml -- diff --git a/examples/pom.xml b/examples/pom.xml index bf804bb..706a97d 100644 --- a/examples/pom.xml +++ b/examples/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.1-SNAPSHOT/version +version1.4.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/03fb26a3/external/flume-sink/pom.xml -- diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml index 076ddaa..e8784eb 100644 --- a/external/flume-sink/pom.xml +++ b/external/flume-sink/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.1-SNAPSHOT/version +version1.4.0/version relativePath../../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/03fb26a3/external/flume/pom.xml -- diff --git a/external/flume/pom.xml b/external/flume/pom.xml index 2491c97..1794f3e 100644 --- a/external/flume/pom.xml +++ b/external/flume/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId
[2/2] spark git commit: Preparing development version 1.4.0-SNAPSHOT
Preparing development version 1.4.0-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/947d700e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/947d700e Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/947d700e Branch: refs/heads/branch-1.4 Commit: 947d700ec867a5a2166c45fd0c108ef1d6d6dd99 Parents: 03fb26a Author: Patrick Wendell pwend...@gmail.com Authored: Sat May 23 20:13:05 2015 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Sat May 23 20:13:05 2015 -0700 -- assembly/pom.xml | 2 +- bagel/pom.xml | 2 +- core/pom.xml | 2 +- examples/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml| 2 +- external/kafka-assembly/pom.xml | 2 +- external/kafka/pom.xml| 2 +- external/mqtt/pom.xml | 2 +- external/twitter/pom.xml | 2 +- external/zeromq/pom.xml | 2 +- extras/java8-tests/pom.xml| 2 +- extras/kinesis-asl/pom.xml| 2 +- extras/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml| 2 +- launcher/pom.xml | 2 +- mllib/pom.xml | 2 +- network/common/pom.xml| 2 +- network/shuffle/pom.xml | 2 +- network/yarn/pom.xml | 2 +- pom.xml | 2 +- repl/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- unsafe/pom.xml| 2 +- yarn/pom.xml | 2 +- 30 files changed, 30 insertions(+), 30 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/947d700e/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index b8a821d..626c857 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.0-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/947d700e/bagel/pom.xml -- diff --git a/bagel/pom.xml b/bagel/pom.xml index c1aa32b..1f3dec9 100644 --- a/bagel/pom.xml +++ b/bagel/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.0-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/947d700e/core/pom.xml -- diff --git a/core/pom.xml b/core/pom.xml index 8acb923..bfa49d0 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.0-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/947d700e/examples/pom.xml -- diff --git a/examples/pom.xml b/examples/pom.xml index 706a97d..5b04b4f 100644 --- a/examples/pom.xml +++ b/examples/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.0-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/947d700e/external/flume-sink/pom.xml -- diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml index e8784eb..1f3e619 100644 --- a/external/flume-sink/pom.xml +++ b/external/flume-sink/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +version1.4.0-SNAPSHOT/version relativePath../../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/947d700e/external/flume/pom.xml -- diff --git a/external/flume/pom.xml b/external/flume/pom.xml index 1794f3e..8df7edb 100644 --- a/external/flume/pom.xml +++ b/external/flume/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.4.0/version +
Git Push Summary
Repository: spark Updated Tags: refs/tags/v1.4.0-rc2 [created] 03fb26a3e - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org