spark git commit: [SPARK-21315][SQL] Skip some spill files when generateIterator(startIndex) in ExternalAppendOnlyUnsafeRowArray.

2017-07-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 833eab2c9 -> 97a1aa2c7 [SPARK-21315][SQL] Skip some spill files when generateIterator(startIndex) in ExternalAppendOnlyUnsafeRowArray. ## What changes were proposed in this pull request? In current code, it is expensive to use

spark git commit: [SPARK-21369][CORE] Don't use Scala Tuple2 in common/network-*

2017-07-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.2 a05edf454 -> edcd9fbc9 [SPARK-21369][CORE] Don't use Scala Tuple2 in common/network-* ## What changes were proposed in this pull request? Remove all usages of Scala Tuple2 from common/network-* projects. Otherwise, Yarn users cannot

spark git commit: [SPARK-21369][CORE] Don't use Scala Tuple2 in common/network-*

2017-07-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 1471ee7af -> 833eab2c9 [SPARK-21369][CORE] Don't use Scala Tuple2 in common/network-* ## What changes were proposed in this pull request? Remove all usages of Scala Tuple2 from common/network-* projects. Otherwise, Yarn users cannot use

spark git commit: [SPARK-21350][SQL] Fix the error message when the number of arguments is wrong when invoking a UDF

2017-07-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/master a2bec6c92 -> 1471ee7af [SPARK-21350][SQL] Fix the error message when the number of arguments is wrong when invoking a UDF ### What changes were proposed in this pull request? Users get a very confusing error when users specify a wrong

spark git commit: [SPARK-21043][SQL] Add unionByName in Dataset

2017-07-10 Thread lixiao
Repository: spark Updated Branches: refs/heads/master c3713fde8 -> a2bec6c92 [SPARK-21043][SQL] Add unionByName in Dataset ## What changes were proposed in this pull request? This pr added `unionByName` in `DataSet`. Here is how to use: ``` val df1 = Seq((1, 2, 3)).toDF("col0", "col1",

spark git commit: [SPARK-21358][EXAMPLES] Argument of repartitionandsortwithinpartitions at pyspark

2017-07-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master d03aebbe6 -> c3713fde8 [SPARK-21358][EXAMPLES] Argument of repartitionandsortwithinpartitions at pyspark ## What changes were proposed in this pull request? At example of repartitionAndSortWithinPartitions at rdd.py, third argument

spark git commit: [SPARK-13534][PYSPARK] Using Apache Arrow to increase performance of DataFrame.toPandas

2017-07-10 Thread holden
Repository: spark Updated Branches: refs/heads/master 2bfd5accd -> d03aebbe6 [SPARK-13534][PYSPARK] Using Apache Arrow to increase performance of DataFrame.toPandas ## What changes were proposed in this pull request? Integrate Apache Arrow with Spark to increase performance of

[spark] Git Push Summary

2017-07-10 Thread marmbrus
Repository: spark Updated Tags: refs/tags/v2.1.0-rc4 [deleted] ec3172658 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2017-07-10 Thread marmbrus
Repository: spark Updated Tags: refs/tags/v2.1.0-rc5 [deleted] cd0a08361 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2017-07-10 Thread marmbrus
Repository: spark Updated Tags: refs/tags/v2.1.0-rc3 [deleted] ef2ccf942 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2017-07-10 Thread marmbrus
Repository: spark Updated Tags: refs/tags/v2.1.0-rc1 [deleted] 80aabc0bd - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2017-07-10 Thread marmbrus
Repository: spark Updated Tags: refs/tags/v2.1.0-rc2 [deleted] 080717497 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2017-07-10 Thread marmbrus
Repository: spark Updated Tags: refs/tags/v2.2.0-rc4 [deleted] 377cfa8ac - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2017-07-10 Thread marmbrus
Repository: spark Updated Tags: refs/tags/v2.2.0-rc1 [deleted] 8ccb4a57c - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2017-07-10 Thread marmbrus
Repository: spark Updated Tags: refs/tags/v2.2.0-rc2 [deleted] 1d4017b44 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2017-07-10 Thread marmbrus
Repository: spark Updated Tags: refs/tags/v2.2.0-rc6 [deleted] a2c7b2133 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2017-07-10 Thread marmbrus
Repository: spark Updated Tags: refs/tags/v2.2.0-rc5 [deleted] 62e442e73 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2017-07-10 Thread marmbrus
Repository: spark Updated Tags: refs/tags/v2.2.0-rc3 [deleted] cc5dbd55b - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2017-07-10 Thread marmbrus
Repository: spark Updated Tags: refs/tags/v2.2.0 [created] a2c7b2133 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r20396 - /dev/spark/spark-2.2.0-rc6/ /release/spark/spark-2.2.0/

2017-07-10 Thread marmbrus
Author: marmbrus Date: Mon Jul 10 22:11:42 2017 New Revision: 20396 Log: Release Spark 2.2.0 Added: release/spark/spark-2.2.0/ - copied from r20395, dev/spark/spark-2.2.0-rc6/ Removed: dev/spark/spark-2.2.0-rc6/

svn commit: r20394 - /dev/spark/spark-2.2.0-rc6/

2017-07-10 Thread marmbrus
Author: marmbrus Date: Mon Jul 10 19:25:36 2017 New Revision: 20394 Log: Add spark-2.2.0-rc6 Added: dev/spark/spark-2.2.0-rc6/ dev/spark/spark-2.2.0-rc6/SparkR_2.2.0.tar.gz (with props) dev/spark/spark-2.2.0-rc6/SparkR_2.2.0.tar.gz.asc

spark git commit: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted string in dapply/gapply/from_json

2017-07-10 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 18b3b00ec -> 2bfd5accd [SPARK-21266][R][PYTHON] Support schema a DDL-formatted string in dapply/gapply/from_json ## What changes were proposed in this pull request? This PR supports schema in a DDL formatted string for `from_json` in

spark git commit: [SPARK-21272] SortMergeJoin LeftAnti does not update numOutputRows

2017-07-10 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 40fd0ce7f -> a05edf454 [SPARK-21272] SortMergeJoin LeftAnti does not update numOutputRows ## What changes were proposed in this pull request? Updating numOutputRows metric was missing from one return path of LeftAnti SortMergeJoin.

spark git commit: [SPARK-21272] SortMergeJoin LeftAnti does not update numOutputRows

2017-07-10 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 6a06c4b03 -> 18b3b00ec [SPARK-21272] SortMergeJoin LeftAnti does not update numOutputRows ## What changes were proposed in this pull request? Updating numOutputRows metric was missing from one return path of LeftAnti SortMergeJoin. ##

spark git commit: [SPARK-21342] Fix DownloadCallback to work well with RetryingBlockFetcher.

2017-07-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.2 3bfad9d42 -> 40fd0ce7f [SPARK-21342] Fix DownloadCallback to work well with RetryingBlockFetcher. When `RetryingBlockFetcher` retries fetching blocks. There could be two `DownloadCallback`s download the same content to the same target

spark git commit: [SPARK-21342] Fix DownloadCallback to work well with RetryingBlockFetcher.

2017-07-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 647963a26 -> 6a06c4b03 [SPARK-21342] Fix DownloadCallback to work well with RetryingBlockFetcher. ## What changes were proposed in this pull request? When `RetryingBlockFetcher` retries fetching blocks. There could be two

spark git commit: [SPARK-20460][SQL] Make it more consistent to handle column name duplication

2017-07-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/master c444d1086 -> 647963a26 [SPARK-20460][SQL] Make it more consistent to handle column name duplication ## What changes were proposed in this pull request? This pr made it more consistent to handle column name duplication. In the current

spark git commit: [MINOR][DOC] Remove obsolete `ec2-scripts.md`

2017-07-10 Thread srowen
Repository: spark Updated Branches: refs/heads/master 96d58f285 -> c444d1086 [MINOR][DOC] Remove obsolete `ec2-scripts.md` ## What changes were proposed in this pull request? Since this document became obsolete, we had better remove this for Apache Spark 2.3.0. The original document is

[spark-website] Git Push Summary

2017-07-10 Thread srowen
Repository: spark-website Updated Branches: refs/heads/remove_ec2 [deleted] 04d5ce051 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-21219][CORE] Task retry occurs on same executor due to race condition with blacklisting

2017-07-10 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 0e80ecae3 -> 96d58f285 [SPARK-21219][CORE] Task retry occurs on same executor due to race condition with blacklisting ## What changes were proposed in this pull request? There's a race condition in the current TaskSetManager where a

[1/2] spark-website git commit: Recover ec2-scripts.html and remove ec2-scripts.md.

2017-07-10 Thread srowen
Repository: spark-website Updated Branches: refs/heads/remove_ec2 [created] 04d5ce051 Recover ec2-scripts.html and remove ec2-scripts.md. Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/74622a5c Tree:

spark-website git commit: Use AMPLab direct link in FAQ

2017-07-10 Thread srowen
Repository: spark-website Updated Branches: refs/heads/asf-site 74622a5cd -> 04d5ce051 Use AMPLab direct link in FAQ Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/04d5ce05 Tree:

[2/2] spark-website git commit: Use AMPLab direct link in FAQ

2017-07-10 Thread srowen
Use AMPLab direct link in FAQ Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/04d5ce05 Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/04d5ce05 Diff:

spark-website git commit: Recover ec2-scripts.html and remove ec2-scripts.md.

2017-07-10 Thread srowen
Repository: spark-website Updated Branches: refs/heads/asf-site 878dcfd84 -> 74622a5cd Recover ec2-scripts.html and remove ec2-scripts.md. Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/74622a5c Tree: