svn commit: r29181 - in /dev/spark/2.3.3-SNAPSHOT-2018_09_06_22_02-84922e5-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-09-06 Thread pwendell
Author: pwendell Date: Fri Sep 7 05:15:35 2018 New Revision: 29181 Log: Apache Spark 2.3.3-SNAPSHOT-2018_09_06_22_02-84922e5 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-25237][SQL] Remove updateBytesReadWithFileSize in FileScanRDD

2018-09-06 Thread srowen
Repository: spark Updated Branches: refs/heads/master 4e3365b57 -> ed249db9c [SPARK-25237][SQL] Remove updateBytesReadWithFileSize in FileScanRDD ## What changes were proposed in this pull request? This pr removed the method `updateBytesReadWithFileSize` in `FileScanRDD` because it computes

spark git commit: [SPARK-25237][SQL] Remove updateBytesReadWithFileSize in FileScanRDD

2018-09-06 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.4 3644c84f5 -> f9b476c6a [SPARK-25237][SQL] Remove updateBytesReadWithFileSize in FileScanRDD ## What changes were proposed in this pull request? This pr removed the method `updateBytesReadWithFileSize` in `FileScanRDD` because it

spark git commit: [SPARK-22357][CORE][FOLLOWUP] SparkContext.binaryFiles ignore minPartitions parameter

2018-09-06 Thread srowen
Repository: spark Updated Branches: refs/heads/master b0ada7dce -> 4e3365b57 [SPARK-22357][CORE][FOLLOWUP] SparkContext.binaryFiles ignore minPartitions parameter ## What changes were proposed in this pull request? This adds a test following https://github.com/apache/spark/pull/21638 ##

spark git commit: [SPARK-22357][CORE][FOLLOWUP] SparkContext.binaryFiles ignore minPartitions parameter

2018-09-06 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.4 24a32612b -> 3644c84f5 [SPARK-22357][CORE][FOLLOWUP] SparkContext.binaryFiles ignore minPartitions parameter ## What changes were proposed in this pull request? This adds a test following https://github.com/apache/spark/pull/21638

spark git commit: [SPARK-25330][BUILD][BRANCH-2.3] Revert Hadoop 2.7 to 2.7.3

2018-09-06 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.3 d22379ec2 -> 84922e506 [SPARK-25330][BUILD][BRANCH-2.3] Revert Hadoop 2.7 to 2.7.3 ## What changes were proposed in this pull request? How to reproduce permission issue: ```sh # build spark ./dev/make-distribution.sh --name SPARK-25330

spark git commit: [SPARK-25330][BUILD][BRANCH-2.3] Revert Hadoop 2.7 to 2.7.3

2018-09-06 Thread srowen
Repository: spark Updated Branches: refs/heads/master 1b1711e05 -> b0ada7dce [SPARK-25330][BUILD][BRANCH-2.3] Revert Hadoop 2.7 to 2.7.3 ## What changes were proposed in this pull request? How to reproduce permission issue: ```sh # build spark ./dev/make-distribution.sh --name SPARK-25330

spark git commit: [SPARK-25330][BUILD][BRANCH-2.3] Revert Hadoop 2.7 to 2.7.3

2018-09-06 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.4 ff832beee -> 24a32612b [SPARK-25330][BUILD][BRANCH-2.3] Revert Hadoop 2.7 to 2.7.3 ## What changes were proposed in this pull request? How to reproduce permission issue: ```sh # build spark ./dev/make-distribution.sh --name SPARK-25330

svn commit: r29180 - in /dev/spark/2.4.0-SNAPSHOT-2018_09_06_20_02-1b1711e-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-09-06 Thread pwendell
Author: pwendell Date: Fri Sep 7 03:16:32 2018 New Revision: 29180 Log: Apache Spark 2.4.0-SNAPSHOT-2018_09_06_20_02-1b1711e docs [This commit notification would consist of 1477 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-23243][CORE][2.3] Fix RDD.repartition() data correctness issue

2018-09-06 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.3 31dab7140 -> d22379ec2 [SPARK-23243][CORE][2.3] Fix RDD.repartition() data correctness issue backport https://github.com/apache/spark/pull/22112 to 2.3 --- An alternative fix for https://github.com/apache/spark/pull/21698 When

spark git commit: [SPARK-25208][SQL][FOLLOW-UP] Reduce code size.

2018-09-06 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.4 a7cfe5158 -> ff832beee [SPARK-25208][SQL][FOLLOW-UP] Reduce code size. ## What changes were proposed in this pull request? This is a follow-up pr of #22200. When casting to decimal type, if `Cast.canNullSafeCastToDecimal()`, overflow

spark git commit: [SPARK-25208][SQL][FOLLOW-UP] Reduce code size.

2018-09-06 Thread wenchen
Repository: spark Updated Branches: refs/heads/master da6fa3828 -> 1b1711e05 [SPARK-25208][SQL][FOLLOW-UP] Reduce code size. ## What changes were proposed in this pull request? This is a follow-up pr of #22200. When casting to decimal type, if `Cast.canNullSafeCastToDecimal()`, overflow

spark git commit: [SPARK-25262][K8S] Allow SPARK_LOCAL_DIRS to be tmpfs backed on K8S

2018-09-06 Thread mcheah
Repository: spark Updated Branches: refs/heads/master 27d3b0a51 -> da6fa3828 [SPARK-25262][K8S] Allow SPARK_LOCAL_DIRS to be tmpfs backed on K8S ## What changes were proposed in this pull request? The default behaviour of Spark on K8S currently is to create `emptyDir` volumes to back

spark git commit: [SPARK-25222][K8S] Improve container status logging

2018-09-06 Thread mcheah
Repository: spark Updated Branches: refs/heads/master c84bc40d7 -> 27d3b0a51 [SPARK-25222][K8S] Improve container status logging ## What changes were proposed in this pull request? Currently when running Spark on Kubernetes a logger is run by the client that watches the K8S API for events

svn commit: r29173 - in /dev/spark/2.3.3-SNAPSHOT-2018_09_06_14_02-31dab71-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-09-06 Thread pwendell
Author: pwendell Date: Thu Sep 6 21:15:56 2018 New Revision: 29173 Log: Apache Spark 2.3.3-SNAPSHOT-2018_09_06_14_02-31dab71 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

svn commit: r29169 - in /dev/spark/2.4.0-SNAPSHOT-2018_09_06_12_02-c84bc40-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-09-06 Thread pwendell
Author: pwendell Date: Thu Sep 6 19:16:48 2018 New Revision: 29169 Log: Apache Spark 2.4.0-SNAPSHOT-2018_09_06_12_02-c84bc40 docs [This commit notification would consist of 1477 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-25108][SQL] Fix the show method to display the wide character alignment problem

2018-09-06 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.4 3682d29f4 -> a7cfe5158 [SPARK-25108][SQL] Fix the show method to display the wide character alignment problem This is not a perfect solution. It is designed to minimize complexity on the basis of solving problems. It is effective

spark git commit: [SPARK-25072][PYSPARK] Forbid extra value for custom Row

2018-09-06 Thread cutlerb
Repository: spark Updated Branches: refs/heads/branch-2.3 9db81fd86 -> 31dab7140 [SPARK-25072][PYSPARK] Forbid extra value for custom Row ## What changes were proposed in this pull request? Add value length check in `_create_row`, forbid extra value for custom Row in PySpark. ## How was

spark git commit: [SPARK-25072][PYSPARK] Forbid extra value for custom Row

2018-09-06 Thread cutlerb
Repository: spark Updated Branches: refs/heads/branch-2.4 f2d502223 -> 3682d29f4 [SPARK-25072][PYSPARK] Forbid extra value for custom Row ## What changes were proposed in this pull request? Add value length check in `_create_row`, forbid extra value for custom Row in PySpark. ## How was

spark git commit: [SPARK-25072][PYSPARK] Forbid extra value for custom Row

2018-09-06 Thread cutlerb
Repository: spark Updated Branches: refs/heads/master 3b6591b0b -> c84bc40d7 [SPARK-25072][PYSPARK] Forbid extra value for custom Row ## What changes were proposed in this pull request? Add value length check in `_create_row`, forbid extra value for custom Row in PySpark. ## How was this

svn commit: r29168 - in /dev/spark/2.3.3-SNAPSHOT-2018_09_06_10_02-9db81fd-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-09-06 Thread pwendell
Author: pwendell Date: Thu Sep 6 17:16:02 2018 New Revision: 29168 Log: Apache Spark 2.3.3-SNAPSHOT-2018_09_06_10_02-9db81fd docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-25328][PYTHON] Add an example for having two columns as the grouping key in group aggregate pandas UDF

2018-09-06 Thread cutlerb
Repository: spark Updated Branches: refs/heads/branch-2.4 085f731ad -> f2d502223 [SPARK-25328][PYTHON] Add an example for having two columns as the grouping key in group aggregate pandas UDF ## What changes were proposed in this pull request? This PR proposes to add another example for

spark git commit: [SPARK-25268][GRAPHX] run Parallel Personalized PageRank throws serialization Exception

2018-09-06 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-2.4 b632e775c -> 085f731ad [SPARK-25268][GRAPHX] run Parallel Personalized PageRank throws serialization Exception ## What changes were proposed in this pull request? mapValues in scala is currently not serializable. To avoid the

spark git commit: [SPARK-25268][GRAPHX] run Parallel Personalized PageRank throws serialization Exception

2018-09-06 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 7ef6d1daf -> 3b6591b0b [SPARK-25268][GRAPHX] run Parallel Personalized PageRank throws serialization Exception ## What changes were proposed in this pull request? mapValues in scala is currently not serializable. To avoid the

spark git commit: [SPARK-25328][PYTHON] Add an example for having two columns as the grouping key in group aggregate pandas UDF

2018-09-06 Thread cutlerb
Repository: spark Updated Branches: refs/heads/master f5817d8bb -> 7ef6d1daf [SPARK-25328][PYTHON] Add an example for having two columns as the grouping key in group aggregate pandas UDF ## What changes were proposed in this pull request? This PR proposes to add another example for multiple

svn commit: r29166 - in /dev/spark/2.4.0-SNAPSHOT-2018_09_06_08_02-f5817d8-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-09-06 Thread pwendell
Author: pwendell Date: Thu Sep 6 15:16:52 2018 New Revision: 29166 Log: Apache Spark 2.4.0-SNAPSHOT-2018_09_06_08_02-f5817d8 docs [This commit notification would consist of 1477 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-25313][BRANCH-2.3][SQL] Fix regression in FileFormatWriter output names

2018-09-06 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.3 31e46ec60 -> 9db81fd86 [SPARK-25313][BRANCH-2.3][SQL] Fix regression in FileFormatWriter output names Port https://github.com/apache/spark/pull/22320 to branch-2.3 ## What changes were proposed in this pull request? Let's see the

spark git commit: [SPARK-25108][SQL] Fix the show method to display the wide character alignment problem

2018-09-06 Thread srowen
Repository: spark Updated Branches: refs/heads/master 64c314e22 -> f5817d8bb [SPARK-25108][SQL] Fix the show method to display the wide character alignment problem This is not a perfect solution. It is designed to minimize complexity on the basis of solving problems. It is effective for

svn commit: r29165 - in /dev/spark/2.4.0-SNAPSHOT-2018_09_06_04_02-64c314e-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-09-06 Thread pwendell
Author: pwendell Date: Thu Sep 6 11:16:27 2018 New Revision: 29165 Log: Apache Spark 2.4.0-SNAPSHOT-2018_09_06_04_02-64c314e docs [This commit notification would consist of 1477 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-25317][CORE] Avoid perf regression in Murmur3 Hash on UTF8String

2018-09-06 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.4 d749d034a -> b632e775c [SPARK-25317][CORE] Avoid perf regression in Murmur3 Hash on UTF8String ## What changes were proposed in this pull request? SPARK-10399 introduced a performance regression on the hash computation for

spark git commit: [SPARK-25317][CORE] Avoid perf regression in Murmur3 Hash on UTF8String

2018-09-06 Thread wenchen
Repository: spark Updated Branches: refs/heads/master d749d034a -> 64c314e22 [SPARK-25317][CORE] Avoid perf regression in Murmur3 Hash on UTF8String ## What changes were proposed in this pull request? SPARK-10399 introduced a performance regression on the hash computation for UTF8String.

[spark] Git Push Summary

2018-09-06 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.4 [created] d749d034a - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r29164 - in /dev/spark/2.4.0-SNAPSHOT-2018_09_06_00_02-d749d03-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-09-06 Thread pwendell
Author: pwendell Date: Thu Sep 6 07:16:47 2018 New Revision: 29164 Log: Apache Spark 2.4.0-SNAPSHOT-2018_09_06_00_02-d749d03 docs [This commit notification would consist of 1477 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]