spark git commit: [SPARK-24002][SQL] Task not serializable caused by org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.getBytes

2018-04-17 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 310a8cd06 -> cce469435 [SPARK-24002][SQL] Task not serializable caused by org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.getBytes ## What changes were proposed in this pull request? ``` Py4JJavaError: An error occurred while

spark git commit: [SPARK-23341][SQL] define some standard options for data source v2

2018-04-17 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 1e3b8762a -> 310a8cd06 [SPARK-23341][SQL] define some standard options for data source v2 ## What changes were proposed in this pull request? Each data source implementation can define its own options and teach its users how to set them.

svn commit: r26386 - in /dev/spark/2.4.0-SNAPSHOT-2018_04_17_20_01-1e3b876-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-04-17 Thread pwendell
Author: pwendell Date: Wed Apr 18 03:16:32 2018 New Revision: 26386 Log: Apache Spark 2.4.0-SNAPSHOT-2018_04_17_20_01-1e3b876 docs [This commit notification would consist of 1457 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-21479][SQL] Outer join filter pushdown in null supplying table when condition is on one of the joined columns

2018-04-17 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 5fccdae18 -> 1e3b8762a [SPARK-21479][SQL] Outer join filter pushdown in null supplying table when condition is on one of the joined columns ## What changes were proposed in this pull request? Added `TransitPredicateInOuterJoin`

spark git commit: [SPARK-22968][DSTREAM] Throw an exception on partition revoking issue

2018-04-17 Thread koeninger
Repository: spark Updated Branches: refs/heads/master 1ca3c50fe -> 5fccdae18 [SPARK-22968][DSTREAM] Throw an exception on partition revoking issue ## What changes were proposed in this pull request? Kafka partitions can be revoked when new consumers joined in the consumer group to rebalance

svn commit: r26383 - in /dev/spark/2.3.1-SNAPSHOT-2018_04_17_14_01-6b99d5b-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-04-17 Thread pwendell
Author: pwendell Date: Tue Apr 17 21:16:26 2018 New Revision: 26383 Log: Apache Spark 2.3.1-SNAPSHOT-2018_04_17_14_01-6b99d5b docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-23948] Trigger mapstage's job listener in submitMissingTasks

2018-04-17 Thread irashid
Repository: spark Updated Branches: refs/heads/branch-2.3 564019b92 -> 6b99d5bc3 [SPARK-23948] Trigger mapstage's job listener in submitMissingTasks ## What changes were proposed in this pull request? SparkContext submitted a map stage from `submitMapStage` to `DAGScheduler`,

svn commit: r26380 - in /dev/spark/2.4.0-SNAPSHOT-2018_04_17_12_04-1ca3c50-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-04-17 Thread pwendell
Author: pwendell Date: Tue Apr 17 19:18:37 2018 New Revision: 26380 Log: Apache Spark 2.4.0-SNAPSHOT-2018_04_17_12_04-1ca3c50 docs [This commit notification would consist of 1457 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark-website git commit: Sync ASF git repo and Github

2018-04-17 Thread sameerag
Repository: spark-website Updated Branches: refs/heads/asf-site f050f7e3d -> 0f049fd2e Sync ASF git repo and Github Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/0f049fd2 Tree:

svn commit: r26378 - in /dev/spark/2.3.1-SNAPSHOT-2018_04_17_10_01-564019b-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-04-17 Thread pwendell
Author: pwendell Date: Tue Apr 17 17:17:05 2018 New Revision: 26378 Log: Apache Spark 2.3.1-SNAPSHOT-2018_04_17_10_01-564019b docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-21741][ML][PYSPARK] Python API for DataFrame-based multivariate summarizer

2018-04-17 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master f39e82ce1 -> 1ca3c50fe [SPARK-21741][ML][PYSPARK] Python API for DataFrame-based multivariate summarizer ## What changes were proposed in this pull request? Python API for DataFrame-based multivariate summarizer. ## How was this patch

spark git commit: [SPARK-23986][SQL] freshName can generate non-unique names

2018-04-17 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.3 9857e249c -> 564019b92 [SPARK-23986][SQL] freshName can generate non-unique names ## What changes were proposed in this pull request? We are using `CodegenContext.freshName` to get a unique name for any new variable we are adding.

spark git commit: [SPARK-23986][SQL] freshName can generate non-unique names

2018-04-17 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 3990daaf3 -> f39e82ce1 [SPARK-23986][SQL] freshName can generate non-unique names ## What changes were proposed in this pull request? We are using `CodegenContext.freshName` to get a unique name for any new variable we are adding.

svn commit: r26377 - in /dev/spark/2.4.0-SNAPSHOT-2018_04_17_08_01-3990daa-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-04-17 Thread pwendell
Author: pwendell Date: Tue Apr 17 15:19:23 2018 New Revision: 26377 Log: Apache Spark 2.4.0-SNAPSHOT-2018_04_17_08_01-3990daa docs [This commit notification would consist of 1457 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-23948] Trigger mapstage's job listener in submitMissingTasks

2018-04-17 Thread irashid
Repository: spark Updated Branches: refs/heads/master ed4101d29 -> 3990daaf3 [SPARK-23948] Trigger mapstage's job listener in submitMissingTasks ## What changes were proposed in this pull request? SparkContext submitted a map stage from `submitMapStage` to `DAGScheduler`,

spark git commit: [SPARK-22676] Avoid iterating all partition paths when spark.sql.hive.verifyPartitionPath=true

2018-04-17 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 0a9172a05 -> ed4101d29 [SPARK-22676] Avoid iterating all partition paths when spark.sql.hive.verifyPartitionPath=true ## What changes were proposed in this pull request? In current code, it will scanning all partition paths when

spark git commit: [SPARK-23835][SQL] Add not-null check to Tuples' arguments deserialization

2018-04-17 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 30ffb53ca -> 0a9172a05 [SPARK-23835][SQL] Add not-null check to Tuples' arguments deserialization ## What changes were proposed in this pull request? There was no check on nullability for arguments of `Tuple`s. This could lead to have

spark git commit: [SPARK-23875][SQL] Add IndexedSeq wrapper for ArrayData

2018-04-17 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 05ae74778 -> 30ffb53ca [SPARK-23875][SQL] Add IndexedSeq wrapper for ArrayData ## What changes were proposed in this pull request? We don't have a good way to sequentially access `UnsafeArrayData` with a common interface such as `Seq`.

svn commit: r26374 - in /dev/spark/2.4.0-SNAPSHOT-2018_04_17_04_04-1cc66a0-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-04-17 Thread pwendell
Author: pwendell Date: Tue Apr 17 11:22:26 2018 New Revision: 26374 Log: Apache Spark 2.4.0-SNAPSHOT-2018_04_17_04_04-1cc66a0 docs [This commit notification would consist of 1457 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]

spark git commit: [SPARK-23747][STRUCTURED STREAMING] Add EpochCoordinator unit tests

2018-04-17 Thread tdas
Repository: spark Updated Branches: refs/heads/master 1cc66a072 -> 05ae74778 [SPARK-23747][STRUCTURED STREAMING] Add EpochCoordinator unit tests ## What changes were proposed in this pull request? Unit tests for EpochCoordinator that test correct sequencing of committed epochs. Several

spark git commit: [SPARK-23687][SS] Add a memory source for continuous processing.

2018-04-17 Thread tdas
Repository: spark Updated Branches: refs/heads/master 14844a62c -> 1cc66a072 [SPARK-23687][SS] Add a memory source for continuous processing. ## What changes were proposed in this pull request? Add a memory source for continuous processing. Note that only one of the ContinuousSuite tests is

spark git commit: [SPARK-23918][SQL] Add array_min function

2018-04-17 Thread ueshin
Repository: spark Updated Branches: refs/heads/master fd990a908 -> 14844a62c [SPARK-23918][SQL] Add array_min function ## What changes were proposed in this pull request? The PR adds the SQL function `array_min`. It takes an array as argument and returns the minimum value in it. ## How was