spark git commit: [SPARK-13503][SQL] Support to specify the (writing) option for compression codec for TEXT

2016-02-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master 26ac60806 -> 9812a24aa [SPARK-13503][SQL] Support to specify the (writing) option for compression codec for TEXT ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-13503 This PR makes the TEXT

spark git commit: [SPARK-13487][SQL] User-facing RuntimeConfig interface

2016-02-25 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 8afe49141 -> 26ac60806 [SPARK-13487][SQL] User-facing RuntimeConfig interface ## What changes were proposed in this pull request? This patch creates the public API for runtime configuration and an implementation for it. The public runtime

spark git commit: [SPARK-12941][SQL][MASTER] Spark-SQL JDBC Oracle dialect fails to map string datatypes to Oracle VARCHAR datatype

2016-02-25 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 50e60e36f -> 8afe49141 [SPARK-12941][SQL][MASTER] Spark-SQL JDBC Oracle dialect fails to map string datatypes to Oracle VARCHAR datatype ## What changes were proposed in this pull request? This Pull request is used for the fix

spark git commit: [SPARK-13504] [SPARKR] Add approxQuantile for SparkR

2016-02-25 Thread meng
Repository: spark Updated Branches: refs/heads/master f3be369ef -> 50e60e36f [SPARK-13504] [SPARKR] Add approxQuantile for SparkR ## What changes were proposed in this pull request? Add ```approxQuantile``` for SparkR. ## How was this patch tested? unit tests Author: Yanbo Liang

spark git commit: [SPARK-12363] [MLLIB] [BACKPORT-1.3] Remove setRun and fix PowerIterationClustering failed test

2016-02-25 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 6ddde8eda -> 65cc451c8 [SPARK-12363] [MLLIB] [BACKPORT-1.3] Remove setRun and fix PowerIterationClustering failed test ## What changes were proposed in this pull request? Backport JIRA-SPARK-12363 to branch-1.3. ## How was the this

spark git commit: [SPARK-13028] [ML] Add MaxAbsScaler to ML.feature as a transformer

2016-02-25 Thread meng
Repository: spark Updated Branches: refs/heads/master 1b39fafa7 -> 90d07154c [SPARK-13028] [ML] Add MaxAbsScaler to ML.feature as a transformer jira: https://issues.apache.org/jira/browse/SPARK-13028 MaxAbsScaler works in a very similar way as MinMaxScaler, but scales in a way that the

spark git commit: [SPARK-13361][SQL] Add benchmark codes for Encoder#compress() in CompressionSchemeBenchmark

2016-02-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master 633d63a48 -> 1b39fafa7 [SPARK-13361][SQL] Add benchmark codes for Encoder#compress() in CompressionSchemeBenchmark This pr added benchmark codes for Encoder#compress(). Also, it replaced the benchmark results with new ones because the

[2/2] spark git commit: [SPARK-12757] Add block-level read/write locks to BlockManager

2016-02-25 Thread andrewor14
[SPARK-12757] Add block-level read/write locks to BlockManager ## Motivation As a pre-requisite to off-heap caching of blocks, we need a mechanism to prevent pages / blocks from being evicted while they are being read. With on-heap objects, evicting a block while it is being read merely leads

[1/2] spark git commit: [SPARK-12757] Add block-level read/write locks to BlockManager

2016-02-25 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 712995757 -> 633d63a48 http://git-wip-us.apache.org/repos/asf/spark/blob/633d63a4/core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala -- diff --git

spark git commit: [SPARK-13387][MESOS] Add support for SPARK_DAEMON_JAVA_OPTS with MesosClusterDispatcher.

2016-02-25 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master f2cfafdfe -> 712995757 [SPARK-13387][MESOS] Add support for SPARK_DAEMON_JAVA_OPTS with MesosClusterDispatcher. ## What changes were proposed in this pull request? Add support for SPARK_DAEMON_JAVA_OPTS with MesosClusterDispatcher. ##

spark git commit: [SPARK-13501] Remove use of Guava Stopwatch

2016-02-25 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 7a6ee8a8f -> f2cfafdfe [SPARK-13501] Remove use of Guava Stopwatch Our nightly doc snapshot builds are failing due to some issue involving the Guava Stopwatch constructor: ``` [error]

spark git commit: [SPARK-12009][YARN] Avoid to re-allocating yarn container while driver want to stop all Executors

2016-02-25 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master dc6c5ea4c -> 7a6ee8a8f [SPARK-12009][YARN] Avoid to re-allocating yarn container while driver want to stop all Executors Author: hushan Closes #9992 from suyanNone/tricky. Project:

spark git commit: [SPARK-13468][WEB UI] Fix a corner case where the Stage UI page should show DAG but it doesn't show

2016-02-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 35316cb0b -> dc6c5ea4c [SPARK-13468][WEB UI] Fix a corner case where the Stage UI page should show DAG but it doesn't show When uses clicks more than one time on any stage in the DAG graph on the *Job* web UI page, many new *Stage* web

spark git commit: [SPARK-13292] [ML] [PYTHON] QuantileDiscretizer should take random seed in PySpark

2016-02-25 Thread meng
Repository: spark Updated Branches: refs/heads/master 14e2700de -> 35316cb0b [SPARK-13292] [ML] [PYTHON] QuantileDiscretizer should take random seed in PySpark ## What changes were proposed in this pull request? QuantileDiscretizer in Python should also specify a random seed. ## How was

spark git commit: [SPARK-12874][ML] ML StringIndexer does not protect itself from column name duplication

2016-02-25 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 d59a08f7c -> abe8f991a [SPARK-12874][ML] ML StringIndexer does not protect itself from column name duplication ## What changes were proposed in this pull request? ML StringIndexer does not protect itself from column name duplication.

spark git commit: [SPARK-12874][ML] ML StringIndexer does not protect itself from column name duplication

2016-02-25 Thread meng
Repository: spark Updated Branches: refs/heads/master fb8bb0476 -> 14e2700de [SPARK-12874][ML] ML StringIndexer does not protect itself from column name duplication ## What changes were proposed in this pull request? ML StringIndexer does not protect itself from column name duplication. We

spark git commit: [SPARK-13069][STREAMING] Add "ask" style store() to ActorReciever

2016-02-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 751724b13 -> fb8bb0476 [SPARK-13069][STREAMING] Add "ask" style store() to ActorReciever Introduces a "ask" style ```store``` in ```ActorReceiver``` as a way to allow actor receiver blocked by back pressure or maxRate. Author: Lin Zhao

spark git commit: [SPARK-13464][STREAMING][PYSPARK] Fix failed streaming in pyspark in branch 1.3

2016-02-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.3 387d81891 -> 6ddde8eda [SPARK-13464][STREAMING][PYSPARK] Fix failed streaming in pyspark in branch 1.3 JIRA: https://issues.apache.org/jira/browse/SPARK-13464 ## What changes were proposed in this pull request? During backport a

spark git commit: Revert "[SPARK-13444][MLLIB] QuantileDiscretizer chooses bad splits on large DataFrames"

2016-02-25 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 5f7440b25 -> d59a08f7c Revert "[SPARK-13444][MLLIB] QuantileDiscretizer chooses bad splits on large DataFrames" This reverts commit cb869a143d338985c3d99ef388dd78b1e3d90a73. Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: Revert "[SPARK-13457][SQL] Removes DataFrame RDD operations"

2016-02-25 Thread davies
Repository: spark Updated Branches: refs/heads/master 46f6e7931 -> 751724b13 Revert "[SPARK-13457][SQL] Removes DataFrame RDD operations" This reverts commit 157fe64f3ecbd13b7286560286e50235eecfe30e. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: Revert "[SPARK-13117][WEB UI] WebUI should use the local ip not 0.0.0.0"

2016-02-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5fcf4c2bf -> 46f6e7931 Revert "[SPARK-13117][WEB UI] WebUI should use the local ip not 0.0.0.0" This reverts commit 2e44031fafdb8cf486573b98e4faa6b31ffb90a4. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-12316] Wait a minutes to avoid cycle calling.

2016-02-25 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-1.6 e3802a752 -> 5f7440b25 [SPARK-12316] Wait a minutes to avoid cycle calling. When application end, AM will clean the staging dir. But if the driver trigger to update the delegation token, it will can't find the right token file and

spark git commit: [SPARK-12316] Wait a minutes to avoid cycle calling.

2016-02-25 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 157fe64f3 -> 5fcf4c2bf [SPARK-12316] Wait a minutes to avoid cycle calling. When application end, AM will clean the staging dir. But if the driver trigger to update the delegation token, it will can't find the right token file and then it

spark git commit: [SPARK-13457][SQL] Removes DataFrame RDD operations

2016-02-25 Thread lian
Repository: spark Updated Branches: refs/heads/master 4460113d4 -> 157fe64f3 [SPARK-13457][SQL] Removes DataFrame RDD operations ## What changes were proposed in this pull request? This PR removes DataFrame RDD operations. Original calls are now replaced by calls to methods of

spark git commit: [SPARK-13490][ML] ML LinearRegression should cache standardization param value

2016-02-25 Thread srowen
Repository: spark Updated Branches: refs/heads/master c98a93ded -> 4460113d4 [SPARK-13490][ML] ML LinearRegression should cache standardization param value ## What changes were proposed in this pull request? Like #11027 for ```LogisticRegression```, ```LinearRegression``` with L1

spark git commit: [SPARK-13439][MESOS] Document that spark.mesos.uris is comma-separated

2016-02-25 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 1f031635f -> e3802a752 [SPARK-13439][MESOS] Document that spark.mesos.uris is comma-separated Author: Michael Gummelt Closes #11311 from mgummelt/document_csv. (cherry picked from commit

spark git commit: [SPARK-13439][MESOS] Document that spark.mesos.uris is comma-separated

2016-02-25 Thread srowen
Repository: spark Updated Branches: refs/heads/master fae88af18 -> c98a93ded [SPARK-13439][MESOS] Document that spark.mesos.uris is comma-separated Author: Michael Gummelt Closes #11311 from mgummelt/document_csv. Project:

spark git commit: [SPARK-13441][YARN] Fix NPE in yarn Client.createConfArchive method

2016-02-25 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 cb869a143 -> 1f031635f [SPARK-13441][YARN] Fix NPE in yarn Client.createConfArchive method ## What changes were proposed in this pull request? Instead of using result of File.listFiles() directly, which may throw NPE, check for null

spark git commit: [SPARK-13441][YARN] Fix NPE in yarn Client.createConfArchive method

2016-02-25 Thread srowen
Repository: spark Updated Branches: refs/heads/master 6f8e835c6 -> fae88af18 [SPARK-13441][YARN] Fix NPE in yarn Client.createConfArchive method ## What changes were proposed in this pull request? Instead of using result of File.listFiles() directly, which may throw NPE, check for null

spark git commit: [SPARK-13444][MLLIB] QuantileDiscretizer chooses bad splits on large DataFrames

2016-02-25 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 3cc938ac8 -> cb869a143 [SPARK-13444][MLLIB] QuantileDiscretizer chooses bad splits on large DataFrames Change line 113 of QuantileDiscretizer.scala to `val requiredSamples = math.max(numBins * numBins, 1.0)` so that

spark git commit: [SPARK-13444][MLLIB] QuantileDiscretizer chooses bad splits on large DataFrames

2016-02-25 Thread srowen
Repository: spark Updated Branches: refs/heads/master 3fa6491be -> 6f8e835c6 [SPARK-13444][MLLIB] QuantileDiscretizer chooses bad splits on large DataFrames ## What changes were proposed in this pull request? Change line 113 of QuantileDiscretizer.scala to `val requiredSamples =

spark git commit: [SPARK-13473][SQL] Don't push predicate through project with nondeterministic field(s)

2016-02-25 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-1.5 d2c1c67cf -> 0e920411f [SPARK-13473][SQL] Don't push predicate through project with nondeterministic field(s) ## What changes were proposed in this pull request? Predicates shouldn't be pushed through project with nondeterministic

spark git commit: [SPARK-13473][SQL] Don't push predicate through project with nondeterministic field(s)

2016-02-25 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-1.6 897599601 -> 3cc938ac8 [SPARK-13473][SQL] Don't push predicate through project with nondeterministic field(s) ## What changes were proposed in this pull request? Predicates shouldn't be pushed through project with nondeterministic

spark git commit: [SPARK-13473][SQL] Don't push predicate through project with nondeterministic field(s)

2016-02-25 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 2e44031fa -> 3fa6491be [SPARK-13473][SQL] Don't push predicate through project with nondeterministic field(s) ## What changes were proposed in this pull request? Predicates shouldn't be pushed through project with nondeterministic

spark git commit: [SPARK-13117][WEB UI] WebUI should use the local ip not 0.0.0.0

2016-02-25 Thread srowen
Repository: spark Updated Branches: refs/heads/master 2b2c8c332 -> 2e44031fa [SPARK-13117][WEB UI] WebUI should use the local ip not 0.0.0.0 Fixed the HTTP Server Host Name/IP issue i.e. HTTP Server to take the configured host name/IP and not '0.0.0.0' always. Author: Devaraj K

[3/3] spark git commit: [SPARK-13486][SQL] Move SQLConf into an internal package

2016-02-25 Thread lian
[SPARK-13486][SQL] Move SQLConf into an internal package ## What changes were proposed in this pull request? This patch moves SQLConf into org.apache.spark.sql.internal package to make it very explicit that it is internal. Soon I will also submit more API work that creates implementations of

[1/3] spark git commit: [SPARK-13486][SQL] Move SQLConf into an internal package

2016-02-25 Thread lian
Repository: spark Updated Branches: refs/heads/master 07f92ef1f -> 2b2c8c332 http://git-wip-us.apache.org/repos/asf/spark/blob/2b2c8c33/sql/core/src/test/scala/org/apache/spark/sql/test/TestSQLContext.scala -- diff --git

spark git commit: [SPARK-13376] [SPARK-13476] [SQL] improve column pruning

2016-02-25 Thread davies
Repository: spark Updated Branches: refs/heads/master 264533b55 -> 07f92ef1f [SPARK-13376] [SPARK-13476] [SQL] improve column pruning ## What changes were proposed in this pull request? This PR mostly rewrite the ColumnPruning rule to support most of the SQL logical plans (except those for