spark git commit: [SPARK-8074] Parquet should throw AnalysisException during setup for data type/name related failures.

2015-06-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.4 f67a27d02 - 1f90a06bd [SPARK-8074] Parquet should throw AnalysisException during setup for data type/name related failures. Author: Reynold Xin r...@databricks.com Closes #6608 from rxin/parquet-analysis and squashes the following

spark git commit: [SPARK-8074] Parquet should throw AnalysisException during setup for data type/name related failures.

2015-06-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 708c63bbb - 939e4f3d8 [SPARK-8074] Parquet should throw AnalysisException during setup for data type/name related failures. Author: Reynold Xin r...@databricks.com Closes #6608 from rxin/parquet-analysis and squashes the following

spark git commit: [SPARK-8063] [SPARKR] Spark master URL conflict between MASTER env variable and --master command line option.

2015-06-03 Thread shivaram
Repository: spark Updated Branches: refs/heads/master d2a86eb8f - 708c63bbb [SPARK-8063] [SPARKR] Spark master URL conflict between MASTER env variable and --master command line option. Author: Sun Rui rui@intel.com Closes #6605 from sun-rui/SPARK-8063 and squashes the following

spark git commit: [SPARK-8063] [SPARKR] Spark master URL conflict between MASTER env variable and --master command line option.

2015-06-03 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.4 0a1dad6cd - f67a27d02 [SPARK-8063] [SPARKR] Spark master URL conflict between MASTER env variable and --master command line option. Author: Sun Rui rui@intel.com Closes #6605 from sun-rui/SPARK-8063 and squashes the following

spark git commit: [MINOR] [UI] Improve confusing message on log page

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.4 bfab61f39 - 31e0ae9e1 [MINOR] [UI] Improve confusing message on log page It's good practice to check if the input path is in the directory we expect to avoid potentially confusing error messages. Project:

spark git commit: [MINOR] [UI] Improve confusing message on log page

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.3 e5747ee3a - 744599609 [MINOR] [UI] Improve confusing message on log page It's good practice to check if the input path is in the directory we expect to avoid potentially confusing error messages. Project:

spark git commit: [SPARK-8001] [CORE] Make AsynchronousListenerBus.waitUntilEmpty throw TimeoutException if timeout

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master aa40c4420 - 1d8669f15 [SPARK-8001] [CORE] Make AsynchronousListenerBus.waitUntilEmpty throw TimeoutException if timeout Some places forget to call `assert` to check the return value of `AsynchronousListenerBus.waitUntilEmpty`. Instead of

spark git commit: [HOTFIX] [TYPO] Fix typo in #6546

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master d8662cd90 - bfbdab12d [HOTFIX] [TYPO] Fix typo in #6546 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bfbdab12 Tree:

spark git commit: [SPARK-8088] don't attempt to lower number of executors by 0

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 566cb5947 - 51898b515 [SPARK-8088] don't attempt to lower number of executors by 0 Author: Ryan Williams ryan.blake.willi...@gmail.com Closes #6624 from ryan-williams/execs and squashes the following commits: b6f71d4 [Ryan Williams]

spark git commit: [SPARK-8054] [MLLIB] Added several Java-friendly APIs + unit tests

2015-06-03 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 2c5a06caf - 20a26b595 [SPARK-8054] [MLLIB] Added several Java-friendly APIs + unit tests Java-friendly APIs added: * GaussianMixture.run() * GaussianMixtureModel.predict() * DistributedLDAModel.javaTopicDistributions() * StreamingKMeans:

spark git commit: [SPARK-8059] [YARN] Wake up allocation thread when new requests arrive.

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master bfbf12b34 - aa40c4420 [SPARK-8059] [YARN] Wake up allocation thread when new requests arrive. This should help reduce latency for new executor allocations. Author: Marcelo Vanzin van...@cloudera.com Closes #6600 from vanzin/SPARK-8059

spark git commit: [HOTFIX] Fix Hadoop-1 build caused by #5792.

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master f27134782 - a8f1f1543 [HOTFIX] Fix Hadoop-1 build caused by #5792. Replaced `fs.listFiles` with Hadoop-1 friendly `fs.listStatus` method. Author: Hari Shreedharan hshreedha...@apache.org Closes #6619 from

spark git commit: [MINOR] [UI] Improve confusing message on log page

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 20a26b595 - c6a6dd0d0 [MINOR] [UI] Improve confusing message on log page It's good practice to check if the input path is in the directory we expect to avoid potentially confusing error messages. Project:

spark git commit: [SPARK-8083] [MESOS] Use the correct base path in mesos driver page.

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master c6a6dd0d0 - bfbf12b34 [SPARK-8083] [MESOS] Use the correct base path in mesos driver page. Author: Timothy Chen tnac...@gmail.com Closes #6615 from tnachen/mesos_driver_path and squashes the following commits: 4f47b7c [Timothy Chen] Use

spark git commit: [HOTFIX] Unbreak build from backporting #6546

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.4 b2a22a651 - d0be9508f [HOTFIX] Unbreak build from backporting #6546 This is caused by 7e46ea0228f142f6b384331d62cec8f86e61c9d1. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [BUILD] Increase Jenkins test timeout

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 0576c3c4f - e35cd36e0 [BUILD] Increase Jenkins test timeout Currently hive tests alone take 40m. The right thing to do is to reduce the test time. However, that is a bigger project and we currently have PRs blocking on tests not timing

spark git commit: [BUILD] Increase Jenkins test timeout

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.4 c2c129073 - 96f71b105 [BUILD] Increase Jenkins test timeout Currently hive tests alone take 40m. The right thing to do is to reduce the test time. However, that is a bigger project and we currently have PRs blocking on tests not timing

spark git commit: [SPARK-8054] [MLLIB] Added several Java-friendly APIs + unit tests

2015-06-03 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.4 1f90a06bd - bfab61f39 [SPARK-8054] [MLLIB] Added several Java-friendly APIs + unit tests Java-friendly APIs added: * GaussianMixture.run() * GaussianMixtureModel.predict() * DistributedLDAModel.javaTopicDistributions() *

spark git commit: [SPARK-8084] [SPARKR] Make SparkR scripts fail on error

2015-06-03 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.4 16748694b - c2c129073 [SPARK-8084] [SPARKR] Make SparkR scripts fail on error cc shaneknapp pwendell JoshRosen Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Closes #6623 from shivaram/SPARK-8084 and squashes the following

spark git commit: [SPARK-8084] [SPARKR] Make SparkR scripts fail on error

2015-06-03 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 51898b515 - 0576c3c4f [SPARK-8084] [SPARKR] Make SparkR scripts fail on error cc shaneknapp pwendell JoshRosen Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Closes #6623 from shivaram/SPARK-8084 and squashes the following

spark git commit: [SPARK-8051] [MLLIB] make StringIndexerModel silent if input column does not exist

2015-06-03 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.4 ca21fff7d - b2a22a651 [SPARK-8051] [MLLIB] make StringIndexerModel silent if input column does not exist This is just a workaround to a bigger problem. Some pipeline stages may not be effective during prediction, and they should not

spark git commit: [SPARK-8051] [MLLIB] make StringIndexerModel silent if input column does not exist

2015-06-03 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master d3e026f87 - 26c9d7a0f [SPARK-8051] [MLLIB] make StringIndexerModel silent if input column does not exist This is just a workaround to a bigger problem. Some pipeline stages may not be effective during prediction, and they should not

spark git commit: [SPARK-7989] [CORE] [TESTS] Fix flaky tests in ExternalShuffleServiceSuite and SparkListenerWithClusterSuite

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 1d8669f15 - f27134782 [SPARK-7989] [CORE] [TESTS] Fix flaky tests in ExternalShuffleServiceSuite and SparkListenerWithClusterSuite The flaky tests in ExternalShuffleServiceSuite and SparkListenerWithClusterSuite will fail if there are

spark git commit: [SPARK-3674] [EC2] Clear SPARK_WORKER_INSTANCES when using YARN

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.4 7e46ea022 - ca21fff7d [SPARK-3674] [EC2] Clear SPARK_WORKER_INSTANCES when using YARN cc andrewor14 Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Closes #6424 from shivaram/spark-worker-instances-yarn-ec2 and squashes the

spark git commit: [SPARK-8088] don't attempt to lower number of executors by 0

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.4 0bc9a3ec4 - 16748694b [SPARK-8088] don't attempt to lower number of executors by 0 Author: Ryan Williams ryan.blake.willi...@gmail.com Closes #6624 from ryan-williams/execs and squashes the following commits: b6f71d4 [Ryan Williams]

spark git commit: [HOTFIX] History Server API docs error fix.

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master bfbdab12d - 566cb5947 [HOTFIX] History Server API docs error fix. Minor error in the monitoring docs. Also made indentation changes in `ApiRootResource` Author: Hari Shreedharan hshreedha...@apache.org Closes #6628 from

spark git commit: [BUILD] Use right branch when checking against Hive

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master e35cd36e0 - 9cf740f35 [BUILD] Use right branch when checking against Hive Right now we always run hive tests in branch-1.4 PRs because we compare whether the diff against master involves hive changes. Really we should be comparing

spark git commit: [SPARK-7983] [MLLIB] Add require for one-based indices in loadLibSVMFile

2015-06-03 Thread srowen
Repository: spark Updated Branches: refs/heads/master d38cf217e - 28dbde387 [SPARK-7983] [MLLIB] Add require for one-based indices in loadLibSVMFile jira: https://issues.apache.org/jira/browse/SPARK-7983 Customers frequently use zero-based indices in their LIBSVM files. No warnings or

spark git commit: [SPARK-8032] [PYSPARK] Make version checking for NumPy in MLlib more robust

2015-06-03 Thread meng
Repository: spark Updated Branches: refs/heads/master 43adbd561 - 452eb82dd [SPARK-8032] [PYSPARK] Make version checking for NumPy in MLlib more robust The current checking does version `1.x' is less than `1.4' this will fail if x has greater than 1 digit, since x 4, however `1.x` `1.4`

spark git commit: [SPARK-8032] [PYSPARK] Make version checking for NumPy in MLlib more robust

2015-06-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 476b87d31 - bbd377228 [SPARK-8032] [PYSPARK] Make version checking for NumPy in MLlib more robust The current checking does version `1.x' is less than `1.4' this will fail if x has greater than 1 digit, since x 4, however `1.x`

spark git commit: [SPARK-8032] [PYSPARK] Make version checking for NumPy in MLlib more robust

2015-06-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.4 33edb2b79 - bd57af387 [SPARK-8032] [PYSPARK] Make version checking for NumPy in MLlib more robust The current checking does version `1.x' is less than `1.4' this will fail if x has greater than 1 digit, since x 4, however `1.x`

spark git commit: [SPARK-7562][SPARK-6444][SQL] Improve error reporting for expression data type mismatch

2015-06-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master ce320cb2d - d38cf217e [SPARK-7562][SPARK-6444][SQL] Improve error reporting for expression data type mismatch It seems hard to find a common pattern of checking types in `Expression`. Sometimes we know what input types we need(like

spark git commit: [SPARK-8060] Improve DataFrame Python test coverage and documentation.

2015-06-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 452eb82dd - ce320cb2d [SPARK-8060] Improve DataFrame Python test coverage and documentation. Author: Reynold Xin r...@databricks.com Closes #6601 from rxin/python-read-write-test-and-doc and squashes the following commits: baa8ad5

spark git commit: [SPARK-8060] Improve DataFrame Python test coverage and documentation.

2015-06-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.4 bd57af387 - ee7f365bd [SPARK-8060] Improve DataFrame Python test coverage and documentation. Author: Reynold Xin r...@databricks.com Closes #6601 from rxin/python-read-write-test-and-doc and squashes the following commits: baa8ad5

spark git commit: [SPARK-8032] [PYSPARK] Make version checking for NumPy in MLlib more robust

2015-06-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.1 672f3228c - 36eed2f9e [SPARK-8032] [PYSPARK] Make version checking for NumPy in MLlib more robust The current checking does version `1.x' is less than `1.4' this will fail if x has greater than 1 digit, since x 4, however `1.x`

spark git commit: [SPARK-8032] [PYSPARK] Make version checking for NumPy in MLlib more robust

2015-06-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.2 aefb113c8 - 23bf3071f [SPARK-8032] [PYSPARK] Make version checking for NumPy in MLlib more robust The current checking does version `1.x' is less than `1.4' this will fail if x has greater than 1 digit, since x 4, however `1.x`

spark git commit: [SPARK-8032] [PYSPARK] Make version checking for NumPy in MLlib more robust

2015-06-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.0 86ad12d44 - fed98b934 [SPARK-8032] [PYSPARK] Make version checking for NumPy in MLlib more robust The current checking does version `1.x' is less than `1.4' this will fail if x has greater than 1 digit, since x 4, however `1.x`

spark git commit: [SPARK-8043] [MLLIB] [DOC] update NaiveBayes and SVM examples in doc

2015-06-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.4 88399c34b - 33edb2b79 [SPARK-8043] [MLLIB] [DOC] update NaiveBayes and SVM examples in doc jira: https://issues.apache.org/jira/browse/SPARK-8043 I found some issues during testing the save/load examples in markdown Documents, as a

svn commit: r1683391 - in /spark: examples.md site/examples.html

2015-06-03 Thread srowen
Author: srowen Date: Wed Jun 3 17:14:40 2015 New Revision: 1683391 URL: http://svn.apache.org/r1683391 Log: Fix two Java example typos Modified: spark/examples.md spark/site/examples.html Modified: spark/examples.md URL:

spark git commit: [SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0

2015-06-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/master f1646e102 - 2c4d550ed [SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0 Author: Patrick Wendell patr...@databricks.com Closes #6328 from pwendell/spark-1.5-update and squashes the following commits: 2f42d02 [Patrick Wendell] A few

spark git commit: [SPARK-7973] [SQL] Increase the timeout of two CliSuite tests.

2015-06-03 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 ee7f365bd - 54a4ea407 [SPARK-7973] [SQL] Increase the timeout of two CliSuite tests. https://issues.apache.org/jira/browse/SPARK-7973 Author: Yin Huai yh...@databricks.com Closes #6525 from yhuai/SPARK-7973 and squashes the following

spark git commit: [SPARK-7973] [SQL] Increase the timeout of two CliSuite tests.

2015-06-03 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 28dbde387 - f1646e102 [SPARK-7973] [SQL] Increase the timeout of two CliSuite tests. https://issues.apache.org/jira/browse/SPARK-7973 Author: Yin Huai yh...@databricks.com Closes #6525 from yhuai/SPARK-7973 and squashes the following

spark git commit: [BUILD] Use right branch when checking against Hive (1.4)

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.4 96f71b105 - 584a2ba21 [BUILD] Use right branch when checking against Hive (1.4) For branch-1.4. This is identical to #6629 and is strictly not necessary. I'm opening this as a PR since it changes Jenkins test behavior and I want to

[4/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.4)

2015-06-03 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/core/src/test/scala/org/apache/spark/util/collection/AppendOnlyMapSuite.scala -- diff --git

[6/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.4)

2015-06-03 Thread andrewor14
[SPARK-7558] Demarcate tests in unit-tests.log (1.4) This includes the following commits: original: 9eb222c hotfix1: 8c99793 hotfix2: a4f2412 scalastyle check: 609c492 --- Original patch #6441 Branch-1.3 patch #6602 Author: Andrew Or and...@databricks.com Closes #6598 from

[5/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.4)

2015-06-03 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferSecuritySuite.scala -- diff --git

spark git commit: [BUILD] Fix Maven build for Kinesis

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.4 bfe74b34a - 84da65319 [BUILD] Fix Maven build for Kinesis A necessary dependency that is transitively referenced is not provided, causing compilation failures in builds that provide the kinesis-asl profile. Project:

spark git commit: [BUILD] Fix Maven build for Kinesis

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 9cf740f35 - 984ad6014 [BUILD] Fix Maven build for Kinesis A necessary dependency that is transitively referenced is not provided, causing compilation failures in builds that provide the kinesis-asl profile. Project:

[1/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.4)

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.4 584a2ba21 - bfe74b34a http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientDistributedCacheManagerSuite.scala

[3/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.4)

2015-06-03 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala -- diff --git

spark git commit: [SPARK-7980] [SQL] Support SQLContext.range(end)

2015-06-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.4 54a4ea407 - 0a1dad6cd [SPARK-7980] [SQL] Support SQLContext.range(end) 1. range() overloaded in SQLContext.scala 2. range() modified in python sql context.py 3. Tests added accordingly in DataFrameSuite.scala and python sql tests.py

spark git commit: [SPARK-7980] [SQL] Support SQLContext.range(end)

2015-06-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2c4d550ed - d053a31be [SPARK-7980] [SQL] Support SQLContext.range(end) 1. range() overloaded in SQLContext.scala 2. range() modified in python sql context.py 3. Tests added accordingly in DataFrameSuite.scala and python sql tests.py

spark git commit: [SPARK-7161] [HISTORY SERVER] Provide REST api to download event logs fro...

2015-06-03 Thread irashid
Repository: spark Updated Branches: refs/heads/master d053a31be - d2a86eb8f [SPARK-7161] [HISTORY SERVER] Provide REST api to download event logs fro... ...m History Server This PR adds a new API that allows the user to download event logs for an application as a zip file. APIs have been

spark git commit: [MINOR] make the launcher project name consistent with others

2015-06-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 07c16cb5b - ccaa82329 [MINOR] make the launcher project name consistent with others I found this by chance while building spark and think it is better to keep its name consistent with other sub-projects (Spark Project *). I am not gonna

[1/5] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.3)

2015-06-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.3 bbd377228 - e5747ee3a http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/streaming/src/test/scala/org/apache/spark/streaming/UISeleniumSuite.scala -- diff --git

[3/5] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.3)

2015-06-03 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/external/flume/src/test/scala/org/apache/spark/streaming/flume/FlumeStreamSuite.scala -- diff --git

[4/5] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.3)

2015-06-03 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/core/src/test/scala/org/apache/spark/rdd/DoubleRDDSuite.scala -- diff --git a/core/src/test/scala/org/apache/spark/rdd/DoubleRDDSuite.scala

[2/5] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.3)

2015-06-03 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/mllib/src/test/scala/org/apache/spark/mllib/regression/LassoSuite.scala -- diff --git a/mllib/src/test/scala/org/apache/spark/mllib/regression/LassoSuite.scala