svn commit: r13782 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview-bin-hadoop2.4.tgz.asc spark-2.0.0-preview-bin-hadoop2.4.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:55:59 2016 New Revision: 13782 Log: rename asc Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.4.tgz.asc - copied unchanged from r13781, release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.4.tgz.asc.txt Removed:

svn commit: r13786 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview.tgz.asc spark-2.0.0-preview.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:56:45 2016 New Revision: 13786 Log: rename asc Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview.tgz.asc - copied unchanged from r13785, release/spark/spark-2.0.0-preview/spark-2.0.0-preview.tgz.asc.txt Removed:

svn commit: r13783 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview-bin-hadoop2.6.tgz.asc spark-2.0.0-preview-bin-hadoop2.6.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:56:13 2016 New Revision: 13783 Log: rename asc Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.6.tgz.asc - copied unchanged from r13782, release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.6.tgz.asc.txt Removed:

svn commit: r13784 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview-bin-hadoop2.7.tgz.asc spark-2.0.0-preview-bin-hadoop2.7.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:56:23 2016 New Revision: 13784 Log: rename asc Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.7.tgz.asc - copied unchanged from r13783, release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.7.tgz.asc.txt Removed:

svn commit: r13785 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview-bin-without-hadoop.tgz.asc spark-2.0.0-preview-bin-without-hadoop.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:56:33 2016 New Revision: 13785 Log: rename asc Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-without-hadoop.tgz.asc - copied unchanged from r13784, release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-without-hadoop.tgz.asc.txt

svn commit: r13780 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview-bin-hadoop2.3.tgz.asc spark-2.0.0-preview-bin-hadoop2.3.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:55:29 2016 New Revision: 13780 Log: (empty) Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.3.tgz.asc - copied unchanged from r13779, release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.3.tgz.asc.txt Removed:

svn commit: r13781 - in /release/spark/spark-2.0.0-preview: spark-2.0.0-preview-bin-hadoop2.4-without-hive.tgz.asc spark-2.0.0-preview-bin-hadoop2.4-without-hive.tgz.asc.txt

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:55:43 2016 New Revision: 13781 Log: rename asc Added: release/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.4-without-hive.tgz.asc - copied unchanged from r13780,

spark git commit: [SPARK-12071][DOC] Document the behaviour of NA in R

2016-05-24 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 402995e5d -> 1dad1a891 [SPARK-12071][DOC] Document the behaviour of NA in R ## What changes were proposed in this pull request? Under Upgrading From SparkR 1.5.x to 1.6.x section added the information, SparkSQL converts `NA` in R to

spark git commit: [SPARK-12071][DOC] Document the behaviour of NA in R

2016-05-24 Thread shivaram
Repository: spark Updated Branches: refs/heads/master cd9f16906 -> 9082b7968 [SPARK-12071][DOC] Document the behaviour of NA in R ## What changes were proposed in this pull request? Under Upgrading From SparkR 1.5.x to 1.6.x section added the information, SparkSQL converts `NA` in R to

spark git commit: [SPARK-15412][PYSPARK][SPARKR][DOCS] Improve linear isotonic regression pydoc & doc build insturctions

2016-05-24 Thread shivaram
Repository: spark Updated Branches: refs/heads/master c9c1c0e54 -> cd9f16906 [SPARK-15412][PYSPARK][SPARKR][DOCS] Improve linear isotonic regression pydoc & doc build insturctions ## What changes were proposed in this pull request? PySpark: Add links to the predictors from the models in

spark git commit: [SPARK-15412][PYSPARK][SPARKR][DOCS] Improve linear isotonic regression pydoc & doc build insturctions

2016-05-24 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-2.0 6f22ba3e1 -> 402995e5d [SPARK-15412][PYSPARK][SPARKR][DOCS] Improve linear isotonic regression pydoc & doc build insturctions ## What changes were proposed in this pull request? PySpark: Add links to the predictors from the models in

svn commit: r13779 - /dev/spark/spark-2.0.0-preview/ /release/spark/spark-2.0.0-preview/

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 05:04:13 2016 New Revision: 13779 Log: spark-2.0.0-preview Added: release/spark/spark-2.0.0-preview/ - copied from r13778, dev/spark/spark-2.0.0-preview/ Removed: dev/spark/spark-2.0.0-preview/

spark git commit: [SPARK-15508][STREAMING][TESTS] Fix flaky test: JavaKafkaStreamSuite.testKafkaStream

2016-05-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 fb7b90f61 -> 6f22ba3e1 [SPARK-15508][STREAMING][TESTS] Fix flaky test: JavaKafkaStreamSuite.testKafkaStream ## What changes were proposed in this pull request? `JavaKafkaStreamSuite.testKafkaStream` assumes when `sent.size ==

spark git commit: [SPARK-15508][STREAMING][TESTS] Fix flaky test: JavaKafkaStreamSuite.testKafkaStream

2016-05-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 50b660d72 -> c9c1c0e54 [SPARK-15508][STREAMING][TESTS] Fix flaky test: JavaKafkaStreamSuite.testKafkaStream ## What changes were proposed in this pull request? `JavaKafkaStreamSuite.testKafkaStream` assumes when `sent.size ==

svn commit: r13778 - /dev/spark/spark-2.0.0-preview/

2016-05-24 Thread rxin
Author: rxin Date: Wed May 25 04:54:42 2016 New Revision: 13778 Log: Add spark-2.0.0-preview Added: dev/spark/spark-2.0.0-preview/ dev/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.3.tgz.asc.txt dev/spark/spark-2.0.0-preview/spark-2.0.0-preview-bin-hadoop2.3.tgz.md5

spark git commit: [SPARK-15498][TESTS] fix slow tests

2016-05-24 Thread lian
Repository: spark Updated Branches: refs/heads/branch-2.0 e13cfd6d2 -> fb7b90f61 [SPARK-15498][TESTS] fix slow tests ## What changes were proposed in this pull request? This PR fixes 3 slow tests: 1. `ParquetQuerySuite.read/write wide table`: This is not a good unit test as it runs more

spark git commit: [SPARK-15498][TESTS] fix slow tests

2016-05-24 Thread lian
Repository: spark Updated Branches: refs/heads/master 4acababca -> 50b660d72 [SPARK-15498][TESTS] fix slow tests ## What changes were proposed in this pull request? This PR fixes 3 slow tests: 1. `ParquetQuerySuite.read/write wide table`: This is not a good unit test as it runs more than 5

spark git commit: [SPARK-15365][SQL] When table size statistics are not available from metastore, we should fallback to HDFS

2016-05-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 14494da87 -> 4acababca [SPARK-15365][SQL] When table size statistics are not available from metastore, we should fallback to HDFS ## What changes were proposed in this pull request? Currently if a table is used in join operation we rely

spark git commit: [SPARK-15365][SQL] When table size statistics are not available from metastore, we should fallback to HDFS

2016-05-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 5504f60e8 -> e13cfd6d2 [SPARK-15365][SQL] When table size statistics are not available from metastore, we should fallback to HDFS ## What changes were proposed in this pull request? Currently if a table is used in join operation we

[2/3] spark git commit: [SPARK-15518] Rename various scheduler backend for consistency

2016-05-24 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/5504f60e/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala -- diff --git

[1/3] spark git commit: [SPARK-15518] Rename various scheduler backend for consistency

2016-05-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 1de3446d9 -> 5504f60e8 http://git-wip-us.apache.org/repos/asf/spark/blob/5504f60e/core/src/test/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackendSuite.scala

[2/3] spark git commit: [SPARK-15518] Rename various scheduler backend for consistency

2016-05-24 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/14494da8/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala -- diff --git

[3/3] spark git commit: [SPARK-15518] Rename various scheduler backend for consistency

2016-05-24 Thread rxin
[SPARK-15518] Rename various scheduler backend for consistency ## What changes were proposed in this pull request? This patch renames various scheduler backends to make them consistent: - LocalScheduler -> LocalSchedulerBackend - AppClient -> StandaloneAppClient - AppClientListener ->

[1/3] spark git commit: [SPARK-15518] Rename various scheduler backend for consistency

2016-05-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master f08bf587b -> 14494da87 http://git-wip-us.apache.org/repos/asf/spark/blob/14494da8/core/src/test/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackendSuite.scala

[3/3] spark git commit: [SPARK-15518] Rename various scheduler backend for consistency

2016-05-24 Thread rxin
[SPARK-15518] Rename various scheduler backend for consistency ## What changes were proposed in this pull request? This patch renames various scheduler backends to make them consistent: - LocalScheduler -> LocalSchedulerBackend - AppClient -> StandaloneAppClient - AppClientListener ->

spark git commit: [SPARK-15512][CORE] repartition(0) should raise IllegalArgumentException

2016-05-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master e631b819f -> f08bf587b [SPARK-15512][CORE] repartition(0) should raise IllegalArgumentException ## What changes were proposed in this pull request? Previously, SPARK-8893 added the constraints on positive number of partitions for

spark git commit: [SPARK-15512][CORE] repartition(0) should raise IllegalArgumentException

2016-05-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 1fb7b3a0a -> 1de3446d9 [SPARK-15512][CORE] repartition(0) should raise IllegalArgumentException ## What changes were proposed in this pull request? Previously, SPARK-8893 added the constraints on positive number of partitions for

spark git commit: [SPARK-15458][SQL][STREAMING] Disable schema inference for streaming datasets on file streams

2016-05-24 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.0 31fb5fa40 -> 1fb7b3a0a [SPARK-15458][SQL][STREAMING] Disable schema inference for streaming datasets on file streams ## What changes were proposed in this pull request? If the user relies on the schema to be inferred in file streams

spark git commit: [SPARK-15458][SQL][STREAMING] Disable schema inference for streaming datasets on file streams

2016-05-24 Thread tdas
Repository: spark Updated Branches: refs/heads/master 20900e5fe -> e631b819f [SPARK-15458][SQL][STREAMING] Disable schema inference for streaming datasets on file streams ## What changes were proposed in this pull request? If the user relies on the schema to be inferred in file streams can

spark git commit: [SPARK-15502][DOC][ML][PYSPARK] add guide note that ALS only supports integer ids

2016-05-24 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-2.0 2574abea0 -> 31fb5fa40 [SPARK-15502][DOC][ML][PYSPARK] add guide note that ALS only supports integer ids This PR adds a note to clarify that the ML API for ALS only supports integers for user/item ids, and that other types for these

spark git commit: [SPARK-15502][DOC][ML][PYSPARK] add guide note that ALS only supports integer ids

2016-05-24 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master be99a99fe -> 20900e5fe [SPARK-15502][DOC][ML][PYSPARK] add guide note that ALS only supports integer ids This PR adds a note to clarify that the ML API for ALS only supports integers for user/item ids, and that other types for these

svn commit: r1745389 - in /spark: research.md site/research.html

2016-05-24 Thread meng
Author: meng Date: Tue May 24 18:30:06 2016 New Revision: 1745389 URL: http://svn.apache.org/viewvc?rev=1745389=rev Log: fix typo Modified: spark/research.md spark/site/research.html Modified: spark/research.md URL:

spark git commit: [MINOR][CORE][TEST] Update obsolete `takeSample` test case.

2016-05-24 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.0 1bb0aa4b0 -> 2574abea0 [MINOR][CORE][TEST] Update obsolete `takeSample` test case. ## What changes were proposed in this pull request? This PR fixes some obsolete comments and assertion in `takeSample` testcase of `RDDSuite.scala`.

spark git commit: [MINOR][CORE][TEST] Update obsolete `takeSample` test case.

2016-05-24 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 784cc07d1 -> be99a99fe [MINOR][CORE][TEST] Update obsolete `takeSample` test case. ## What changes were proposed in this pull request? This PR fixes some obsolete comments and assertion in `takeSample` testcase of `RDDSuite.scala`. ##

spark git commit: [SPARK-15388][SQL] Fix spark sql CREATE FUNCTION with hive 1.2.1

2016-05-24 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.0 988d4dbf4 -> 1bb0aa4b0 [SPARK-15388][SQL] Fix spark sql CREATE FUNCTION with hive 1.2.1 ## What changes were proposed in this pull request? spark.sql("CREATE FUNCTION myfunc AS 'com.haizhi.bdp.udf.UDFGetGeoCode'") throws

spark git commit: [SPARK-15388][SQL] Fix spark sql CREATE FUNCTION with hive 1.2.1

2016-05-24 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master a313a5ae7 -> 784cc07d1 [SPARK-15388][SQL] Fix spark sql CREATE FUNCTION with hive 1.2.1 ## What changes were proposed in this pull request? spark.sql("CREATE FUNCTION myfunc AS 'com.haizhi.bdp.udf.UDFGetGeoCode'") throws

svn commit: r1745381 - in /spark: documentation.md site/documentation.html

2016-05-24 Thread meng
Author: meng Date: Tue May 24 17:41:08 2016 New Revision: 1745381 URL: http://svn.apache.org/viewvc?rev=1745381=rev Log: list papers only on the research page Modified: spark/documentation.md spark/site/documentation.html Modified: spark/documentation.md URL:

svn commit: r1745380 - in /spark: research.md site/research.html

2016-05-24 Thread meng
Author: meng Date: Tue May 24 17:33:48 2016 New Revision: 1745380 URL: http://svn.apache.org/viewvc?rev=1745380=rev Log: add MLlib and SparkR papers Modified: spark/research.md spark/site/research.html Modified: spark/research.md URL:

spark git commit: [SPARK-15405][YARN] Remove unnecessary upload of config archive.

2016-05-24 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 695d9a0fd -> a313a5ae7 [SPARK-15405][YARN] Remove unnecessary upload of config archive. We only need one copy of it. The client code that was uploading the second copy just needs to be modified to update the metadata in the cache, so that

spark git commit: [SPARK-15405][YARN] Remove unnecessary upload of config archive.

2016-05-24 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-2.0 4e2a53ba4 -> 988d4dbf4 [SPARK-15405][YARN] Remove unnecessary upload of config archive. We only need one copy of it. The client code that was uploading the second copy just needs to be modified to update the metadata in the cache, so

spark git commit: [SPARK-15433] [PYSPARK] PySpark core test should not use SerDe from PythonMLLibAPI

2016-05-24 Thread davies
Repository: spark Updated Branches: refs/heads/master f8763b80e -> 695d9a0fd [SPARK-15433] [PYSPARK] PySpark core test should not use SerDe from PythonMLLibAPI ## What changes were proposed in this pull request? Currently PySpark core test uses the `SerDe` from `PythonMLLibAPI` which

spark git commit: [SPARK-13135] [SQL] Don't print expressions recursively in generated code

2016-05-24 Thread davies
Repository: spark Updated Branches: refs/heads/branch-2.0 9f55951a1 -> 4e2a53ba4 [SPARK-13135] [SQL] Don't print expressions recursively in generated code ## What changes were proposed in this pull request? This PR is an up-to-date and a little bit improved version of #11019 of rxin for -

spark git commit: [SPARK-13135] [SQL] Don't print expressions recursively in generated code

2016-05-24 Thread davies
Repository: spark Updated Branches: refs/heads/master c24b6b679 -> f8763b80e [SPARK-13135] [SQL] Don't print expressions recursively in generated code ## What changes were proposed in this pull request? This PR is an up-to-date and a little bit improved version of #11019 of rxin for - (1)

spark git commit: [SPARK-11753][SQL][TEST-HADOOP2.2] Make allowNonNumericNumbers option work

2016-05-24 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.0 6ee1583c3 -> 9f55951a1 [SPARK-11753][SQL][TEST-HADOOP2.2] Make allowNonNumericNumbers option work ## What changes were proposed in this pull request? Jackson suppprts `allowNonNumericNumbers` option to parse non-standard non-numeric

spark git commit: [SPARK-15442][ML][PYSPARK] Add 'relativeError' param to PySpark QuantileDiscretizer

2016-05-24 Thread mlnick
Repository: spark Updated Branches: refs/heads/master d642b2735 -> 6075f5b4d [SPARK-15442][ML][PYSPARK] Add 'relativeError' param to PySpark QuantileDiscretizer This PR adds the `relativeError` param to PySpark's `QuantileDiscretizer` to match Scala. Also cleaned up a duplication of

spark git commit: [SPARK-15397][SQL] fix string udf locate as hive

2016-05-24 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.0 1890f5fdf -> 6adbc0613 [SPARK-15397][SQL] fix string udf locate as hive ## What changes were proposed in this pull request? in hive, `locate("aa", "aaa", 0)` would yield 0, `locate("aa", "aaa", 1)` would yield 1 and `locate("aa",

spark git commit: [SPARK-15397][SQL] fix string udf locate as hive

2016-05-24 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master de726b0d5 -> d642b2735 [SPARK-15397][SQL] fix string udf locate as hive ## What changes were proposed in this pull request? in hive, `locate("aa", "aaa", 0)` would yield 0, `locate("aa", "aaa", 1)` would yield 1 and `locate("aa", "aaa",