spark git commit: [SPARK-7050] [BUILD] Fix Python Kafka test assembly jar not found issue under Maven build

2015-07-08 Thread srowen
Repository: spark Updated Branches: refs/heads/master 351a36d0c - 8a9d9cc15 [SPARK-7050] [BUILD] Fix Python Kafka test assembly jar not found issue under Maven build To fix Spark Streaming unit test with maven build. Previously the name and path of maven generated jar is different from

spark git commit: [SPARK-8872] [MLLIB] added verification results from R for FPGrowthSuite

2015-07-08 Thread meng
Repository: spark Updated Branches: refs/heads/master 8a9d9cc15 - 3bb217750 [SPARK-8872] [MLLIB] added verification results from R for FPGrowthSuite Author: Kashif Rasul kashif.ra...@gmail.com Closes #7269 from kashif/SPARK-8872 and squashes the following commits: 2d5457f [Kashif Rasul]

spark git commit: [SPARK-8657] [YARN] [HOTFIX] Fail to upload resource to viewfs

2015-07-08 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.4 e91d87e66 - e4313db38 [SPARK-8657] [YARN] [HOTFIX] Fail to upload resource to viewfs Fail to upload resource to viewfs in spark-1.4 JIRA Link: https://issues.apache.org/jira/browse/SPARK-8657 Author: Tao Li li...@sogou-inc.com Closes

spark git commit: [SPARK-8785] [SQL] Improve Parquet schema merging

2015-07-08 Thread lian
Repository: spark Updated Branches: refs/heads/master bf02e3771 - 6722aca80 [SPARK-8785] [SQL] Improve Parquet schema merging JIRA: https://issues.apache.org/jira/browse/SPARK-8785 Currently, the parquet schema merging (`ParquetRelation2.readSchema`) may spend much time to merge duplicate

spark git commit: [SPARK-6912] [SQL] Throw an AnalysisException when unsupported Java MapK, V types used in Hive UDF

2015-07-08 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 6722aca80 - 3e831a269 [SPARK-6912] [SQL] Throw an AnalysisException when unsupported Java MapK,V types used in Hive UDF To make UDF developers understood, throw an exception when unsupported MapK,V types used in Hive UDF. This fix is the

spark git commit: [SPARK-8894] [SPARKR] [DOC] Example code errors in SparkR documentation.

2015-07-08 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 3bb217750 - bf02e3771 [SPARK-8894] [SPARKR] [DOC] Example code errors in SparkR documentation. Author: Sun Rui rui@intel.com Closes #7287 from sun-rui/SPARK-8894 and squashes the following commits: da63898 [Sun Rui]

spark git commit: [SPARK-8894] [SPARKR] [DOC] Example code errors in SparkR documentation.

2015-07-08 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.4 d3d5f2ab2 - de49916ab [SPARK-8894] [SPARKR] [DOC] Example code errors in SparkR documentation. Author: Sun Rui rui@intel.com Closes #7287 from sun-rui/SPARK-8894 and squashes the following commits: da63898 [Sun Rui]

spark git commit: [SPARK-8753][SQL] Create an IntervalType data type

2015-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master 74335b310 - 0ba98c04c [SPARK-8753][SQL] Create an IntervalType data type We need a new data type to represent time intervals. Because we can't determine how many days in a month, so we need 2 values for interval: a int `months`, a long

spark git commit: [SPARK-5707] [SQL] fix serialization of generated projection

2015-07-08 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 3e831a269 - 74335b310 [SPARK-5707] [SQL] fix serialization of generated projection Author: Davies Liu dav...@databricks.com Closes #7272 from davies/fix_projection and squashes the following commits: 075ef76 [Davies Liu] fix codegen with

spark git commit: [SPARK-8888][SQL] Use java.util.HashMap in DynamicPartitionWriterContainer.

2015-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master 0ba98c04c - f61c989b4 [SPARK-][SQL] Use java.util.HashMap in DynamicPartitionWriterContainer. Just a baby step towards making it more efficient. Author: Reynold Xin r...@databricks.com Closes #7282 from rxin/SPARK- and squashes

spark git commit: [SPARK-8657] [YARN] Fail to upload resource to viewfs

2015-07-08 Thread srowen
Repository: spark Updated Branches: refs/heads/master f61c989b4 - 26d9b6b8c [SPARK-8657] [YARN] Fail to upload resource to viewfs Fail to upload resource to viewfs in spark-1.4 JIRA Link: https://issues.apache.org/jira/browse/SPARK-8657 Author: Tao Li li...@sogou-inc.com Closes #7125 from

spark git commit: [HOTFIX] Fix style error introduced in e4313db38e81f6288f1704c22e17d0c6e81b4d75

2015-07-08 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.4 e4313db38 - 898b0739e [HOTFIX] Fix style error introduced in e4313db38e81f6288f1704c22e17d0c6e81b4d75 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/898b0739

spark git commit: [SPARK-8700][ML] Disable feature scaling in Logistic Regression

2015-07-08 Thread dbtsai
Repository: spark Updated Branches: refs/heads/master 00b265f12 - 57221934e [SPARK-8700][ML] Disable feature scaling in Logistic Regression All compressed sensing applications, and some of the regression use-cases will have better result by turning the feature scaling off. However, if we

spark git commit: [SPARK-7785] [MLLIB] [PYSPARK] Add __str__ and __repr__ to Matrices

2015-07-08 Thread meng
Repository: spark Updated Branches: refs/heads/master 374c8a8a4 - 2b40365d7 [SPARK-7785] [MLLIB] [PYSPARK] Add __str__ and __repr__ to Matrices Adding __str__ and __repr__ to DenseMatrix and SparseMatrix Author: MechCoder manojkumarsivaraj...@gmail.com Closes #6342 from

spark git commit: [SPARK-8457] [ML] NGram Documentation

2015-07-08 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master f03154378 - c5532e2fe [SPARK-8457] [ML] NGram Documentation Add documentation for NGram feature transformer. Author: Feynman Liang fli...@databricks.com Closes #7244 from feynmanliang/SPARK-8457 and squashes the following commits:

spark git commit: [SPARK-8908] [SQL] Add () to distinct definition in dataframe

2015-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8f3cd9327 - 00b265f12 [SPARK-8908] [SQL] Add () to distinct definition in dataframe Adding `()` to the definition of `distinct` in DataFrame allows distinct to be called with parentheses, which is consistent with `dropDuplicates`.

spark git commit: [SPARK-8900] [SPARKR] Fix sparkPackages in init documentation

2015-07-08 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.4 898b0739e - 512786350 [SPARK-8900] [SPARKR] Fix sparkPackages in init documentation cc pwendell Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Closes #7293 from shivaram/sparkr-packages-doc and squashes the following

spark git commit: [SPARK-8909][Documentation] Change the scala example in sql-programmi…

2015-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.4 512786350 - 4df0f1b1b [SPARK-8909][Documentation] Change the scala example in sql-programmi… …ng-guide#Manually Specifying Options to be in sync with java,python, R version Author: Alok Singh “sing...@us.ibm.com” Closes

spark git commit: [SPARK-8900] [SPARKR] Fix sparkPackages in init documentation

2015-07-08 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 26d9b6b8c - 374c8a8a4 [SPARK-8900] [SPARKR] Fix sparkPackages in init documentation cc pwendell Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Closes #7293 from shivaram/sparkr-packages-doc and squashes the following commits:

spark git commit: [SPARK-8883][SQL]Remove the OverrideFunctionRegistry

2015-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master 08192a1b8 - 351a36d0c [SPARK-8883][SQL]Remove the OverrideFunctionRegistry Remove the `OverrideFunctionRegistry` from the Spark SQL, as the subclasses of `FunctionRegistry` have their own way to the delegate to the right underlying

spark git commit: [SPARK-8783] [SQL] CTAS with WITH clause does not work

2015-07-08 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 2b40365d7 - f03154378 [SPARK-8783] [SQL] CTAS with WITH clause does not work Currently, CTESubstitution only handles the case that WITH is on the top of the plan. I think it SHOULD handle the case that WITH is child of CTAS. This patch

spark git commit: [SPARK-8932] Support copy() for UnsafeRows that do not use ObjectPools

2015-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master a29081487 - b55499a44 [SPARK-8932] Support copy() for UnsafeRows that do not use ObjectPools We call Row.copy() in many places throughout SQL but UnsafeRow currently throws UnsupportedOperationException when copy() is called. Supporting

spark git commit: [SPARK-8937] [TEST] A setting `spark.unsafe.exceptionOnMemoryLeak ` is missing in ScalaTest config.

2015-07-08 Thread sarutak
Repository: spark Updated Branches: refs/heads/branch-1.4 12c1c36d9 - c04f0a5cf [SPARK-8937] [TEST] A setting `spark.unsafe.exceptionOnMemoryLeak ` is missing in ScalaTest config. `spark.unsafe.exceptionOnMemoryLeak` is present in the config of surefire. ``` !-- Surefire runs all

spark git commit: [SPARK-8937] [TEST] A setting `spark.unsafe.exceptionOnMemoryLeak ` is missing in ScalaTest config.

2015-07-08 Thread sarutak
Repository: spark Updated Branches: refs/heads/master 47ef423f8 - aba5784da [SPARK-8937] [TEST] A setting `spark.unsafe.exceptionOnMemoryLeak ` is missing in ScalaTest config. `spark.unsafe.exceptionOnMemoryLeak` is present in the config of surefire. ``` !-- Surefire runs all Java

spark git commit: [SPARK-8928] [SQL] Makes CatalystSchemaConverter sticking to 1.4.x- when handling Parquet LISTs in compatible mode

2015-07-08 Thread lian
Repository: spark Updated Branches: refs/heads/master c056484c0 - 851e247ca [SPARK-8928] [SQL] Makes CatalystSchemaConverter sticking to 1.4.x- when handling Parquet LISTs in compatible mode This PR is based on #7209 authored by Sephiroth-Lin. Author: Weizhong Lin linweizh...@huawei.com

spark git commit: [SPARK-8866][SQL] use 1us precision for timestamp type

2015-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master 28fa01e2b - a29081487 [SPARK-8866][SQL] use 1us precision for timestamp type JIRA: https://issues.apache.org/jira/browse/SPARK-8866 Author: Yijie Shen henry.yijies...@gmail.com Closes #7283 from yijieshen/micro_timestamp and squashes the

spark git commit: [SPARK-8928] [SQL] Makes CatalystSchemaConverter sticking to 1.4.x- when handling Parquet LISTs in compatible mode

2015-07-08 Thread lian
Repository: spark Updated Branches: refs/heads/master a240bf3b4 - 3dab0da42 [SPARK-8928] [SQL] Makes CatalystSchemaConverter sticking to 1.4.x- when handling Parquet LISTs in compatible mode This PR is based on #7209 authored by Sephiroth-Lin. Author: Weizhong Lin linweizh...@huawei.com

spark git commit: [SPARK-8927] [DOCS] Format wrong for some config descriptions

2015-07-08 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.4 5bc19a1a9 - 2fb2ef0ee [SPARK-8927] [DOCS] Format wrong for some config descriptions A couple descriptions were not inside `td/td` and were being displayed immediately under the section title instead of in their row. Author: Jonathan

spark git commit: [SPARK-8927] [DOCS] Format wrong for some config descriptions

2015-07-08 Thread srowen
Repository: spark Updated Branches: refs/heads/master 74d8d3d92 - 28fa01e2b [SPARK-8927] [DOCS] Format wrong for some config descriptions A couple descriptions were not inside `td/td` and were being displayed immediately under the section title instead of in their row. Author: Jonathan

spark git commit: [SPARK-8910] Fix MiMa flaky due to port contention issue

2015-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master b55499a44 - 47ef423f8 [SPARK-8910] Fix MiMa flaky due to port contention issue Due to the way MiMa works, we currently start a `SQLContext` pretty early on. This causes us to start a `SparkUI` that attempts to bind to port 4040. Because

spark git commit: [SPARK-8910] Fix MiMa flaky due to port contention issue

2015-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.4 2fb2ef0ee - 12c1c36d9 [SPARK-8910] Fix MiMa flaky due to port contention issue Due to the way MiMa works, we currently start a `SQLContext` pretty early on. This causes us to start a `SparkUI` that attempts to bind to port 4040.

spark git commit: [SPARK-8926][SQL] Good errors for ExpectsInputType expressions

2015-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master aba5784da - 768907eb7 [SPARK-8926][SQL] Good errors for ExpectsInputType expressions For example: `cannot resolve 'testfunction(null)' due to data type mismatch: argument 1 is expected to be of type int, however, null is of type

spark git commit: Revert [SPARK-8928] [SQL] Makes CatalystSchemaConverter sticking to 1.4.x- when handling Parquet LISTs in compatible mode

2015-07-08 Thread lian
Repository: spark Updated Branches: refs/heads/master 3dab0da42 - c056484c0 Revert [SPARK-8928] [SQL] Makes CatalystSchemaConverter sticking to 1.4.x- when handling Parquet LISTs in compatible mode This reverts commit 3dab0da42940a46f0c4aa4853bdb5c64c4cb2613. Project:

Git Push Summary

2015-07-08 Thread pwendell
Repository: spark Updated Tags: refs/tags/v1.4.1-rc4 [created] dbaa5c294 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[2/2] spark git commit: Preparing development version 1.4.2-SNAPSHOT

2015-07-08 Thread pwendell
Preparing development version 1.4.2-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5bc19a1a Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5bc19a1a Diff:

spark git commit: [SPARK-8914][SQL] Remove RDDApi

2015-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master f472b8cdc - 2a4f88b6c [SPARK-8914][SQL] Remove RDDApi As rxin suggested in #7298 , we should consider to remove `RDDApi`. Author: Kousuke Saruta saru...@oss.nttdata.co.jp Closes #7302 from sarutak/remove-rddapi and squashes the following

spark git commit: [SPARK-8902] Correctly print hostname in error

2015-07-08 Thread sarutak
Repository: spark Updated Branches: refs/heads/branch-1.4 3f6e6e0e2 - df763495f [SPARK-8902] Correctly print hostname in error With + the strings are separate expressions, and format() is called on the last string before concatenation. (So substitution does not happen.) Without + the string

spark git commit: [SPARK-5016] [MLLIB] Distribute GMM mixture components to executors

2015-07-08 Thread meng
Repository: spark Updated Branches: refs/heads/master 8c32b2e87 - f472b8cdc [SPARK-5016] [MLLIB] Distribute GMM mixture components to executors Distribute expensive portions of computation for Gaussian mixture components (in particular, pre-computation of `MultivariateGaussian.rootSigmaInv`,

spark git commit: [SPARK-8450] [SQL] [PYSARK] cleanup type converter for Python DataFrame

2015-07-08 Thread davies
Repository: spark Updated Branches: refs/heads/master 2a4f88b6c - 74d8d3d92 [SPARK-8450] [SQL] [PYSARK] cleanup type converter for Python DataFrame This PR fixes the converter for Python DataFrame, especially for DecimalType Closes #7106 Author: Davies Liu dav...@databricks.com Closes

spark git commit: [SPARK-8903] Fix bug in cherry-pick of SPARK-8803

2015-07-08 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.4 4df0f1b1b - 3f6e6e0e2 [SPARK-8903] Fix bug in cherry-pick of SPARK-8803 This fixes a bug introduced in the cherry-pick of #7201 which led to a NullPointerException when cross-tabulating a data set that contains null values. Author:

[2/4] spark git commit: [SPARK-6123] [SPARK-6775] [SPARK-6776] [SQL] Refactors Parquet read path for interoperability and backwards-compatibility

2015-07-08 Thread lian
http://git-wip-us.apache.org/repos/asf/spark/blob/4ffc27ca/sql/core/src/test/gen-java/org/apache/spark/sql/parquet/test/thrift/ParquetThriftCompat.java -- diff --git

[3/4] spark git commit: [SPARK-6123] [SPARK-6775] [SPARK-6776] [SQL] Refactors Parquet read path for interoperability and backwards-compatibility

2015-07-08 Thread lian
http://git-wip-us.apache.org/repos/asf/spark/blob/4ffc27ca/sql/core/src/test/gen-java/org/apache/spark/sql/parquet/test/avro/CompatibilityTest.java -- diff --git

[1/4] spark git commit: [SPARK-6123] [SPARK-6775] [SPARK-6776] [SQL] Refactors Parquet read path for interoperability and backwards-compatibility

2015-07-08 Thread lian
Repository: spark Updated Branches: refs/heads/master 5687f7655 - 4ffc27caa http://git-wip-us.apache.org/repos/asf/spark/blob/4ffc27ca/sql/core/src/test/gen-java/org/apache/spark/sql/parquet/test/thrift/Suit.java -- diff --git

spark git commit: [SPARK-8877] [MLLIB] Public API for association rule generation

2015-07-08 Thread meng
Repository: spark Updated Branches: refs/heads/master 381cb161b - 8c32b2e87 [SPARK-8877] [MLLIB] Public API for association rule generation Adds FPGrowth.generateAssociationRules to public API for generating association rules after mining frequent itemsets. Author: Feynman Liang

spark git commit: [SPARK-8068] [MLLIB] Add confusionMatrix method at class MulticlassMetrics in pyspark/mllib

2015-07-08 Thread meng
Repository: spark Updated Branches: refs/heads/master 4ffc27caa - 381cb161b [SPARK-8068] [MLLIB] Add confusionMatrix method at class MulticlassMetrics in pyspark/mllib Add confusionMatrix method at class MulticlassMetrics in pyspark/mllib Author: Yanbo Liang yblia...@gmail.com Closes #7286