date:20150217

[1/2] spark git commit: [Minor] [SQL] Cleans up DataFrame variable names and toDF() calls

2015-02-17 Thread rxin

Repository: spark Updated Branches: refs/heads/branch-1.3 f8f9a64eb -> 2bd33ce62 http://git-wip-us.apache.org/repos/asf/spark/blob/2bd33ce6/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUdfSuite.scala -- diff

[1/2] spark git commit: [Minor] [SQL] Cleans up DataFrame variable names and toDF() calls

2015-02-17 Thread rxin

Repository: spark Updated Branches: refs/heads/master 3912d3324 -> 61ab08549 http://git-wip-us.apache.org/repos/asf/spark/blob/61ab0854/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUdfSuite.scala -- diff --gi

[2/2] spark git commit: [Minor] [SQL] Cleans up DataFrame variable names and toDF() calls

2015-02-17 Thread rxin

[Minor] [SQL] Cleans up DataFrame variable names and toDF() calls Although we've migrated to the DataFrame API, lots of code still uses `rdd` or `srdd` as local variable names. This PR tries to address these naming inconsistencies and some other minor DataFrame related style issues. [https://r

[2/2] spark git commit: [Minor] [SQL] Cleans up DataFrame variable names and toDF() calls

2015-02-17 Thread rxin

[Minor] [SQL] Cleans up DataFrame variable names and toDF() calls Although we've migrated to the DataFrame API, lots of code still uses `rdd` or `srdd` as local variable names. This PR tries to address these naming inconsistencies and some other minor DataFrame related style issues. [https://r

spark git commit: [SPARK-5731][Streaming][Test] Fix incorrect test in DirectKafkaStreamSuite

2015-02-17 Thread tdas

Repository: spark Updated Branches: refs/heads/branch-1.3 6e82c46bf -> f8f9a64eb [SPARK-5731][Streaming][Test] Fix incorrect test in DirectKafkaStreamSuite The test was incorrect. Instead of counting the number of records, it counted the number of partitions of RDD generated by DStream. Which

spark git commit: [SPARK-5731][Streaming][Test] Fix incorrect test in DirectKafkaStreamSuite

2015-02-17 Thread tdas

Repository: spark Updated Branches: refs/heads/master e50934f11 -> 3912d3324 [SPARK-5731][Streaming][Test] Fix incorrect test in DirectKafkaStreamSuite The test was incorrect. Instead of counting the number of records, it counted the number of partitions of RDD generated by DStream. Which is

spark git commit: [SPARK-5723][SQL]Change the default file format to Parquet for CTAS statements.

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/master d5f12bfe8 -> e50934f11 [SPARK-5723][SQL]Change the default file format to Parquet for CTAS statements. JIRA: https://issues.apache.org/jira/browse/SPARK-5723 Author: Yin Huai This patch had conflicts when merged, resolved by Committer: M

spark git commit: [SPARK-5723][SQL]Change the default file format to Parquet for CTAS statements.

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/branch-1.3 2ab0ba04f -> 6e82c46bf [SPARK-5723][SQL]Change the default file format to Parquet for CTAS statements. JIRA: https://issues.apache.org/jira/browse/SPARK-5723 Author: Yin Huai This patch had conflicts when merged, resolved by Committe

Git Push Summary

2015-02-17 Thread pwendell

Repository: spark Updated Tags: refs/tags/v1.3.0-rc1 [created] f97b0d4a6 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[2/2] spark git commit: Preparing Spark release v1.3.0-rc1

2015-02-17 Thread pwendell

Preparing Spark release v1.3.0-rc1 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f97b0d4a Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f97b0d4a Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f97b0d4a Bra

[1/2] spark git commit: Preparing development version 1.3.1-SNAPSHOT

2015-02-17 Thread pwendell

Repository: spark Updated Branches: refs/heads/branch-1.3 e8284b29d -> 2ab0ba04f Preparing development version 1.3.1-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2ab0ba04 Tree: http://git-wip-us.apache.org/repo

spark git commit: [SPARK-5875][SQL]logical.Project should not be resolved if it contains aggregates or generators

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/branch-1.3 7320605ad -> e8284b29d [SPARK-5875][SQL]logical.Project should not be resolved if it contains aggregates or generators https://issues.apache.org/jira/browse/SPARK-5875 has a case to reproduce the bug and explain the root cause. Autho

spark git commit: [SPARK-5875][SQL]logical.Project should not be resolved if it contains aggregates or generators

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/master a51fc7ef9 -> d5f12bfe8 [SPARK-5875][SQL]logical.Project should not be resolved if it contains aggregates or generators https://issues.apache.org/jira/browse/SPARK-5875 has a case to reproduce the bug and explain the root cause. Author: Y

[1/2] spark git commit: Revert "Preparing development version 1.3.1-SNAPSHOT"

2015-02-17 Thread pwendell

Repository: spark Updated Branches: refs/heads/branch-1.3 7e5e4d82b -> 7320605ad Revert "Preparing development version 1.3.1-SNAPSHOT" This reverts commit e57c81b8c1a6581c2588973eaf30d3c7ae90ed0c. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org

[2/2] spark git commit: Revert "Preparing Spark release v1.3.0-snapshot1"

2015-02-17 Thread pwendell

Revert "Preparing Spark release v1.3.0-snapshot1" This reverts commit d97bfc6f28ec4b7acfb36410c7c167d8d3c145ec. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7320605a Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/

spark git commit: [SPARK-4454] Revert getOrElse() cleanup in DAGScheduler.getCacheLocs()

2015-02-17 Thread joshrosen

Repository: spark Updated Branches: refs/heads/branch-1.3 07a401a7b -> 7e5e4d82b [SPARK-4454] Revert getOrElse() cleanup in DAGScheduler.getCacheLocs() This method is performance-sensitive and this change wasn't necessary. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: ht

spark git commit: [SPARK-4454] Revert getOrElse() cleanup in DAGScheduler.getCacheLocs()

2015-02-17 Thread joshrosen

Repository: spark Updated Branches: refs/heads/master d46d6246d -> a51fc7ef9 [SPARK-4454] Revert getOrElse() cleanup in DAGScheduler.getCacheLocs() This method is performance-sensitive and this change wasn't necessary. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http:/

spark git commit: [SPARK-4454] Properly synchronize accesses to DAGScheduler cacheLocs map

2015-02-17 Thread pwendell

Repository: spark Updated Branches: refs/heads/branch-1.3 cb905841b -> 07a401a7b [SPARK-4454] Properly synchronize accesses to DAGScheduler cacheLocs map This patch addresses a race condition in DAGScheduler by properly synchronizing accesses to its `cacheLocs` map. This map is accessed by t

spark git commit: [SPARK-4454] Properly synchronize accesses to DAGScheduler cacheLocs map

2015-02-17 Thread pwendell

Repository: spark Updated Branches: refs/heads/master ae6cfb3ac -> d46d6246d [SPARK-4454] Properly synchronize accesses to DAGScheduler cacheLocs map This patch addresses a race condition in DAGScheduler by properly synchronizing accesses to its `cacheLocs` map. This map is accessed by the `

spark git commit: [SPARK-5811] Added documentation for maven coordinates and added Spark Packages support

2015-02-17 Thread pwendell

Repository: spark Updated Branches: refs/heads/master c3d2b90bd -> ae6cfb3ac [SPARK-5811] Added documentation for maven coordinates and added Spark Packages support Documentation for maven coordinates + Spark Package support. Added pyspark tests for `--packages` Author: Burak Yavuz Author:

spark git commit: [SPARK-5811] Added documentation for maven coordinates and added Spark Packages support

2015-02-17 Thread pwendell

Repository: spark Updated Branches: refs/heads/branch-1.3 81202350a -> cb905841b [SPARK-5811] Added documentation for maven coordinates and added Spark Packages support Documentation for maven coordinates + Spark Package support. Added pyspark tests for `--packages` Author: Burak Yavuz Aut

svn commit: r1660555 - in /spark: community.md site/community.html

2015-02-17 Thread matei

Author: matei Date: Wed Feb 18 01:08:19 2015 New Revision: 1660555 URL: http://svn.apache.org/r1660555 Log: Berlin meetup and cleaning up meetup page Modified: spark/community.md spark/site/community.html Modified: spark/community.md URL: http://svn.apache.org/viewvc/spark/community.md?

svn commit: r1660554 - /spark/news/_posts/2015-02-09-spark-1-2-1-released.md

2015-02-17 Thread pwendell

Author: pwendell Date: Wed Feb 18 01:07:50 2015 New Revision: 1660554 URL: http://svn.apache.org/r1660554 Log: Adding missing news file for Spark 1.2.1 release Added: spark/news/_posts/2015-02-09-spark-1-2-1-released.md Added: spark/news/_posts/2015-02-09-spark-1-2-1-released.md URL: http:/

spark git commit: [SPARK-5785] [PySpark] narrow dependency for cogroup/join in PySpark

2015-02-17 Thread joshrosen

Repository: spark Updated Branches: refs/heads/branch-1.3 07d8ef9e7 -> 81202350a [SPARK-5785] [PySpark] narrow dependency for cogroup/join in PySpark Currently, PySpark does not support narrow dependency during cogroup/join when the two RDDs have the partitioner, another unnecessary shuffle s

spark git commit: [SPARK-5785] [PySpark] narrow dependency for cogroup/join in PySpark

2015-02-17 Thread joshrosen

Repository: spark Updated Branches: refs/heads/master 117121a4e -> c3d2b90bd [SPARK-5785] [PySpark] narrow dependency for cogroup/join in PySpark Currently, PySpark does not support narrow dependency during cogroup/join when the two RDDs have the partitioner, another unnecessary shuffle stage

spark git commit: [SPARK-5852][SQL]Fail to convert a newly created empty metastore parquet table to a data source parquet table.

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/branch-1.3 0dba382ee -> 07d8ef9e7 [SPARK-5852][SQL]Fail to convert a newly created empty metastore parquet table to a data source parquet table. The problem is that after we create an empty hive metastore parquet table (e.g. `CREATE TABLE test (

spark git commit: [SPARK-5852][SQL]Fail to convert a newly created empty metastore parquet table to a data source parquet table.

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/master 4d4cc760f -> 117121a4e [SPARK-5852][SQL]Fail to convert a newly created empty metastore parquet table to a data source parquet table. The problem is that after we create an empty hive metastore parquet table (e.g. `CREATE TABLE test (a in

spark git commit: [SPARK-5872] [SQL] create a sqlCtx in pyspark shell

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/branch-1.3 cb061603c -> 0dba382ee [SPARK-5872] [SQL] create a sqlCtx in pyspark shell The sqlCtx will be HiveContext if hive is built in assembly jar, or SQLContext if not. It also skip the Hive tests in pyspark.sql.tests if no hive is available

spark git commit: [SPARK-5872] [SQL] create a sqlCtx in pyspark shell

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/master 3df85dccb -> 4d4cc760f [SPARK-5872] [SQL] create a sqlCtx in pyspark shell The sqlCtx will be HiveContext if hive is built in assembly jar, or SQLContext if not. It also skip the Hive tests in pyspark.sql.tests if no hive is available. A

spark git commit: [SPARK-5871] output explain in Python

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/branch-1.3 35e23ff14 -> cb061603c [SPARK-5871] output explain in Python Author: Davies Liu Closes #4658 from davies/explain and squashes the following commits: db87ea2 [Davies Liu] output explain in Python (cherry picked from commit 3df85dccbc

spark git commit: [SPARK-5871] output explain in Python

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/master 445a755b8 -> 3df85dccb [SPARK-5871] output explain in Python Author: Davies Liu Closes #4658 from davies/explain and squashes the following commits: db87ea2 [Davies Liu] output explain in Python Project: http://git-wip-us.apache.org/re

spark git commit: [SPARK-4172] [PySpark] Progress API in Python

2015-02-17 Thread joshrosen

Repository: spark Updated Branches: refs/heads/master de4836f8f -> 445a755b8 [SPARK-4172] [PySpark] Progress API in Python This patch bring the pull based progress API into Python, also a example in Python. Author: Davies Liu Closes #3027 from davies/progress_api and squashes the following

spark git commit: [SPARK-4172] [PySpark] Progress API in Python

2015-02-17 Thread joshrosen

Repository: spark Updated Branches: refs/heads/branch-1.3 e65dc1fd5 -> 35e23ff14 [SPARK-4172] [PySpark] Progress API in Python This patch bring the pull based progress API into Python, also a example in Python. Author: Davies Liu Closes #3027 from davies/progress_api and squashes the follo

spark git commit: [SPARK-5868][SQL] Fix python UDFs in HiveContext and checks in SQLContext

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/master 9d281fa56 -> de4836f8f [SPARK-5868][SQL] Fix python UDFs in HiveContext and checks in SQLContext Author: Michael Armbrust Closes #4657 from marmbrus/pythonUdfs and squashes the following commits: a7823a8 [Michael Armbrust] [SPARK-5868][S

spark git commit: [SPARK-5868][SQL] Fix python UDFs in HiveContext and checks in SQLContext

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/branch-1.3 01356514e -> e65dc1fd5 [SPARK-5868][SQL] Fix python UDFs in HiveContext and checks in SQLContext Author: Michael Armbrust Closes #4657 from marmbrus/pythonUdfs and squashes the following commits: a7823a8 [Michael Armbrust] [SPARK-586

spark git commit: [Minor][SQL] Use same function to check path parameter in JSONRelation

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/master 4611de1ce -> ac506b7c2 [Minor][SQL] Use same function to check path parameter in JSONRelation Author: Liang-Chi Hsieh Closes #4649 from viirya/use_checkpath and squashes the following commits: 0f9a1a1 [Liang-Chi Hsieh] Use same function

spark git commit: [SQL] [Minor] Update the HiveContext Unittest

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/master ac506b7c2 -> 9d281fa56 [SQL] [Minor] Update the HiveContext Unittest In unit test, the table src(key INT, value STRING) is not the same as HIVE src(key STRING, value STRING) https://github.com/apache/hive/blob/branch-0.13/data/scripts/q_te

spark git commit: [SQL] [Minor] Update the HiveContext Unittest

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/branch-1.3 d74d5e86a -> 01356514e [SQL] [Minor] Update the HiveContext Unittest In unit test, the table src(key INT, value STRING) is not the same as HIVE src(key STRING, value STRING) https://github.com/apache/hive/blob/branch-0.13/data/scripts/

spark git commit: [Minor][SQL] Use same function to check path parameter in JSONRelation

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/branch-1.3 62063b7a3 -> d74d5e86a [Minor][SQL] Use same function to check path parameter in JSONRelation Author: Liang-Chi Hsieh Closes #4649 from viirya/use_checkpath and squashes the following commits: 0f9a1a1 [Liang-Chi Hsieh] Use same funct

spark git commit: [SPARK-5862][SQL] Only transformUp the given plan once in HiveMetastoreCatalog

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/branch-1.3 5636c4a58 -> 62063b7a3 [SPARK-5862][SQL] Only transformUp the given plan once in HiveMetastoreCatalog Current `ParquetConversions` in `HiveMetastoreCatalog` will transformUp the given plan multiple times if there are many Metastore Par

spark git commit: [SPARK-5862][SQL] Only transformUp the given plan once in HiveMetastoreCatalog

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/master 31efb39c1 -> 4611de1ce [SPARK-5862][SQL] Only transformUp the given plan once in HiveMetastoreCatalog Current `ParquetConversions` in `HiveMetastoreCatalog` will transformUp the given plan multiple times if there are many Metastore Parquet

spark git commit: [Minor] fix typo in SQL document

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/master fc4eb9505 -> 31efb39c1 [Minor] fix typo in SQL document Author: CodingCat Closes #4656 from CodingCat/fix_typo and squashes the following commits: b41d15c [CodingCat] recover 689fe46 [CodingCat] fix typo Project: http://git-wip-us.apac

spark git commit: [Minor] fix typo in SQL document

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/branch-1.3 71cf6e295 -> 5636c4a58 [Minor] fix typo in SQL document Author: CodingCat Closes #4656 from CodingCat/fix_typo and squashes the following commits: b41d15c [CodingCat] recover 689fe46 [CodingCat] fix typo (cherry picked from commit 3

spark git commit: [SPARK-5864] [PySpark] support .jar as python package

2015-02-17 Thread pwendell

Repository: spark Updated Branches: refs/heads/branch-1.3 e64afcd84 -> 71cf6e295 [SPARK-5864] [PySpark] support .jar as python package A jar file containing Python sources in it could be used as a Python package, just like zip file. spark-submit already put the jar file into PYTHONPATH, this

spark git commit: [SPARK-5864] [PySpark] support .jar as python package

2015-02-17 Thread pwendell

Repository: spark Updated Branches: refs/heads/master 49c19fdba -> fc4eb9505 [SPARK-5864] [PySpark] support .jar as python package A jar file containing Python sources in it could be used as a Python package, just like zip file. spark-submit already put the jar file into PYTHONPATH, this pat

spark git commit: SPARK-5841 [CORE] [HOTFIX] Memory leak in DiskBlockManager

2015-02-17 Thread srowen

Repository: spark Updated Branches: refs/heads/master 24f358b9d -> 49c19fdba SPARK-5841 [CORE] [HOTFIX] Memory leak in DiskBlockManager Avoid call to remove shutdown hook being called from shutdown hook CC pwendell JoshRosen MattWhelan Author: Sean Owen Closes #4648 from srowen/SPARK-5841.

spark git commit: SPARK-5841 [CORE] [HOTFIX] Memory leak in DiskBlockManager

2015-02-17 Thread srowen

Repository: spark Updated Branches: refs/heads/branch-1.3 420bc9b3a -> e64afcd84 SPARK-5841 [CORE] [HOTFIX] Memory leak in DiskBlockManager Avoid call to remove shutdown hook being called from shutdown hook CC pwendell JoshRosen MattWhelan Author: Sean Owen Closes #4648 from srowen/SPARK-5

spark git commit: MAINTENANCE: Automated closing of pull requests.

2015-02-17 Thread pwendell

Repository: spark Updated Branches: refs/heads/master 9b746f380 -> 24f358b9d MAINTENANCE: Automated closing of pull requests. This commit exists to close the following pull requests on Github: Closes #3297 (close requested by 'andrewor14') Closes #3345 (close requested by 'pwendell') Closes #

spark git commit: [SPARK-5661]function hasShutdownDeleteTachyonDir should use shutdownDeleteTachyonPaths to determine whether contains file

2015-02-17 Thread srowen

Repository: spark Updated Branches: refs/heads/branch-1.3 2bf2b56ef -> 420bc9b3a [SPARK-5661]function hasShutdownDeleteTachyonDir should use shutdownDeleteTachyonPaths to determine whether contains file hasShutdownDeleteTachyonDir(file: TachyonFile) should use shutdownDeleteTachyonPaths(not

spark git commit: [SPARK-3381] [MLlib] Eliminate bins for unordered features in DecisionTrees

2015-02-17 Thread jkbradley

Repository: spark Updated Branches: refs/heads/master b271c265b -> 9b746f380 [SPARK-3381] [MLlib] Eliminate bins for unordered features in DecisionTrees For unordered features, it is sufficient to use splits since the threshold of the split corresponds the threshold of the HighSplit of the bi

spark git commit: [SPARK-5778] throw if nonexistent metrics config file provided

2015-02-17 Thread pwendell

Repository: spark Updated Branches: refs/heads/branch-1.3 4a581aa3f -> 2bf2b56ef [SPARK-5778] throw if nonexistent metrics config file provided previous behavior was to log an error; this is fine in the general case where no `spark.metrics.conf` parameter was specified, in which case a default

spark git commit: [SPARK-5661]function hasShutdownDeleteTachyonDir should use shutdownDeleteTachyonPaths to determine whether contains file

2015-02-17 Thread srowen

Repository: spark Updated Branches: refs/heads/master d8f69cf78 -> b271c265b [SPARK-5661]function hasShutdownDeleteTachyonDir should use shutdownDeleteTachyonPaths to determine whether contains file hasShutdownDeleteTachyonDir(file: TachyonFile) should use shutdownDeleteTachyonPaths(not shut

spark git commit: [SPARK-5778] throw if nonexistent metrics config file provided

2015-02-17 Thread pwendell

Repository: spark Updated Branches: refs/heads/master d8adefefc -> d8f69cf78 [SPARK-5778] throw if nonexistent metrics config file provided previous behavior was to log an error; this is fine in the general case where no `spark.metrics.conf` parameter was specified, in which case a default `me

spark git commit: [SPARK-5859] [PySpark] [SQL] fix DataFrame Python API

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/branch-1.3 cd3d41587 -> 4a581aa3f [SPARK-5859] [PySpark] [SQL] fix DataFrame Python API 1. added explain() 2. add isLocal() 3. do not call show() in __repl__ 4. add foreach() and foreachPartition() 5. add distinct() 6. fix functions.col()/column()

[1/2] spark git commit: [SPARK-5166][SPARK-5247][SPARK-5258][SQL] API Cleanup / Documentation

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/branch-1.3 97cb568a2 -> cd3d41587 http://git-wip-us.apache.org/repos/asf/spark/blob/cd3d4158/sql/core/src/test/scala/org/apache/spark/sql/jdbc/MySQLIntegration.scala -- diff --git

[1/2] spark git commit: [SPARK-5166][SPARK-5247][SPARK-5258][SQL] API Cleanup / Documentation

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/master c76da36c2 -> c74b07fa9 http://git-wip-us.apache.org/repos/asf/spark/blob/c74b07fa/sql/core/src/test/scala/org/apache/spark/sql/jdbc/MySQLIntegration.scala -- diff --git a/s

[2/2] spark git commit: [SPARK-5166][SPARK-5247][SPARK-5258][SQL] API Cleanup / Documentation

2015-02-17 Thread marmbrus

[SPARK-5166][SPARK-5247][SPARK-5258][SQL] API Cleanup / Documentation Author: Michael Armbrust Closes #4642 from marmbrus/docs and squashes the following commits: d291c34 [Michael Armbrust] python tests 9be66e3 [Michael Armbrust] comments d56afc2 [Michael Armbrust] fix style f004747 [Michael Ar

spark git commit: [SPARK-5859] [PySpark] [SQL] fix DataFrame Python API

2015-02-17 Thread marmbrus

Repository: spark Updated Branches: refs/heads/master c74b07fa9 -> d8adefefc [SPARK-5859] [PySpark] [SQL] fix DataFrame Python API 1. added explain() 2. add isLocal() 3. do not call show() in __repl__ 4. add foreach() and foreachPartition() 5. add distinct() 6. fix functions.col()/column()/lit

[2/2] spark git commit: [SPARK-5166][SPARK-5247][SPARK-5258][SQL] API Cleanup / Documentation

2015-02-17 Thread marmbrus

[SPARK-5166][SPARK-5247][SPARK-5258][SQL] API Cleanup / Documentation Author: Michael Armbrust Closes #4642 from marmbrus/docs and squashes the following commits: d291c34 [Michael Armbrust] python tests 9be66e3 [Michael Armbrust] comments d56afc2 [Michael Armbrust] fix style f004747 [Michael Ar

spark git commit: [SPARK-5858][MLLIB] Remove unnecessary first() call in GLM

2015-02-17 Thread meng

Repository: spark Updated Branches: refs/heads/branch-1.3 824062912 -> 97cb568a2 [SPARK-5858][MLLIB] Remove unnecessary first() call in GLM `numFeatures` is only used by multinomial logistic regression. Calling `.first()` for every GLM causes performance regression, especially in Python. Aut

spark git commit: [SPARK-5858][MLLIB] Remove unnecessary first() call in GLM

2015-02-17 Thread meng

Repository: spark Updated Branches: refs/heads/master 3ce46e94f -> c76da36c2 [SPARK-5858][MLLIB] Remove unnecessary first() call in GLM `numFeatures` is only used by multinomial logistic regression. Calling `.first()` for every GLM causes performance regression, especially in Python. Author:

spark git commit: SPARK-5856: In Maven build script, launch Zinc with more memory

2015-02-17 Thread pwendell

Repository: spark Updated Branches: refs/heads/branch-1.3 aeb85cdee -> 824062912 SPARK-5856: In Maven build script, launch Zinc with more memory I've seen out of memory exceptions when trying to run many parallel builds against the same Zinc server during packaging. We should use the same incr

spark git commit: SPARK-5856: In Maven build script, launch Zinc with more memory

2015-02-17 Thread pwendell

Repository: spark Updated Branches: refs/heads/master ee6e3eff0 -> 3ce46e94f SPARK-5856: In Maven build script, launch Zinc with more memory I've seen out of memory exceptions when trying to run many parallel builds against the same Zinc server during packaging. We should use the same increase

spark git commit: Revert "[SPARK-5363] [PySpark] check ending mark in non-block way"

2015-02-17 Thread joshrosen

Repository: spark Updated Branches: refs/heads/branch-1.3 b8da5c390 -> aeb85cdee Revert "[SPARK-5363] [PySpark] check ending mark in non-block way" This reverts commits ac6fe67e1d8bf01ee565f9cc09ad48d88a275829 and c06e42f2c1e5fcf123b466efd27ee4cb53bbed3f. Project: http://git-wip-us.apache.o

spark git commit: Revert "[SPARK-5363] [PySpark] check ending mark in non-block way"

2015-02-17 Thread joshrosen

Repository: spark Updated Branches: refs/heads/branch-1.2 432ceca2a -> 6be36d5a8 Revert "[SPARK-5363] [PySpark] check ending mark in non-block way" This reverts commits ac6fe67e1d8bf01ee565f9cc09ad48d88a275829 and c06e42f2c1e5fcf123b466efd27ee4cb53bbed3f. Project: http://git-wip-us.apache.o

spark git commit: Revert "[SPARK-5363] [PySpark] check ending mark in non-block way"

2015-02-17 Thread joshrosen

Repository: spark Updated Branches: refs/heads/master a65766bf0 -> ee6e3eff0 Revert "[SPARK-5363] [PySpark] check ending mark in non-block way" This reverts commits ac6fe67e1d8bf01ee565f9cc09ad48d88a275829 and c06e42f2c1e5fcf123b466efd27ee4cb53bbed3f. Project: http://git-wip-us.apache.org/r

spark git commit: [SPARK-5826][Streaming] Fix Configuration not serializable problem

2015-02-17 Thread srowen

Repository: spark Updated Branches: refs/heads/branch-1.3 e9241fa70 -> b8da5c390 [SPARK-5826][Streaming] Fix Configuration not serializable problem Author: jerryshao Closes #4612 from jerryshao/SPARK-5826 and squashes the following commits: 7ec71db [jerryshao] Remove transient for conf stat

spark git commit: [SPARK-5826][Streaming] Fix Configuration not serializable problem

2015-02-17 Thread srowen

Repository: spark Updated Branches: refs/heads/master c06e42f2c -> a65766bf0 [SPARK-5826][Streaming] Fix Configuration not serializable problem Author: jerryshao Closes #4612 from jerryshao/SPARK-5826 and squashes the following commits: 7ec71db [jerryshao] Remove transient for conf statemen

68 matches

Mail list logo