[2/3] spark git commit: [SPARK-4233] [SPARK-4367] [SPARK-3947] [SPARK-3056] [SQL] Aggregation Improvement

2015-07-21 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/c03299a1/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/sortBasedIterators.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/

[3/3] spark git commit: [SPARK-4233] [SPARK-4367] [SPARK-3947] [SPARK-3056] [SQL] Aggregation Improvement

2015-07-21 Thread rxin
[SPARK-4233] [SPARK-4367] [SPARK-3947] [SPARK-3056] [SQL] Aggregation Improvement This is the first PR for the aggregation improvement, which is tracked by https://issues.apache.org/jira/browse/SPARK-4366 (umbrella JIRA). This PR contains work for its subtasks, SPARK-3056, SPARK-3947, SPARK-423

[1/3] spark git commit: [SPARK-4233] [SPARK-4367] [SPARK-3947] [SPARK-3056] [SQL] Aggregation Improvement

2015-07-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master f4785f5b8 -> c03299a18 http://git-wip-us.apache.org/repos/asf/spark/blob/c03299a1/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/AggregationQuerySuite.scala --

spark git commit: [SPARK-9232] [SQL] Duplicate code in JSONRelation

2015-07-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master 63f4bcc73 -> f4785f5b8 [SPARK-9232] [SQL] Duplicate code in JSONRelation Author: Andrew Or Closes #7576 from andrewor14/clean-up-json-relation and squashes the following commits: ea80803 [Andrew Or] Clean up duplicate code Project: ht

spark git commit: [SPARK-9121] [SPARKR] Get rid of the warnings about `no visible global function definition` in SparkR

2015-07-21 Thread shivaram
Repository: spark Updated Branches: refs/heads/master a4c83cb1e -> 63f4bcc73 [SPARK-9121] [SPARKR] Get rid of the warnings about `no visible global function definition` in SparkR [[SPARK-9121] Get rid of the warnings about `no visible global function definition` in SparkR - ASF JIRA](https:

spark git commit: [SPARK-9154][SQL] Rename formatString to format_string.

2015-07-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master d4c7a7a36 -> a4c83cb1e [SPARK-9154][SQL] Rename formatString to format_string. Also make format_string the canonical form, rather than printf. Author: Reynold Xin Closes #7579 from rxin/format_strings and squashes the following commits:

spark git commit: [SPARK-9154] [SQL] codegen StringFormat

2015-07-21 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master c07838b5a -> d4c7a7a36 [SPARK-9154] [SQL] codegen StringFormat Jira: https://issues.apache.org/jira/browse/SPARK-9154 fixes bug of #7546 marmbrus I can't reopen the other PR, because I didn't closed it. Can you trigger Jenkins? Author:

spark git commit: [SPARK-9206] [SQL] Fix HiveContext classloading for GCS connector.

2015-07-21 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 60c0ce134 -> c07838b5a [SPARK-9206] [SQL] Fix HiveContext classloading for GCS connector. IsolatedClientLoader.isSharedClass includes all of com.google.\*, presumably for Guava, protobuf, and/or other shared Google libraries, but needs to c

[1/3] spark git commit: [SPARK-8906][SQL] Move all internal data source classes into execution.datasources.

2015-07-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9ba7c64de -> 60c0ce134 http://git-wip-us.apache.org/repos/asf/spark/blob/60c0ce13/sql/core/src/main/scala/org/apache/spark/sql/sources/commands.scala -- diff --git a/sql/co

[3/3] spark git commit: [SPARK-8906][SQL] Move all internal data source classes into execution.datasources.

2015-07-21 Thread rxin
[SPARK-8906][SQL] Move all internal data source classes into execution.datasources. This way, the sources package contains only public facing interfaces. Author: Reynold Xin Closes #7565 from rxin/move-ds and squashes the following commits: 7661aff [Reynold Xin] Mima 9d5196a [Reynold Xin] Rea

[2/3] spark git commit: [SPARK-8906][SQL] Move all internal data source classes into execution.datasources.

2015-07-21 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/60c0ce13/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ddl.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ddl.scala

spark git commit: [SPARK-8357] Fix unsafe memory leak on empty inputs in GeneratedAggregate

2015-07-21 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 87d890cc1 -> 9ba7c64de [SPARK-8357] Fix unsafe memory leak on empty inputs in GeneratedAggregate This patch fixes a managed memory leak in GeneratedAggregate. The leak occurs when the unsafe aggregation path is used to perform grouped agg

spark git commit: Revert "[SPARK-9154] [SQL] codegen StringFormat"

2015-07-21 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 89db3c0b6 -> 87d890cc1 Revert "[SPARK-9154] [SQL] codegen StringFormat" This reverts commit 7f072c3d5ec50c65d76bd9f28fac124fce96a89e. Revert #7546 Author: Michael Armbrust Closes #7570 from marmbrus/revert9154 and squashes the following

spark git commit: [SPARK-8481] [MLLIB] GaussianMixtureModel.predict, GaussianMixtureModel.predictSoft variants for a single vector

2015-07-21 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.4 a292c492a -> 07f778978 [SPARK-8481] [MLLIB] GaussianMixtureModel.predict, GaussianMixtureModel.predictSoft variants for a single vector This PR adds GaussianMixtureModel.predict & GaussianMixtureModel.predictSoft variants for a single

spark git commit: [SPARK-5989] [MLLIB] Model save/load for LDA

2015-07-21 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 7f072c3d5 -> 89db3c0b6 [SPARK-5989] [MLLIB] Model save/load for LDA Add support for saving and loading LDA both the local and distributed versions. Author: MechCoder Closes #6948 from MechCoder/lda_save_load and squashes the following co

spark git commit: [SPARK-5423] [CORE] Register a TaskCompletionListener to make sure release all resources

2015-07-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 4f7f1ee37 -> d45355ee2 [SPARK-5423] [CORE] Register a TaskCompletionListener to make sure release all resources Make `DiskMapIterator.cleanup` idempotent and register a TaskCompletionListener to make sure call `cleanup`. Author: zsxwing

spark git commit: [SPARK-9154] [SQL] codegen StringFormat

2015-07-21 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master d45355ee2 -> 7f072c3d5 [SPARK-9154] [SQL] codegen StringFormat Jira: https://issues.apache.org/jira/browse/SPARK-9154 Author: Tarek Auel Closes #7546 from tarekauel/SPARK-9154 and squashes the following commits: a943d3e [Tarek Auel] [SP

spark git commit: [SPARK-4598] [WEBUI] Task table pagination for the Stage page

2015-07-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 31954910d -> 4f7f1ee37 [SPARK-4598] [WEBUI] Task table pagination for the Stage page This PR adds pagination for the task table to solve the scalability issue of the stage page. Here is the initial screenshot: https://cloud.githubuserconte

spark git commit: [SPARK-7171] Added a method to retrieve metrics sources in TaskContext

2015-07-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 9a4fd875b -> 31954910d [SPARK-7171] Added a method to retrieve metrics sources in TaskContext Author: Jacek Lewandowski Closes #5805 from jacek-lewandowski/SPARK-7171 and squashes the following commits: ed20bda [Jacek Lewandowski] SPARK

spark git commit: [SPARK-9128] [CORE] Get outerclasses and objects with only one method calling in ClosureCleaner

2015-07-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master f67da43c3 -> 9a4fd875b [SPARK-9128] [CORE] Get outerclasses and objects with only one method calling in ClosureCleaner JIRA: https://issues.apache.org/jira/browse/SPARK-9128 Currently, in `ClosureCleaner`, the outerclasses and objects are

spark git commit: [SPARK-9036] [CORE] SparkListenerExecutorMetricsUpdate messages not included in JsonProtocol

2015-07-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 6592a6058 -> f67da43c3 [SPARK-9036] [CORE] SparkListenerExecutorMetricsUpdate messages not included in JsonProtocol This PR implements a JSON serializer and deserializer in the JSONProtocol to handle the (de)serialization of SparkListener

spark git commit: [SPARK-9193] Avoid assigning tasks to "lost" executor(s)

2015-07-21 Thread irashid
Repository: spark Updated Branches: refs/heads/branch-1.4 1782c0ef9 -> a292c492a [SPARK-9193] Avoid assigning tasks to "lost" executor(s) Now, when some executors are killed by dynamic-allocation, it leads to some mis-assignment onto lost executors sometimes. Such kind of mis-assignment caus

spark git commit: [SPARK-9193] Avoid assigning tasks to "lost" executor(s)

2015-07-21 Thread irashid
Repository: spark Updated Branches: refs/heads/master df4ddb312 -> 6592a6058 [SPARK-9193] Avoid assigning tasks to "lost" executor(s) Now, when some executors are killed by dynamic-allocation, it leads to some mis-assignment onto lost executors sometimes. Such kind of mis-assignment causes t

spark git commit: [SPARK-8915] [DOCUMENTATION, MLLIB] Added @since tags to mllib.classification

2015-07-21 Thread meng
Repository: spark Updated Branches: refs/heads/master be5c5d374 -> df4ddb312 [SPARK-8915] [DOCUMENTATION, MLLIB] Added @since tags to mllib.classification Created since tags for methods in mllib.classification Author: petz2000 Closes #7371 from petz2000/add_since_mllib.classification and sq

spark git commit: [SPARK-9081] [SPARK-9168] [SQL] nanvl & dropna/fillna supporting nan as well

2015-07-21 Thread davies
Repository: spark Updated Branches: refs/heads/master f5b6dc5e3 -> be5c5d374 [SPARK-9081] [SPARK-9168] [SQL] nanvl & dropna/fillna supporting nan as well JIRA: https://issues.apache.org/jira/browse/SPARK-9081 https://issues.apache.org/jira/browse/SPARK-9168 This PR target at two modifications

spark git commit: [SPARK-8401] [BUILD] Scala version switching build enhancements

2015-07-21 Thread srowen
Repository: spark Updated Branches: refs/heads/master 6364735bc -> f5b6dc5e3 [SPARK-8401] [BUILD] Scala version switching build enhancements These commits address a few minor issues in the Scala cross-version support in the build: 1. Correct two missing `${scala.binary.version}` pom file s

spark git commit: [SPARK-8875] Remove BlockStoreShuffleFetcher class

2015-07-21 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master ae230596b -> 6364735bc [SPARK-8875] Remove BlockStoreShuffleFetcher class The shuffle code has gotten increasingly difficult to read as it has evolved, and many classes have evolved significantly since they were originally created. The Bl

spark git commit: [SPARK-9173][SQL]UnionPushDown should also support Intersect and Except

2015-07-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master 560c658a7 -> ae230596b [SPARK-9173][SQL]UnionPushDown should also support Intersect and Except JIRA: https://issues.apache.org/jira/browse/SPARK-9173 Author: Yijie Shen Closes #7540 from yjshen/union_pushdown and squashes the following c

spark git commit: [SPARK-8230][SQL] Add array/map size method

2015-07-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8c8f0ef59 -> 560c658a7 [SPARK-8230][SQL] Add array/map size method Pull Request for: https://issues.apache.org/jira/browse/SPARK-8230 Primary issue resolved is to implement array/map size for Spark SQL. Code is ready for review by a commi

spark git commit: [SPARK-8255] [SPARK-8256] [SQL] Add regex_extract/regex_replace

2015-07-21 Thread davies
Repository: spark Updated Branches: refs/heads/master d38c5029a -> 8c8f0ef59 [SPARK-8255] [SPARK-8256] [SQL] Add regex_extract/regex_replace Add expressions `regex_extract` & `regex_replace` Author: Cheng Hao Closes #7468 from chenghao-intel/regexp and squashes the following commits: e5ea4

spark git commit: [SPARK-9100] [SQL] Adds DataFrame reader/writer shortcut methods for ORC

2015-07-21 Thread lian
Repository: spark Updated Branches: refs/heads/master 1ddd0f2f1 -> d38c5029a [SPARK-9100] [SQL] Adds DataFrame reader/writer shortcut methods for ORC This PR adds DataFrame reader/writer shortcut methods for ORC in both Scala and Python. Author: Cheng Lian Closes #7444 from liancheng/spark