Repository: spark
Updated Branches:
refs/heads/master 701fb5052 -> 1b6a5d4af
[SPARK-11493] remove bitset from BytesToBytesMap
Since we have 4 bytes as number of records in the beginning of a page, the
address can not be zero, so we do not need the bitset.
For performance concerns, the
http://git-wip-us.apache.org/repos/asf/spark/blob/d19f4fda/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
--
diff --git
http://git-wip-us.apache.org/repos/asf/spark/blob/d19f4fda/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Last.scala
--
diff --git
Repository: spark
Updated Branches:
refs/heads/master abf5e4285 -> d19f4fda6
http://git-wip-us.apache.org/repos/asf/spark/blob/d19f4fda/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/utils.scala
Repository: spark
Updated Branches:
refs/heads/master de289bf27 -> abf5e4285
[SPARK-11504][SQL] API audit for distributeBy and localSort
1. Renamed localSort -> sortWithinPartitions to avoid ambiguity in "local"
2. distributeBy -> repartition to match the existing repartition.
Author:
Repository: spark
Updated Branches:
refs/heads/master d19f4fda6 -> 701fb5052
[SPARK-10949] Update Snappy version to 1.1.2
This is an updated version of #8995 by a-roberts. Original description follows:
Snappy now supports concatenation of serialized streams, this patch contains a
version
Repository: spark
Updated Branches:
refs/heads/master 1b6a5d4af -> 411ff6afb
[SPARK-10028][MLLIB][PYTHON] Add Python API for PrefixSpan
Author: Yu ISHIKAWA
Closes #9469 from yu-iskw/SPARK-10028.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/master a752ddad7 -> d0b563396
[SPARK-11307] Reduce memory consumption of OutputCommitCoordinator
OutputCommitCoordinator uses a map in a place where an array would suffice,
increasing its memory consumption for result stages with millions of
Repository: spark
Updated Branches:
refs/heads/master b6e0a5ae6 -> ce5e6a284
[SPARK-11491] Update build to use Scala 2.10.5
Spark should build against Scala 2.10.5, since that includes a fix for Scaladoc
that will fix doc snapshot publishing:
https://issues.scala-lang.org/browse/SI-8479
Repository: spark
Updated Branches:
refs/heads/master ce5e6a284 -> a752ddad7
[SPARK-11398] [SQL] unnecessary def dialectClassName in HiveContext, and
misleading dialect conf at the start of spark-sql
1. def dialectClassName in HiveContext is unnecessary.
In HiveContext, if conf.dialect ==
Repository: spark
Updated Branches:
refs/heads/master 411ff6afb -> b6e0a5ae6
[SPARK-11510][SQL] Remove SQL aggregation tests for higher order statistics
We have some aggregate function tests in both DataFrameAggregateSuite and
SQLQuerySuite. The two have almost the same coverage and we
Repository: spark
Updated Branches:
refs/heads/master d0b563396 -> 81498dd5c
[SPARK-11425] [SPARK-11486] Improve hybrid aggregation
After aggregation, the dataset could be smaller than inputs, so it's better to
do hash based aggregation for all inputs, then using sort based aggregation to
Repository: spark
Updated Branches:
refs/heads/master e328b69c3 -> 820064e61
[SPARK-11380][DOCS] Replace example code in mllib-frequent-pattern-mining.md
using include_example
Author: Pravin Gadakh
Author: Pravin Gadakh
Closes #9340 from
Repository: spark
Updated Branches:
refs/heads/master c09e51398 -> e328b69c3
[SPARK-9492][ML][R] LogisticRegression in R should provide model statistics
Like ml ```LinearRegression```, ```LogisticRegression``` should provide a
training summary including feature names and their coefficients.
Repository: spark
Updated Branches:
refs/heads/master e0fc9c7e5 -> 3bd6f5d2a
[SPARK-11490][SQL] variance should alias var_samp instead of var_pop.
stddev is an alias for stddev_samp. variance should be consistent with stddev.
Also took the chance to remove internal Stddev and Variance, and
Repository: spark
Updated Branches:
refs/heads/master 8790ee6d6 -> 27feafccb
[SPARK-11235][NETWORK] Add ability to stream data using network lib.
The current interface used to fetch shuffle data is not very efficient for
large buffers; it requires the receiver to buffer the entirety of the
Repository: spark
Updated Branches:
refs/heads/master 3bd6f5d2a -> 987df4bfc
Closes #9464
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/987df4bf
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/987df4bf
Diff:
Repository: spark
Updated Branches:
refs/heads/master 987df4bfc -> de289bf27
[SPARK-10304][SQL] Following up checking valid dir structure for partition
discovery
This patch follows up #8840.
Author: Liang-Chi Hsieh
Closes #9459 from
Repository: spark
Updated Branches:
refs/heads/master 2692bdb7d -> 8aff36e91
[SPARK-2960][DEPLOY] Support executing Spark from symlinks (reopen)
This PR is based on the work of roji to support running Spark scripts from
symlinks. Thanks for the great work roji . Would you mind taking a look
Repository: spark
Updated Branches:
refs/heads/master 8aff36e91 -> c09e51398
[SPARK-11442] Reduce numSlices for local metrics test of SparkListenerSuite
In the thread,
http://search-hadoop.com/m/q3RTtcQiFSlTxeP/test+failed+due+to+OOME=test+failed+due+to+OOME,
it was discussed that memory
20 matches
Mail list logo