spark git commit: [SPARK-8968] [SQL] [HOT-FIX] Fix scala 2.11 build.

2016-01-20 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 015c8efb3 -> d60f8d74a [SPARK-8968] [SQL] [HOT-FIX] Fix scala 2.11 build. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d60f8d74 Tree:

spark git commit: [SPARK-12204][SPARKR] Implement drop method for DataFrame in SparkR.

2016-01-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/master d7415991a -> 1b2a918e5 [SPARK-12204][SPARKR] Implement drop method for DataFrame in SparkR. Author: Sun Rui Closes #10201 from sun-rui/SPARK-12204. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-8968][SQL] external sort by the partition clomns when dynamic partitioning to optimize the memory overhead

2016-01-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master b362239df -> 015c8efb3 [SPARK-8968][SQL] external sort by the partition clomns when dynamic partitioning to optimize the memory overhead Now the hash based writer dynamic partitioning show the bad performance for big data and cause many

spark git commit: [SPARK-12910] Fixes : R version for installing sparkR

2016-01-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/master d60f8d74a -> d7415991a [SPARK-12910] Fixes : R version for installing sparkR Testing code: ``` $ ./install-dev.sh USING R_HOME = /usr/bin ERROR: this R is version 2.15.1, package 'SparkR' requires R >= 3.0 ``` Using the new argument: ```

spark git commit: [SPARK-12881] [SQL] subexpress elimination in mutable projection

2016-01-20 Thread davies
Repository: spark Updated Branches: refs/heads/master 753b19451 -> 8e4f894e9 [SPARK-12881] [SQL] subexpress elimination in mutable projection Author: Davies Liu Closes #10814 from davies/mutable_subexpr. Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: [SPARK-6519][ML] Add spark.ml API for bisecting k-means

2016-01-20 Thread meng
Repository: spark Updated Branches: refs/heads/master 8e4f894e9 -> 9376ae723 [SPARK-6519][ML] Add spark.ml API for bisecting k-means Author: Yu ISHIKAWA Closes #9604 from yu-iskw/SPARK-6519. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-12230][ML] WeightedLeastSquares.fit() should handle division by zero properly if standard deviation of target variable is zero.

2016-01-20 Thread meng
Repository: spark Updated Branches: refs/heads/master 9bb35c5b5 -> 9753835cf [SPARK-12230][ML] WeightedLeastSquares.fit() should handle division by zero properly if standard deviation of target variable is zero. This fixes the behavior of WeightedLeastSquars.fit() when the standard

spark git commit: [SPARK-12847][CORE][STREAMING] Remove StreamingListenerBus and post all Streaming events to the same thread as Spark events

2016-01-20 Thread tdas
Repository: spark Updated Branches: refs/heads/master e3727c409 -> 944fdadf7 [SPARK-12847][CORE][STREAMING] Remove StreamingListenerBus and post all Streaming events to the same thread as Spark events Including the following changes: 1. Add StreamingListenerForwardingBus to

spark git commit: [SPARK-12898] Consider having dummyCallSite for HiveTableScan

2016-01-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master e75e340a4 -> ab4a6bfd1 [SPARK-12898] Consider having dummyCallSite for HiveTableScan Currently, HiveTableScan runs with getCallSite which is really expensive and shows up when scanning through large table with partitions (e.g TPC-DS)

spark git commit: [SPARK-11295][PYSPARK] Add packages to JUnit output for Python tests

2016-01-20 Thread meng
Repository: spark Updated Branches: refs/heads/master 9376ae723 -> 9bb35c5b5 [SPARK-11295][PYSPARK] Add packages to JUnit output for Python tests This is #9263 from gliptak (improving grouping/display of test case results) with a small fix of bisecting k-means unit test. Author: Gábor

spark git commit: [SPARK-12616][SQL] Making Logical Operator `Union` Support Arbitrary Number of Children

2016-01-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master b7d74a602 -> 8f90c1518 [SPARK-12616][SQL] Making Logical Operator `Union` Support Arbitrary Number of Children The existing `Union` logical operator only supports two children. Thus, adding a new logical operator `Unions` which can have

spark git commit: [SPARK-12888][SQL] benchmark the new hash expression

2016-01-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8f90c1518 -> f3934a8d6 [SPARK-12888][SQL] benchmark the new hash expression Benchmark it on 4 different schemas, the result: ``` Intel(R) Core(TM) i7-4960HQ CPU 2.60GHz Hash For simple: Avg Time(ms)Avg Rate(M/s)

spark git commit: [SPARK-12848][SQL] Change parsed decimal literal datatype from Double to Decimal

2016-01-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master f3934a8d6 -> 101732793 [SPARK-12848][SQL] Change parsed decimal literal datatype from Double to Decimal The current parser turns a decimal literal, for example ```12.1```, into a Double. The problem with this approach is that we convert

spark git commit: [SPARK-12797] [SQL] Generated TungstenAggregate (without grouping keys)

2016-01-20 Thread davies
Repository: spark Updated Branches: refs/heads/master 101732793 -> b362239df [SPARK-12797] [SQL] Generated TungstenAggregate (without grouping keys) As discussed in #10786, the generated TungstenAggregate does not support imperative functions. For a query ``` sqlContext.range(10).filter("id

spark git commit: [SPARK-7799][SPARK-12786][STREAMING] Add "streaming-akka" project

2016-01-20 Thread tdas
Repository: spark Updated Branches: refs/heads/master 944fdadf7 -> b7d74a602 [SPARK-7799][SPARK-12786][STREAMING] Add "streaming-akka" project Include the following changes: 1. Add "streaming-akka" project and org.apache.spark.streaming.akka.AkkaUtils for creating an actorStream 2. Remove

svn commit: r1725819 - in /spark: ./ site/ site/news/ site/releases/

2016-01-20 Thread matei
Author: matei Date: Wed Jan 20 21:40:26 2016 New Revision: 1725819 URL: http://svn.apache.org/viewvc?rev=1725819=rev Log: Add TM symbol to download page Modified: spark/downloads.md spark/site/documentation.html spark/site/downloads.html spark/site/news/index.html

spark git commit: [SPARK-12921] Use SparkHadoopUtil reflection in SpecificParquetRecordReaderBase

2016-01-20 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.6 962e618ec -> 40fa21856 [SPARK-12921] Use SparkHadoopUtil reflection in SpecificParquetRecordReaderBase It looks like there's one place left in the codebase, SpecificParquetRecordReaderBase, where we didn't use SparkHadoopUtil's