spark git commit: [minor] update streaming linear algorithms

2015-02-03 Thread meng
Repository: spark Updated Branches: refs/heads/master 980764f3c -> 659329f9e [minor] update streaming linear algorithms Author: Xiangrui Meng Closes #4329 from mengxr/streaming-lr and squashes the following commits: 78731e1 [Xiangrui Meng] update streaming linear algorithms Project: http:

spark git commit: [SPARK-5551][SQL] Create type alias for SchemaRDD for source backward compatibility

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 37df33013 -> 523a93523 [SPARK-5551][SQL] Create type alias for SchemaRDD for source backward compatibility Author: Reynold Xin Closes #4327 from rxin/schemarddTypeAlias and squashes the following commits: e5a8ff3 [Reynold Xin] [SPARK-55

spark git commit: [SQL][DataFrame] Remove DataFrameApi, ExpressionApi, and GroupedDataFrameApi

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 659329f9e -> 37df33013 [SQL][DataFrame] Remove DataFrameApi, ExpressionApi, and GroupedDataFrameApi They were there mostly for code review and easier check of the API. I don't think they need to be there anymore. Author: Reynold Xin Clo

spark git commit: [SPARK-5549] Define TaskContext interface in Scala.

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 523a93523 -> bebf4c42b [SPARK-5549] Define TaskContext interface in Scala. So the interface documentation shows up in ScalaDoc. Author: Reynold Xin Closes #4324 from rxin/TaskContext-scala and squashes the following commits: 2480a17 [Re

spark git commit: Minor: Fix TaskContext deprecated annotations.

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master bebf4c42b -> f7948f3f5 Minor: Fix TaskContext deprecated annotations. Made a mistake in https://github.com/apache/spark/pull/4324 Author: Reynold Xin Closes #4333 from rxin/taskcontext-deprecate and squashes the following commits: 61c44

spark git commit: [SQL] DataFrame API update

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master f7948f3f5 -> 4204a1271 [SQL] DataFrame API update 1. Added Java-friendly version of the expression operators (i.e. gt, geq) 2. Added JavaDoc for most operators 3. Simplified expression operators by having only one version of the function (

Git Push Summary

2015-02-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.3 [created] 4204a1271 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-4987] [SQL] parquet timestamp type support

2015-02-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 4204a1271 -> 0c20ce69f [SPARK-4987] [SQL] parquet timestamp type support Author: Daoyuan Wang Closes #3820 from adrian-wang/parquettimestamp and squashes the following commits: b1e2a0d [Daoyuan Wang] fix for nanos 4dadef1 [Daoyuan Wang]

spark git commit: [SPARK-4987] [SQL] parquet timestamp type support

2015-02-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.3 4204a1271 -> 67d52207b [SPARK-4987] [SQL] parquet timestamp type support Author: Daoyuan Wang Closes #3820 from adrian-wang/parquettimestamp and squashes the following commits: b1e2a0d [Daoyuan Wang] fix for nanos 4dadef1 [Daoyuan W

spark git commit: [SPARK-5550] [SQL] Support the case insensitive for UDF

2015-02-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 0c20ce69f -> ca7a6cdff [SPARK-5550] [SQL] Support the case insensitive for UDF SQL in HiveContext, should be case insensitive, however, the following query will fail. ```scala udf.register("random0", () => { Math.random()}) assert(sql("S

spark git commit: [SPARK-5550] [SQL] Support the case insensitive for UDF

2015-02-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.3 67d52207b -> 654c992a1 [SPARK-5550] [SQL] Support the case insensitive for UDF SQL in HiveContext, should be case insensitive, however, the following query will fail. ```scala udf.register("random0", () => { Math.random()}) assert(sq

spark git commit: [SPARK-5383][SQL] Support alias for udtfs

2015-02-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.3 654c992a1 -> 5dbeb2104 [SPARK-5383][SQL] Support alias for udtfs Add support for alias of udtfs, such as ``` select stack(2, key, value, key, value) as (a, b) from src limit 5; select a, b from (select stack(2, key, value, key, value)

spark git commit: [SPARK-5383][SQL] Support alias for udtfs

2015-02-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master ca7a6cdff -> 5adbb3948 [SPARK-5383][SQL] Support alias for udtfs Add support for alias of udtfs, such as ``` select stack(2, key, value, key, value) as (a, b) from src limit 5; select a, b from (select stack(2, key, value, key, value) as (

spark git commit: [SPARK-4508] [SQL] build native date type to conform behavior to Hive

2015-02-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 5adbb3948 -> db821ed2e [SPARK-4508] [SQL] build native date type to conform behavior to Hive The previous #3732 is reverted due to some test failure. Have fixed that. Author: Daoyuan Wang Closes #4325 from adrian-wang/datenative and squa

spark git commit: [SPARK-4508] [SQL] build native date type to conform behavior to Hive

2015-02-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.3 5dbeb2104 -> 6e244cf4e [SPARK-4508] [SQL] build native date type to conform behavior to Hive The previous #3732 is reverted due to some test failure. Have fixed that. Author: Daoyuan Wang Closes #4325 from adrian-wang/datenative and

spark git commit: [SPARK-5153][Streaming][Test] Increased timeout to deal with flaky KafkaStreamSuite

2015-02-03 Thread tdas
Repository: spark Updated Branches: refs/heads/master db821ed2e -> 681f9df47 [SPARK-5153][Streaming][Test] Increased timeout to deal with flaky KafkaStreamSuite Timeout increased to allow overloaded Jenkins to cope with delay in topic creation. Author: Tathagata Das Closes #4342 from tdas

spark git commit: [SPARK-5153][Streaming][Test] Increased timeout to deal with flaky KafkaStreamSuite

2015-02-03 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.2 591cd8393 -> 36c299430 [SPARK-5153][Streaming][Test] Increased timeout to deal with flaky KafkaStreamSuite Timeout increased to allow overloaded Jenkins to cope with delay in topic creation. Author: Tathagata Das Closes #4342 from

spark git commit: [SPARK-5153][Streaming][Test] Increased timeout to deal with flaky KafkaStreamSuite

2015-02-03 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.3 6e244cf4e -> d644bd96a [SPARK-5153][Streaming][Test] Increased timeout to deal with flaky KafkaStreamSuite Timeout increased to allow overloaded Jenkins to cope with delay in topic creation. Author: Tathagata Das Closes #4342 from

spark git commit: [STREAMING] SPARK-4986 Wait for receivers to deregister and receiver job to terminate

2015-02-03 Thread tdas
Repository: spark Updated Branches: refs/heads/master 681f9df47 -> 1e8b5394b [STREAMING] SPARK-4986 Wait for receivers to deregister and receiver job to terminate A slow receiver might not have enough time to shutdown cleanly even when graceful shutdown is used. This PR extends graceful wait

spark git commit: [STREAMING] SPARK-4986 Wait for receivers to deregister and receiver job to terminate

2015-02-03 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.3 d644bd96a -> 092d4ba57 [STREAMING] SPARK-4986 Wait for receivers to deregister and receiver job to terminate A slow receiver might not have enough time to shutdown cleanly even when graceful shutdown is used. This PR extends graceful

spark git commit: [STREAMING] SPARK-4986 Wait for receivers to deregister and receiver job to terminate

2015-02-03 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.2 36c299430 -> 62c758753 [STREAMING] SPARK-4986 Wait for receivers to deregister and receiver job to terminate A slow receiver might not have enough time to shutdown cleanly even when graceful shutdown is used. This PR extends graceful

spark git commit: [SPARK-5554] [SQL] [PySpark] add more tests for DataFrame Python API

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1e8b5394b -> 068c0e2ee [SPARK-5554] [SQL] [PySpark] add more tests for DataFrame Python API Add more tests and docs for DataFrame Python API, improve test coverage, fix bugs. Author: Davies Liu Closes #4331 from davies/fix_df and squash

spark git commit: [SPARK-5554] [SQL] [PySpark] add more tests for DataFrame Python API

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 092d4ba57 -> 4640623bc [SPARK-5554] [SQL] [PySpark] add more tests for DataFrame Python API Add more tests and docs for DataFrame Python API, improve test coverage, fix bugs. Author: Davies Liu Closes #4331 from davies/fix_df and sq

spark git commit: [SPARK-5520][MLlib] Make FP-Growth implementation take generic item types (WIP)

2015-02-03 Thread meng
Repository: spark Updated Branches: refs/heads/master 068c0e2ee -> e380d2d46 [SPARK-5520][MLlib] Make FP-Growth implementation take generic item types (WIP) Make FPGrowth.run API take generic item types: `def run[Item: ClassTag, Basket <: Iterable[Item]](data: RDD[Basket]): FPGrowthModel[Item

spark git commit: [SPARK-5520][MLlib] Make FP-Growth implementation take generic item types (WIP)

2015-02-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 4640623bc -> 298ef5ba4 [SPARK-5520][MLlib] Make FP-Growth implementation take generic item types (WIP) Make FPGrowth.run API take generic item types: `def run[Item: ClassTag, Basket <: Iterable[Item]](data: RDD[Basket]): FPGrowthModel[

[1/2] spark git commit: [SPARK-5578][SQL][DataFrame] Provide a convenient way for Scala users to use UDFs

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master e380d2d46 -> 1077f2e1d http://git-wip-us.apache.org/repos/asf/spark/blob/1077f2e1/sql/core/src/main/scala/org/apache/spark/sql/UdfRegistration.scala -- diff --git a/sql/core

[2/2] spark git commit: [SPARK-5578][SQL][DataFrame] Provide a convenient way for Scala users to use UDFs

2015-02-03 Thread rxin
[SPARK-5578][SQL][DataFrame] Provide a convenient way for Scala users to use UDFs A more convenient way to define user-defined functions. Author: Reynold Xin Closes #4345 from rxin/defineUDF and squashes the following commits: 639c0f8 [Reynold Xin] udf tests. 0a0b339 [Reynold Xin] defineUDF -

[2/2] spark git commit: [SPARK-5578][SQL][DataFrame] Provide a convenient way for Scala users to use UDFs

2015-02-03 Thread rxin
[SPARK-5578][SQL][DataFrame] Provide a convenient way for Scala users to use UDFs A more convenient way to define user-defined functions. Author: Reynold Xin Closes #4345 from rxin/defineUDF and squashes the following commits: 639c0f8 [Reynold Xin] udf tests. 0a0b339 [Reynold Xin] defineUDF -

[1/2] spark git commit: [SPARK-5578][SQL][DataFrame] Provide a convenient way for Scala users to use UDFs

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 298ef5ba4 -> b22d5b5f8 http://git-wip-us.apache.org/repos/asf/spark/blob/b22d5b5f/sql/core/src/main/scala/org/apache/spark/sql/UdfRegistration.scala -- diff --git a/sql/

spark git commit: [SPARK-4795][Core] Redesign the "primitive type => Writable" implicit APIs to make them be activated automatically

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 b22d5b5f8 -> 5c63e0567 [SPARK-4795][Core] Redesign the "primitive type => Writable" implicit APIs to make them be activated automatically Try to redesign the "primitive type => Writable" implicit APIs to make them be activated automat

spark git commit: [SPARK-4795][Core] Redesign the "primitive type => Writable" implicit APIs to make them be activated automatically

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1077f2e1d -> d37978d8a [SPARK-4795][Core] Redesign the "primitive type => Writable" implicit APIs to make them be activated automatically Try to redesign the "primitive type => Writable" implicit APIs to make them be activated automatical

spark git commit: [FIX][MLLIB] fix seed handling in Python GMM

2015-02-03 Thread meng
Repository: spark Updated Branches: refs/heads/master d37978d8a -> eb1563185 [FIX][MLLIB] fix seed handling in Python GMM If `seed` is `None` on the python side, it will pass in as a `null`. So we should use `java.lang.Long` instead of `Long` to take it. Author: Xiangrui Meng Closes #4349

spark git commit: [FIX][MLLIB] fix seed handling in Python GMM

2015-02-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 5c63e0567 -> 679228b7f [FIX][MLLIB] fix seed handling in Python GMM If `seed` is `None` on the python side, it will pass in as a `null`. So we should use `java.lang.Long` instead of `Long` to take it. Author: Xiangrui Meng Closes #4

spark git commit: [SPARK-5579][SQL][DataFrame] Support for project/filter using SQL expressions

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master eb1563185 -> 40c4cb2fe [SPARK-5579][SQL][DataFrame] Support for project/filter using SQL expressions ```scala df.selectExpr("abs(colA)", "colB") df.filter("age > 21") ``` Author: Reynold Xin Closes #4348 from rxin/SPARK-5579 and squashes

spark git commit: [SPARK-5579][SQL][DataFrame] Support for project/filter using SQL expressions

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 679228b7f -> cb7f783df [SPARK-5579][SQL][DataFrame] Support for project/filter using SQL expressions ```scala df.selectExpr("abs(colA)", "colB") df.filter("age > 21") ``` Author: Reynold Xin Closes #4348 from rxin/SPARK-5579 and squa

spark git commit: [SPARK-4969][STREAMING][PYTHON] Add binaryRecords to streaming

2015-02-03 Thread tdas
Repository: spark Updated Branches: refs/heads/master 40c4cb2fe -> 242b4f02d [SPARK-4969][STREAMING][PYTHON] Add binaryRecords to streaming In Spark 1.2 we added a `binaryRecords` input method for loading flat binary data. This format is useful for numerical array data, e.g. in scientific co

spark git commit: [SPARK-4969][STREAMING][PYTHON] Add binaryRecords to streaming

2015-02-03 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.3 cb7f783df -> 9a33f8962 [SPARK-4969][STREAMING][PYTHON] Add binaryRecords to streaming In Spark 1.2 we added a `binaryRecords` input method for loading flat binary data. This format is useful for numerical array data, e.g. in scientific

spark git commit: [SPARK-4939] revive offers periodically in LocalBackend

2015-02-03 Thread kayousterhout
Repository: spark Updated Branches: refs/heads/master 242b4f02d -> 83de71c45 [SPARK-4939] revive offers periodically in LocalBackend The locality timeout assume that the SchedulerBackend can revive offers periodically, but currently LocalBackend did do that, then some job with mixed locality

spark git commit: [SPARK-4939] revive offers periodically in LocalBackend

2015-02-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.3 9a33f8962 -> e196da840 [SPARK-4939] revive offers periodically in LocalBackend The locality timeout assume that the SchedulerBackend can revive offers periodically, but currently LocalBackend did do that, then some job with mixed loca

spark git commit: [SPARK-4939] revive offers periodically in LocalBackend

2015-02-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.2 62c758753 -> 379976320 [SPARK-4939] revive offers periodically in LocalBackend The locality timeout assume that the SchedulerBackend can revive offers periodically, but currently LocalBackend did do that, then some job with mixed loca

spark git commit: [SPARK-5341] Use maven coordinates as dependencies in spark-shell and spark-submit

2015-02-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 83de71c45 -> 6aed719e5 [SPARK-5341] Use maven coordinates as dependencies in spark-shell and spark-submit This PR adds support for using maven coordinates as dependencies to spark-shell. Coordinates can be provided as a comma-delimited str

spark git commit: [SPARK-5341] Use maven coordinates as dependencies in spark-shell and spark-submit

2015-02-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.3 e196da840 -> 3b7acd22a [SPARK-5341] Use maven coordinates as dependencies in spark-shell and spark-submit This PR adds support for using maven coordinates as dependencies to spark-shell. Coordinates can be provided as a comma-delimited