spark git commit: [SPARK-7869][SQL] Adding Postgres JSON and JSONb data types support

2015-10-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3aff0866a -> b8f849b54 [SPARK-7869][SQL] Adding Postgres JSON and JSONb data types support This PR addresses [SPARK-7869](https://issues.apache.org/jira/browse/SPARK-7869) Before the patch, attempt to load the table from Postgres with

spark git commit: Akka framesize units should be specified

2015-10-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master b8f849b54 -> cd28139c9 Akka framesize units should be specified 1.4 docs noted that the units were MB - i have assumed this is still the case Author: admackin Closes #9025 from admackin/master.

spark git commit: [SPARK-10883] Add a note about how to build Spark sub-modules (reactor)

2015-10-08 Thread srowen
Repository: spark Updated Branches: refs/heads/master cd28139c9 -> 60150cf00 [SPARK-10883] Add a note about how to build Spark sub-modules (reactor) Author: Jean-Baptiste Onofré Closes #8993 from jbonofre/SPARK-10883-2. Project:

spark git commit: [SPARK-10987] [YARN] Workaround for missing netty rpc disconnection event.

2015-10-08 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 2df882ef1 -> 56a9692fc [SPARK-10987] [YARN] Workaround for missing netty rpc disconnection event. In YARN client mode, when the AM connects to the driver, it may be the case that the driver never needs to send a message back to the AM

spark git commit: [SPARK-10836] [SPARKR] Added sort(x, decreasing, col, ... ) method to DataFrame

2015-10-08 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 56a9692fc -> e8f90d9dd [SPARK-10836] [SPARKR] Added sort(x, decreasing, col, ... ) method to DataFrame the sort function can be used as an alternative to arrange(... ). As arguments it accepts x - dataframe, decreasing - TRUE/FALSE, a list

spark git commit: [SPARK-10999] [SQL] Coalesce should be able to handle UnsafeRow

2015-10-08 Thread lian
Repository: spark Updated Branches: refs/heads/master 60150cf00 -> 59b0606f3 [SPARK-10999] [SQL] Coalesce should be able to handle UnsafeRow Author: Cheng Lian Closes #9024 from liancheng/spark-10999.coalesce-unsafe-row-handling. Project:

spark git commit: [SPARK-5775] [SPARK-5508] [SQL] Re-enable Hive Parquet array reading tests

2015-10-08 Thread lian
Repository: spark Updated Branches: refs/heads/master 59b0606f3 -> 2df882ef1 [SPARK-5775] [SPARK-5508] [SQL] Re-enable Hive Parquet array reading tests Since SPARK-5508 has already been fixed. Author: Cheng Lian Closes #8999 from

spark git commit: [SPARK-8654] [SQL] Fix Analysis exception when using NULL IN (...)

2015-10-08 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 5c9fdf74e -> dcbd58a92 [SPARK-8654] [SQL] Fix Analysis exception when using NULL IN (...) In the analysis phase , while processing the rules for IN predicate, we compare the in-list types to the lhs expression type and generate cast

spark git commit: [SPARK-10998] [SQL] Show non-children in default Expression.toString

2015-10-08 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master e8f90d9dd -> 5c9fdf74e [SPARK-10998] [SQL] Show non-children in default Expression.toString Its pretty hard to debug problems with expressions when you can't see all the arguments. Before: `invoke()` After: `invoke(inputObject#1,

spark git commit: [SPARK-10993] [SQL] Inital code generated encoder for product types

2015-10-08 Thread yhuai
Repository: spark Updated Branches: refs/heads/master a8226a9f1 -> 9e66a53c9 [SPARK-10993] [SQL] Inital code generated encoder for product types This PR is a first cut at code generating an encoder that takes a Scala `Product` type and converts it directly into the tungsten binary format.

spark git commit: [SPARK-10988] [SQL] Reduce duplication in Aggregate2's expression rewriting logic

2015-10-08 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 9e66a53c9 -> 2816c89b6 [SPARK-10988] [SQL] Reduce duplication in Aggregate2's expression rewriting logic In `aggregate/utils.scala`, there is a substantial amount of duplication in the expression-rewriting logic. As a prerequisite to

spark git commit: Revert [SPARK-8654] [SQL] Fix Analysis exception when using NULL IN

2015-10-08 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master af2a55448 -> a8226a9f1 Revert [SPARK-8654] [SQL] Fix Analysis exception when using NULL IN This reverts commit dcbd58a929be0058b1cfa59b14898c4c428a7680 from #8983 Author: Michael Armbrust Closes #9034 from

spark git commit: [SPARK-8848] [SQL] Refactors Parquet write path to follow parquet-format

2015-10-08 Thread lian
Repository: spark Updated Branches: refs/heads/master 2816c89b6 -> 02149ff08 [SPARK-8848] [SQL] Refactors Parquet write path to follow parquet-format This PR refactors Parquet write path to follow parquet-format spec. It's a successor of PR #7679, but with less non-essential changes. Major

spark git commit: [SPARK-10955] [STREAMING] Add a warning if dynamic allocation for Streaming applications

2015-10-08 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.5 ba601b1ac -> 3df750029 [SPARK-10955] [STREAMING] Add a warning if dynamic allocation for Streaming applications Dynamic allocation can be painful for streaming apps and can lose data. Log a warning for streaming applications if

spark git commit: [SPARK-10955] [STREAMING] Add a warning if dynamic allocation for Streaming applications

2015-10-08 Thread tdas
Repository: spark Updated Branches: refs/heads/master fa3e4d8f5 -> 098412900 [SPARK-10955] [STREAMING] Add a warning if dynamic allocation for Streaming applications Dynamic allocation can be painful for streaming apps and can lose data. Log a warning for streaming applications if dynamic

spark git commit: [SPARK-10810] [SPARK-10902] [SQL] Improve session management in SQL

2015-10-08 Thread davies
Repository: spark Updated Branches: refs/heads/master 84ea28717 -> 3390b400d [SPARK-10810] [SPARK-10902] [SQL] Improve session management in SQL This PR improve the sessions management by replacing the thread-local based to one SQLContext per session approach, introduce separated temporary

spark git commit: [SPARK-10973] [ML] [PYTHON] __gettitem__ method throws IndexError exception when we…

2015-10-08 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 3390b400d -> 8e67882b9 [SPARK-10973] [ML] [PYTHON] __gettitem__ method throws IndexError exception when we… __gettitem__ method throws IndexError exception when we try to access index after the last non-zero entry from

spark git commit: [SPARK-10914] UnsafeRow serialization breaks when two machines have different Oops size.

2015-10-08 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 57978ae09 -> ba601b1ac [SPARK-10914] UnsafeRow serialization breaks when two machines have different Oops size. UnsafeRow contains 3 pieces of information when pointing to some data in memory (an object, a base offset, and length).

spark git commit: [SPARK-10914] UnsafeRow serialization breaks when two machines have different Oops size.

2015-10-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master 02149ff08 -> 84ea28717 [SPARK-10914] UnsafeRow serialization breaks when two machines have different Oops size. UnsafeRow contains 3 pieces of information when pointing to some data in memory (an object, a base offset, and length). When

spark git commit: [SPARK-10959] [PYSPARK] StreamingLogisticRegressionWithSGD does not train with given regParam and convergenceTol parameters

2015-10-08 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.5 3df750029 -> f95129c17 [SPARK-10959] [PYSPARK] StreamingLogisticRegressionWithSGD does not train with given regParam and convergenceTol parameters These params were being passed into the StreamingLogisticRegressionWithSGD

spark git commit: [SPARK-10875] [MLLIB] Computed covariance matrix should be symmetric

2015-10-08 Thread meng
Repository: spark Updated Branches: refs/heads/master 5410747a8 -> 5994cfe81 [SPARK-10875] [MLLIB] Computed covariance matrix should be symmetric Compute upper triangular values of the covariance matrix, then copy to lower triangular values. Author: Nick Pritchard

spark git commit: [SPARK-10959] [PYSPARK] StreamingLogisticRegressionWithSGD does not train with given regParam and convergenceTol parameters

2015-10-08 Thread meng
Repository: spark Updated Branches: refs/heads/master 67fbecbf3 -> 5410747a8 [SPARK-10959] [PYSPARK] StreamingLogisticRegressionWithSGD does not train with given regParam and convergenceTol parameters These params were being passed into the StreamingLogisticRegressionWithSGD constructor,

spark git commit: [SPARK-10956] Common MemoryManager interface for storage and execution

2015-10-08 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 098412900 -> 67fbecbf3 [SPARK-10956] Common MemoryManager interface for storage and execution This patch introduces a `MemoryManager` that is the central arbiter of how much memory to grant to storage and execution. This patch is

spark git commit: [SPARK-11006] Rename NullColumnAccess as NullColumnAccessor

2015-10-08 Thread davies
Repository: spark Updated Branches: refs/heads/master 226835600 -> 2a6f614cd [SPARK-11006] Rename NullColumnAccess as NullColumnAccessor davies I think NullColumnAccessor follows same convention for other accessors Author: tedyu Closes #9028 from tedyu/master.

spark git commit: [SPARK-10887] [SQL] Build HashedRelation outside of HashJoinNode.

2015-10-08 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 2a6f614cd -> 82d275f27 [SPARK-10887] [SQL] Build HashedRelation outside of HashJoinNode. This PR refactors `HashJoinNode` to take a existing `HashedRelation`. So, we can reuse this node for both `ShuffledHashJoin` and `BroadcastHashJoin`.

spark git commit: [SPARK-9718] [ML] linear regression training summary all columns

2015-10-08 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master dcbd58a92 -> 0903c6489 [SPARK-9718] [ML] linear regression training summary all columns LinearRegression training summary: The transformed dataset should hold all columns, not just selected ones like prediction and label. There is no real

spark git commit: [SPARK-7770] [ML] GBT validationTol change to compare with relative or absolute error

2015-10-08 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 0903c6489 -> 226835600 [SPARK-7770] [ML] GBT validationTol change to compare with relative or absolute error GBT compare ValidateError with tolerance switching between relative and absolute ones, where the former one is relative to the

spark git commit: [SPARK-7527] [CORE] Fix createNullValue to return the correct null values and REPL mode detection

2015-10-08 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.4 e7c4346d0 -> e2ff49198 [SPARK-7527] [CORE] Fix createNullValue to return the correct null values and REPL mode detection The root cause of SPARK-7527 is `createNullValue` returns an incompatible value `Byte(0)` for `char` and

spark git commit: [SPARK-10337] [SQL] fix hive views on non-hive-compatible tables.

2015-10-08 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 82d275f27 -> af2a55448 [SPARK-10337] [SQL] fix hive views on non-hive-compatible tables. add a new config to deal with this special case. Author: Wenchen Fan Closes #8990 from cloud-fan/view-master. Project: