spark git commit: [SPARK-9725] [SQL] fix serialization of UTF8String across different JVM

2015-08-14 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.5 33015009f -> d97af68af [SPARK-9725] [SQL] fix serialization of UTF8String across different JVM The BYTE_ARRAY_OFFSET could be different in JVM with different configurations (for example, different heap size, 24 if heap > 32G, otherwise

spark git commit: [SPARK-9725] [SQL] fix serialization of UTF8String across different JVM

2015-08-14 Thread davies
Repository: spark Updated Branches: refs/heads/master 71a3af8a9 -> 7c1e56825 [SPARK-9725] [SQL] fix serialization of UTF8String across different JVM The BYTE_ARRAY_OFFSET could be different in JVM with different configurations (for example, different heap size, 24 if heap > 32G, otherwise 16)

spark git commit: [SPARK-9960] [GRAPHX] sendMessage type fix in LabelPropagation.scala

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master 609ce3c07 -> 71a3af8a9 [SPARK-9960] [GRAPHX] sendMessage type fix in LabelPropagation.scala Author: zc he Closes #8188 from farseer90718/farseer-patch-1. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-

spark git commit: [SPARK-9960] [GRAPHX] sendMessage type fix in LabelPropagation.scala

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 83cbf60a2 -> 33015009f [SPARK-9960] [GRAPHX] sendMessage type fix in LabelPropagation.scala Author: zc he Closes #8188 from farseer90718/farseer-patch-1. (cherry picked from commit 71a3af8a94f900a26ac7094f22ec1216cab62e15) Signed-off

spark git commit: [SPARK-9984] [SQL] Create local physical operator interface.

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6c4fdbec3 -> 609ce3c07 [SPARK-9984] [SQL] Create local physical operator interface. This pull request creates a new operator interface that is more similar to traditional database query iterators (with open/close/next/get). These local op

spark git commit: [SPARK-8887] [SQL] Explicit define which data types can be used as dynamic partition columns

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master ec29f2034 -> 6c4fdbec3 [SPARK-8887] [SQL] Explicit define which data types can be used as dynamic partition columns This PR enforce dynamic partition column data type requirements by adding analysis rules. JIRA: https://issues.apache.org

spark git commit: [SPARK-9634] [SPARK-9323] [SQL] cleanup unnecessary Aliases in LogicalPlan at the end of analysis

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 3cdeeaf5e -> 83cbf60a2 [SPARK-9634] [SPARK-9323] [SQL] cleanup unnecessary Aliases in LogicalPlan at the end of analysis Also alias the ExtractValue instead of wrapping it with UnresolvedAlias when resolve attribute in LogicalPlan, as

spark git commit: [SPARK-9634] [SPARK-9323] [SQL] cleanup unnecessary Aliases in LogicalPlan at the end of analysis

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master 37586e544 -> ec29f2034 [SPARK-9634] [SPARK-9323] [SQL] cleanup unnecessary Aliases in LogicalPlan at the end of analysis Also alias the ExtractValue instead of wrapping it with UnresolvedAlias when resolve attribute in LogicalPlan, as thi

spark git commit: [HOTFIX] fix duplicated braces

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 d84291713 -> 3cdeeaf5e [HOTFIX] fix duplicated braces Author: Davies Liu Closes #8219 from davies/fix_typo. (cherry picked from commit 37586e5449ff8f892d41f0b6b8fa1de83dd3849e) Signed-off-by: Reynold Xin Project: http://git-wip-us

spark git commit: [HOTFIX] fix duplicated braces

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master e5fd60415 -> 37586e544 [HOTFIX] fix duplicated braces Author: Davies Liu Closes #8219 from davies/fix_typo. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/37586e54

spark git commit: [SPARK-9934] Deprecate NIO ConnectionManager.

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 6be945cef -> d84291713 [SPARK-9934] Deprecate NIO ConnectionManager. Deprecate NIO ConnectionManager in Spark 1.5.0, before removing it in Spark 1.6.0. Author: Reynold Xin Closes #8162 from rxin/SPARK-9934. (cherry picked from comm

spark git commit: [SPARK-9934] Deprecate NIO ConnectionManager.

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master 932b24fd1 -> e5fd60415 [SPARK-9934] Deprecate NIO ConnectionManager. Deprecate NIO ConnectionManager in Spark 1.5.0, before removing it in Spark 1.6.0. Author: Reynold Xin Closes #8162 from rxin/SPARK-9934. Project: http://git-wip-us.

spark git commit: [SPARK-9949] [SQL] Fix TakeOrderedAndProject's output.

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 8d2624790 -> 6be945cef [SPARK-9949] [SQL] Fix TakeOrderedAndProject's output. https://issues.apache.org/jira/browse/SPARK-9949 Author: Yin Huai Closes #8179 from yhuai/SPARK-9949. (cherry picked from commit 932b24fd144232fb08184f0bd

spark git commit: [SPARK-9949] [SQL] Fix TakeOrderedAndProject's output.

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master 18a761ef7 -> 932b24fd1 [SPARK-9949] [SQL] Fix TakeOrderedAndProject's output. https://issues.apache.org/jira/browse/SPARK-9949 Author: Yin Huai Closes #8179 from yhuai/SPARK-9949. Project: http://git-wip-us.apache.org/repos/asf/spark/r

spark git commit: [SPARK-9968] [STREAMING] Reduced time spent within synchronized block to prevent lock starvation

2015-08-14 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.5 612b4609b -> 8d2624790 [SPARK-9968] [STREAMING] Reduced time spent within synchronized block to prevent lock starvation When the rate limiter is actually limiting the rate at which data is inserted into the buffer, the synchronized bl

spark git commit: [SPARK-9968] [STREAMING] Reduced time spent within synchronized block to prevent lock starvation

2015-08-14 Thread tdas
Repository: spark Updated Branches: refs/heads/master f3bfb711c -> 18a761ef7 [SPARK-9968] [STREAMING] Reduced time spent within synchronized block to prevent lock starvation When the rate limiter is actually limiting the rate at which data is inserted into the buffer, the synchronized block

spark git commit: [SPARK-9966] [STREAMING] Handle couple of corner cases in PIDRateEstimator

2015-08-14 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.5 5bbb2d327 -> 612b4609b [SPARK-9966] [STREAMING] Handle couple of corner cases in PIDRateEstimator 1. The rate estimator should not estimate any rate when there are no records in the batch, as there is no data to estimate the rate. In t

spark git commit: [SPARK-9966] [STREAMING] Handle couple of corner cases in PIDRateEstimator

2015-08-14 Thread tdas
Repository: spark Updated Branches: refs/heads/master 1150a19b1 -> f3bfb711c [SPARK-9966] [STREAMING] Handle couple of corner cases in PIDRateEstimator 1. The rate estimator should not estimate any rate when there are no records in the batch, as there is no data to estimate the rate. In the c

spark git commit: [SPARK-8670] [SQL] Nested columns can't be referenced in pyspark

2015-08-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 0f4ccdc4c -> 5bbb2d327 [SPARK-8670] [SQL] Nested columns can't be referenced in pyspark This bug is caused by a wrong column-exist-check in `__getitem__` of pyspark dataframe. `DataFrame.apply` accepts not only top level column names,

spark git commit: [SPARK-9981] [ML] Made labels public for StringIndexerModel

2015-08-14 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.5 59cdcc079 -> 0f4ccdc4c [SPARK-9981] [ML] Made labels public for StringIndexerModel Also added unit test for integration between StringIndexerModel and IndexToString CC: holdenk We realized we should have left in your unit test (to cat

spark git commit: [SPARK-8670] [SQL] Nested columns can't be referenced in pyspark

2015-08-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 2a6590e51 -> 1150a19b1 [SPARK-8670] [SQL] Nested columns can't be referenced in pyspark This bug is caused by a wrong column-exist-check in `__getitem__` of pyspark dataframe. `DataFrame.apply` accepts not only top level column names, but

spark git commit: [SPARK-9981] [ML] Made labels public for StringIndexerModel

2015-08-14 Thread meng
Repository: spark Updated Branches: refs/heads/master 11ed2b180 -> 2a6590e51 [SPARK-9981] [ML] Made labels public for StringIndexerModel Also added unit test for integration between StringIndexerModel and IndexToString CC: holdenk We realized we should have left in your unit test (to catch t

spark git commit: [SPARK-9978] [PYSPARK] [SQL] fix Window.orderBy and doc of ntile()

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.4 969e8b31b -> 4fc3b8cd2 [SPARK-9978] [PYSPARK] [SQL] fix Window.orderBy and doc of ntile() Author: Davies Liu Closes #8213 from davies/fix_window. (cherry picked from commit 11ed2b180ec86523a94679a8b8132fadb911ccd5) Signed-off-by: Rey

spark git commit: [SPARK-9978] [PYSPARK] [SQL] fix Window.orderBy and doc of ntile()

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 130e06ef1 -> 59cdcc079 [SPARK-9978] [PYSPARK] [SQL] fix Window.orderBy and doc of ntile() Author: Davies Liu Closes #8213 from davies/fix_window. (cherry picked from commit 11ed2b180ec86523a94679a8b8132fadb911ccd5) Signed-off-by: Rey

spark git commit: [SPARK-9978] [PYSPARK] [SQL] fix Window.orderBy and doc of ntile()

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9407baa2a -> 11ed2b180 [SPARK-9978] [PYSPARK] [SQL] fix Window.orderBy and doc of ntile() Author: Davies Liu Closes #8213 from davies/fix_window. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apach

spark git commit: [SPARK-9877] [CORE] Fix StandaloneRestServer NPE when submitting application

2015-08-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.5 1ce0b01f4 -> 130e06ef1 [SPARK-9877] [CORE] Fix StandaloneRestServer NPE when submitting application Detailed exception log can be seen in [SPARK-9877](https://issues.apache.org/jira/browse/SPARK-9877), the problem is when creating `St

spark git commit: [SPARK-9877] [CORE] Fix StandaloneRestServer NPE when submitting application

2015-08-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 6518ef630 -> 9407baa2a [SPARK-9877] [CORE] Fix StandaloneRestServer NPE when submitting application Detailed exception log can be seen in [SPARK-9877](https://issues.apache.org/jira/browse/SPARK-9877), the problem is when creating `Standa

spark git commit: [SPARK-9948] Fix flaky AccumulatorSuite - internal accumulators

2015-08-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 33bae585d -> 6518ef630 [SPARK-9948] Fix flaky AccumulatorSuite - internal accumulators In these tests, we use a custom listener and we assert on fields in the stage / task completion events. However, these events are posted in a separate t

spark git commit: [SPARK-9948] Fix flaky AccumulatorSuite - internal accumulators

2015-08-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.5 ff3e9561d -> 1ce0b01f4 [SPARK-9948] Fix flaky AccumulatorSuite - internal accumulators In these tests, we use a custom listener and we assert on fields in the stage / task completion events. However, these events are posted in a separa

spark git commit: [SPARK-9809] Task crashes because the internal accumulators are not properly initialized

2015-08-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.5 d92568ae5 -> ff3e9561d [SPARK-9809] Task crashes because the internal accumulators are not properly initialized When a stage failed and another stage was resubmitted with only part of partitions to compute, all the tasks failed with e

spark git commit: [SPARK-9809] Task crashes because the internal accumulators are not properly initialized

2015-08-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master ffa05c84f -> 33bae585d [SPARK-9809] Task crashes because the internal accumulators are not properly initialized When a stage failed and another stage was resubmitted with only part of partitions to compute, all the tasks failed with error

spark git commit: [SPARK-9828] [PYSPARK] Mutable values should not be default arguments

2015-08-14 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.4 db71ea482 -> 969e8b31b [SPARK-9828] [PYSPARK] Mutable values should not be default arguments Author: MechCoder Closes #8110 from MechCoder/spark-9828. (cherry picked from commit ffa05c84fe75663fc33f3d954d1cb1e084ab3280) Signed-off-by

spark git commit: [SPARK-9828] [PYSPARK] Mutable values should not be default arguments

2015-08-14 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.5 b2842138c -> d92568ae5 [SPARK-9828] [PYSPARK] Mutable values should not be default arguments Author: MechCoder Closes #8110 from MechCoder/spark-9828. (cherry picked from commit ffa05c84fe75663fc33f3d954d1cb1e084ab3280) Signed-off-by

spark git commit: [SPARK-9828] [PYSPARK] Mutable values should not be default arguments

2015-08-14 Thread meng
Repository: spark Updated Branches: refs/heads/master ece00566e -> ffa05c84f [SPARK-9828] [PYSPARK] Mutable values should not be default arguments Author: MechCoder Closes #8110 from MechCoder/spark-9828. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.

spark git commit: [SPARK-9561] Re-enable BroadcastJoinSuite

2015-08-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 e2a288cc3 -> b2842138c [SPARK-9561] Re-enable BroadcastJoinSuite We can do this now that SPARK-9580 is resolved. Author: Andrew Or Closes #8208 from andrewor14/reenable-sql-tests. (cherry picked from commit ece00566e4d5f38585f2810be

spark git commit: [SPARK-9561] Re-enable BroadcastJoinSuite

2015-08-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 3bc552872 -> ece00566e [SPARK-9561] Re-enable BroadcastJoinSuite We can do this now that SPARK-9580 is resolved. Author: Andrew Or Closes #8208 from andrewor14/reenable-sql-tests. Project: http://git-wip-us.apache.org/repos/asf/spark/r

spark git commit: [SPARK-9946] [SPARK-9589] [SQL] fix NPE and thread-safety in TaskMemoryManager

2015-08-14 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.5 e4ea2390a -> e2a288cc3 [SPARK-9946] [SPARK-9589] [SQL] fix NPE and thread-safety in TaskMemoryManager Currently, we access the `page.pageNumer` after it's freed, that could be modified by other thread, cause NPE. The same TaskMemoryMa

spark git commit: [SPARK-9946] [SPARK-9589] [SQL] fix NPE and thread-safety in TaskMemoryManager

2015-08-14 Thread davies
Repository: spark Updated Branches: refs/heads/master 57c2d0880 -> 3bc552872 [SPARK-9946] [SPARK-9589] [SQL] fix NPE and thread-safety in TaskMemoryManager Currently, we access the `page.pageNumer` after it's freed, that could be modified by other thread, cause NPE. The same TaskMemoryManage

spark git commit: [SPARK-9923] [CORE] ShuffleMapStage.numAvailableOutputs should be an Int instead of Long

2015-08-14 Thread srowen
Repository: spark Updated Branches: refs/heads/master 34d610be8 -> 57c2d0880 [SPARK-9923] [CORE] ShuffleMapStage.numAvailableOutputs should be an Int instead of Long Modified type of ShuffleMapStage.numAvailableOutputs from Long to Int Author: Neelesh Srinivas Salian Closes #8183 from nssa

spark git commit: [SPARK-9929] [SQL] support metadata in withColumn

2015-08-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master a7317ccdc -> 34d610be8 [SPARK-9929] [SQL] support metadata in withColumn in MLlib sometimes we need to set metadata for the new column, thus we will alias the new column with metadata before call `withColumn` and in `withColumn` we alias

spark git commit: [SPARK-8744] [ML] Add a public constructor to StringIndexer

2015-08-14 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.5 f5298da16 -> e4ea2390a [SPARK-8744] [ML] Add a public constructor to StringIndexer It would be helpful to allow users to pass a pre-computed index to create an indexer, rather than always going through StringIndexer to create the model

spark git commit: [SPARK-8744] [ML] Add a public constructor to StringIndexer

2015-08-14 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 7ecf0c469 -> a7317ccdc [SPARK-8744] [ML] Add a public constructor to StringIndexer It would be helpful to allow users to pass a pre-computed index to create an indexer, rather than always going through StringIndexer to create the model. A

spark git commit: [SPARK-9956] [ML] Make trees work with one-category features

2015-08-14 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.5 4aa9238b9 -> f5298da16 [SPARK-9956] [ML] Make trees work with one-category features This modifies DecisionTreeMetadata construction to treat 1-category features as continuous, so that trees do not fail with such features. It is import

spark git commit: [SPARK-9956] [ML] Make trees work with one-category features

2015-08-14 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master a0e1abbd0 -> 7ecf0c469 [SPARK-9956] [ML] Make trees work with one-category features This modifies DecisionTreeMetadata construction to treat 1-category features as continuous, so that trees do not fail with such features. It is important

spark git commit: [SPARK-9661] [MLLIB] minor clean-up of SPARK-9661

2015-08-14 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master c8677d736 -> a0e1abbd0 [SPARK-9661] [MLLIB] minor clean-up of SPARK-9661 Some minor clean-ups after SPARK-9661. See my inline comments. MechCoder jkbradley Author: Xiangrui Meng Closes #8190 from mengxr/SPARK-9661-fix. Project: http:/

spark git commit: [SPARK-9661] [MLLIB] minor clean-up of SPARK-9661

2015-08-14 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.5 a0d52eb30 -> 4aa9238b9 [SPARK-9661] [MLLIB] minor clean-up of SPARK-9661 Some minor clean-ups after SPARK-9661. See my inline comments. MechCoder jkbradley Author: Xiangrui Meng Closes #8190 from mengxr/SPARK-9661-fix. (cherry pick