spark git commit: [SPARK-10981] [SPARKR] SparkR Join improvements

2015-10-13 Thread shivaram
Repository: spark Updated Branches: refs/heads/master ce3f9a806 -> 8b3288570 [SPARK-10981] [SPARKR] SparkR Join improvements I was having issues with collect() and orderBy() in Spark 1.5.0 so I used the DataFrame.R file and test_sparkSQL.R file from the Spark 1.5.1 download. I only modified

spark git commit: [SPARK-10996] [SPARKR] Implement sampleBy() in DataFrameStatFunctions.

2015-10-13 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 8b3288570 -> 390b22fad [SPARK-10996] [SPARKR] Implement sampleBy() in DataFrameStatFunctions. Author: Sun Rui Closes #9023 from sun-rui/SPARK-10996. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-11026] [YARN] spark.yarn.user.classpath.first does work for 'spark-submit --jars hdfs://user/foo.jar'

2015-10-13 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-1.5 2217f4f8b -> 47bc6c0fa [SPARK-11026] [YARN] spark.yarn.user.classpath.first does work for 'spark-submit --jars hdfs://user/foo.jar' when spark.yarn.user.classpath.first=true and using 'spark-submit --jars hdfs://user/foo.jar', it can

spark git commit: [SPARK-11026] [YARN] spark.yarn.user.classpath.first does work for 'spark-submit --jars hdfs://user/foo.jar'

2015-10-13 Thread tgraves
Repository: spark Updated Branches: refs/heads/master c4da5345a -> 626aab79c [SPARK-11026] [YARN] spark.yarn.user.classpath.first does work for 'spark-submit --jars hdfs://user/foo.jar' when spark.yarn.user.classpath.first=true and using 'spark-submit --jars hdfs://user/foo.jar', it can not

spark git commit: [SPARK-11079] Post-hoc review Netty-based RPC - round 1

2015-10-13 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6987c0679 -> 1797055db [SPARK-11079] Post-hoc review Netty-based RPC - round 1 I'm going through the implementation right now for post-doc review. Adding more comments and renaming things as I go through them. I also want to write higher

spark git commit: [SPARK-10051] [SPARKR] Support collecting data of StructType in DataFrame

2015-10-13 Thread shivaram
Repository: spark Updated Branches: refs/heads/master d0cc79ccd -> 5e3868ba1 [SPARK-10051] [SPARKR] Support collecting data of StructType in DataFrame Two points in this PR: 1.Originally thought was that a named R list is assumed to be a struct in SerDe. But this is problematic because

spark git commit: [SPARK-11009] [SQL] fix wrong result of Window function in cluster mode

2015-10-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.5 47bc6c0fa -> edc509586 [SPARK-11009] [SQL] fix wrong result of Window function in cluster mode Currently, All windows function could generate wrong result in cluster sometimes. The root cause is that AttributeReference is called in

spark git commit: [SPARK-11030] [SQL] share the SQLTab across sessions

2015-10-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 1797055db -> d0cc79ccd [SPARK-11030] [SQL] share the SQLTab across sessions The SQLTab will be shared by multiple sessions. If we create multiple independent SQLContexts (not using newSession()), will still see multiple SQLTabs in the

spark git commit: [SPARK-10913] [SPARKR] attach() function support

2015-10-13 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 1e0aba90b -> f7f28ee7a [SPARK-10913] [SPARKR] attach() function support Bring the change code up to date. Author: Adrian Zhuang Author: adrian555 Closes #9031 from

spark git commit: [SPARK-7402] [ML] JSON SerDe for standard param types

2015-10-13 Thread meng
Repository: spark Updated Branches: refs/heads/master c75f058b7 -> 2b574f52d [SPARK-7402] [ML] JSON SerDe for standard param types This PR implements the JSON SerDe for the following param types: `Boolean`, `Int`, `Long`, `Float`, `Double`, `String`, `Array[Int]`, `Array[Double]`, and

svn commit: r1708495 - in /spark: js/downloads.js site/js/downloads.js

2015-10-13 Thread srowen
Author: srowen Date: Tue Oct 13 19:42:05 2015 New Revision: 1708495 URL: http://svn.apache.org/viewvc?rev=1708495=rev Log: SPARK-11070 Point to archive.apache.org for older Spark releases Modified: spark/js/downloads.js spark/site/js/downloads.js Modified: spark/js/downloads.js URL:

spark git commit: [SPARK-11009] [SQL] fix wrong result of Window function in cluster mode

2015-10-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 626aab79c -> 6987c0679 [SPARK-11009] [SQL] fix wrong result of Window function in cluster mode Currently, All windows function could generate wrong result in cluster sometimes. The root cause is that AttributeReference is called in

spark git commit: [PYTHON] [MINOR] List modules in PySpark tests when given bad name

2015-10-13 Thread davies
Repository: spark Updated Branches: refs/heads/master f7f28ee7a -> c75f058b7 [PYTHON] [MINOR] List modules in PySpark tests when given bad name Output list of supported modules for python tests in error message when given bad module name. CC: davies Author: Joseph K. Bradley

spark git commit: [SPARK-11052] Spaces in the build dir causes failures in the build/mv…

2015-10-13 Thread srowen
Repository: spark Updated Branches: refs/heads/master b3ffac517 -> 0d1b73b78 [SPARK-11052] Spaces in the build dir causes failures in the build/mv… …n script Author: trystanleftwich Closes #9065 from trystanleftwich/SPARK-11052. Project:

spark git commit: [SPARK-11080] [SQL] Incorporate per-JVM id into ExprId to prevent unsafe cross-JVM comparisions

2015-10-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 0d1b73b78 -> ef72673b2 [SPARK-11080] [SQL] Incorporate per-JVM id into ExprId to prevent unsafe cross-JVM comparisions In the current implementation of named expressions' `ExprIds`, we rely on a per-JVM AtomicLong to ensure that

[2/2] spark git commit: [SPARK-10983] Unified memory manager

2015-10-13 Thread joshrosen
[SPARK-10983] Unified memory manager This patch unifies the memory management of the storage and execution regions such that either side can borrow memory from each other. When memory pressure arises, storage will be evicted in favor of execution. To avoid regressions in cases where storage is

[1/2] spark git commit: [SPARK-10983] Unified memory manager

2015-10-13 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 2b574f52d -> b3ffac517 http://git-wip-us.apache.org/repos/asf/spark/blob/b3ffac51/core/src/test/scala/org/apache/spark/util/collection/ExternalSorterSuite.scala -- diff

spark git commit: [SPARK-10932] [PROJECT INFRA] Port two minor changes to release-build.sh from scripts' old repo

2015-10-13 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master ef72673b2 -> d0482f6af [SPARK-10932] [PROJECT INFRA] Port two minor changes to release-build.sh from scripts' old repo Spark's release packaging scripts used to live in a separate repository. Although these scripts are now part of the

spark git commit: [SPARK-11059] [ML] Change range of quantile probabilities in AFTSurvivalRegression

2015-10-13 Thread meng
Repository: spark Updated Branches: refs/heads/master d0482f6af -> 3889b1c7a [SPARK-11059] [ML] Change range of quantile probabilities in AFTSurvivalRegression Value of the quantile probabilities array should be in the range (0, 1) instead of [0,1] in `AFTSurvivalRegression.scala`

spark git commit: [SPARK-10389] [SQL] [1.5] support order by non-attribute grouping expression on Aggregate

2015-10-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 15d2736af -> 94e6d8f72 [SPARK-10389] [SQL] [1.5] support order by non-attribute grouping expression on Aggregate backport https://github.com/apache/spark/pull/8548 to 1.5 Author: Wenchen Fan Closes #9102 from

spark git commit: [SPARK-11090] [SQL] Constructor for Product types from InternalRow

2015-10-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 3889b1c7a -> 328d1b3e4 [SPARK-11090] [SQL] Constructor for Product types from InternalRow This is a first draft of the ability to construct expressions that will take a catalyst internal row and construct a Product (case class or tuple)

spark git commit: [SPARK-10932] [PROJECT INFRA] Port two minor changes to release-build.sh from scripts' old repo

2015-10-13 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.5 edc509586 -> 77eeaad98 [SPARK-10932] [PROJECT INFRA] Port two minor changes to release-build.sh from scripts' old repo Spark's release packaging scripts used to live in a separate repository. Although these scripts are now part of

spark git commit: [SPARK-10959] [PYSPARK] StreamingLogisticRegressionWithSGD does not t…

2015-10-13 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.5 77eeaad98 -> 15d2736af [SPARK-10959] [PYSPARK] StreamingLogisticRegressionWithSGD does not t… …rain with given regParam and StreamingLinearRegressionWithSGD intercept param is not in correct position. regParam was being passed

spark git commit: [SPARK-11032] [SQL] correctly handle having

2015-10-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 328d1b3e4 -> e170c2216 [SPARK-11032] [SQL] correctly handle having We should not stop resolving having when the having condtion is resolved, or something like `count(1)` will crash. Author: Wenchen Fan Closes #9105

spark git commit: [SPARK-11068] [SQL] add callback to query execution

2015-10-13 Thread davies
Repository: spark Updated Branches: refs/heads/master e170c2216 -> 15ff85b31 [SPARK-11068] [SQL] add callback to query execution With this feature, we can track the query plan, time cost, exception during query execution for spark users. Author: Wenchen Fan Closes #9078

spark git commit: [SPARK-11091] [SQL] Change spark.sql.canonicalizeView to spark.sql.nativeView.

2015-10-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 15ff85b31 -> ce3f9a806 [SPARK-11091] [SQL] Change spark.sql.canonicalizeView to spark.sql.nativeView. https://issues.apache.org/jira/browse/SPARK-11091 Author: Yin Huai Closes #9103 from yhuai/SPARK-11091.