spark git commit: [SPARK-15696][SQL] Improve `crosstab` to have a consistent column order

2016-06-09 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 ebbbf2136 -> 1371d5ece [SPARK-15696][SQL] Improve `crosstab` to have a consistent column order ## What changes were proposed in this pull request? Currently, `crosstab` returns a Dataframe having **random-order** columns obtained by

spark git commit: [SPARK-15791] Fix NPE in ScalarSubquery

2016-06-09 Thread rxin
Repository: spark Updated Branches: refs/heads/master 16df133d7 -> 6c5fd977f [SPARK-15791] Fix NPE in ScalarSubquery ## What changes were proposed in this pull request? The fix is pretty simple, just don't make the executedPlan transient in `ScalarSubquery` since it is referenced at

spark git commit: [SPARK-15791] Fix NPE in ScalarSubquery

2016-06-09 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 d45aa50fc -> ebbbf2136 [SPARK-15791] Fix NPE in ScalarSubquery ## What changes were proposed in this pull request? The fix is pretty simple, just don't make the executedPlan transient in `ScalarSubquery` since it is referenced at

spark git commit: [SPARK-15850][SQL] Remove function grouping in SparkSession

2016-06-09 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 4d9d9cc58 -> 16df133d7 [SPARK-15850][SQL] Remove function grouping in SparkSession ## What changes were proposed in this pull request? SparkSession does not have that many functions due to better namespacing, and as a result we probably

spark git commit: [SPARK-15853][SQL] HDFSMetadataLog.get should close the input stream

2016-06-09 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.0 00bbf7873 -> ca0801120 [SPARK-15853][SQL] HDFSMetadataLog.get should close the input stream ## What changes were proposed in this pull request? This PR closes the input stream created in `HDFSMetadataLog.get` ## How was this patch

spark git commit: [SPARK-15853][SQL] HDFSMetadataLog.get should close the input stream

2016-06-09 Thread tdas
Repository: spark Updated Branches: refs/heads/master b914e1930 -> 4d9d9cc58 [SPARK-15853][SQL] HDFSMetadataLog.get should close the input stream ## What changes were proposed in this pull request? This PR closes the input stream created in `HDFSMetadataLog.get` ## How was this patch

spark git commit: [SPARK-15794] Should truncate toString() of very wide plans

2016-06-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 83070cd1d -> b914e1930 [SPARK-15794] Should truncate toString() of very wide plans ## What changes were proposed in this pull request? With very wide tables, e.g. thousands of fields, the plan output is unreadable and often causes OOMs

spark git commit: [SPARK-15794] Should truncate toString() of very wide plans

2016-06-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 3119d8eef -> 00bbf7873 [SPARK-15794] Should truncate toString() of very wide plans ## What changes were proposed in this pull request? With very wide tables, e.g. thousands of fields, the plan output is unreadable and often causes

spark git commit: [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests.

2016-06-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 b2d076c35 -> 3119d8eef [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests. Description from JIRA. In ReplSuite, for a test that can be tested well on just local should not really have to start a local-cluster.

spark git commit: [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests.

2016-06-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master aa0364510 -> 83070cd1d [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests. Description from JIRA. In ReplSuite, for a test that can be tested well on just local should not really have to start a local-cluster. And

spark git commit: [SPARK-12447][YARN] Only update the states when executor is successfully launched

2016-06-09 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-2.0 b42e3d886 -> b2d076c35 [SPARK-12447][YARN] Only update the states when executor is successfully launched The details is described in https://issues.apache.org/jira/browse/SPARK-12447. vanzin Please help to review, thanks a lot.

spark git commit: [SPARK-12447][YARN] Only update the states when executor is successfully launched

2016-06-09 Thread vanzin
Repository: spark Updated Branches: refs/heads/master b0768538e -> aa0364510 [SPARK-12447][YARN] Only update the states when executor is successfully launched The details is described in https://issues.apache.org/jira/browse/SPARK-12447. vanzin Please help to review, thanks a lot. Author:

spark git commit: [SPARK-14321][SQL] Reduce date format cost and string-to-date cost in date functions

2016-06-09 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6cb71f473 -> b0768538e [SPARK-14321][SQL] Reduce date format cost and string-to-date cost in date functions ## What changes were proposed in this pull request? The current implementations of `UnixTime` and `FromUnixTime` do not cache

spark git commit: [SPARK-14321][SQL] Reduce date format cost and string-to-date cost in date functions

2016-06-09 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 0408793aa -> b42e3d886 [SPARK-14321][SQL] Reduce date format cost and string-to-date cost in date functions ## What changes were proposed in this pull request? The current implementations of `UnixTime` and `FromUnixTime` do not cache

spark git commit: [SPARK-15839] Fix Maven doc-jar generation when JAVA_7_HOME is set

2016-06-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 07a914c09 -> 0408793aa [SPARK-15839] Fix Maven doc-jar generation when JAVA_7_HOME is set ## What changes were proposed in this pull request? It looks like the nightly Maven snapshots broke after we set `JAVA_7_HOME` in the build:

spark git commit: [SPARK-15839] Fix Maven doc-jar generation when JAVA_7_HOME is set

2016-06-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/master f74b77713 -> 6cb71f473 [SPARK-15839] Fix Maven doc-jar generation when JAVA_7_HOME is set ## What changes were proposed in this pull request? It looks like the nightly Maven snapshots broke after we set `JAVA_7_HOME` in the build:

spark git commit: [SPARK-15827][BUILD] Publish Spark's forked sbt-pom-reader to Maven Central

2016-06-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.6 bb917fc65 -> 739d992f0 [SPARK-15827][BUILD] Publish Spark's forked sbt-pom-reader to Maven Central Spark's SBT build currently uses a fork of the sbt-pom-reader plugin but depends on that fork via a SBT subproject which is cloned from

spark git commit: [SPARK-15827][BUILD] Publish Spark's forked sbt-pom-reader to Maven Central

2016-06-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 10f759947 -> 07a914c09 [SPARK-15827][BUILD] Publish Spark's forked sbt-pom-reader to Maven Central Spark's SBT build currently uses a fork of the sbt-pom-reader plugin but depends on that fork via a SBT subproject which is cloned from

spark git commit: [SPARK-15827][BUILD] Publish Spark's forked sbt-pom-reader to Maven Central

2016-06-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master e594b4928 -> f74b77713 [SPARK-15827][BUILD] Publish Spark's forked sbt-pom-reader to Maven Central Spark's SBT build currently uses a fork of the sbt-pom-reader plugin but depends on that fork via a SBT subproject which is cloned from

spark git commit: [SPARK-15788][PYSPARK][ML] PySpark IDFModel missing "idf" property

2016-06-09 Thread mlnick
Repository: spark Updated Branches: refs/heads/branch-2.0 eb9e8fc09 -> 10f759947 [SPARK-15788][PYSPARK][ML] PySpark IDFModel missing "idf" property ## What changes were proposed in this pull request? add method idf to IDF in pyspark ## How was this patch tested? add unit test Author: Jeff

spark git commit: [SPARK-15788][PYSPARK][ML] PySpark IDFModel missing "idf" property

2016-06-09 Thread mlnick
Repository: spark Updated Branches: refs/heads/master 99386fe39 -> e594b4928 [SPARK-15788][PYSPARK][ML] PySpark IDFModel missing "idf" property ## What changes were proposed in this pull request? add method idf to IDF in pyspark ## How was this patch tested? add unit test Author: Jeff

spark git commit: [SPARK-15804][SQL] Include metadata in the toStructType

2016-06-09 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 147c02082 -> 99386fe39 [SPARK-15804][SQL] Include metadata in the toStructType ## What changes were proposed in this pull request? The help function 'toStructType' in the AttributeSeq class doesn't include the metadata when it builds the

spark git commit: [SPARK-15804][SQL] Include metadata in the toStructType

2016-06-09 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.0 77c08d224 -> eb9e8fc09 [SPARK-15804][SQL] Include metadata in the toStructType ## What changes were proposed in this pull request? The help function 'toStructType' in the AttributeSeq class doesn't include the metadata when it builds

spark git commit: [SPARK-15818][BUILD] Upgrade to Hadoop 2.7.2

2016-06-09 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.0 8ee93eed9 -> 77c08d224 [SPARK-15818][BUILD] Upgrade to Hadoop 2.7.2 ## What changes were proposed in this pull request? Updating the Hadoop version from 2.7.0 to 2.7.2 if we use the Hadoop-2.7 build profile ## How was this patch

spark git commit: [SPARK-15818][BUILD] Upgrade to Hadoop 2.7.2

2016-06-09 Thread srowen
Repository: spark Updated Branches: refs/heads/master 921fa40b1 -> 147c02082 [SPARK-15818][BUILD] Upgrade to Hadoop 2.7.2 ## What changes were proposed in this pull request? Updating the Hadoop version from 2.7.0 to 2.7.2 if we use the Hadoop-2.7 build profile ## How was this patch tested?

spark git commit: [SPARK-12712] Fix failure in ./dev/test-dependencies when run against empty .m2 cache

2016-06-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.6 5830828ef -> bb917fc65 [SPARK-12712] Fix failure in ./dev/test-dependencies when run against empty .m2 cache This patch fixes a bug in `./dev/test-dependencies.sh` which caused spurious failures when the script was run on a machine

spark git commit: [SPARK-12712] Fix failure in ./dev/test-dependencies when run against empty .m2 cache

2016-06-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master d5807def1 -> 921fa40b1 [SPARK-12712] Fix failure in ./dev/test-dependencies when run against empty .m2 cache This patch fixes a bug in `./dev/test-dependencies.sh` which caused spurious failures when the script was run on a machine with

spark git commit: [SPARK-12712] Fix failure in ./dev/test-dependencies when run against empty .m2 cache

2016-06-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 96c011d5b -> 8ee93eed9 [SPARK-12712] Fix failure in ./dev/test-dependencies when run against empty .m2 cache This patch fixes a bug in `./dev/test-dependencies.sh` which caused spurious failures when the script was run on a machine

spark git commit: [MINOR][DOC] In Dataset docs, remove self link to Dataset and add link to Column

2016-06-09 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 48239b5f1 -> 96c011d5b [MINOR][DOC] In Dataset docs, remove self link to Dataset and add link to Column ## What changes were proposed in this pull request? Documentation Fix ## How was this patch tested? Author: Sandeep Singh