spark git commit: [SQL][SPARK-6742]: Don't push down predicates which reference partition column(s)

2015-04-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 85ee0cabe - 3a205bbd9 [SQL][SPARK-6742]: Don't push down predicates which reference partition column(s) cc liancheng Author: Yash Datta yash.da...@guavus.com Closes #5390 from saucam/fpush and squashes the following commits: 3f026d6

spark git commit: [Spark-4848] Allow different Worker configurations in standalone cluster

2015-04-13 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 4898dfa46 - 435b8779d [Spark-4848] Allow different Worker configurations in standalone cluster This refixes #3699 with the latest code. This fixes SPARK-4848 I've changed the stand-alone cluster scripts to allow different workers to have

spark git commit: [SPARK-5941] [SQL] Unit Test loads the table `src` twice for leftsemijoin.q

2015-04-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master e63a86abe - c5602bdc3 [SPARK-5941] [SQL] Unit Test loads the table `src` twice for leftsemijoin.q In `leftsemijoin.q`, there is a data loading command for table `sales` already, but in `TestHive`, it also created the table `sales`, which

spark git commit: [SPARK-5931][CORE] Use consistent naming for time properties

2015-04-13 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master c5602bdc3 - c4ab255e9 [SPARK-5931][CORE] Use consistent naming for time properties I've added new utility methods to do the conversion from times specified as e.g. 120s, 240ms, 360us to convert to a consistent internal representation.

spark git commit: [SPARK-6303][SQL] Remove unnecessary Average in GeneratedAggregate

2015-04-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master d7f2c1986 - 5b8b324f3 [SPARK-6303][SQL] Remove unnecessary Average in GeneratedAggregate Because `Average` is a `PartialAggregate`, we never get a `Average` node when reaching `HashAggregation` to prepare `GeneratedAggregate`. That is

spark git commit: [SPARK-6877][SQL] Add code generation support for Min

2015-04-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 5b8b324f3 - 4898dfa46 [SPARK-6877][SQL] Add code generation support for Min Currently `min` is not supported in code generation. This pr adds the support for it. Author: Liang-Chi Hsieh vii...@gmail.com Closes #5487 from

spark git commit: [SPARK-5794] [SQL] fix add jar

2015-04-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 3782e1f2b - b45059d0d [SPARK-5794] [SQL] fix add jar Author: Daoyuan Wang daoyuan.w...@intel.com Closes #4586 from adrian-wang/addjar and squashes the following commits: efdd602 [Daoyuan Wang] move jar to another place 6c707e8 [Daoyuan

spark git commit: [WIP][HOTFIX][SPARK-4123]: Fix bug in PR dependency (all deps. removed issue)

2015-04-13 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 971b95b0c - 77eeb10fd [WIP][HOTFIX][SPARK-4123]: Fix bug in PR dependency (all deps. removed issue) We're seeing a bug sporadically in the new PR dependency comparison test whereby it notes that *all* dependencies are removed. This

spark git commit: [Minor][SparkR] Minor refactor and removes redundancy related to cleanClosure.

2015-04-13 Thread shivaram
Repository: spark Updated Branches: refs/heads/master b45059d0d - 0ba3fdd59 [Minor][SparkR] Minor refactor and removes redundancy related to cleanClosure. 1. Only use `cleanClosure` in creation of RRDDs. Normally, user and developer do not need to call `cleanClosure` in their function

[1/2] spark git commit: [SPARK-5957][ML] better handling of parameters

2015-04-13 Thread meng
Repository: spark Updated Branches: refs/heads/master 0ba3fdd59 - 971b95b0c http://git-wip-us.apache.org/repos/asf/spark/blob/971b95b0/mllib/src/main/scala/org/apache/spark/ml/util/SchemaUtils.scala -- diff --git

[2/2] spark git commit: [SPARK-5957][ML] better handling of parameters

2015-04-13 Thread meng
[SPARK-5957][ML] better handling of parameters The design doc was posted on the JIRA page. Python changes will be in a follow-up PR. jkbradley 1. Use codegen for shared params. 1. Move shared params to package `ml.param.shared`. 1. Set default values in `Params` instead of in `Param`. 1. Add a

spark git commit: [SPARK-6868][YARN] Fix broken container log link on executor page when HTTPS_ONLY.

2015-04-13 Thread srowen
Repository: spark Updated Branches: refs/heads/master 68d1faa3c - 950645d59 [SPARK-6868][YARN] Fix broken container log link on executor page when HTTPS_ONLY. Correct http schema in YARN container log link in Spark UI when container logs when YARN is configured to be HTTPS_ONLY. Uses the

spark git commit: [SPARK-6860][Streaming][WebUI] Fix the possible inconsistency of StreamingPage

2015-04-13 Thread srowen
Repository: spark Updated Branches: refs/heads/master cadd7d72c - 14ce3ea2c [SPARK-6860][Streaming][WebUI] Fix the possible inconsistency of StreamingPage Because `StreamingPage.render` doesn't hold the `listener` lock when generating the content, the different parts of content may have some

spark git commit: [SPARK-6868][YARN] Fix broken container log link on executor page when HTTPS_ONLY.

2015-04-13 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.3 8d4176132 - 8e5caa227 [SPARK-6868][YARN] Fix broken container log link on executor page when HTTPS_ONLY. Correct http schema in YARN container log link in Spark UI when container logs when YARN is configured to be HTTPS_ONLY. Uses

spark git commit: [SPARK-6870][Yarn] Catch InterruptedException when yarn application state monitor thread been interrupted

2015-04-13 Thread srowen
Repository: spark Updated Branches: refs/heads/master 240ea03fa - 202ebf06e [SPARK-6870][Yarn] Catch InterruptedException when yarn application state monitor thread been interrupted On PR #5305 we interrupt the monitor thread but forget to catch the InterruptedException, then in the log

spark git commit: [SPARK-6440][CORE]Handle IPv6 addresses properly when constructing URI

2015-04-13 Thread srowen
Repository: spark Updated Branches: refs/heads/master 14ce3ea2c - 9d117cee0 [SPARK-6440][CORE]Handle IPv6 addresses properly when constructing URI Author: nyaapa nya...@gmail.com Closes #5424 from nyaapa/master and squashes the following commits: 6b717aa [nyaapa] [SPARK-6440][CORE] Remove

spark git commit: [SPARK-6671] Add status command for spark daemons

2015-04-13 Thread srowen
Repository: spark Updated Branches: refs/heads/master 9d117cee0 - 240ea03fa [SPARK-6671] Add status command for spark daemons SPARK-6671 Currently using the spark-daemon.sh script we can start and stop the spark demons. But we cannot get the status of the daemons. It will be nice to include

spark git commit: [SPARK-6207] [YARN] [SQL] Adds delegation tokens for metastore to conf.

2015-04-13 Thread tgraves
Repository: spark Updated Branches: refs/heads/master b29663eee - 77620be76 [SPARK-6207] [YARN] [SQL] Adds delegation tokens for metastore to conf. Adds hive2-metastore delegation token to conf when running in secure mode. Without this change, running on YARN in cluster mode fails with a GSS

spark git commit: [SPARK-6352] [SQL] Add DirectParquetOutputCommitter

2015-04-13 Thread lian
Repository: spark Updated Branches: refs/heads/master 202ebf06e - b29663eee [SPARK-6352] [SQL] Add DirectParquetOutputCommitter Add a DirectParquetOutputCommitter class that skips _temporary directory when saving to s3. Add new config value spark.sql.parquet.useDirectParquetOutputCommitter