spark git commit: [SPARK-11205][PYSPARK] Delegate to scala DataFrame API rather than p…

2015-10-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1107bd958 -> 5cdea7d1e [SPARK-11205][PYSPARK] Delegate to scala DataFrame API rather than p… …rint in python No test needed. Verify it manually in pyspark shell Author: Jeff Zhang Closes #9177 from zjffdu/SPARK-11205. Project: htt

spark git commit: [SPARK-11221][SPARKR] fix R doc for lit and add examples

2015-10-20 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 135ade905 -> 1107bd958 [SPARK-11221][SPARKR] fix R doc for lit and add examples Currently the documentation for `lit` is inconsistent with doc format, references "Scala symbol" and has no example. Fixing that. shivaram Author: felixcheung

spark git commit: [MINOR][ML] fix doc warnings

2015-10-20 Thread meng
Repository: spark Updated Branches: refs/heads/master 45861693b -> 135ade905 [MINOR][ML] fix doc warnings Without an empty line, sphinx will treat doctest as docstring. holdenk ~~~ /Users/meng/src/spark/python/pyspark/ml/feature.py:docstring of pyspark.ml.feature.CountVectorizer:3: ERROR: Un

spark git commit: [SPARK-10082][MLLIB] minor style updates for matrix indexing after #8271

2015-10-20 Thread meng
Repository: spark Updated Branches: refs/heads/master 89e6db615 -> 45861693b [SPARK-10082][MLLIB] minor style updates for matrix indexing after #8271 * `>=0` => `>= 0` * print `i`, `j` in the log message MechCoder Author: Xiangrui Meng Closes #9189 from mengxr/SPARK-10082. Project: http:

spark git commit: [SPARK-11153][SQL] Disables Parquet filter push-down for string and binary columns

2015-10-20 Thread lian
Repository: spark Updated Branches: refs/heads/master aea7142c9 -> 89e6db615 [SPARK-11153][SQL] Disables Parquet filter push-down for string and binary columns Due to PARQUET-251, `BINARY` columns in existing Parquet files may be written with corrupted statistics information. This informatio

spark git commit: [SPARK-11153][SQL] Disables Parquet filter push-down for string and binary columns

2015-10-20 Thread lian
Repository: spark Updated Branches: refs/heads/branch-1.5 a3ab67146 -> 0887e5e87 [SPARK-11153][SQL] Disables Parquet filter push-down for string and binary columns Due to PARQUET-251, `BINARY` columns in existing Parquet files may be written with corrupted statistics information. This inform

spark git commit: [SPARK-10767][PYSPARK] Make pyspark shared params codegen more consistent

2015-10-20 Thread meng
Repository: spark Updated Branches: refs/heads/master da46b77af -> aea7142c9 [SPARK-10767][PYSPARK] Make pyspark shared params codegen more consistent Namely "." shows up in some places in the template when using the param docstring and not in others Author: Holden Karau Closes #9017 from

spark git commit: [SPARK-10082][MLLIB] Validate i, j in apply DenseMatrices and SparseMatrices

2015-10-20 Thread meng
Repository: spark Updated Branches: refs/heads/master 04521ea06 -> da46b77af [SPARK-10082][MLLIB] Validate i, j in apply DenseMatrices and SparseMatrices Given row_ind should be less than the number of rows Given col_ind should be less than the number of cols. The current code in master gives

spark git commit: [SPARK-10269][PYSPARK][MLLIB] Add @since annotation to pyspark.mllib.classification

2015-10-20 Thread meng
Repository: spark Updated Branches: refs/heads/master 9f49895fe -> 04521ea06 [SPARK-10269][PYSPARK][MLLIB] Add @since annotation to pyspark.mllib.classification Duplicated the since decorator from pyspark.sql into pyspark (also tweaked to handle functions without docstrings). Added since to

spark git commit: [SPARK-10261][DOCUMENTATION, ML] Fixed @Since annotation to ml.evaluation

2015-10-20 Thread meng
Repository: spark Updated Branches: refs/heads/master 82e9d9c81 -> 9f49895fe [SPARK-10261][DOCUMENTATION, ML] Fixed @Since annotation to ml.evaluation Author: Tijo Thomas Author: tijo Closes #8554 from tijoparacka/SPARK-10261-2. Project: http://git-wip-us.apache.org/repos/asf/spark/repo C

spark git commit: [SPARK-10272][PYSPARK][MLLIB] Added @since tags to pyspark.mllib.evaluation

2015-10-20 Thread meng
Repository: spark Updated Branches: refs/heads/master 06e6b765d -> 82e9d9c81 [SPARK-10272][PYSPARK][MLLIB] Added @since tags to pyspark.mllib.evaluation Duplicated the since decorator from pyspark.sql into pyspark (also tweaked to handle functions without docstrings). Added since to public m

spark git commit: [SPARK-11149] [SQL] Improve cache performance for primitive types

2015-10-20 Thread davies
Repository: spark Updated Branches: refs/heads/master 67d468f8d -> 06e6b765d [SPARK-11149] [SQL] Improve cache performance for primitive types This PR improve the performance by: 1) Generate an Iterator that take Iterator[CachedBatch] as input, and call accessors (unroll the loop for columns

spark git commit: [SPARK-11111] [SQL] fast null-safe join

2015-10-20 Thread davies
Repository: spark Updated Branches: refs/heads/master 478c7ce86 -> 67d468f8d [SPARK-1] [SQL] fast null-safe join Currently, we use CartesianProduct for join with null-safe-equal condition. ``` scala> sqlContext.sql("select * from t a join t b on (a.i <=> b.i)").explain == Physical Plan ==

spark git commit: [SPARK-6740] [SQL] correctly parse NOT operator with comparison operations

2015-10-20 Thread davies
Repository: spark Updated Branches: refs/heads/master 2f6dd634c -> 478c7ce86 [SPARK-6740] [SQL] correctly parse NOT operator with comparison operations We can't parse `NOT` operator with comparison operations like `SELECT NOT TRUE > TRUE`, this PR fixed it. Takes over https://github.com/apac

spark git commit: [SPARK-11105] [YARN] Distribute log4j.properties to executors

2015-10-20 Thread vanzin
Repository: spark Updated Branches: refs/heads/master e18b571c3 -> 2f6dd634c [SPARK-11105] [YARN] Distribute log4j.properties to executors Currently log4j.properties file is not uploaded to executor's which is leading them to use the default values. This fix will make sure that file is always

spark git commit: [SPARK-10447][SPARK-3842][PYSPARK] upgrade pyspark to py4j0.9

2015-10-20 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 94139557c -> e18b571c3 [SPARK-10447][SPARK-3842][PYSPARK] upgrade pyspark to py4j0.9 Upgrade to Py4j0.9 Author: Holden Karau Author: Holden Karau Closes #8615 from holdenk/SPARK-10447-upgrade-pyspark-to-py4j0.9. Project: http://git-wi

spark git commit: [SPARK-10463] [SQL] remove PromotePrecision during optimization

2015-10-20 Thread davies
Repository: spark Updated Branches: refs/heads/master 60851bc7b -> 94139557c [SPARK-10463] [SQL] remove PromotePrecision during optimization PromotePrecision is not necessary after HiveTypeCoercion done. Jira: https://issues.apache.org/jira/browse/SPARK-10463 Author: Daoyuan Wang Closes #8

spark git commit: [SPARK-11110][BUILD] Remove transient annotation for parameters.

2015-10-20 Thread srowen
Repository: spark Updated Branches: refs/heads/master 8f74aa639 -> 60851bc7b [SPARK-0][BUILD] Remove transient annotation for parameters. `transient` annotations on class parameters (not case class parameters or vals) causes compilation errors during compilation with Scala 2.11. I underst

spark git commit: [SPARK-10876] Display total uptime for completed applications

2015-10-20 Thread srowen
Repository: spark Updated Branches: refs/heads/master 8b877cc4e -> 8f74aa639 [SPARK-10876] Display total uptime for completed applications Author: Jean-Baptiste Onofré Closes #9059 from jbonofre/SPARK-10876. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip