spark git commit: [SPARK-10684] [SQL] StructType.interpretedOrdering need not to be serialized

2015-09-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 74d8f7dda -> e3b5d6cb2 [SPARK-10684] [SQL] StructType.interpretedOrdering need not to be serialized Kryo fails with buffer overflow even with max value (2G). {noformat} org.apache.spark.SparkException: Kryo serialization failed: Buffer

spark git commit: [SPARK-9808] Remove hash shuffle file consolidation.

2015-09-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 3a22b1004 -> 348d7c9a9 [SPARK-9808] Remove hash shuffle file consolidation. Author: Reynold Xin Closes #8812 from rxin/SPARK-9808-1. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-10539] [SQL] Project should not be pushed down through Intersect or Except #8742

2015-09-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 00a2911c5 -> c6f8135ee [SPARK-10539] [SQL] Project should not be pushed down through Intersect or Except #8742 Intersect and Except are both set operators and they use the all the columns to compare equality between rows. When pushing

spark git commit: [SPARK-10540] Fixes flaky all-data-type test

2015-09-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 35e8ab939 -> 00a2911c5 [SPARK-10540] Fixes flaky all-data-type test This PR breaks the original test case into multiple ones (one test case for each data type). In this way, test failure output can be much more readable. Within each test

spark git commit: [SPARK-10539] [SQL] Project should not be pushed down through Intersect or Except #8742

2015-09-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.5 e1e781f04 -> 3df52ccfa [SPARK-10539] [SQL] Project should not be pushed down through Intersect or Except #8742 Intersect and Except are both set operators and they use the all the columns to compare equality between rows. When

spark git commit: [SPARK-10540] Fixes flaky all-data-type test

2015-09-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.5 2c6a51e14 -> e1e781f04 [SPARK-10540] Fixes flaky all-data-type test This PR breaks the original test case into multiple ones (one test case for each data type). In this way, test failure output can be much more readable. Within each

spark git commit: [SPARK-10449] [SQL] Don't merge decimal types with incompatable precision or scales

2015-09-18 Thread lian
Repository: spark Updated Branches: refs/heads/master c6f8135ee -> 3a22b1004 [SPARK-10449] [SQL] Don't merge decimal types with incompatable precision or scales >From JIRA: Schema merging should only handle struct fields. But currently we >also reconcile decimal precision and scale

spark git commit: [SPARK-10449] [SQL] Don't merge decimal types with incompatable precision or scales

2015-09-18 Thread lian
Repository: spark Updated Branches: refs/heads/branch-1.5 3df52ccfa -> 4051fffaa [SPARK-10449] [SQL] Don't merge decimal types with incompatable precision or scales >From JIRA: Schema merging should only handle struct fields. But currently we >also reconcile decimal precision and scale

spark git commit: [SPARK-10623] [SQL] Fixes ORC predicate push-down

2015-09-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master c8149ef2c -> 22be2ae14 [SPARK-10623] [SQL] Fixes ORC predicate push-down When pushing down a leaf predicate, ORC `SearchArgument` builder requires an extra "parent" predicate (any one among `AND`/`OR`/`NOT`) to wrap the leaf predicate.

spark git commit: [SPARK-10623] [SQL] Fixes ORC predicate push-down

2015-09-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.5 a6c315358 -> b3f1e6533 [SPARK-10623] [SQL] Fixes ORC predicate push-down When pushing down a leaf predicate, ORC `SearchArgument` builder requires an extra "parent" predicate (any one among `AND`/`OR`/`NOT`) to wrap the leaf

spark git commit: [SPARK-10611] Clone Configuration for each task for NewHadoopRDD

2015-09-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 348d7c9a9 -> 8074208fa [SPARK-10611] Clone Configuration for each task for NewHadoopRDD This patch attempts to fix the Hadoop Configuration thread safety issue for NewHadoopRDD in the same way SPARK-2546 fixed the issue for HadoopRDD.

spark git commit: [SPARK-10611] Clone Configuration for each task for NewHadoopRDD

2015-09-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.5 4051fffaa -> a6c315358 [SPARK-10611] Clone Configuration for each task for NewHadoopRDD This patch attempts to fix the Hadoop Configuration thread safety issue for NewHadoopRDD in the same way SPARK-2546 fixed the issue for HadoopRDD.

spark git commit: [MINOR] [ML] override toString of AttributeGroup

2015-09-18 Thread meng
Repository: spark Updated Branches: refs/heads/master 8074208fa -> c8149ef2c [MINOR] [ML] override toString of AttributeGroup This makes equality test failures much more readable. mengxr Author: Eric Liang Author: Eric Liang Closes #8826 from

spark git commit: [SPARK-10615] [PYSPARK] change assertEquals to assertEqual

2015-09-18 Thread meng
Repository: spark Updated Branches: refs/heads/master 20fd35dfd -> 35e8ab939 [SPARK-10615] [PYSPARK] change assertEquals to assertEqual As ```assertEquals``` is deprecated, so we need to change ```assertEquals``` to ```assertEqual``` for existing python unit tests. Author: Yanbo Liang

svn commit: r1703900 [4/8] - /spark/site/docs/1.5.0/api/R/

2015-09-18 Thread shivaram
Added: spark/site/docs/1.5.0/api/R/greatest.html URL: http://svn.apache.org/viewvc/spark/site/docs/1.5.0/api/R/greatest.html?rev=1703900=auto == --- spark/site/docs/1.5.0/api/R/greatest.html (added) +++

svn commit: r1703900 [1/8] - /spark/site/docs/1.5.0/api/R/

2015-09-18 Thread shivaram
Author: shivaram Date: Fri Sep 18 16:25:35 2015 New Revision: 1703900 URL: http://svn.apache.org/viewvc?rev=1703900=rev Log: Add 1.5.0 R API docs back Added: spark/site/docs/1.5.0/api/R/ spark/site/docs/1.5.0/api/R/00Index.html spark/site/docs/1.5.0/api/R/00frame_toc.html

svn commit: r1703900 [5/8] - /spark/site/docs/1.5.0/api/R/

2015-09-18 Thread shivaram
Added: spark/site/docs/1.5.0/api/R/log2.html URL: http://svn.apache.org/viewvc/spark/site/docs/1.5.0/api/R/log2.html?rev=1703900=auto == --- spark/site/docs/1.5.0/api/R/log2.html (added) +++

svn commit: r1703900 [3/8] - /spark/site/docs/1.5.0/api/R/

2015-09-18 Thread shivaram
Added: spark/site/docs/1.5.0/api/R/createExternalTable.html URL: http://svn.apache.org/viewvc/spark/site/docs/1.5.0/api/R/createExternalTable.html?rev=1703900=auto == ---

svn commit: r1703900 [6/8] - /spark/site/docs/1.5.0/api/R/

2015-09-18 Thread shivaram
Added: spark/site/docs/1.5.0/api/R/randn.html URL: http://svn.apache.org/viewvc/spark/site/docs/1.5.0/api/R/randn.html?rev=1703900=auto == --- spark/site/docs/1.5.0/api/R/randn.html (added) +++

spark git commit: [SPARK-10451] [SQL] Prevent unnecessary serializations in InMemoryColumnarTableScan

2015-09-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master e3b5d6cb2 -> 20fd35dfd [SPARK-10451] [SQL] Prevent unnecessary serializations in InMemoryColumnarTableScan Many of the fields in InMemoryColumnar scan and InMemoryRelation can be made transient. This reduces my 1000ms job to abt 700 ms