spark git commit: [SPARK-2930] clarify docs on using webhdfs with spark.yarn.access.nam…

2016-01-15 Thread srowen
Repository: spark Updated Branches: refs/heads/master d0a5c32bd -> 96fb894d4 [SPARK-2930] clarify docs on using webhdfs with spark.yarn.access.nam… …enodes Author: Tom Graves Closes #10699 from tgravescs/SPARK-2930. Project:

spark git commit: [SPARK-12655][GRAPHX] GraphX does not unpersist RDDs

2016-01-15 Thread srowen
Repository: spark Updated Branches: refs/heads/master fe7246fea -> d0a5c32bd [SPARK-12655][GRAPHX] GraphX does not unpersist RDDs Some VertexRDD and EdgeRDD are created during the intermediate step of g.connectedComponents() but unnecessarily left cached after the method is done. The fix is

[1/2] spark git commit: [MINOR] [SQL] GeneratedExpressionCode -> ExprCode

2016-01-15 Thread davies
Repository: spark Updated Branches: refs/heads/master ba4a64190 -> c5e7076da http://git-wip-us.apache.org/repos/asf/spark/blob/c5e7076d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects.scala --

[2/2] spark git commit: [MINOR] [SQL] GeneratedExpressionCode -> ExprCode

2016-01-15 Thread davies
[MINOR] [SQL] GeneratedExpressionCode -> ExprCode GeneratedExpressionCode is too long Author: Davies Liu Closes #10767 from davies/renaming. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c5e7076d

spark git commit: [SPARK-11031][SPARKR] Method str() on a DataFrame

2016-01-15 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 96fb894d4 -> ba4a64190 [SPARK-11031][SPARKR] Method str() on a DataFrame Author: Oscar D. Lara Yejas Author: Oscar D. Lara Yejas Author: Oscar D. Lara Yejas

spark git commit: [SPARK-11031][SPARKR] Method str() on a DataFrame

2016-01-15 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.6 d23e57d02 -> 5a0052839 [SPARK-11031][SPARKR] Method str() on a DataFrame Author: Oscar D. Lara Yejas Author: Oscar D. Lara Yejas Author: Oscar D. Lara Yejas

spark git commit: [SPARK-12833][SQL] Initial import of spark-csv

2016-01-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master c5e7076da -> 5f83c6991 [SPARK-12833][SQL] Initial import of spark-csv CSV is the most common data format in the "small data" world. It is often the first format people want to try when they see Spark on a single node. Having to rely on a

[2/2] spark git commit: [SPARK-12667] Remove block manager's internal "external block store" API

2016-01-15 Thread joshrosen
[SPARK-12667] Remove block manager's internal "external block store" API This pull request removes the external block store API. This is rarely used, and the file system interface is actually a better, more standard way to interact with external storage systems. There are some other things to

[1/2] spark git commit: [SPARK-12667] Remove block manager's internal "external block store" API

2016-01-15 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 5f83c6991 -> ad1503f92 http://git-wip-us.apache.org/repos/asf/spark/blob/ad1503f9/core/src/test/scala/org/apache/spark/ui/storage/StoragePageSuite.scala -- diff --git

spark git commit: [SPARK-12833][HOT-FIX] Fix scala 2.11 compilation.

2016-01-15 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ad1503f92 -> 513266c04 [SPARK-12833][HOT-FIX] Fix scala 2.11 compilation. Seems https://github.com/apache/spark/commit/5f83c6991c95616ecbc2878f8860c69b2826f56c breaks scala 2.11 compilation. Author: Yin Huai

spark git commit: [SPARK-12701][CORE] FileAppender should use join to ensure writing thread completion

2016-01-15 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 5a0052839 -> 773366818 [SPARK-12701][CORE] FileAppender should use join to ensure writing thread completion Changed Logging FileAppender to use join in `awaitTermination` to ensure that thread is properly finished before returning.

spark git commit: Fix typo

2016-01-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 513266c04 -> 0bb73554a Fix typo disvoered => discovered Author: Julien Baley Closes #10773 from julienbaley/patch-1. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-12716][WEB UI] Add a TOTALS row to the Executors Web UI

2016-01-15 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 0bb73554a -> 61c45876f [SPARK-12716][WEB UI] Add a TOTALS row to the Executors Web UI Added a Totals table to the top of the page to display the totals of each applicable column in the executors table. Old Description: ~~Created a TOTALS

spark git commit: [SQL][MINOR] BoundReference do not need to be NamedExpression

2016-01-15 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 61c45876f -> 3f1c58d60 [SQL][MINOR] BoundReference do not need to be NamedExpression We made it a `NamedExpression` to workaroud some hacky cases long time ago, and now seems it's safe to remove it. Author: Wenchen Fan

[2/2] spark git commit: [SPARK-12575][SQL] Grammar parity with existing SQL parser

2016-01-15 Thread rxin
[SPARK-12575][SQL] Grammar parity with existing SQL parser In this PR the new CatalystQl parser stack reaches grammar parity with the old Parser-Combinator based SQL Parser. This PR also replaces all uses of the old Parser, and removes it from the code base. Although the existing Hive and SQL

[1/2] spark git commit: [SPARK-12575][SQL] Grammar parity with existing SQL parser

2016-01-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3f1c58d60 -> 7cd7f2202 http://git-wip-us.apache.org/repos/asf/spark/blob/7cd7f220/sql/hive/src/main/scala/org/apache/spark/sql/hive/ExtendedHiveQlParser.scala -- diff --git

spark git commit: [SPARK-12840] [SQL] Support passing arbitrary objects (not just expressions) into code generated classes

2016-01-15 Thread davies
Repository: spark Updated Branches: refs/heads/master 9039333c0 -> 242efb754 [SPARK-12840] [SQL] Support passing arbitrary objects (not just expressions) into code generated classes This is a refactor to support codegen for aggregation and broadcast join. Author: Davies Liu

spark git commit: [SPARK-12833][HOT-FIX] Reset the locale after we set it.

2016-01-15 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 5f843781e -> f6ddbb360 [SPARK-12833][HOT-FIX] Reset the locale after we set it. Author: Yin Huai Closes #10778 from yhuai/resetLocale. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-12649][SQL] support reading bucketed table

2016-01-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8dbbf3e75 -> 3b5ccb12b [SPARK-12649][SQL] support reading bucketed table This PR adds the support to read bucketed tables, and correctly populate `outputPartitioning`, so that we can avoid shuffle for some cases. TODO(follow-up PRs): *

spark git commit: [SPARK-11925][ML][PYSPARK] Add PySpark missing methods for ml.feature during Spark 1.6 QA

2016-01-15 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 7cd7f2202 -> 5f843781e [SPARK-11925][ML][PYSPARK] Add PySpark missing methods for ml.feature during Spark 1.6 QA Add PySpark missing methods and params for ml.feature: * ```RegexTokenizer``` should support setting ```toLowercase```. *

spark git commit: [SPARK-12842][TEST-HADOOP2.7] Add Hadoop 2.7 build profile

2016-01-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master f6ddbb360 -> 8dbbf3e75 [SPARK-12842][TEST-HADOOP2.7] Add Hadoop 2.7 build profile This patch adds a Hadoop 2.7 build profile in order to let us automate tests against that version. /cc rxin srowen Author: Josh Rosen

spark git commit: [SPARK-12644][SQL] Update parquet reader to be vectorized.

2016-01-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3b5ccb12b -> 9039333c0 [SPARK-12644][SQL] Update parquet reader to be vectorized. This inlines a few of the Parquet decoders and adds vectorized APIs to support decoding in batch. There are a few particulars in the Parquet encodings that