spark git commit: [SPARK-15759] [SQL] Fallback to non-codegen when fail to compile generated code

2016-06-10 Thread davies
Repository: spark Updated Branches: refs/heads/branch-2.0 91dffcabd -> f0fa0a894 [SPARK-15759] [SQL] Fallback to non-codegen when fail to compile generated code ## What changes were proposed in this pull request? In case of any bugs in whole-stage codegen, the generated code can't be

spark git commit: [SPARK-15759] [SQL] Fallback to non-codegen when fail to compile generated code

2016-06-10 Thread davies
Repository: spark Updated Branches: refs/heads/master 468da03e2 -> 7504bc73f [SPARK-15759] [SQL] Fallback to non-codegen when fail to compile generated code ## What changes were proposed in this pull request? In case of any bugs in whole-stage codegen, the generated code can't be compiled,

spark git commit: Revert "[SPARK-15639][SQL] Try to push down filter at RowGroups level for parquet reader"

2016-06-10 Thread lian
Repository: spark Updated Branches: refs/heads/branch-2.0 a08715c7a -> 91dffcabd Revert "[SPARK-15639][SQL] Try to push down filter at RowGroups level for parquet reader" This reverts commit 7d6bd1196410563bd1fccc10e7bff6e75b5c9f22. Project:

spark git commit: [SPARK-15678] Add support to REFRESH data source paths

2016-06-10 Thread davies
Repository: spark Updated Branches: refs/heads/master 8e7b56f3d -> 468da03e2 [SPARK-15678] Add support to REFRESH data source paths ## What changes were proposed in this pull request? Spark currently incorrectly continues to use cached data even if the underlying data is overwritten.

spark git commit: [SPARK-15678] Add support to REFRESH data source paths

2016-06-10 Thread davies
Repository: spark Updated Branches: refs/heads/branch-2.0 798825c09 -> a08715c7a [SPARK-15678] Add support to REFRESH data source paths ## What changes were proposed in this pull request? Spark currently incorrectly continues to use cached data even if the underlying data is overwritten.

spark git commit: Revert "[SPARK-15639][SQL] Try to push down filter at RowGroups level for parquet reader"

2016-06-10 Thread lian
Repository: spark Updated Branches: refs/heads/master 99f3c8277 -> 8e7b56f3d Revert "[SPARK-15639][SQL] Try to push down filter at RowGroups level for parquet reader" This reverts commit bba5d7999f7b3ae9d816ea552ba9378fea1615a6. Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: [SPARK-15884][SPARKR][SQL] Overriding stringArgs in MapPartitionsInR

2016-06-10 Thread lian
Repository: spark Updated Branches: refs/heads/branch-2.0 f41f433b1 -> 0a450cfff [SPARK-15884][SPARKR][SQL] Overriding stringArgs in MapPartitionsInR ## What changes were proposed in this pull request? As discussed in https://github.com/apache/spark/pull/12836 we need to override stringArgs

spark git commit: [SPARK-15884][SPARKR][SQL] Overriding stringArgs in MapPartitionsInR

2016-06-10 Thread lian
Repository: spark Updated Branches: refs/heads/master 2022afe57 -> 54f758b5f [SPARK-15884][SPARKR][SQL] Overriding stringArgs in MapPartitionsInR ## What changes were proposed in this pull request? As discussed in https://github.com/apache/spark/pull/12836 we need to override stringArgs

spark git commit: [SPARK-15773][CORE][EXAMPLE] Avoid creating local variable `sc` in examples if possible

2016-06-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 e6ebb547b -> f41f433b1 [SPARK-15773][CORE][EXAMPLE] Avoid creating local variable `sc` in examples if possible ## What changes were proposed in this pull request? Instead of using local variable `sc` like the following example, this

spark git commit: [SPARK-15773][CORE][EXAMPLE] Avoid creating local variable `sc` in examples if possible

2016-06-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 127a6678d -> 2022afe57 [SPARK-15773][CORE][EXAMPLE] Avoid creating local variable `sc` in examples if possible ## What changes were proposed in this pull request? Instead of using local variable `sc` like the following example, this PR

spark git commit: [SPARK-15489][SQL] Dataset kryo encoder won't load custom user settings

2016-06-10 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master aec502d91 -> 127a6678d [SPARK-15489][SQL] Dataset kryo encoder won't load custom user settings ## What changes were proposed in this pull request? Serializer instantiation will consider existing SparkConf ## How was this patch tested?

spark git commit: [SPARK-15489][SQL] Dataset kryo encoder won't load custom user settings

2016-06-10 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-2.0 bc53422ad -> e6ebb547b [SPARK-15489][SQL] Dataset kryo encoder won't load custom user settings ## What changes were proposed in this pull request? Serializer instantiation will consider existing SparkConf ## How was this patch

spark git commit: [SPARK-15654] [SQL] fix non-splitable files for text based file formats

2016-06-10 Thread davies
Repository: spark Updated Branches: refs/heads/branch-2.0 f2e5d6d0f -> bc53422ad [SPARK-15654] [SQL] fix non-splitable files for text based file formats ## What changes were proposed in this pull request? Currently, we always split the files when it's bigger than maxSplitBytes, but Hadoop

spark git commit: [SPARK-15654] [SQL] fix non-splitable files for text based file formats

2016-06-10 Thread davies
Repository: spark Updated Branches: refs/heads/master e05a2feeb -> aec502d91 [SPARK-15654] [SQL] fix non-splitable files for text based file formats ## What changes were proposed in this pull request? Currently, we always split the files when it's bigger than maxSplitBytes, but Hadoop

spark git commit: [SPARK-15825] [SQL] Fix SMJ invalid results

2016-06-10 Thread davies
Repository: spark Updated Branches: refs/heads/master 026eb9064 -> e05a2feeb [SPARK-15825] [SQL] Fix SMJ invalid results ## What changes were proposed in this pull request? Code generated `SortMergeJoin` failed with wrong results when using structs as keys. This could (eventually) be traced

spark git commit: [SPARK-15825] [SQL] Fix SMJ invalid results

2016-06-10 Thread davies
Repository: spark Updated Branches: refs/heads/branch-2.0 80b8711b3 -> f2e5d6d0f [SPARK-15825] [SQL] Fix SMJ invalid results ## What changes were proposed in this pull request? Code generated `SortMergeJoin` failed with wrong results when using structs as keys. This could (eventually) be

spark git commit: [SPARK-15738][PYSPARK][ML] Adding Pyspark ml RFormula __str__ method similar to Scala API

2016-06-10 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.0 8b6742a37 -> 80b8711b3 [SPARK-15738][PYSPARK][ML] Adding Pyspark ml RFormula __str__ method similar to Scala API ## What changes were proposed in this pull request? Adding __str__ to RFormula and model that will show the set formula

spark git commit: [SPARK-15875] Try to use Seq.isEmpty and Seq.nonEmpty instead of Seq.length == 0 and Seq.length > 0

2016-06-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 96bb1476c -> 8b6742a37 [SPARK-15875] Try to use Seq.isEmpty and Seq.nonEmpty instead of Seq.length == 0 and Seq.length > 0 ## What changes were proposed in this pull request? In scala, immutable.List.length is an expensive operation

spark git commit: [SPARK-15875] Try to use Seq.isEmpty and Seq.nonEmpty instead of Seq.length == 0 and Seq.length > 0

2016-06-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 865ec32dd -> 026eb9064 [SPARK-15875] Try to use Seq.isEmpty and Seq.nonEmpty instead of Seq.length == 0 and Seq.length > 0 ## What changes were proposed in this pull request? In scala, immutable.List.length is an expensive operation so

spark git commit: [SPARK-6320][SQL] Move planLater method into GenericStrategy.

2016-06-10 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master fb219029d -> 667d4ea7b [SPARK-6320][SQL] Move planLater method into GenericStrategy. ## What changes were proposed in this pull request? This PR moves `QueryPlanner.planLater()` method into `GenericStrategy` for extra strategies to be

spark git commit: [MINOR][X][X] Replace all occurrences of None: Option with Option.empty

2016-06-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 667d4ea7b -> 865ec32dd [MINOR][X][X] Replace all occurrences of None: Option with Option.empty ## What changes were proposed in this pull request? Replace all occurrences of `None: Option[X]` with `Option.empty[X]` ## How was this patch

spark git commit: [MINOR][X][X] Replace all occurrences of None: Option with Option.empty

2016-06-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 f15d641e2 -> 96bb1476c [MINOR][X][X] Replace all occurrences of None: Option with Option.empty ## What changes were proposed in this pull request? Replace all occurrences of `None: Option[X]` with `Option.empty[X]` ## How was this

spark git commit: [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter`

2016-06-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 c1390ccbb -> f15d641e2 [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter` ## What changes were proposed in this pull request? It doesn't make sense to specify partitioning parameters, when we write data out from

spark git commit: [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter`

2016-06-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5c16ad0d5 -> fb219029d [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter` ## What changes were proposed in this pull request? It doesn't make sense to specify partitioning parameters, when we write data out from

spark git commit: Revert [SPARK-14485][CORE] ignore task finished for executor lost

2016-06-10 Thread kayousterhout
Repository: spark Updated Branches: refs/heads/branch-2.0 f895d6d85 -> c1390ccbb Revert [SPARK-14485][CORE] ignore task finished for executor lost This reverts commit 695dbc816a6d70289abeb145cb62ff4e62b3f49b. This change is being reverted because it hurts performance of some jobs, and only

spark git commit: Revert [SPARK-14485][CORE] ignore task finished for executor lost

2016-06-10 Thread kayousterhout
Repository: spark Updated Branches: refs/heads/master 2c8f40cea -> 5c16ad0d5 Revert [SPARK-14485][CORE] ignore task finished for executor lost This reverts commit 695dbc816a6d70289abeb145cb62ff4e62b3f49b. This change is being reverted because it hurts performance of some jobs, and only helps

spark git commit: [SPARK-15766][SPARKR] R should export is.nan

2016-06-10 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 2413fce9d -> 2c8f40cea [SPARK-15766][SPARKR] R should export is.nan ## What changes were proposed in this pull request? When reviewing SPARK-15545, we found that is.nan is not exported, which should be exported. Add it to the NAMESPACE.

svn commit: r1747764 - in /spark: css/ site/ site/css/ site/images/

2016-06-10 Thread matei
Author: matei Date: Fri Jun 10 18:15:52 2016 New Revision: 1747764 URL: http://svn.apache.org/viewvc?rev=1747764=rev Log: CSS tweaks Modified: spark/css/custom.css spark/site/css/custom.css spark/site/documentation.html spark/site/images/spark-logo-reverse.eps

svn commit: r1747763 - in /spark/images: spark-logo-reverse.eps spark-logo-trademark.png spark-logo.eps spark-logo.png spark-runs-everywhere.png

2016-06-10 Thread matei
Author: matei Date: Fri Jun 10 18:15:35 2016 New Revision: 1747763 URL: http://svn.apache.org/viewvc?rev=1747763=rev Log: Version of logo with Apache Modified: spark/images/spark-logo-reverse.eps spark/images/spark-logo-trademark.png spark/images/spark-logo.eps

spark git commit: [SPARK-15753][SQL] Move Analyzer stuff to Analyzer from DataFrameWriter

2016-06-10 Thread lian
Repository: spark Updated Branches: refs/heads/branch-2.0 47c2a265f -> 55a837246 [SPARK-15753][SQL] Move Analyzer stuff to Analyzer from DataFrameWriter ## What changes were proposed in this pull request? This patch moves some codes in `DataFrameWriter.insertInto` that belongs to

spark git commit: [SPARK-15866] Rename listAccumulator collectionAccumulator

2016-06-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 0ec279ffd -> 254bc8c34 [SPARK-15866] Rename listAccumulator collectionAccumulator ## What changes were proposed in this pull request? SparkContext.listAccumulator, by Spark's convention, makes it sound like "list" is a verb and the method

spark git commit: [SPARK-15866] Rename listAccumulator collectionAccumulator

2016-06-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 55a837246 -> 935b6e0e4 [SPARK-15866] Rename listAccumulator collectionAccumulator ## What changes were proposed in this pull request? SparkContext.listAccumulator, by Spark's convention, makes it sound like "list" is a verb and the

spark git commit: [SPARK-15753][SQL] Move Analyzer stuff to Analyzer from DataFrameWriter

2016-06-10 Thread lian
Repository: spark Updated Branches: refs/heads/master abdb5d42c -> 0ec279ffd [SPARK-15753][SQL] Move Analyzer stuff to Analyzer from DataFrameWriter ## What changes were proposed in this pull request? This patch moves some codes in `DataFrameWriter.insertInto` that belongs to `Analyzer`.

spark git commit: [SPARK-15812][SQ][STREAMING] Added support for sorting after streaming aggregation with complete mode

2016-06-10 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.0 54b4763d2 -> 47c2a265f [SPARK-15812][SQ][STREAMING] Added support for sorting after streaming aggregation with complete mode ## What changes were proposed in this pull request? When the output mode is complete, then the output of a

spark git commit: [SPARK-15812][SQ][STREAMING] Added support for sorting after streaming aggregation with complete mode

2016-06-10 Thread tdas
Repository: spark Updated Branches: refs/heads/master cdd7f5a57 -> abdb5d42c [SPARK-15812][SQ][STREAMING] Added support for sorting after streaming aggregation with complete mode ## What changes were proposed in this pull request? When the output mode is complete, then the output of a

spark git commit: [SPARK-15837][ML][PYSPARK] Word2vec python add maxsentence parameter

2016-06-10 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.0 6709ce1ae -> 54b4763d2 [SPARK-15837][ML][PYSPARK] Word2vec python add maxsentence parameter ## What changes were proposed in this pull request? Word2vec python add maxsentence parameter. ## How was this patch tested? Existing test.

spark git commit: [SPARK-15837][ML][PYSPARK] Word2vec python add maxsentence parameter

2016-06-10 Thread srowen
Repository: spark Updated Branches: refs/heads/master 16ca32eac -> cdd7f5a57 [SPARK-15837][ML][PYSPARK] Word2vec python add maxsentence parameter ## What changes were proposed in this pull request? Word2vec python add maxsentence parameter. ## How was this patch tested? Existing test.

spark git commit: [SPARK-15823][PYSPARK][ML] Add @property for 'accuracy' in MulticlassMetrics

2016-06-10 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.0 84a8421e5 -> 6709ce1ae [SPARK-15823][PYSPARK][ML] Add @property for 'accuracy' in MulticlassMetrics ## What changes were proposed in this pull request? `accuracy` should be decorated with `property` to keep step with other methods in

spark git commit: [SPARK-15823][PYSPARK][ML] Add @property for 'accuracy' in MulticlassMetrics

2016-06-10 Thread srowen
Repository: spark Updated Branches: refs/heads/master 675a73715 -> 16ca32eac [SPARK-15823][PYSPARK][ML] Add @property for 'accuracy' in MulticlassMetrics ## What changes were proposed in this pull request? `accuracy` should be decorated with `property` to keep step with other methods in

spark git commit: [DOCUMENTATION] fixed groupby aggregation example for pyspark

2016-06-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 02ed7b536 -> 84a8421e5 [DOCUMENTATION] fixed groupby aggregation example for pyspark ## What changes were proposed in this pull request? fixing documentation for the groupby/agg example in python ## How was this patch tested? the

spark git commit: [DOCUMENTATION] fixed groupby aggregation example for pyspark

2016-06-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 00c310133 -> 675a73715 [DOCUMENTATION] fixed groupby aggregation example for pyspark ## What changes were proposed in this pull request? fixing documentation for the groupby/agg example in python ## How was this patch tested? the

spark git commit: [DOCUMENTATION] fixed groupby aggregation example for pyspark

2016-06-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 739d992f0 -> 393f4ba15 [DOCUMENTATION] fixed groupby aggregation example for pyspark ## What changes were proposed in this pull request? fixing documentation for the groupby/agg example in python ## How was this patch tested? the

spark git commit: [SPARK-15593][SQL] Add DataFrameWriter.foreach to allow the user consuming data in ContinuousQuery

2016-06-10 Thread tdas
Repository: spark Updated Branches: refs/heads/master 5a3533e77 -> 00c310133 [SPARK-15593][SQL] Add DataFrameWriter.foreach to allow the user consuming data in ContinuousQuery ## What changes were proposed in this pull request? * Add DataFrameWriter.foreach to allow the user consuming data