[GitHub] spark issue #19429: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-09 Thread jomach
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19429 @gatorsmile I dressed your comments. Still I cannot use the jekyll build... `SKIP_API=1 jekyll build --incremental Configuration file: /Users/jorge/Downloads/spark/docs/_config.yml

[GitHub] spark pull request #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark ...

2017-10-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19459#discussion_r143633890 --- Diff: python/pyspark/sql/session.py --- @@ -510,9 +511,43 @@ def createDataFrame(self, data, schema=None, samplingRatio=None, verifySchema=Tr

[GitHub] spark pull request #19462: [SPARK-22159][SQL][FOLLOW-UP] Make config names c...

2017-10-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19462 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19462: [SPARK-22159][SQL][FOLLOW-UP] Make config names consiste...

2017-10-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19462 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19462: [SPARK-22159][SQL][FOLLOW-UP] Make config names consiste...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19462 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19462: [SPARK-22159][SQL][FOLLOW-UP] Make config names consiste...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19462 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82575/ Test PASSed. ---

[GitHub] spark issue #19462: [SPARK-22159][SQL][FOLLOW-UP] Make config names consiste...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19462 **[Test build #82575 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82575/testReport)** for PR 19462 at commit

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143630635 --- Diff: python/pyspark/sql/tests.py --- @@ -3376,6 +3376,151 @@ def test_vectorized_udf_empty_partition(self): res =

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143630813 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -519,3 +519,4 @@ case class CoGroup(

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143630505 --- Diff: python/pyspark/sql/tests.py --- @@ -3376,6 +3376,151 @@ def test_vectorized_udf_empty_partition(self): res =

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143629848 --- Diff: python/pyspark/sql/functions.py --- @@ -2181,30 +2187,66 @@ def udf(f=None, returnType=StringType()): @since(2.3) def

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143630469 --- Diff: python/pyspark/sql/tests.py --- @@ -3376,6 +3376,151 @@ def test_vectorized_udf_empty_partition(self): res =

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143630939 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/pythonLogicalOperators.scala --- @@ -0,0 +1,43 @@ +/* + *

[GitHub] spark issue #19463: Cleanup comment in RDDSuite test

2017-10-09 Thread sohum2002
Github user sohum2002 commented on the issue: https://github.com/apache/spark/pull/19463 I just added "Removed one comment from RDDSuite." to the PR description. Will this suffice? --- - To unsubscribe, e-mail:

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-09 Thread akopich
Github user akopich commented on the issue: https://github.com/apache/spark/pull/18924 @WeichenXu123, could you please notify @jkbradley once again? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #18966: [SPARK-21751][SQL] CodeGeneraor.splitExpressions ...

2017-10-09 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18966#discussion_r143629417 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -769,16 +769,21 @@ class

[GitHub] spark pull request #18966: [SPARK-21751][SQL] CodeGeneraor.splitExpressions ...

2017-10-09 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18966#discussion_r143628760 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -769,16 +769,21 @@ class CodegenContext {

[GitHub] spark pull request #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHo...

2017-10-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19460 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19460 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19363 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19363 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82574/ Test PASSed. ---

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19363 **[Test build #82574 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82574/testReport)** for PR 19363 at commit

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19460 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19463: Cleanup comment in RDDSuite test

2017-10-09 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19463 Could you please update the description why you want to apply this change? --- - To unsubscribe, e-mail:

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19460 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18664 @BryanCutler, BTW, do you think it is possible to de-duplicate timezone handling within Python side if we go for 1.? --- -

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18664 I think I prefer 1. Do you maybe have a preference @ueshin? I believe you are more insightful in this. --- - To

[GitHub] spark issue #19399: [SPARK-22175][WEB-UI] Add status column to history page

2017-10-09 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/19399 With @jerryshao comments I'm going to get off the fence firmly against this, we already have too many things slowing down the SHS as it is ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19082 Aha, I feel fair enough. Based the insight, there is one of solutions to make the wholestage codegen consider #calls of gen'd functions though, it seems the approach is not simple. So, splitting

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82577 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82577/testReport)** for PR 18732 at commit

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-09 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18732 Merged some last minute changes from @BryanCutler to make the wrapping a bit cleaner. Thanks @BryanCutler! --- - To

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-10-09 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r143624224 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -68,6 +68,26 @@ private[hive] trait SaveAsHiveFile

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-10-09 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r143624210 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -68,6 +68,26 @@ private[hive] trait SaveAsHiveFile

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-10-09 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r143624196 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -68,6 +68,26 @@ private[hive] trait SaveAsHiveFile

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-10-09 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r143624181 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -68,6 +68,26 @@ private[hive] trait SaveAsHiveFile

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82576 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82576/testReport)** for PR 18732 at commit

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143622623 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -435,6 +435,35 @@ class RelationalGroupedDataset

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143622617 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/FlatMapGroupsInPandasExec.scala --- @@ -0,0 +1,103 @@ +/* + * Licensed

[GitHub] spark issue #19463: Cleanup comment in RDDSuite test

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19463 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19463: Cleanup comment in RDDSuite test

2017-10-09 Thread sohum2002
GitHub user sohum2002 opened a pull request: https://github.com/apache/spark/pull/19463 Cleanup comment in RDDSuite test ## What changes were proposed in this pull request? There were not changes proposed in this pull request. ## How was this patch tested?

[GitHub] spark issue #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark DataFra...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19459 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82573/ Test PASSed. ---

[GitHub] spark issue #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark DataFra...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19459 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark DataFra...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19459 **[Test build #82573 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82573/testReport)** for PR 19459 at commit

[GitHub] spark pull request #19454: [SPARK-22152][SPARK-18855][SQL] Added flatten fun...

2017-10-09 Thread sohum2002
Github user sohum2002 closed the pull request at: https://github.com/apache/spark/pull/19454 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855][SQL] Added flatten functions ...

2017-10-09 Thread sohum2002
Github user sohum2002 commented on the issue: https://github.com/apache/spark/pull/19454 Thank you all for your comments. I hope to improve in my future PRs. Cheers! --- - To unsubscribe, e-mail:

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855][SQL] Added flatten functions ...

2017-10-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/19454 Honestly I don't think it is worth doing this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19462: [SPARK-22159][SQL][FOLLOW-UP] Make config names consiste...

2017-10-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19462 cc @rxin @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19462: [SPARK-22159][SQL][FOLLOW-UP] Make config names consiste...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19462 **[Test build #82575 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82575/testReport)** for PR 19462 at commit

[GitHub] spark pull request #19462: [SPARK-22159][SQL][FOLLOW-UP] Make config names c...

2017-10-09 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/19462 [SPARK-22159][SQL][FOLLOW-UP] Make config names consistently end with "enabled". ## What changes were proposed in this pull request? This is a follow-up of #19384. In the previous

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 The above reasoning also explains the motivation and the effect of #18931 too. The generated codes of query operators are extracted to individual smaller functions. It is beneficial to step

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 @maropu The codes to do aggregation are actually wrapped in a function `doAggregateWithKeys`/`doAggregateWithoutKey`. This is also the part of generated codes this PR improves by extracting

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19460 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82572/ Test PASSed. ---

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19460 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19460 **[Test build #82572 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82572/testReport)** for PR 19460 at commit

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19363 **[Test build #82574 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82574/testReport)** for PR 19363 at commit

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143614190 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -435,6 +435,35 @@ class RelationalGroupedDataset

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143614283 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/FlatMapGroupsInPandasExec.scala --- @@ -0,0 +1,103 @@ +/* + * Licensed to

[GitHub] spark issue #19399: [SPARK-22175][WEB-UI] Add status column to history page

2017-10-09 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19399 I agree with @squito that the criteria to define application's success should be well considered. Here in your current code, only if all the jobs are successful then the application is marked as

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19082 Either way, I think we first need to know why the regression on `q66` happens when turning off wholestage codegen. We first thought turning off too-long functions had better performance, but it is

[GitHub] spark pull request #19454: [SPARK-22152][SPARK-18855][SQL] Added flatten fun...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19454#discussion_r143612478 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -2543,6 +2543,14 @@ class Dataset[T] private[sql](

[GitHub] spark pull request #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark ...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19459#discussion_r143610100 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala --- @@ -203,4 +205,16 @@ private[sql] object

[GitHub] spark pull request #19454: [SPARK-22152][SPARK-18855][SQL] Added flatten fun...

2017-10-09 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19454#discussion_r143608933 --- Diff: core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala --- @@ -63,6 +63,7 @@ class RDDSuite extends SparkFunSuite with SharedSparkContext {

[GitHub] spark pull request #19454: [SPARK-22152][SPARK-18855][SQL] Added flatten fun...

2017-10-09 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19454#discussion_r143607680 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -382,6 +382,13 @@ abstract class RDD[T: ClassTag]( } /** +*

[GitHub] spark issue #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark DataFra...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19459 **[Test build #82573 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82573/testReport)** for PR 19459 at commit

[GitHub] spark pull request #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark ...

2017-10-09 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/19459#discussion_r143607522 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala --- @@ -203,4 +205,16 @@ private[sql] object

[GitHub] spark pull request #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark ...

2017-10-09 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/19459#discussion_r143606693 --- Diff: python/pyspark/sql/tests.py --- @@ -3147,6 +3150,14 @@ def test_filtered_frame(self): self.assertEqual(pdf.columns[0], "i")

[GitHub] spark pull request #19454: [SPARK-22152][SPARK-18855][SQL] Added flatten fun...

2017-10-09 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19454#discussion_r143606572 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -2543,6 +2543,14 @@ class Dataset[T] private[sql](

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19460 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82567/ Test PASSed. ---

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19460 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark ...

2017-10-09 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/19459#discussion_r143605840 --- Diff: python/pyspark/sql/tests.py --- @@ -3147,6 +3150,14 @@ def test_filtered_frame(self): self.assertEqual(pdf.columns[0], "i")

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19460 **[Test build #82567 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82567/testReport)** for PR 19460 at commit

[GitHub] spark issue #19442: [SPARK-8515][ML][WIP] Improve ML Attribute API

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19442 @VDuda Thanks for asking. This is a big change. I hope this PR can resolve SPARK-8515. Most APIs are ready. I'm working on the compatibility with current attribute APIs. When it is ready,

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19460 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82564/ Test PASSed. ---

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19460 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19460 **[Test build #82564 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82564/testReport)** for PR 19460 at commit

[GitHub] spark issue #19461: [SPARK-22230] Swap per-row order in state store restore.

2017-10-09 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/19461 Discussed offline. We don't need to backport to branch-2.2. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-09 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 Ok sounds good. Could I get some opinions on the best way to convert internal Spark timestamps since they are stored as UTC time? I think we have the following options: 1. Write

[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19269 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19269 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82571/ Test FAILed. ---

[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19269 **[Test build #82571 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82571/testReport)** for PR 19269 at commit

[GitHub] spark issue #19270: [SPARK-21809] : Change Stage Page to use datatables to s...

2017-10-09 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/19270 So I think I know why the appId was handled the way it was, the live app ui no longer works because the appId var is "undefined" in all the api calls ---

[GitHub] spark issue #19270: [SPARK-21809] : Change Stage Page to use datatables to s...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19270 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19270: [SPARK-21809] : Change Stage Page to use datatables to s...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19270 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82563/ Test PASSed. ---

[GitHub] spark issue #19270: [SPARK-21809] : Change Stage Page to use datatables to s...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19270 **[Test build #82563 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82563/testReport)** for PR 19270 at commit

[GitHub] spark issue #19461: [SPARK-22230] Swap per-row order in state store restore.

2017-10-09 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/19461 Oh, there are some conflicts with 2.2. @joseph-torres could you submit a backport PR, please? --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19461: [SPARK-22230] Swap per-row order in state store r...

2017-10-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19461 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19461: [SPARK-22230] Swap per-row order in state store restore.

2017-10-09 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/19461 Thanks! Merging to master and 2.2. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19461: [SPARK-22230] Swap per-row order in state store restore.

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19461 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19461: [SPARK-22230] Swap per-row order in state store restore.

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19461 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82566/ Test PASSed. ---

[GitHub] spark issue #19461: [SPARK-22230] Swap per-row order in state store restore.

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19461 **[Test build #82566 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82566/testReport)** for PR 19461 at commit

[GitHub] spark pull request #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark ...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19459#discussion_r143600411 --- Diff: python/pyspark/sql/tests.py --- @@ -3147,6 +3150,14 @@ def test_filtered_frame(self): self.assertEqual(pdf.columns[0], "i")

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19433 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82570/ Test FAILed. ---

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19433 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19433 **[Test build #82570 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82570/testReport)** for PR 19433 at commit

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19460 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19460 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82562/ Test PASSed. ---

[GitHub] spark issue #19460: [SPARK-22222][core] Fix the ARRAY_MAX in BufferHolder an...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19460 **[Test build #82562 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82562/testReport)** for PR 19460 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18664 Yup, I think we already don't have timezone in `udf` too? I think we are fine as long as it keeps the existing behaviour. Let's don't forget to handle all those cases when we deal with timezone

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19250 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82561/ Test PASSed. ---

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19250 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

  1   2   3   4   >