[GitHub] spark issue #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkSession ...

2018-01-28 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20404 Thanks @HyukjinKwon for your help. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkSession ...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20404 **[Test build #86756 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86756/testReport)** for PR 20404 at commit

[GitHub] spark issue #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkSession ...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20404 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/325/

[GitHub] spark issue #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkSession ...

2018-01-28 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20404 I think I made a duplicated effort .. thanks for taking this in. --- - To unsubscribe, e-mail:

[GitHub] spark issue #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkSession ...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20404 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19965: [SSPARK-22769][CORE] When driver stopping, there is erro...

2018-01-28 Thread Mark110
Github user Mark110 commented on the issue: https://github.com/apache/spark/pull/19965 I get the problem that org.apache.spark.SparkException: Exception thrown in awaitResult .how can I handle the problem --- - To

[GitHub] spark issue #20414: [SPARK-23243][SQL] Shuffle+Repartition on an RDD could l...

2018-01-28 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20414 @cloud-fan Yea you provide a more clear statement here, and I totally agree! --- - To unsubscribe, e-mail:

[GitHub] spark issue #20414: [SPARK-23243][SQL] Shuffle+Repartition on an RDD could l...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20414 > Not quite - coalesce will not combine partitions across executors (aka shuffle) so you could still end up having many many files. I'm not sure if I follow here. For `coalesce(1)` Spark

[GitHub] spark issue #20402: [SPARK-23223][SQL] Make stacking dataset transforms more...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20402 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20414: [SPARK-23243][SQL] Shuffle+Repartition on an RDD could l...

2018-01-28 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20414 @felixcheung You are right that I didn't make it clear there should be still many shuffle blocks, and if you have the read task retried it should be slower than using `repartition(1)` directly.

[GitHub] spark issue #20402: [SPARK-23223][SQL] Make stacking dataset transforms more...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20402 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86750/ Test FAILed. ---

[GitHub] spark issue #20402: [SPARK-23223][SQL] Make stacking dataset transforms more...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20402 **[Test build #86750 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86750/testReport)** for PR 20402 at commit

[GitHub] spark issue #19575: [SPARK-22221][DOCS] Adding User Documentation for Arrow

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19575 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19575: [SPARK-22221][DOCS] Adding User Documentation for Arrow

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19575 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86755/ Test PASSed. ---

[GitHub] spark issue #19575: [SPARK-22221][DOCS] Adding User Documentation for Arrow

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19575 **[Test build #86755 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86755/testReport)** for PR 19575 at commit

[GitHub] spark pull request #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkS...

2018-01-28 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/20404#discussion_r164350646 --- Diff: python/pyspark/sql/session.py --- @@ -760,6 +764,7 @@ def stop(self): """Stop the underlying :class:`SparkContext`.

[GitHub] spark issue #19575: [SPARK-22221][DOCS] Adding User Documentation for Arrow

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19575 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/324/

[GitHub] spark issue #19575: [SPARK-22221][DOCS] Adding User Documentation for Arrow

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19575 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkS...

2018-01-28 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20404#discussion_r164350415 --- Diff: python/pyspark/sql/session.py --- @@ -760,6 +764,7 @@ def stop(self): """Stop the underlying :class:`SparkContext`.

[GitHub] spark issue #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFactory i...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20397 LGTM except a few comments. also cc @rdblue --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19575: [SPARK-22221][DOCS] Adding User Documentation for Arrow

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19575 **[Test build #86755 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86755/testReport)** for PR 19575 at commit

[GitHub] spark pull request #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkS...

2018-01-28 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/20404#discussion_r164350171 --- Diff: python/pyspark/sql/session.py --- @@ -760,6 +764,7 @@ def stop(self): """Stop the underlying :class:`SparkContext`.

[GitHub] spark pull request #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkS...

2018-01-28 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/20404#discussion_r164349606 --- Diff: python/pyspark/sql/session.py --- @@ -760,6 +764,7 @@ def stop(self): """Stop the underlying :class:`SparkContext`.

[GitHub] spark issue #20403: [SPARK-23238][SQL] Externalize SQLConf configurations ex...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20403 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20403: [SPARK-23238][SQL] Externalize SQLConf configurations ex...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20403 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86749/ Test PASSed. ---

[GitHub] spark pull request #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFa...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20397#discussion_r164349078 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/Distribution.java --- @@ -21,9 +21,9 @@ /** * An interface to

[GitHub] spark issue #20414: [SPARK-23243][SQL] Shuffle+Repartition on an RDD could l...

2018-01-28 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20414 > Actually for the first case, you shall use coalesce() instead of repartition() to get a similar effect, without need of another shuffle! Not quite - coalesce will not combine partitions

[GitHub] spark pull request #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFa...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20397#discussion_r164349108 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/Distribution.java --- @@ -21,9 +21,9 @@ /** * An interface to

[GitHub] spark issue #20403: [SPARK-23238][SQL] Externalize SQLConf configurations ex...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20403 **[Test build #86749 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86749/testReport)** for PR 20403 at commit

[GitHub] spark pull request #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFa...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20397#discussion_r164348810 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/DataSourceV2Reader.java --- @@ -63,7 +63,7 @@ StructType readSchema();

[GitHub] spark pull request #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFa...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20397#discussion_r164348145 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/DataSourceV2Reader.java --- @@ -63,7 +63,7 @@ StructType readSchema();

[GitHub] spark pull request #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFa...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20397#discussion_r164348008 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/DataReaderFactory.java --- @@ -50,7 +50,7 @@ } /** - *

[GitHub] spark pull request #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFa...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20397#discussion_r164347780 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/DataReaderFactory.java --- @@ -22,19 +22,19 @@ import

[GitHub] spark pull request #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFa...

2018-01-28 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/20397#discussion_r164347650 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/SupportsScanColumnarBatch.java --- @@ -30,21 +30,21 @@

[GitHub] spark pull request #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFa...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20397#discussion_r164347542 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/ClusteredDistribution.java --- @@ -22,7 +22,7 @@ /** * A concrete

[GitHub] spark pull request #19575: [SPARK-22221][DOCS] Adding User Documentation for...

2018-01-28 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/19575#discussion_r164347106 --- Diff: docs/sql-programming-guide.md --- @@ -1640,6 +1640,133 @@ Configuration of Hive is done by placing your `hive-site.xml`, `core-site.xml` a

[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-28 Thread rednaxelafx
Github user rednaxelafx commented on the issue: https://github.com/apache/spark/pull/20419 LGTM, and +1 on @viirya 's idea. I like it better for the comment to be on top of the class declaration instead of inside it; but I'm okay either way if others have strong opinion otherwise. As

[GitHub] spark issue #20250: [SPARK-23059][SQL][TEST] Correct some improper with view...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20250 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20250: [SPARK-23059][SQL][TEST] Correct some improper with view...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20250 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86748/ Test PASSed. ---

[GitHub] spark pull request #19575: [SPARK-22221][DOCS] Adding User Documentation for...

2018-01-28 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/19575#discussion_r164346796 --- Diff: docs/sql-programming-guide.md --- @@ -1640,6 +1640,133 @@ Configuration of Hive is done by placing your `hive-site.xml`, `core-site.xml` a

[GitHub] spark issue #20250: [SPARK-23059][SQL][TEST] Correct some improper with view...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20250 **[Test build #86748 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86748/testReport)** for PR 20250 at commit

[GitHub] spark issue #20414: [SPARK-23243][SQL] Shuffle+Repartition on an RDD could l...

2018-01-28 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20414 Talked to @yanboliang offline, he claimed that the major use cases of RDD/DataFrame.repartition() in ml workloads he has observed are: 1. During save models, you may need `repartition()` to

[GitHub] spark pull request #19575: [SPARK-22221][DOCS] Adding User Documentation for...

2018-01-28 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/19575#discussion_r164346090 --- Diff: docs/sql-programming-guide.md --- @@ -1640,6 +1640,133 @@ Configuration of Hive is done by placing your `hive-site.xml`, `core-site.xml` a

[GitHub] spark pull request #19575: [SPARK-22221][DOCS] Adding User Documentation for...

2018-01-28 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/19575#discussion_r164345403 --- Diff: docs/sql-programming-guide.md --- @@ -1640,6 +1640,133 @@ Configuration of Hive is done by placing your `hive-site.xml`, `core-site.xml` a

[GitHub] spark pull request #19575: [SPARK-22221][DOCS] Adding User Documentation for...

2018-01-28 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/19575#discussion_r164345006 --- Diff: docs/sql-programming-guide.md --- @@ -1640,6 +1640,129 @@ Configuration of Hive is done by placing your `hive-site.xml`, `core-site.xml` a

[GitHub] spark pull request #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFa...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20397#discussion_r164345010 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/SupportsScanColumnarBatch.java --- @@ -30,21 +30,21 @@

[GitHub] spark issue #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFactory i...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20397 **[Test build #86754 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86754/testReport)** for PR 20397 at commit

[GitHub] spark issue #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFactory i...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20397 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/323/

[GitHub] spark issue #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFactory i...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20397 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20373: [SPARK-23159][PYTHON] Update cloudpickle to match...

2018-01-28 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20373#discussion_r164343686 --- Diff: python/pyspark/cloudpickle.py --- @@ -1019,18 +948,40 @@ def __reduce__(cls): return cls.__name__ -def

[GitHub] spark pull request #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFa...

2018-01-28 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/20397#discussion_r164342995 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/SupportsScanColumnarBatch.java --- @@ -30,21 +30,21 @@

[GitHub] spark issue #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkSession ...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20404 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkSession ...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20404 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86752/ Test FAILed. ---

[GitHub] spark issue #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkSession ...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20404 **[Test build #86752 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86752/testReport)** for PR 20404 at commit

[GitHub] spark issue #20420: [SPARK-22916][SQL][FOLLOW-UP] Update the Description of ...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20420 **[Test build #86753 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86753/testReport)** for PR 20420 at commit

[GitHub] spark issue #20373: [SPARK-23159][PYTHON] Update cloudpickle to match 0.4.2

2018-01-28 Thread rgbkrk
Github user rgbkrk commented on the issue: https://github.com/apache/spark/pull/20373 Cool, I can take a gander at this tomorrow (beyond my limited skim just now). --- - To unsubscribe, e-mail:

[GitHub] spark issue #20335: [SPARK-23088][CORE] History server not showing incomplet...

2018-01-28 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20335 CC @ajbozarth .\ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20420: [SPARK-22916][SQL][FOLLOW-UP] Update the Description of ...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20420 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/322/

[GitHub] spark issue #20420: [SPARK-22916][SQL][FOLLOW-UP] Update the Description of ...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20420 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20373: [SPARK-23159][PYTHON] Update cloudpickle to match...

2018-01-28 Thread rgbkrk
Github user rgbkrk commented on a diff in the pull request: https://github.com/apache/spark/pull/20373#discussion_r164341489 --- Diff: python/pyspark/cloudpickle.py --- @@ -1019,18 +948,40 @@ def __reduce__(cls): return cls.__name__ -def

[GitHub] spark pull request #18349: [SPARK-20927][SS] Change some operators in Datase...

2018-01-28 Thread ZiyueHuang
Github user ZiyueHuang closed the pull request at: https://github.com/apache/spark/pull/18349 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20420: [SPARK-22916][SQL][FOLLOW-UP] Update the Description of ...

2018-01-28 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20420 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20420: [SPARK-22916][SQL][FOLLOW-UP] Update the Description of ...

2018-01-28 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20420 `org.apache.spark.sql.execution.datasources.orc.OrcQuerySuite.(It is not a test it is a sbt.testing.SuiteSelector)` is another flaky test ---

[GitHub] spark pull request #20405: [SPARK-23229][SQL] Dataset.hint should use planWi...

2018-01-28 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20405#discussion_r164340914 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -1216,7 +1216,7 @@ class Dataset[T] private[sql]( */

[GitHub] spark issue #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkSession ...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20404 **[Test build #86752 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86752/testReport)** for PR 20404 at commit

[GitHub] spark pull request #20405: [SPARK-23229][SQL] Dataset.hint should use planWi...

2018-01-28 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20405#discussion_r164340278 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -1216,7 +1216,7 @@ class Dataset[T] private[sql]( */

[GitHub] spark issue #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkSession ...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20404 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/321/

[GitHub] spark issue #20404: [SPARK-23228][PYSPARK] Add Python Created jsparkSession ...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20404 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20402: [SPARK-23223][SQL] Make stacking dataset transforms more...

2018-01-28 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20402 LGTM too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19575: [SPARK-22221][DOCS] Adding User Documentation for...

2018-01-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19575#discussion_r164337710 --- Diff: docs/sql-programming-guide.md --- @@ -1640,6 +1640,133 @@ Configuration of Hive is done by placing your `hive-site.xml`, `core-site.xml` a

[GitHub] spark pull request #19575: [SPARK-22221][DOCS] Adding User Documentation for...

2018-01-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19575#discussion_r164337734 --- Diff: docs/sql-programming-guide.md --- @@ -1640,6 +1640,133 @@ Configuration of Hive is done by placing your `hive-site.xml`, `core-site.xml` a

[GitHub] spark issue #20402: [SPARK-23223][SQL] Make stacking dataset transforms more...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20402 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #20402: [SPARK-23223][SQL] Make stacking dataset transfor...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20402#discussion_r164338886 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -62,7 +62,11 @@ import org.apache.spark.util.Utils private[sql]

[GitHub] spark pull request #20405: [SPARK-23229][SQL] Dataset.hint should use planWi...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20405#discussion_r164338606 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -1216,7 +1216,7 @@ class Dataset[T] private[sql]( */

[GitHub] spark issue #20372: [SPARK-23249] Improved block merging logic for partition...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20372 please update your title to `[SPARK-23249][SQL] ...` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #20372: [SPARK-23249] Improved block merging logic for partition...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20372 LGTM, also cc @hvanhovell @marmbrus --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20402: [SPARK-23223][SQL] Make stacking dataset transfor...

2018-01-28 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20402#discussion_r164338267 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -62,7 +62,11 @@ import org.apache.spark.util.Utils private[sql]

[GitHub] spark pull request #20369: [SPARK-23196] Unify continuous and microbatch V2 ...

2018-01-28 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20369 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20369: [SPARK-23196] Unify continuous and microbatch V2 sinks

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20369 thanks, merging to master/2.3! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark pull request #20402: [SPARK-23223][SQL] Make stacking dataset transfor...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20402#discussion_r164337457 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -62,7 +62,11 @@ import org.apache.spark.util.Utils private[sql]

[GitHub] spark issue #20402: [SPARK-23223][SQL] Make stacking dataset transforms more...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20402 **[Test build #86751 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86751/testReport)** for PR 20402 at commit

[GitHub] spark issue #20402: [SPARK-23223][SQL] Make stacking dataset transforms more...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20402 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/320/

[GitHub] spark issue #20402: [SPARK-23223][SQL] Make stacking dataset transforms more...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20402 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20402: [SPARK-23223][SQL] Make stacking dataset transfor...

2018-01-28 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20402#discussion_r164336721 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala --- @@ -23,7 +23,7 @@ import java.sql.{Date, Timestamp} import

[GitHub] spark issue #20402: [SPARK-23223][SQL] Make stacking dataset transforms more...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20402 **[Test build #86750 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86750/testReport)** for PR 20402 at commit

[GitHub] spark issue #20402: [SPARK-23223][SQL] Make stacking dataset transforms more...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20402 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20402: [SPARK-23223][SQL] Make stacking dataset transforms more...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20402 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/319/

[GitHub] spark pull request #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFa...

2018-01-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20397#discussion_r164335994 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/SupportsScanColumnarBatch.java --- @@ -30,21 +30,21 @@

[GitHub] spark issue #20373: [SPARK-23159][PYTHON] Update cloudpickle to match 0.4.2

2018-01-28 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20373 LGTM otherwise. @ueshin, I assume you were following this too. Did you had a change to take a look? Also, @rgbkrk, I think it would be great to have your look too .. It's basically a

[GitHub] spark issue #20420: [SPARK-22916][SQL][FOLLOW-UP] Update the Description of ...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20420 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86747/ Test FAILed. ---

[GitHub] spark issue #20420: [SPARK-22916][SQL][FOLLOW-UP] Update the Description of ...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20420 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20420: [SPARK-22916][SQL][FOLLOW-UP] Update the Description of ...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20420 **[Test build #86747 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86747/testReport)** for PR 20420 at commit

[GitHub] spark issue #20369: [SPARK-23196] Unify continuous and microbatch V2 sinks

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20369 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86746/ Test PASSed. ---

[GitHub] spark issue #20369: [SPARK-23196] Unify continuous and microbatch V2 sinks

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20369 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20369: [SPARK-23196] Unify continuous and microbatch V2 sinks

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20369 **[Test build #86746 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86746/testReport)** for PR 20369 at commit

[GitHub] spark issue #20403: [SPARK-23238][SQL] Externalize SQLConf configurations ex...

2018-01-28 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20403 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20419 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86745/ Test PASSed. ---

[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20419 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20419 **[Test build #86745 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86745/testReport)** for PR 20419 at commit

[GitHub] spark issue #20373: [SPARK-23159][PYTHON] Update cloudpickle to match 0.4.2

2018-01-28 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20373 I took a quick look for the commits and seems we should backport https://github.com/cloudpipe/cloudpickle/pull/145 too as looks introduced from

  1   2   >