[GitHub] spark issue #23111: [SPARK-26148][PYTHON][TESTS] Increases default paralleli...

2018-11-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23111 Hey all, I will merge this in few days if there's no more comments. It's going to speed up the tests roughly 12 ~ 15 mins

[GitHub] spark issue #23119: [SPARK-25954][SS][FOLLOWUP][test-maven] Add Zookeeper 3....

2018-11-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23119 LGTM too --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23080: [SPARK-26108][SQL] Support custom lineSep in CSV ...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23080#discussion_r235830894 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala --- @@ -377,6 +377,8 @@ final class DataStreamReader private

[GitHub] spark issue #23111: [SPARK-26148][PYTHON][TESTS] Increases default paralleli...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23111 Yea, the improvement looks persistent: `Tests passed in 1027 seconds` --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #23117: [WIP][SPARK-7721][INFRA] Run and generate test coverage ...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23117 It's not urgent :) so it's okay. Actually i'm on a vacation for a week as well. Thanks for taking a look @shaneknapp

[GitHub] spark pull request #23098: [WIP][SPARK-26132][BUILD][CORE] Remove support fo...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23098#discussion_r235830558 --- Diff: bin/load-spark-env.cmd --- @@ -21,37 +21,42 @@ rem This script loads spark-env.cmd if it exists, and ensures it is only loaded rem spark

[GitHub] spark issue #23109: [SPARK-26069][TESTS][FOLLOWUP]Add another possible error...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23109 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23111: [SPARK-26148][PYTHON][TESTS] Increases default paralleli...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23111 cc @rxin, @BryanCutler, @squito. This decreases elapsed time (even faster then before splitting the tests

[GitHub] spark issue #23111: [DO-NOT-MERGE] Increases default parallelism in PySpark ...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23111 Oh? it drastically decreases from, for instance, ``` Tests passed in 1770 seconds ``` to ``` Tests passed in 1171 seconds

[GitHub] spark issue #23111: [DO-NOT-MERGE] Increases default parallelism in PySpark ...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23111 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23117: [WIP][SPARK-7721][INFRA] Run and generate test coverage ...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23117 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23117: [WIP][SPARK-7721][INFRA] Run and generate test co...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23117#discussion_r235660674 --- Diff: dev/run-tests.py --- @@ -594,7 +651,18 @@ def main(): modules_with_python_tests = [m for m in test_modules

[GitHub] spark issue #23118: [SPARK-26144][BUILD] `build/mvn` should detect `scala.ve...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23118 Haha today's not holiday here :D. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #23109: [SPARK-26069][TESTS][FOLLOWUP]Add another possible error...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23109 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23117: [WIP][SPARK-7721][INFRA] Run and generate test co...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23117#discussion_r235644712 --- Diff: dev/run-tests.py --- @@ -594,7 +651,18 @@ def main(): modules_with_python_tests = [m for m in test_modules

[GitHub] spark issue #23111: [DO-NOT-MERGE] Increases default parallelism in PySpark ...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23111 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23117: [WIP][SPARK-7721][INFRA] Run and generate test coverage ...

2018-11-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23117 cc @rxin, @JoshRosen, @shaneknapp, @gatorsmile, @BryanCutler, @holdenk, @felixcheung, @viirya, @ueshin, @icexelloss

[GitHub] spark pull request #23117: [WIP][SPARK-7721][INFRA] Run and generate test co...

2018-11-22 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/23117 [WIP][SPARK-7721][INFRA] Run and generate test coverage report from Python via Jenkins ## What changes were proposed in this pull request? ### Background For the current

[GitHub] spark issue #23111: [DO-NOT-MERGE] Increases default parallelism in PySpark ...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23111 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23109: [SPARK-26069][TESTS][FOLLOWUP]Add another possible error...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23109 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23112: [GraphX] Remove unused variables left over by previous r...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23112 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23112: [GraphX] Remove unused variables left over by previous r...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23112 Looks okay to go. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23109: [SPARK-26069][TESTS][FOLLOWUP]Add another possible error...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23109 retest this please (the test failures look not persistent) --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #23070: [SPARK-26099][SQL] Verification of the corrupt column in...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23070 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23080: [SPARK-26108][SQL] Support custom lineSep in CSV datasou...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23080 LGTM except https://github.com/apache/spark/pull/23080/files#r235589426 --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #23080: [SPARK-26108][SQL] Support custom lineSep in CSV ...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23080#discussion_r235589426 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala --- @@ -216,8 +232,13 @@ class CSVOptions

[GitHub] spark pull request #23080: [SPARK-26108][SQL] Support custom lineSep in CSV ...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23080#discussion_r235589448 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala --- @@ -227,7 +248,10 @@ class CSVOptions

[GitHub] spark pull request #22979: [SPARK-25977][SQL] Parsing decimals from CSV usin...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22979#discussion_r235588064 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtils.scala --- @@ -79,4 +83,22 @@ object CSVExprUtils

[GitHub] spark issue #23027: [SPARK-26049][SQL][TEST] FilterPushdownBenchmark add InM...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23027 Looks fine. @dongjoon-hyun WDYT? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23052 There are two more things to deal with: https://github.com/apache/spark/pull/23052#issuecomment-440687200 comment will still be valid - at least it should be double checked because

[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23052 cc @cloud-fan as well --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23111: [DO-NOT-MERGE] Increases default parallelism in PySpark ...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23111 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22938#discussion_r235583851 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -240,16 +240,6 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark issue #22938: [SPARK-25935][SQL] Prevent null rows from JSON parser

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22938 Sorry for the late response. The change looks good to me in general but I had one question. --- - To unsubscribe, e-mail

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22938#discussion_r235583559 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala --- @@ -1892,7 +1898,7 @@ class JsonSuite extends

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22938#discussion_r235583349 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala --- @@ -1892,7 +1898,7 @@ class JsonSuite extends

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22938#discussion_r235583315 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala --- @@ -1905,7 +1911,7 @@ class JsonSuite extends

[GitHub] spark issue #23111: [DO-NOT-MERGE] Increases default parallelism in PySpark ...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23111 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23113: [SPARK-26019][PYTHON] Fix race condition in accumulators...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23113 > race condition explained in https://issues.apache.org/jira/browse/SPARK-26019 How race condition happens? Can you clarify it in PR description. I think @viirya's analysis is matc

[GitHub] spark pull request #23098: [WIP][SPARK-26132][BUILD][CORE] Remove support fo...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23098#discussion_r235572895 --- Diff: bin/load-spark-env.cmd --- @@ -21,37 +21,42 @@ rem This script loads spark-env.cmd if it exists, and ensures it is only loaded rem spark

[GitHub] spark pull request #23098: [WIP][SPARK-26132][BUILD][CORE] Remove support fo...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23098#discussion_r235572737 --- Diff: bin/load-spark-env.cmd --- @@ -21,37 +21,42 @@ rem This script loads spark-env.cmd if it exists, and ensures it is only loaded rem spark

[GitHub] spark pull request #23098: [WIP][SPARK-26132][BUILD][CORE] Remove support fo...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23098#discussion_r235572442 --- Diff: bin/load-spark-env.cmd --- @@ -21,37 +21,42 @@ rem This script loads spark-env.cmd if it exists, and ensures it is only loaded rem spark

[GitHub] spark pull request #23111: [DO-NOT-MERGE] Increases default parallelism in P...

2018-11-21 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/23111 [DO-NOT-MERGE] Increases default parallelism in PySpark tests ## What changes were proposed in this pull request? I'm trying to see if increasing parallelism decreases elapsed time

[GitHub] spark issue #23078: [SPARK-26106][PYTHON] Prioritizes ML unittests over the ...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23078 Thanks. Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23102: [SPARK-26137][CORE] Use Java system property "fil...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23102#discussion_r235570156 --- Diff: core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala --- @@ -65,7 +65,7 @@ private[deploy] object DependencyUtils extends

[GitHub] spark issue #23102: [SPARK-26137][CORE] Use Java system property "file.separ...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23102 @markpavey, can you write a test? I can run the test on Windows via AppVeyor. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pyspark.me...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23055 Sorry, may I ask to take another look please? I thought it's quite a straightforward fix by a consistent way of fixing

[GitHub] spark issue #23078: [SPARK-26106][PYTHON] Prioritizes ML unittests over the ...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23078 @JoshRosen, can you take a look please when you're available? it's quite obvious to fix. --- - To unsubscribe, e-mail

[GitHub] spark issue #23109: [SPARK-26069][TESTS][FOLLOWUP]Add another possible error...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23109 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23052 Also, it's not always for Parquet to write empty files. That does not write empty files when data frames are created from emptyRDD (the one pointed out in the PR link I gave). We should match

[GitHub] spark issue #23101: [SPARK-26134][CORE] Upgrading Hadoop to 2.7.4 to fix jav...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23101 LGTM2 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23052 @MaxGekk I didn't mean to block this PR. Since we're going ahead for 3.0, it should be good to match and fix the behaviours across data sources. For instance, CSV should still be able to read

[GitHub] zeppelin issue #3206: [ZEPPELIN-3810] Support Spark 2.4

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/zeppelin/pull/3206 This fix is not released yet. This PR exactly fixes the problem you faced. This fix will be available in the next release of Zeppelin. ---

[GitHub] spark pull request #23098: [WIP][SPARK-26132][BUILD][CORE] Remove support fo...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23098#discussion_r235322452 --- Diff: bin/load-spark-env.cmd --- @@ -21,37 +21,42 @@ rem This script loads spark-env.cmd if it exists, and ensures it is only loaded rem spark

[GitHub] spark issue #23071: [SPARK-26102][SQL][TEST] Extracting common CSV/JSON func...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23071 Thank you @MaxGekk. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22939: [SPARK-25446][R] Add schema_of_json() and schema_of_csv(...

2018-11-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22939 Sure! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23027: [SPARK-26049][SQL][TEST] FilterPushdownBenchmark add InM...

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23027 @wangyum, why did you close this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #22939: [SPARK-25446][R] Add schema_of_json() and schema_of_csv(...

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22939 gentle ping, @felixcheung. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #23070: [SPARK-26099][SQL] Verification of the corrupt column in...

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23070 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23071: [SPARK-26102][SQL][TEST] Extracting common CSV/JSON func...

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23071 I think we don't need this for now. Let's do this when more `from/to_...` functions are added later. The amount of codes increases actually

[GitHub] spark issue #23080: [SPARK-26108][SQL] Support custom lineSep in CSV datasou...

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23080 @MaxGekk, let's rebase this one accordingly with encoding support. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #23080: [SPARK-26108][SQL] Support custom lineSep in CSV ...

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23080#discussion_r235244407 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala --- @@ -192,6 +192,20 @@ class CSVOptions

[GitHub] spark issue #23085: [Docs] Added csv, orc, and text output format options to...

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23085 @mrandrewandrade, let's close this for now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #23078: [SPARK-26106][PYTHON] Prioritizes ML unittests over the ...

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23078 @zsxwing can you take a look please when you're available --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #23099: [WIP][SPARK-25954][SS] Upgrade to Kafka 2.1.0

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23099 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23089: [SPARK-26120][TESTS][SS][SPARKR]Fix a streaming query le...

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23089 Merged to master and branch-2.4. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #23091: [SPARK-26122][SQL] Support encoding for multiLine in CSV...

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23091 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23055#discussion_r235226292 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner

[GitHub] spark issue #23085: [Docs] Added csv, orc, and text output format options to...

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23085 Because it's already documented. Also it brings maintnense overhead. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #23087: [SPARK-26124][BUILD] Update plugins to latest versions

2018-11-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23087 reest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23004: [SPARK-26004][SQL] InMemoryTable support StartsWith pred...

2018-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23004 Looks fine to me --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23085: [Docs] Added csv, orc, and text output format options to...

2018-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23085 It's already documented in official site anyway. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #23085: [Docs] Added csv, orc, and text output format options to...

2018-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23085 I think it doesn't need to change. It's likely to be changed and we wouldn't want to update this doc everytime we add new datasource

[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23055#discussion_r234838984 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner

[GitHub] spark issue #23091: [SPARK-26122][SQL] Support encoding for multiLine in CSV...

2018-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23091 FYI hey @priancho IIRC, you proposed a similar change before in the mailing list. I wasn't positive about that because I was thinking we should deprecate `encoding` option at that time. It has

[GitHub] spark issue #23087: [MINOR][BUILD] Update plugins to latest versions

2018-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23087 Looks fine to me. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23087: [MINOR][BUILD] Update plugins to latest versions

2018-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23087#discussion_r234831263 --- Diff: pom.xml --- @@ -2522,7 +2530,7 @@ com.puppycrawl.tools checkstyle -8.2

[GitHub] spark issue #22635: [SPARK-25591][PySpark][SQL] Avoid overwriting deserializ...

2018-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22635 This is fixed in 2.4.0 and your issue is when 2.3.1 -> 2.3.2. It's not related. --- - To unsubscribe, e-mail: revi

[GitHub] spark issue #22635: [SPARK-25591][PySpark][SQL] Avoid overwriting deserializ...

2018-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22635 How does it related with the JIRA? looks not quite related from a cursory look. Please leave some analysis next time or at least testing it before/after the specific commit. Let me take a look

[GitHub] spark issue #23080: [SPARK-26108][SQL] Support custom lineSep in CSV datasou...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23080 Ah, also, `CsvParser.beginParsing` takes an additional argument `Charset`. It should rather be easily able to support encoding in `multiLine`. @MaxGekk, would you be able to find some time

[GitHub] spark pull request #23080: [SPARK-26108][SQL] Support custom lineSep in CSV ...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23080#discussion_r234476318 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala --- @@ -192,6 +192,20 @@ class CSVOptions

[GitHub] spark pull request #23080: [SPARK-26108][SQL] Support custom lineSep in CSV ...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23080#discussion_r234475595 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala --- @@ -192,6 +192,20 @@ class CSVOptions

[GitHub] spark pull request #23080: [SPARK-26108][SQL] Support custom lineSep in CSV ...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23080#discussion_r234475228 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala --- @@ -192,6 +192,20 @@ class CSVOptions

[GitHub] spark issue #23077: [SPARK-26105][PYTHON] Clean unittest2 imports up that we...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23077 Merged to master. Thanks for reviewing this, @BryanCutler and @srowen. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pyspark.me...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23055 adding @BryanCutler and @ueshin as well FYI. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #23078: [SPARK-26106][PYTHON] Prioritizes ML unittests over the ...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23078 cc @BryanCutler. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23078: [SPARK-26106][PYTHON] Prioritizes ML unittests ov...

2018-11-18 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/23078 [SPARK-26106][PYTHON] Prioritizes ML unittests over the doctests in PySpark ## What changes were proposed in this pull request? Arguably, unittests usually takes longer then doctests

[GitHub] spark issue #23077: [SPARK-25344][PYTHON] Clean unittest2 imports up that we...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23077 cc @BryanCutler. BTW, Bryan, do you have some time to work on the `has_numpy` stuff that we talked about before? I was thinking we should specify the package version and produce

[GitHub] spark pull request #23077: [SPARK-25344][PYTHON] Clean unittest2 imports up ...

2018-11-18 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/23077 [SPARK-25344][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before ## What changes were proposed in this pull request? Currently, some of PySpark tests sill assume

[GitHub] spark issue #23063: [SPARK-26033][PYTHON][TESTS] Break large ml/tests.py fil...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23063 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23070: [SPARK-26099][SQL] Verification of the corrupt column in...

2018-11-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23070 L9oks good to me. I or someone else should take a closer look tho. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #23070: [SPARK-26099][SQL] Verification of the corrupt column in...

2018-11-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23070 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23070: [SPARK-26099][SQL] Verification of the corrupt column in...

2018-11-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23070 add to whitelist --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23064: [MINOR][SQL] Fix typo in CTAS plan database string

2018-11-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23064 Merged to master, branch-2.4 and branch-2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pyspark.me...

2018-11-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23055 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23063: [SPARK-26033][PYTHON][TESTS] Break large ml/tests.py fil...

2018-11-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23063 Will merge this one tomorrow if this is not merged till then. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #23050: [SPARK-26079][sql] Ensure listener event delivery in Str...

2018-11-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23050 Merged to master and branch-2.4. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #23048: transform DenseVector x DenseVector sqdist from i...

2018-11-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23048#discussion_r234399421 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala --- @@ -370,14 +370,19 @@ object Vectors { case (v1

[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23055#discussion_r234398745 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner

[GitHub] spark issue #23059: [SPARK-26091][SQL] Upgrade to 2.3.4 for Hive Metastore C...

2018-11-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23059 Looks good. Adding @gatorsmile and @wangyum --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

<    1   2   3   4   5   6   7   8   9   10   >