Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23111
Hey all, I will merge this in few days if there's no more comments. It's
going to speed up the tests roughly 12 ~ 15 mins
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23119
LGTM too
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23080#discussion_r235830894
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala
---
@@ -377,6 +377,8 @@ final class DataStreamReader private
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23111
Yea, the improvement looks persistent:
`Tests passed in 1027 seconds`
---
-
To unsubscribe, e-mail: reviews
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23117
It's not urgent :) so it's okay. Actually i'm on a vacation for a week as
well. Thanks for taking a look @shaneknapp
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23098#discussion_r235830558
--- Diff: bin/load-spark-env.cmd ---
@@ -21,37 +21,42 @@ rem This script loads spark-env.cmd if it exists, and
ensures it is only loaded
rem spark
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23109
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23111
cc @rxin, @BryanCutler, @squito. This decreases elapsed time (even faster
then before splitting the tests
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23111
Oh? it drastically decreases from, for instance,
```
Tests passed in 1770 seconds
```
to
```
Tests passed in 1171 seconds
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23111
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23117
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23117#discussion_r235660674
--- Diff: dev/run-tests.py ---
@@ -594,7 +651,18 @@ def main():
modules_with_python_tests = [m for m in test_modules
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23118
Haha today's not holiday here :D.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23109
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23117#discussion_r235644712
--- Diff: dev/run-tests.py ---
@@ -594,7 +651,18 @@ def main():
modules_with_python_tests = [m for m in test_modules
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23111
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23117
cc @rxin, @JoshRosen, @shaneknapp, @gatorsmile, @BryanCutler, @holdenk,
@felixcheung, @viirya, @ueshin, @icexelloss
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/23117
[WIP][SPARK-7721][INFRA] Run and generate test coverage report from Python
via Jenkins
## What changes were proposed in this pull request?
### Background
For the current
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23111
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23109
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23112
ok to test
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23112
Looks okay to go.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23109
retest this please
(the test failures look not persistent)
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23070
Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23080
LGTM except https://github.com/apache/spark/pull/23080/files#r235589426
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23080#discussion_r235589426
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala
---
@@ -216,8 +232,13 @@ class CSVOptions
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23080#discussion_r235589448
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala
---
@@ -227,7 +248,10 @@ class CSVOptions
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22979#discussion_r235588064
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVExprUtils.scala
---
@@ -79,4 +83,22 @@ object CSVExprUtils
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23027
Looks fine. @dongjoon-hyun WDYT?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23052
There are two more things to deal with:
https://github.com/apache/spark/pull/23052#issuecomment-440687200 comment
will still be valid - at least it should be double checked because
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23052
cc @cloud-fan as well
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23111
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22938#discussion_r235583851
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
---
@@ -240,16 +240,6 @@ class SQLQuerySuite extends QueryTest
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22938
Sorry for the late response. The change looks good to me in general but I
had one question.
---
-
To unsubscribe, e-mail
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22938#discussion_r235583559
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
---
@@ -1892,7 +1898,7 @@ class JsonSuite extends
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22938#discussion_r235583349
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
---
@@ -1892,7 +1898,7 @@ class JsonSuite extends
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22938#discussion_r235583315
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
---
@@ -1905,7 +1911,7 @@ class JsonSuite extends
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23111
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23113
> race condition explained in
https://issues.apache.org/jira/browse/SPARK-26019
How race condition happens? Can you clarify it in PR description. I think
@viirya's analysis is matc
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23098#discussion_r235572895
--- Diff: bin/load-spark-env.cmd ---
@@ -21,37 +21,42 @@ rem This script loads spark-env.cmd if it exists, and
ensures it is only loaded
rem spark
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23098#discussion_r235572737
--- Diff: bin/load-spark-env.cmd ---
@@ -21,37 +21,42 @@ rem This script loads spark-env.cmd if it exists, and
ensures it is only loaded
rem spark
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23098#discussion_r235572442
--- Diff: bin/load-spark-env.cmd ---
@@ -21,37 +21,42 @@ rem This script loads spark-env.cmd if it exists, and
ensures it is only loaded
rem spark
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/23111
[DO-NOT-MERGE] Increases default parallelism in PySpark tests
## What changes were proposed in this pull request?
I'm trying to see if increasing parallelism decreases elapsed time
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23078
Thanks. Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23102#discussion_r235570156
--- Diff: core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala
---
@@ -65,7 +65,7 @@ private[deploy] object DependencyUtils extends
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23102
@markpavey, can you write a test? I can run the test on Windows via
AppVeyor.
---
-
To unsubscribe, e-mail: reviews
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23055
Sorry, may I ask to take another look please? I thought it's quite a
straightforward fix by a consistent way of fixing
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23078
@JoshRosen, can you take a look please when you're available? it's quite
obvious to fix.
---
-
To unsubscribe, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23109
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23052
Also, it's not always for Parquet to write empty files. That does not write
empty files when data frames are created from emptyRDD (the one pointed out in
the PR link I gave). We should match
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23101
LGTM2
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23052
@MaxGekk I didn't mean to block this PR. Since we're going ahead for 3.0,
it should be good to match and fix the behaviours across data sources. For
instance, CSV should still be able to read
Github user HyukjinKwon commented on the issue:
https://github.com/apache/zeppelin/pull/3206
This fix is not released yet. This PR exactly fixes the problem you faced.
This fix will be available in the next release of Zeppelin.
---
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23098#discussion_r235322452
--- Diff: bin/load-spark-env.cmd ---
@@ -21,37 +21,42 @@ rem This script loads spark-env.cmd if it exists, and
ensures it is only loaded
rem spark
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23071
Thank you @MaxGekk.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22939
Sure!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23027
@wangyum, why did you close this?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22939
gentle ping, @felixcheung.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23070
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23071
I think we don't need this for now. Let's do this when more `from/to_...`
functions are added later. The amount of codes increases actually
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23080
@MaxGekk, let's rebase this one accordingly with encoding support.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23080#discussion_r235244407
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala
---
@@ -192,6 +192,20 @@ class CSVOptions
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23085
@mrandrewandrade, let's close this for now.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23078
@zsxwing can you take a look please when you're available
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23099
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23089
Merged to master and branch-2.4.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23091
Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r235226292
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23085
Because it's already documented. Also it brings maintnense overhead.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23087
reest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23004
Looks fine to me
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23085
It's already documented in official site anyway.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23085
I think it doesn't need to change. It's likely to be changed and we
wouldn't want to update this doc everytime we add new datasource
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r234838984
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23091
FYI hey @priancho IIRC, you proposed a similar change before in the mailing
list. I wasn't positive about that because I was thinking we should deprecate
`encoding` option at that time. It has
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23087
Looks fine to me.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23087#discussion_r234831263
--- Diff: pom.xml ---
@@ -2522,7 +2530,7 @@
com.puppycrawl.tools
checkstyle
-8.2
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22635
This is fixed in 2.4.0 and your issue is when 2.3.1 -> 2.3.2. It's not
related.
---
-
To unsubscribe, e-mail: revi
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22635
How does it related with the JIRA? looks not quite related from a cursory
look. Please leave some analysis next time or at least testing it before/after
the specific commit. Let me take a look
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23080
Ah, also, `CsvParser.beginParsing` takes an additional argument `Charset`.
It should rather be easily able to support encoding in `multiLine`. @MaxGekk,
would you be able to find some time
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23080#discussion_r234476318
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala
---
@@ -192,6 +192,20 @@ class CSVOptions
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23080#discussion_r234475595
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala
---
@@ -192,6 +192,20 @@ class CSVOptions
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23080#discussion_r234475228
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala
---
@@ -192,6 +192,20 @@ class CSVOptions
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23077
Merged to master.
Thanks for reviewing this, @BryanCutler and @srowen.
---
-
To unsubscribe, e-mail: reviews
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23055
adding @BryanCutler and @ueshin as well FYI.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23078
cc @BryanCutler.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/23078
[SPARK-26106][PYTHON] Prioritizes ML unittests over the doctests in PySpark
## What changes were proposed in this pull request?
Arguably, unittests usually takes longer then doctests
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23077
cc @BryanCutler.
BTW, Bryan, do you have some time to work on the `has_numpy` stuff that we
talked about before? I was thinking we should specify the package version and
produce
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/23077
[SPARK-25344][PYTHON] Clean unittest2 imports up that were added for Python
2.6 before
## What changes were proposed in this pull request?
Currently, some of PySpark tests sill assume
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23063
Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23070
L9oks good to me. I or someone else should take a closer look tho.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23070
ok to test
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23070
add to whitelist
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23064
Merged to master, branch-2.4 and branch-2.3.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23055
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23063
Will merge this one tomorrow if this is not merged till then.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23050
Merged to master and branch-2.4.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23048#discussion_r234399421
--- Diff:
mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala ---
@@ -370,14 +370,19 @@ object Vectors {
case (v1
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r234398745
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23059
Looks good. Adding @gatorsmile and @wangyum
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
201 - 300 of 12680 matches
Mail list logo