[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/20909 @HyukjinKwon This PR is mostly obsolete. I will close it and re-open something smaller... maybe a one-line documentation change to handle the missing UDF case for those who build with sbt. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20909 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91612/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20909 Build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20909 **[Test build #91612 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91612/testReport)** for PR 20909 at commit [`db14acb`](https://github.com/apache/spark/commit/db14acbb3a90c9da184fc9c909640e07100c38fa). * This patch **fails from timeout after a configured wait of \`300m\`**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20909 **[Test build #91612 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91612/testReport)** for PR 20909 at commit [`db14acb`](https://github.com/apache/spark/commit/db14acbb3a90c9da184fc9c909640e07100c38fa). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20909 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20909 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20909 cc @viirya too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20909 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20909 **[Test build #88863 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88863/testReport)** for PR 20909 at commit [`db14acb`](https://github.com/apache/spark/commit/db14acbb3a90c9da184fc9c909640e07100c38fa). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20909 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88863/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20909 **[Test build #88863 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88863/testReport)** for PR 20909 at commit [`db14acb`](https://github.com/apache/spark/commit/db14acbb3a90c9da184fc9c909640e07100c38fa). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/20909 > can you investigate and try to explicitly skip some doctests conditionally? @HyukjinKwon I will take a look to see how that can be done. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20909 Yea, I know the hidden output in the console and I believe that's a known issue. In my case, I made such change before - https://github.com/apache/spark/pull/20487. Also see the discussion in https://github.com/apache/spark/pull/20465. The thing is, it needs duplicated changes to print out the warnings and that's why I have been hesitant to fix related code paths. Actually, I was thinking we should resemble what we do in `streaming.py` to skip the doctests although I haven't taken a close look to check if we can control function level yet. I know we use `# doctest: +SKIP` here and there in particular with Pandas / Arrow. I think basically we should remove this and do the same things to test them when possible. I am sure on this too (and told few committers before that I am thinking in this way). Let me cc @cloud-fan, @ueshin and @BryanCutler FYI. For the best, can you investigate and try to explicitly skip some doctests conditionally? For console output from our test script, I think we can do this separately (but please leave a comment as a todo or JIRA). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/20909 @HyukjinKwon That makes sense. Note that when the tests are run using python/run-tests, run-tests.py steals stdout and stderr. I would need to make a small change to run-tests.py to detect when a tests is skipped (maybe through retcode) and print the message (from test test's stdout or stderr). One other thing. I checked readwriter.py more closely, and there is only a single docstring test that requires Hive: >>> spark.read.table('tmpTable').dtypes I added # doctest: +SKIP to that one line and all the tests passed. Rather than sometimes skipping all readerwriter tests, maybe we should just always skip that single test. udf.py, on the other hand, has lots of docstring tests that require the test udf files. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20909 @holdenk, I just saw https://issues.apache.org/jira/browse/SPARK-23853. I think this PR could fix it together if I understood correctly :-). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20909 Yes, I feel sure that's more consistent and correct. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/20909 > actually have been thinking about skipping and proceeding the tests in pyspark/streaming/tests.py with an explicit message as well. Can we skip and continue the tests? Hi @HyukjinKwon I just want to verify your comment: if hive assembly is missing, readwriter.py should not fail, but instead skip running its doctests. Also, in that case, there should be a message indicating that the tests were skipped. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20909 > I modeled this after pyspark/streaming/tests.py, which checks for prereqs and raises exceptions with a useful message so one can get past the error (although pyspark/streaming/tests.py only checks for its own prereqs, not those required by streaming docstring tests). I actually have been thinking about skipping and proceeding the tests in `pyspark/streaming/tests.py` with an explicit message as well. Can we skip and continue the tests? I think we basically should just skip the tests. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20909 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88736/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20909 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20909 **[Test build #88736 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88736/testReport)** for PR 20909 at commit [`0f830e2`](https://github.com/apache/spark/commit/0f830e2144da50c8b2a5239a61fa08fae40384e0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20909 **[Test build #88736 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88736/testReport)** for PR 20909 at commit [`0f830e2`](https://github.com/apache/spark/commit/0f830e2144da50c8b2a5239a61fa08fae40384e0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/20909 @HyukjinKwon The HiveSparkSubmitTests error message is [here](https://gist.github.com/bersprockets/746ab6f2353bd5b54404d5cbb1df7403) I propose the following: - Fix HiveSparkSubmitTests according to @felixcheung's suggestion. After that fix, tests.py won't need the checks. - Move the Hive assembly check to pyspark.sql.readwriter's _test() function. - Move the test UDF check to pyspark.sql.udf's _test() function. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20909 maybe the approach in hiveContextSQLTests is better for HiveSparkSubmitTests? https://github.com/bersprockets/spark/blob/8a965a51be6190f0db864ca7b1ba37269b3a55bc/python/pyspark/sql/tests.py#L3112 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20909 @bersprockets, do you have the error messages? I could (will) check it by myself in the following week but want to take a quick look if you already have them. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/20909 Thanks @felixcheung . Turns out HiveSparkSubmitTests will fail if Spark is not built with the hive profile (AssertionError: 0 != 1). In addition, at least one pyspark.sql.readwriter docstring test fails. This PR uses pyspark.sql.tests as the "leader of the pack" (run-tests.py gives it priortity 0 amongst the pyspark.sql tests) to check for prerequisites for its own tests as well as the sql docstring tests. The docstring tests can't make these checks. I modeled this after pyspark/streaming/tests.py, which checks for prereqs and raises exceptions with a useful message so one can get past the error (although pyspark/streaming/tests.py only checks for its own prereqs, not those required by streaming docstring tests). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20909 **[Test build #88606 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88606/testReport)** for PR 20909 at commit [`8a965a5`](https://github.com/apache/spark/commit/8a965a51be6190f0db864ca7b1ba37269b3a55bc). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20909 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88606/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20909 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20909 **[Test build #88606 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88606/testReport)** for PR 20909 at commit [`8a965a5`](https://github.com/apache/spark/commit/8a965a51be6190f0db864ca7b1ba37269b3a55bc). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20909: [SPARK-23776][python][test] Check for needed components/...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20909 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org