[GitHub] spark issue #14861: [SPARK-17287] [PYSPARK] Add recursive kwarg to Python Sp...
Github user jpiper commented on the issue: https://github.com/apache/spark/pull/14861 @holdenk sorry I've been on vacation! I'll fix the typo and merge in the latest master for you later today or tomorrow. Cheers! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15398 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15398 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66561/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15398 **[Test build #66561 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66561/consoleFull)** for PR 15398 at commit [`64df4cf`](https://github.com/apache/spark/commit/64df4cfc730dbc8c8085a414e620036dcbc92f3e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15365: [SPARK-17157][SPARKR]: Add multiclass logistic regressio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15365 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66563/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15365: [SPARK-17157][SPARKR]: Add multiclass logistic regressio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15365 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15365: [SPARK-17157][SPARKR]: Add multiclass logistic regressio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15365 **[Test build #66563 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66563/consoleFull)** for PR 15365 at commit [`1921221`](https://github.com/apache/spark/commit/1921221964131b9cf8ba500b2dcf2f2219d3b20c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11211: [SPARK-13330][PYSPARK] PYTHONHASHSEED is not propgated t...
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/11211 For what its worth it seems like this has maybe caused issues in the past too - https://issues.apache.org/jira/browse/SPARK-12100 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15375: [SPARK-17790] Support for parallelizing R data.frame lar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15375 **[Test build #66565 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66565/consoleFull)** for PR 15375 at commit [`766d903`](https://github.com/apache/spark/commit/766d9031f9b4629acb56bde73ab604f14eb8c6e0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15354 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66558/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15354 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15354 **[Test build #66558 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66558/consoleFull)** for PR 15354 at commit [`ecdac76`](https://github.com/apache/spark/commit/ecdac7640194e82ddde222572275ad7987e2bc65). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15297: [WIP][SPARK-9862]Handling data skew
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15297 **[Test build #66564 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66564/consoleFull)** for PR 15297 at commit [`5786e22`](https://github.com/apache/spark/commit/5786e22132c9e80f03ce45e928d1f2de962cdd16). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15297: [WIP][SPARK-9862]Handling data skew
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15297 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66564/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15297: [WIP][SPARK-9862]Handling data skew
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15297 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15297: [WIP][SPARK-9862]Handling data skew
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15297 **[Test build #66564 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66564/consoleFull)** for PR 15297 at commit [`5786e22`](https://github.com/apache/spark/commit/5786e22132c9e80f03ce45e928d1f2de962cdd16). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15365: [SPARK-17157][SPARKR]: Add multiclass logistic regressio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15365 **[Test build #66563 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66563/consoleFull)** for PR 15365 at commit [`1921221`](https://github.com/apache/spark/commit/1921221964131b9cf8ba500b2dcf2f2219d3b20c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15297: [WIP][SPARK-9862]Handling data skew
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15297 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15400: [SPARK-11272] [Web UI] Add support for downloading event...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15400 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66554/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15400: [SPARK-11272] [Web UI] Add support for downloading event...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15400 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15400: [SPARK-11272] [Web UI] Add support for downloading event...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15400 **[Test build #66554 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66554/consoleFull)** for PR 15400 at commit [`b02e25d`](https://github.com/apache/spark/commit/b02e25d8c16e7f6cf4ce41ee9d4c526d6dc5f902). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double datatypes
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/15314 @sethah @srowen I think we can firstly work on this PR to support weightCol numeric types and by the way fix the bug in LabelCol. And then, open another jira to discuss whether cast weightCol and labelCol in `Predictor` or not. I agree with sethah's inclination that casting should be a commen method and each algo only need to deal with `DoubleType`. What's your opinion? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15400: [SPARK-11272] [Web UI] Add support for downloading event...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15400 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66553/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15400: [SPARK-11272] [Web UI] Add support for downloading event...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15400 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15400: [SPARK-11272] [Web UI] Add support for downloading event...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15400 **[Test build #66553 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66553/consoleFull)** for PR 15400 at commit [`ef1f82b`](https://github.com/apache/spark/commit/ef1f82b7609a241e57e670d505e5a7b982b55960). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #8318: [SPARK-1267][PYSPARK] Adds pip installer for pyspark
Github user rgbkrk commented on the issue: https://github.com/apache/spark/pull/8318 > How will it work if users want to run a different version of PySpark from a different version of Spark (maybe something they installed locally)? How can they easily swap that out? We don't want this making it harder to use Spark against a real cluster because the version you got from pip is wrong. They have to deal with normal Python packaging semantics. Right now, _not_ making it pip installable and importable actually makes it harder for us. We then rely on [findspark](https://github.com/minrk/findspark) to resolve the package (plus some amount of ritual to start the JVM...) In case you're wondering, yes I use Spark against a real live large cluster and so do users I support. > Can we make an account that's shared by all the committers somehow? You can. However, it's easier to give access rights to each individual on PyPI. > Can we sign releases? Yes, you can GPG sign them. > In particular, does anyone have examples of other ASF projects that publish to PyPI? [libcloud](https://libcloud.apache.org/) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15401: [SPARK-17782][STREAMING][KAFKA] alternative eliminate ra...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15401 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15401: [SPARK-17782][STREAMING][KAFKA] alternative eliminate ra...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15401 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66562/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15401: [SPARK-17782][STREAMING][KAFKA] alternative eliminate ra...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15401 **[Test build #66562 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66562/consoleFull)** for PR 15401 at commit [`eae5ba1`](https://github.com/apache/spark/commit/eae5ba14b0ddf68dd77dd5f5ab3eaff73643f9fc). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15377 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66556/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15377 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15377 **[Test build #66556 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66556/consoleFull)** for PR 15377 at commit [`dc6951d`](https://github.com/apache/spark/commit/dc6951d925305419412c9f769aa423006028dc2b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15399 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66560/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15399 **[Test build #66560 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66560/consoleFull)** for PR 15399 at commit [`a832760`](https://github.com/apache/spark/commit/a8327603a21dd0d3a49f2d689ea9178ce892e7a8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15399 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15401: [SPARK-17782][STREAMING][KAFKA] alternative eliminate ra...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15401 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66559/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15401: [SPARK-17782][STREAMING][KAFKA] alternative eliminate ra...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15401 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15401: [SPARK-17782][STREAMING][KAFKA] alternative eliminate ra...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15401 **[Test build #66559 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66559/consoleFull)** for PR 15401 at commit [`143bf12`](https://github.com/apache/spark/commit/143bf12ba2f3826c982a5f6e83a8986ddaea0b93). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15346: [SPARK-17741][SQL] Grammar to parse top level and nested...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/15346 @hvanhovell Could you please look at this? Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15401: [SPARK-17782][STREAMING][KAFKA] alternative eliminate ra...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15401 **[Test build #66562 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66562/consoleFull)** for PR 15401 at commit [`eae5ba1`](https://github.com/apache/spark/commit/eae5ba14b0ddf68dd77dd5f5ab3eaff73643f9fc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user jodersky commented on the issue: https://github.com/apache/spark/pull/15398 Every non-java-pattern character is quoted now, updated the StringUtils test suite --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15401: [SPARK-17782][STREAMING][KAFKA] alternative eliminate ra...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15401 **[Test build #66559 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66559/consoleFull)** for PR 15401 at commit [`143bf12`](https://github.com/apache/spark/commit/143bf12ba2f3826c982a5f6e83a8986ddaea0b93). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15387: [SPARK-17782][STREAMING][KAFKA] eliminate race condition...
Github user koeninger commented on the issue: https://github.com/apache/spark/pull/15387 Let me know if you guys like that alternative PR better --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15399 **[Test build #66560 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66560/consoleFull)** for PR 15399 at commit [`a832760`](https://github.com/apache/spark/commit/a8327603a21dd0d3a49f2d689ea9178ce892e7a8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15398 **[Test build #66561 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66561/consoleFull)** for PR 15398 at commit [`64df4cf`](https://github.com/apache/spark/commit/64df4cfc730dbc8c8085a414e620036dcbc92f3e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15401: [SPARK-17782][STREAMING][KAFKA] alternative elimi...
GitHub user koeninger opened a pull request: https://github.com/apache/spark/pull/15401 [SPARK-17782][STREAMING][KAFKA] alternative eliminate race condition of poll twice ## What changes were proposed in this pull request? Alternative approach to https://github.com/apache/spark/pull/15387 You can merge this pull request into a Git repository by running: $ git pull https://github.com/koeninger/spark-1 SPARK-17782-alt Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15401.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15401 commit 1fc5863db88cac9dfd0be09318c4ca8779a51682 Author: cody koeningerDate: 2016-10-07T01:08:01Z [SPARK-17782][STREAMING][KAFKA] eliminate race condition of poll being called twice and moving position commit aca55de0624f5634acb04f91636dce79af875fab Author: cody koeninger Date: 2016-10-07T01:20:43Z [SPARK-17782][STREAMING][KAFKA] whitespace fix commit 143bf12ba2f3826c982a5f6e83a8986ddaea0b93 Author: cody koeninger Date: 2016-10-08T02:55:44Z [SPARK-17782][STREAMING][KAFKA] alternative to fixing poll(0) returning messages --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15399 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15399 **[Test build #66557 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66557/consoleFull)** for PR 15399 at commit [`26f4b25`](https://github.com/apache/spark/commit/26f4b2519b8cd3cc70abbb255605b0149ddcd428). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15399 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66557/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15398 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15398 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66549/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15398 **[Test build #66549 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66549/consoleFull)** for PR 15398 at commit [`c76fad3`](https://github.com/apache/spark/commit/c76fad30a5f41b8da8a6a90d568bf81d3fff84f8). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15398 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66552/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15398 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15398 **[Test build #66552 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66552/consoleFull)** for PR 15398 at commit [`0610dc6`](https://github.com/apache/spark/commit/0610dc66060f0338ce4ee1ff2f000423b1365cda). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15389: [SPARK-17817][PySpark] PySpark RDD Repartitioning Result...
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15389 This looks good to me, one alternative is that we could try and fix it by doing better shuffling of the batched chunks but this wouldn't work well for increasing the number of partitions. Maybe @davies can take a look and see if there is anything else that needs to be done? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15389: [SPARK-17817][PySpark] PySpark RDD Repartitioning...
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/15389#discussion_r82493650 --- Diff: python/pyspark/rdd.py --- @@ -2029,7 +2030,11 @@ def coalesce(self, numPartitions, shuffle=False): >>> sc.parallelize([1, 2, 3, 4, 5], 3).coalesce(1).glom().collect() [[1, 2, 3, 4, 5]] """ -jrdd = self._jrdd.coalesce(numPartitions, shuffle) +if shuffle: +data_java_rdd = self._to_java_object_rdd().coalesce(numPartitions, shuffle) +jrdd = self.ctx._jvm.SerDeUtil.javaToPython(data_java_rdd) --- End diff -- Yah that seems close enough we don't need to worry (and for the big cases presumably the impact of having better balanced partitions is well worth the slight overhead). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15365: [SPARK-17157][SPARKR]: Add multiclass logistic re...
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/15365#discussion_r82493602 --- Diff: R/pkg/R/mllib.R --- @@ -647,6 +654,195 @@ setMethod("predict", signature(object = "KMeansModel"), predict_internal(object, newData) }) +#' Logistic Regression Model +#' +#' Fits an logistic regression model against a Spark DataFrame. It supports "binomial": Binary logistic regression +#' with pivoting; "multinomial": Multinomial logistic (softmax) regression without pivoting, similar to glmnet. +#' Users can print, make predictions on the produced model and save the model to the input path. +#' +#' @param data SparkDataFrame for training +#' @param formula A symbolic description of the model to be fitted. Currently only a few formula +#'operators are supported, including '~', '.', ':', '+', and '-'. +#' @param regParam the regularization parameter. Default is 0.0. +#' @param elasticNetParam the ElasticNet mixing parameter. For alpha = 0, the penalty is an L2 penalty. +#'For alpha = 1, it is an L1 penalty. For 0 < alpha < 1, the penalty is a combination +#'of L1 and L2. Default is 0.0 which is an L2 penalty. +#' @param maxIter maximum iteration number. +#' @param tol convergence tolerance of iterations. +#' @param fitIntercept whether to fit an intercept term. Default is TRUE. +#' @param family the name of family which is a description of the label distribution to be used in the model. +#' Supported options: +#' - "auto": Automatically select the family based on the number of classes: +#' If numClasses == 1 || numClasses == 2, set to "binomial". +#' Else, set to "multinomial". +#' - "binomial": Binary logistic regression with pivoting. +#' - "multinomial": Multinomial logistic (softmax) regression without pivoting. +#' Default is "auto". +#' @param standardization whether to standardize the training features before fitting the model. The coefficients +#'of models will be always returned on the original scale, so it will be transparent for +#'users. Note that with/without standardization, the models should be always converged +#'to the same solution when no regularization is applied. Default is TRUE, same as glmnet. +#' @param threshold in binary classification, in range [0, 1]. If the estimated probability of class label 1 +#' is > threshold, then predict 1, else 0. A high threshold encourages the model to predict 0 +#' more often; a low threshold encourages the model to predict 1 more often. Note: Setting this with +#' threshold p is equivalent to setting thresholds (Array(1-p, p)). When threshold is set, any user-set +#' value for thresholds will be cleared. If both threshold and thresholds are set, then they must be +#' equivalent. Default is 0.5. +#' @param thresholds in multiclass (or binary) classification to adjust the probability of predicting each class. +#' Array must have length equal to the number of classes, with values > 0, excepting that at most one +#' value may be 0. The class with largest value p/t is predicted, where p is the original probability +#' of that class and t is the class's threshold. Note: When thresholds is set, any user-set +#' value for threshold will be cleared. If both threshold and thresholds are set, then they must be +#' equivalent. Default is NULL. +#' @param weightCol The weight column name. +#' @param aggregationDepth depth for treeAggregate (>= 2). If the dimensions of features or the number of partitions +#' are large, this param could be adjusted to a larger size. Default is 2. +#' @param ... additional arguments passed to the method. +#' @return \code{spark.logit} returns a fitted logistic regression model +#' @rdname spark.logit +#' @aliases spark.logit,SparkDataFrame,formula-method +#' @name spark.logit +#' @export +#' @examples +#' \dontrun{ +#' sparkR.session() +#' # binary logistic regression +#' label <- c(1.0, 1.0, 1.0, 0.0, 0.0) +#' feature <- c(1.1419053, 0.9194079, -0.9498666, -1.1069903, 0.2809776) +#' binary_data <- as.data.frame(cbind(label, feature)) +#' binary_df <- suppressWarnings(createDataFrame(binary_data)) --- End diff -- I see. Thanks for your explanation! --- If your project is set up for it, you can reply to this email and
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82492217 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -453,7 +534,11 @@ class SessionCatalog( val db = formatDatabaseName(name.database.getOrElse(currentDb)) val table = formatTableName(name.table) val relationAlias = alias.getOrElse(table) - if (name.database.isDefined || !tempTables.contains(table)) { + if (db == globalTempViewManager.database) { +globalTempViewManager.get(table).map { viewDef => + SubqueryAlias(relationAlias, viewDef, Some(name)) +}.getOrElse(throw new NoSuchTableException(db, table)) --- End diff -- In the `else` branch, we are just doing `SubqueryAlias(relationAlias, tempTables(table), Option(name))`. Later we should also make this branch and that branch consistent (by throwing NoSuchTableException). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82493489 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -37,39 +37,14 @@ import org.apache.spark.util.{MutableURLClassLoader, Utils} */ private[sql] class SharedState(val sparkContext: SparkContext) extends Logging { - /** - * Class for caching query results reused in future executions. - */ - val cacheManager: CacheManager = new CacheManager - - /** - * A listener for SQL-specific [[org.apache.spark.scheduler.SparkListenerEvent]]s. - */ - val listener: SQLListener = createListenerAndUI(sparkContext) - + // Load hive-site.xml into hadoopConf and determine the warehouse path we want to use, based on + // the config from both hive and Spark SQL. Finally set the warehouse config value to sparkConf. { val configFile = Utils.getContextOrSparkClassLoader.getResource("hive-site.xml") if (configFile != null) { sparkContext.hadoopConfiguration.addResource(configFile) } - } - - /** - * A catalog that interacts with external systems. - */ - lazy val externalCatalog: ExternalCatalog = -SharedState.reflect[ExternalCatalog, SparkConf, Configuration]( - SharedState.externalCatalogClassName(sparkContext.conf), - sparkContext.conf, - sparkContext.hadoopConfiguration) - - /** - * A classloader used to load all user-added jar. - */ - val jarClassLoader = new NonClosableMutableURLClassLoader( -org.apache.spark.util.Utils.getContextOrSparkClassLoader) - { // Set the Hive metastore warehouse path to the one we use val tempConf = new SQLConf --- End diff -- If this block does not use any `val` defined in this class, it is fine to move it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82493266 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -2433,31 +2433,65 @@ class Dataset[T] private[sql]( } /** - * Creates a temporary view using the given name. The lifetime of this + * Creates a local temporary view using the given name. The lifetime of this * temporary view is tied to the [[SparkSession]] that was used to create this Dataset. * + * Local temporary view is session-scoped. Its lifetime is the lifetime of the session that + * created it, i.e. it will be automatically dropped when the session terminates. It's not + * tied to any databases, i.e. we can't use `db1.view1` to reference a local temporary view. + * * @throws AnalysisException if the view name already exists * * @group basic * @since 2.0.0 */ @throws[AnalysisException] def createTempView(viewName: String): Unit = withPlan { -createViewCommand(viewName, replace = false) +createTempViewCommand(viewName, replace = false, global = false) } + + /** - * Creates a temporary view using the given name. The lifetime of this + * Creates a local temporary view using the given name. The lifetime of this * temporary view is tied to the [[SparkSession]] that was used to create this Dataset. * * @group basic * @since 2.0.0 */ def createOrReplaceTempView(viewName: String): Unit = withPlan { -createViewCommand(viewName, replace = true) +createTempViewCommand(viewName, replace = true, global = false) } - private def createViewCommand(viewName: String, replace: Boolean): CreateViewCommand = { + /** + * Creates a global temporary view using the given name. The lifetime of this + * temporary view is tied to this Spark application. + * + * Global temporary view is cross-session. Its lifetime is the lifetime of the Spark application, + * i.e. it will be automatically dropped when the application terminates. It's tied to a system + * preserved database `_global_temp`, and we must use the qualified name to refer a global temp + * view, e.g. `SELECT * FROM _global_temp.view1`. + * + * @throws TempTableAlreadyExistsException if the view name already exists + * + * @group basic + * @since 2.1.0 + */ + @throws[AnalysisException] + def createGlobalTempView(viewName: String): Unit = withPlan { +createTempViewCommand(viewName, replace = false, global = true) --- End diff -- What is the behavior of `createTempView("a.b")`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82493366 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -183,17 +183,19 @@ case class DropTableCommand( override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog -// If the command DROP VIEW is to drop a table or DROP TABLE is to drop a view -// issue an exception. -catalog.getTableMetadataOption(tableName).map(_.tableType match { - case CatalogTableType.VIEW if !isView => -throw new AnalysisException( - "Cannot drop a view with DROP TABLE. Please use DROP VIEW instead") - case o if o != CatalogTableType.VIEW && isView => -throw new AnalysisException( - s"Cannot drop a table with DROP VIEW. Please use DROP TABLE instead") - case _ => -}) +if (tableName.database.forall(catalog.databaseExists) && catalog.tableExists(tableName)) { --- End diff -- What will happen if the db is global_temp? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82279784 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/GlobalTempViewManager.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.catalog + +import javax.annotation.concurrent.GuardedBy + +import scala.collection.mutable + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.analysis.TempTableAlreadyExistsException +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.catalyst.util.StringUtils + + +/** + * A thread-safe manager for global temporary views, providing atomic operations to manage them, + * e.g. create, update, remove, etc. + * + * Note that, the view name is always case-sensitive here, callers are responsible to format the + * view name w.r.t. case-sensitive config. + * + * @param database The system preserved virtual database that keeps all the global temporary views. + */ +class GlobalTempViewManager(val database: String) { + + /** List of view definitions, mapping from view name to logical plan. */ + @GuardedBy("this") + private val viewDefinitions = new mutable.HashMap[String, LogicalPlan] + + /** + * Returns the global view definition which matches the given name, or None if not found. + */ + def get(name: String): Option[LogicalPlan] = synchronized { +viewDefinitions.get(name) + } + + /** + * Creates a global temp view, or issue an exception if the view already exists and + * `overrideIfExists` is false. + */ + def create( + name: String, + viewDefinition: LogicalPlan, + overrideIfExists: Boolean): Unit = synchronized { +if (!overrideIfExists && viewDefinitions.contains(name)) { + throw new TempTableAlreadyExistsException(name) +} +viewDefinitions.put(name, viewDefinition) + } + + /** + * Updates the global temp view if it exists, returns true if updated, false otherwise. + */ + def update( + name: String, + viewDefinition: LogicalPlan): Boolean = synchronized { +if (viewDefinitions.contains(name)) { + viewDefinitions.put(name, viewDefinition) + true +} else { + false +} + } + + /** + * Removes the global temp view if it exists, returns true if removed, false otherwise. + */ + def remove(name: String): Boolean = synchronized { +viewDefinitions.remove(name).isDefined + } + + /** + * Renames the global temp view if the source view exists and the destination view not exists, or + * issue an exception if the source view exists but the destination view already exists. Returns + * true if renamed, false otherwise. + */ + def rename(oldName: String, newName: String): Boolean = synchronized { +if (viewDefinitions.contains(oldName)) { + if (viewDefinitions.contains(newName)) { +throw new AnalysisException( + s"rename temporary view from '$oldName' to '$newName': destination view already exists") + } + + val viewDefinition = viewDefinitions(oldName) + viewDefinitions.remove(oldName) + viewDefinitions.put(newName, viewDefinition) + true +} else { + false +} --- End diff -- What is reason that failing to rename has two behavior (when source does not exist, we return false. But, when destination already exists, we thrown an error)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82299965 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -188,6 +196,11 @@ class SessionCatalog( def setCurrentDatabase(db: String): Unit = { val dbName = formatDatabaseName(db) +if (dbName == globalTempViewManager.database) { + throw new AnalysisException( +s"${globalTempViewManager.database} is a system preserved database, " + + "you cannot use it as current database.") --- End diff -- Seems it will be useful to let users know how to access temp views under this namespace. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82493491 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -37,39 +37,14 @@ import org.apache.spark.util.{MutableURLClassLoader, Utils} */ private[sql] class SharedState(val sparkContext: SparkContext) extends Logging { - /** - * Class for caching query results reused in future executions. - */ - val cacheManager: CacheManager = new CacheManager - - /** - * A listener for SQL-specific [[org.apache.spark.scheduler.SparkListenerEvent]]s. - */ - val listener: SQLListener = createListenerAndUI(sparkContext) - + // Load hive-site.xml into hadoopConf and determine the warehouse path we want to use, based on + // the config from both hive and Spark SQL. Finally set the warehouse config value to sparkConf. { val configFile = Utils.getContextOrSparkClassLoader.getResource("hive-site.xml") if (configFile != null) { sparkContext.hadoopConfiguration.addResource(configFile) } - } - - /** - * A catalog that interacts with external systems. - */ - lazy val externalCatalog: ExternalCatalog = -SharedState.reflect[ExternalCatalog, SparkConf, Configuration]( - SharedState.externalCatalogClassName(sparkContext.conf), - sparkContext.conf, - sparkContext.hadoopConfiguration) - - /** - * A classloader used to load all user-added jar. - */ - val jarClassLoader = new NonClosableMutableURLClassLoader( -org.apache.spark.util.Utils.getContextOrSparkClassLoader) - { // Set the Hive metastore warehouse path to the one we use val tempConf = new SQLConf --- End diff -- Can you double check? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82493318 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -380,6 +380,7 @@ class SparkSqlAstBuilder(conf: SQLConf) extends AstBuilder { tableIdent = visitTableIdentifier(ctx.tableIdentifier()), userSpecifiedSchema = Option(ctx.colTypeList()).map(createStructType), replace = ctx.REPLACE != null, + global = ctx.GLOBAL != null, --- End diff -- Can we add a test for this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82493591 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -94,6 +69,47 @@ private[sql] class SharedState(val sparkContext: SparkContext) extends Logging { } /** + * Class for caching query results reused in future executions. + */ + val cacheManager: CacheManager = new CacheManager + + /** + * A listener for SQL-specific [[org.apache.spark.scheduler.SparkListenerEvent]]s. + */ + val listener: SQLListener = createListenerAndUI(sparkContext) + + /** + * A catalog that interacts with external systems. + */ + val externalCatalog: ExternalCatalog = +SharedState.reflect[ExternalCatalog, SparkConf, Configuration]( + SharedState.externalCatalogClassName(sparkContext.conf), + sparkContext.conf, + sparkContext.hadoopConfiguration) + + /** + * A manager for global temporary views. + */ + val globalTempViewManager = { +// System preserved database should not exists in metastore. However it's hard to guarantee it +// for every session, because case-sensitivity differs. Here we always lowercase it to make our +// life easier. +val globalTempDB = sparkContext.conf.get(GLOBAL_TEMP_DATABASE).toLowerCase +if (externalCatalog.databaseExists(globalTempDB)) { + throw new SparkException( +s"$globalTempDB is a system preserved database, please rename your existing database " + + "to resolve the name conflict and launch your Spark application again.") --- End diff -- Should we also mention this conf? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82493274 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala --- @@ -125,6 +124,9 @@ class QueryExecution(val sparkSession: SparkSession, val logical: LogicalPlan) { .mkString("\t") } } +// SHOW TABLES in Hive only output table names, while ours outputs database, table name, isTemp. +case command: ExecutedCommandExec if command.cmd.isInstanceOf[ShowTablesCommand] => + command.executeCollect().map(_.getString(1)) --- End diff -- Why do we only hit this in this pr? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15365: [SPARK-17157][SPARKR]: Add multiclass logistic regressio...
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15365 I just pick the name for simplicity. Hope to receive feedback from the community and I can make changes accordingly. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82493549 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/GlobalTempViewSuite.scala --- @@ -0,0 +1,168 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution + +import org.apache.spark.sql.{AnalysisException, QueryTest, Row} +import org.apache.spark.sql.catalog.Table +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.analysis.NoSuchTableException +import org.apache.spark.sql.test.SharedSQLContext +import org.apache.spark.sql.types.StructType + +class GlobalTempViewSuite extends QueryTest with SharedSQLContext { + import testImplicits._ + + override protected def beforeAll(): Unit = { +super.beforeAll() +globalTempDB = spark.sharedState.globalTempViewManager.database + } + + private var globalTempDB: String = _ + + test("basic semantic") { +sql("CREATE GLOBAL TEMP VIEW src AS SELECT 1, 'a'") + +// If there is no database in table name, we should try local temp view first, if not found, +// try table/view in current database, which is "default" in this case. So we expect +// NoSuchTableException here. +intercept[NoSuchTableException](spark.table("src")) + +// Use qualified name to refer to the global temp view explicitly. +checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a")) + +// Table name without database will never refer to a global temp view. +intercept[NoSuchTableException](sql("DROP VIEW src")) + +sql(s"DROP VIEW $globalTempDB.src") +// The global temp view should be dropped successfully. +intercept[NoSuchTableException](spark.table(s"$globalTempDB.src")) + +// We can also use Dataset API to create global temp view +Seq(1 -> "a").toDF("i", "j").createGlobalTempView("src") +checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a")) + +// Use qualified name to rename a global temp view. +sql(s"ALTER VIEW $globalTempDB.src RENAME TO src2") --- End diff -- Should we do `sql(s"ALTER VIEW $globalTempDB.src RENAME TO $globalTempDB.src2")` since we always require operating global temp views using qualified identifiers? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82272178 --- Diff: python/pyspark/sql/catalog.py --- @@ -181,6 +181,22 @@ def dropTempView(self, viewName): """ self._jcatalog.dropTempView(viewName) +@since(2.1) +def dropGlobalTempView(self, viewName): +"""Drops the global temporary view with the given view name in the catalog. +If the view has been cached before, then it will also be uncached. + +>>> spark.createDataFrame([(1, 1)]).createGlobalTempView("my_table") +>>> spark.table("global_temp.my_table").collect() +[Row(_1=1, _2=1)] +>>> spark.catalog.dropGlobalTempView("my_table") +>>> spark.table("global_temp.my_table") # doctest: +IGNORE_EXCEPTION_DETAIL +Traceback (most recent call last): +... +AnalysisException: ... +""" --- End diff -- I think this bad case will end up in the python doc. Can we move this test to the `tests.py` file (it is fine to do it in a follow-up pr)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82493562 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/GlobalTempViewSuite.scala --- @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution + +import org.apache.spark.sql.{AnalysisException, QueryTest, Row} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.analysis.NoSuchTableException +import org.apache.spark.sql.test.SharedSQLContext +import org.apache.spark.sql.types.StructType + +class GlobalTempViewSuite extends QueryTest with SharedSQLContext { + import testImplicits._ + + override protected def beforeAll(): Unit = { +super.beforeAll() +globalTempDB = spark.sharedState.globalTempDB + } + + private var globalTempDB: String = _ + + test("basic semantic") { +sql("CREATE GLOBAL TEMP VIEW src AS SELECT 1, 'a'") + +// If there is no database in table name, we should try local temp view first, if not found, +// try table/view in current database, which is "default" in this case. So we expect +// NoSuchTableException here. +intercept[NoSuchTableException](spark.table("src")) + +// Use qualified name to refer to the global temp view explicitly. +checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a")) + +// Table name without database will never refer to a global temp view. +intercept[NoSuchTableException](sql("DROP VIEW src")) + +sql(s"DROP VIEW $globalTempDB.src") +// The global temp view should be dropped successfully. +intercept[NoSuchTableException](spark.table(s"$globalTempDB.src")) + +// We can also use Dataset API to create global temp view +Seq(1 -> "a").toDF("i", "j").createGlobalTempView("src") +checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a")) + +// Use qualified name to rename a global temp view. +sql(s"ALTER VIEW $globalTempDB.src RENAME TO src2") +intercept[NoSuchTableException](spark.table(s"$globalTempDB.src")) +checkAnswer(spark.table(s"$globalTempDB.src2"), Row(1, "a")) + +// Use qualified name to alter a global temp view. +sql(s"ALTER VIEW $globalTempDB.src2 AS SELECT 2, 'b'") +checkAnswer(spark.table(s"$globalTempDB.src2"), Row(2, "b")) + +// We can also use Catalog API to drop global temp view +spark.catalog.dropGlobalTempView("src2") +intercept[NoSuchTableException](spark.table(s"$globalTempDB.src2")) + } + + test("global temp view database should be preserved") { +val e = intercept[AnalysisException](sql(s"CREATE DATABASE $globalTempDB")) +assert(e.message.contains("system preserved database")) + +val e2 = intercept[AnalysisException](sql(s"USE $globalTempDB")) +assert(e2.message.contains("system preserved database")) + } + + test("CREATE TABLE LIKE should work for global temp view") { +try { + sql("CREATE GLOBAL TEMP VIEW src AS SELECT 1 AS a, '2' AS b") + sql(s"CREATE TABLE cloned LIKE ${globalTempDB}.src") + val tableMeta = spark.sessionState.catalog.getTableMetadata(TableIdentifier("cloned")) + assert(tableMeta.schema == new StructType().add("a", "int", false).add("b", "string", false)) +} finally { + spark.catalog.dropGlobalTempView("src") + sql("DROP TABLE default.cloned") +} + } + + test("list global temp views") { +try { + sql("CREATE GLOBAL TEMP VIEW v1 AS SELECT 3, 4") + sql("CREATE TEMP VIEW v2 AS SELECT 1, 2") + + checkAnswer(sql(s"SHOW TABLES IN $globalTempDB"), +Row(globalTempDB, "v1", true) :: +Row("", "v2", true) :: Nil) + + assert(spark.catalog.listTables(globalTempDB).collect().toSeq.map(_.name) == Seq("v1", "v2")) +} finally { +
[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14897#discussion_r82493357 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -183,17 +183,19 @@ case class DropTableCommand( override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog -// If the command DROP VIEW is to drop a table or DROP TABLE is to drop a view -// issue an exception. -catalog.getTableMetadataOption(tableName).map(_.tableType match { - case CatalogTableType.VIEW if !isView => -throw new AnalysisException( - "Cannot drop a view with DROP TABLE. Please use DROP VIEW instead") - case o if o != CatalogTableType.VIEW && isView => -throw new AnalysisException( - s"Cannot drop a table with DROP VIEW. Please use DROP TABLE instead") - case _ => -}) +if (tableName.database.forall(catalog.databaseExists) && catalog.tableExists(tableName)) { --- End diff -- I feel `forall` is not easy to understand when you are using it with a option. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15354 **[Test build #66558 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66558/consoleFull)** for PR 15354 at commit [`ecdac76`](https://github.com/apache/spark/commit/ecdac7640194e82ddde222572275ad7987e2bc65). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15354 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15398 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66547/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15399 **[Test build #66557 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66557/consoleFull)** for PR 15399 at commit [`26f4b25`](https://github.com/apache/spark/commit/26f4b2519b8cd3cc70abbb255605b0149ddcd428). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15371: [SPARK-17816] [Core] Fix ConcurrentModificationException...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15371 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15371: [SPARK-17816] [Core] Fix ConcurrentModificationException...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15371 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66545/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/15399 Retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15371: [SPARK-17816] [Core] Fix ConcurrentModificationException...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15371 **[Test build #66545 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66545/consoleFull)** for PR 15371 at commit [`1526ea6`](https://github.com/apache/spark/commit/1526ea6ffe4ec2a9ef2b0f6d7ec9261c6be06b8e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15398 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15398 **[Test build #66547 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66547/consoleFull)** for PR 15398 at commit [`5f190eb`](https://github.com/apache/spark/commit/5f190eb49e2a70131637a1c439a73066cf612069). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14690 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14690 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66548/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14690 **[Test build #66548 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66548/consoleFull)** for PR 14690 at commit [`7b788d1`](https://github.com/apache/spark/commit/7b788d1a8c5a3abe926d11864f042f1fe31d4fa3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14426: [SPARK-16475][SQL] Broadcast Hint for SQL Queries
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14426 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14426: [SPARK-16475][SQL] Broadcast Hint for SQL Queries
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14426 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66546/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14426: [SPARK-16475][SQL] Broadcast Hint for SQL Queries
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14426 **[Test build #66546 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66546/consoleFull)** for PR 14426 at commit [`778cede`](https://github.com/apache/spark/commit/778cede9b4c1fe780386fcafc6e7930df906bef3). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class Hint(name: String, parameters: Seq[String], child: LogicalPlan) extends UnaryNode ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15399 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15399 **[Test build #66555 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66555/consoleFull)** for PR 15399 at commit [`26f4b25`](https://github.com/apache/spark/commit/26f4b2519b8cd3cc70abbb255605b0149ddcd428). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15399 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66555/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15297: [WIP][SPARK-9862]Handling data skew
Github user scwf commented on the issue: https://github.com/apache/spark/pull/15297 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15389: [SPARK-17817][PySpark] PySpark RDD Repartitioning Result...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15389 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66550/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15389: [SPARK-17817][PySpark] PySpark RDD Repartitioning Result...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15389 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15389: [SPARK-17817][PySpark] PySpark RDD Repartitioning Result...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15389 **[Test build #66550 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66550/consoleFull)** for PR 15389 at commit [`27d7c84`](https://github.com/apache/spark/commit/27d7c84012174037f2e0f98992a534dda8084589). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15377 **[Test build #66556 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66556/consoleFull)** for PR 15377 at commit [`dc6951d`](https://github.com/apache/spark/commit/dc6951d925305419412c9f769aa423006028dc2b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15377#discussion_r82492566 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2474,25 +2474,36 @@ private[spark] class CallerContext( val context = "SPARK_" + from + appIdStr + appAttemptIdStr + jobIdStr + stageIdStr + stageAttemptIdStr + taskIdStr + taskAttemptNumberStr + private var callerContextSupported: Boolean = true + /** * Set up the caller context [[context]] by invoking Hadoop CallerContext API of * [[org.apache.hadoop.ipc.CallerContext]], which was added in hadoop 2.8. */ def setCurrentContext(): Boolean = { -var succeed = false -try { - // scalastyle:off classforname - val callerContext = Class.forName("org.apache.hadoop.ipc.CallerContext") - val Builder = Class.forName("org.apache.hadoop.ipc.CallerContext$Builder") - // scalastyle:on classforname - val builderInst = Builder.getConstructor(classOf[String]).newInstance(context) - val hdfsContext = Builder.getMethod("build").invoke(builderInst) - callerContext.getMethod("setCurrent", callerContext).invoke(null, hdfsContext) - succeed = true -} catch { - case NonFatal(e) => logInfo("Fail to set Spark caller context", e) +if (!callerContextSupported) { + false +} else { + try { +// scalastyle:off classforname +val callerContext = Class.forName("org.apache.hadoop.ipc.CallerContext") +val builder = Class.forName("org.apache.hadoop.ipc.CallerContext$Builder") +// scalastyle:on classforname +val builderInst = builder.getConstructor(classOf[String]).newInstance(context) +val hdfsContext = builder.getMethod("build").invoke(builderInst) +callerContext.getMethod("setCurrent", callerContext).invoke(null, hdfsContext) +true + } catch { +case e: ClassNotFoundException => + logInfo( +s"Fail to set Spark caller context: requires Hadoop 2.8 or later: ${e.getMessage}") + callerContextSupported = false + false +case NonFatal(e) => + logWarning("Fail to set Spark caller context", e) --- End diff -- Thanks for the tip about `hadoop.caller.context.enabled` and the suggestion, i'll do that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15377#discussion_r82492524 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2432,6 +2432,10 @@ private[spark] object Utils extends Logging { } } +private[spark] object CallerContext { + var callerContextSupported: Boolean = true --- End diff -- Makes sense, done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15399 **[Test build #66555 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66555/consoleFull)** for PR 15399 at commit [`26f4b25`](https://github.com/apache/spark/commit/26f4b2519b8cd3cc70abbb255605b0149ddcd428). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org