[GitHub] spark issue #14235: [SPARK-16590][SQL] Improve LogicalPlanToSQLSuite to chec...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14235 Thank you, @rxin . By the way, the following test occurs two times sequentially. ``` HiveSparkSubmitSuite.SPARK-8020: set sql conf in spark conf *** FAILED *** (5 minutes, 0 seconds) ``` I looked into the cases, but still have no idea about that. I wish the Jenkins pass at this time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14054: [SPARK-16226] [SQL] Weaken JDBC isolation level to avoid...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14054 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62457/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14054: [SPARK-16226] [SQL] Weaken JDBC isolation level to avoid...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14054 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14054: [SPARK-16226] [SQL] Weaken JDBC isolation level to avoid...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14054 **[Test build #62457 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62457/consoleFull)** for PR 14054 at commit [`12a3803`](https://github.com/apache/spark/commit/12a38032e84a673bec6591003b28fafa3c68daaf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14235: [SPARK-16590][SQL] Improve LogicalPlanToSQLSuite to chec...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14235 **[Test build #62469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62469/consoleFull)** for PR 14235 at commit [`efaa4d0`](https://github.com/apache/spark/commit/efaa4d0d55373280e19ed38b7e192545e4a3a6af). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14086: [SPARK-16463][SQL] Support `truncate` option in Overwrit...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14086 I think it's valuable to explicitly compare `truncate` option with the existing `SaveMode.Overwrite` in popular DBMSs. After some testing, we can provide this with more clear limitation notes if exists. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14136 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62466/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14136 **[Test build #62466 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62466/consoleFull)** for PR 14136 at commit [`6314611`](https://github.com/apache/spark/commit/63146115ad270109a75fcbfa4e608de0e7b31046). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14136 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14065: [SPARK-14743][YARN] Add a configurable token manager for...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14065 **[Test build #62468 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62468/consoleFull)** for PR 14065 at commit [`9863ca4`](https://github.com/apache/spark/commit/9863ca42427479bf79731f204f606d1ccae20491). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14065: [SPARK-14743][YARN] Add a configurable token manager for...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14065 **[Test build #62467 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62467/consoleFull)** for PR 14065 at commit [`a66aed0`](https://github.com/apache/spark/commit/a66aed0d1f299b87f68f34f96f0ccc539507dfde). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14136 **[Test build #62466 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62466/consoleFull)** for PR 14136 at commit [`6314611`](https://github.com/apache/spark/commit/63146115ad270109a75fcbfa4e608de0e7b31046). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14246: [SPARK-16600][MLLib] fix some latex formula syntax error
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14246 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62460/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14246: [SPARK-16600][MLLib] fix some latex formula syntax error
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14246 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14246: [SPARK-16600][MLLib] fix some latex formula syntax error
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14246 **[Test build #62460 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62460/consoleFull)** for PR 14246 at commit [`127a428`](https://github.com/apache/spark/commit/127a428e6a3ce242a826597c841b00651a7f5dd5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14137: SPARK-16478 graphX (added graph caching in strongly conn...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14137 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62464/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14137: SPARK-16478 graphX (added graph caching in strongly conn...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14137 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14137: SPARK-16478 graphX (added graph caching in strongly conn...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14137 **[Test build #62464 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62464/consoleFull)** for PR 14137 at commit [`84a8363`](https://github.com/apache/spark/commit/84a8363adfac60557a65d91216a5f790df5961da). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14245: [SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java example u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14245 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14245: [SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java example u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14245 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62461/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14245: [SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java example u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14245 **[Test build #62461 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62461/consoleFull)** for PR 14245 at commit [`03491fd`](https://github.com/apache/spark/commit/03491fdb25b953194b723058ef29818e6d74402e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14247: [MINOR] Remove unused arg in als.py
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14247 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62463/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14247: [MINOR] Remove unused arg in als.py
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14247 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14247: [MINOR] Remove unused arg in als.py
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14247 **[Test build #62463 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62463/consoleFull)** for PR 14247 at commit [`70afb78`](https://github.com/apache/spark/commit/70afb784bc71f1a669285842833b58c1007b32ca). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12983: [SPARK-15213][PySpark] Unify 'range' usages
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12983 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14245: [SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java example u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14245 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62462/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12983: [SPARK-15213][PySpark] Unify 'range' usages
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12983 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62458/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14245: [SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java example u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14245 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14245: [SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java example u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14245 **[Test build #62462 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62462/consoleFull)** for PR 14245 at commit [`927c46a`](https://github.com/apache/spark/commit/927c46a6f7f641bf9959c2d82e7424d7fa2d2d0a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12983: [SPARK-15213][PySpark] Unify 'range' usages
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12983 **[Test build #62458 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62458/consoleFull)** for PR 12983 at commit [`ca7546a`](https://github.com/apache/spark/commit/ca7546a5bc27ccde721bbe80865f3e2a67c57c16). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14098: [SPARK-16380][SQL][Example]:Update SQL examples and prog...
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14098 Yea, especially on case insensitive OS'es like Mac and Windows, the doc actually builds successfully even when cases of the example file names don't match. I guess that's probably why we missed SPARK-16553. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13922: [SPARK-11938][PySpark] Expose numFeatures in all ML Pred...
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/13922 @vectorijk this is covered in #12889 by @holdenk --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12983: [SPARK-15213][PySpark] Unify 'range' usages
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12983 **[Test build #62458 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62458/consoleFull)** for PR 12983 at commit [`ca7546a`](https://github.com/apache/spark/commit/ca7546a5bc27ccde721bbe80865f3e2a67c57c16). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14245: [SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java example u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14245 **[Test build #62461 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62461/consoleFull)** for PR 14245 at commit [`03491fd`](https://github.com/apache/spark/commit/03491fdb25b953194b723058ef29818e6d74402e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14220: [SPARK-16568][SQL][Documentation] update sql programming...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/14220 cc @liancheng Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14054: [SPARK-16226] [SQL] Weaken JDBC isolation level to avoid...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14054 **[Test build #62457 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62457/consoleFull)** for PR 14054 at commit [`12a3803`](https://github.com/apache/spark/commit/12a38032e84a673bec6591003b28fafa3c68daaf). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14247: [MINOR] Remove unused arg in als.py
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14247 **[Test build #62463 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62463/consoleFull)** for PR 14247 at commit [`70afb78`](https://github.com/apache/spark/commit/70afb784bc71f1a669285842833b58c1007b32ca). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14150: [SPARK-16494] [ML] Upgrade breeze version to 0.12
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14150 **[Test build #62459 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62459/consoleFull)** for PR 14150 at commit [`dbfcbff`](https://github.com/apache/spark/commit/dbfcbff6945dfd3e02180d95c4bccf7f5500e199). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14246: [SPARK-16600][MLLib] fix some latex formula syntax error
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14246 **[Test build #62460 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62460/consoleFull)** for PR 14246 at commit [`127a428`](https://github.com/apache/spark/commit/127a428e6a3ce242a826597c841b00651a7f5dd5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14245: [SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java example u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14245 **[Test build #62462 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62462/consoleFull)** for PR 14245 at commit [`927c46a`](https://github.com/apache/spark/commit/927c46a6f7f641bf9959c2d82e7424d7fa2d2d0a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14137: SPARK-16478 graphX (added graph caching in strongly conn...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14137 **[Test build #62464 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62464/consoleFull)** for PR 14137 at commit [`84a8363`](https://github.com/apache/spark/commit/84a8363adfac60557a65d91216a5f790df5961da). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13051: [SPARK-15271] [MESOS] Allow force pulling executor docke...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13051 **[Test build #62465 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62465/consoleFull)** for PR 13051 at commit [`36b3258`](https://github.com/apache/spark/commit/36b32584bcd8197ecb8e6b8d92954270c735a62f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13894: [SPARK-15254][DOC] Improve ML pipeline Cross Validation ...
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/13894 @krishnakalyan3 think merge conflicts still need to be resolved - also the Python style issue. Subject to those this LGTM now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, punctuati...
Github user ahmed-mahran commented on the issue: https://github.com/apache/spark/pull/14234 Fine, ignoring this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71131209 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non applies to both date type and timestamp type. By default, it is None which means trying to parse times and date by ``java.sql.Timestamp.valueOf()`` and ``java.sql.Date.valueOf()``. +:param timezone: defines the timezone to be used for both date type and timestamp type. + If a timezone is specified in the data, this will load them after --- End diff -- It seems it is a default behaviour for `SimpleDateFormat`. I will look into this deeper and will fix or add some more commemts tomorrow. Thanks again! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, punctuati...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14234 Don't bother if you don't have Office and it's any trouble --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, punctuati...
Github user ahmed-mahran commented on the issue: https://github.com/apache/spark/pull/14234 The slides renders bad on "libre office" I have; I'll try something else and see. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71130049 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non applies to both date type and timestamp type. By default, it is None which means trying to parse times and date by ``java.sql.Timestamp.valueOf()`` and ``java.sql.Date.valueOf()``. +:param timezone: defines the timezone to be used for both date type and timestamp type. + If a timezone is specified in the data, this will load them after --- End diff -- I see. Thank you for the detailed explanation. I should correct comments and the behaviour. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, punctuati...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14234 Oh nice. It might be a matter of exporting the image at a higher resolution, but I still wouldn't worry if it's just a trivial typo and takes any non-trivial time to figure out. (You can fix and save the PPTX and add that here.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, punctuati...
Github user ahmed-mahran commented on the issue: https://github.com/apache/spark/pull/14234 I can find a pptx at "docs/img/structured-streaming.pptx" where there is a corresponding slide for each image. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13051: [SPARK-15271] [MESOS] Allow force pulling executo...
Github user philipphoffmann commented on a diff in the pull request: https://github.com/apache/spark/pull/13051#discussion_r71129439 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala --- @@ -105,11 +105,14 @@ private[mesos] object MesosSchedulerBackendUtil extends Logging { def addDockerInfo( container: ContainerInfo.Builder, image: String, + forcePullImage: Boolean = false, volumes: Option[List[Volume]] = None, network: Option[ContainerInfo.DockerInfo.Network] = None, --- End diff -- Working on this, I figured that the `network` parameter is used nowhere. Just wanted to point this out, not sure if there is WIP for adding this feature. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71129335 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non applies to both date type and timestamp type. By default, it is None which means trying to parse times and date by ``java.sql.Timestamp.valueOf()`` and ``java.sql.Date.valueOf()``. +:param timezone: defines the timezone to be used for both date type and timestamp type. + If a timezone is specified in the data, this will load them after --- End diff -- I mean specifically that any "timezone" specified as a parameter when reading the input "27/08/2015 00:00 PDT" should not matter, which is why I wonder why this is in the example. The parsed timestamp is unambiguously 144065880; it has no timezone because that's a concept only related to formatted dates/times. When outputting formatted strings, I understand why a timezone parameter matters. For example with format "dd/MM/ HH:mm:ss z" the output depends on the desired timezone. With "GMT", yes I agree with your new example, the output is "27/08/2015 07:00:00 GMT" Back to the original question -- what is the comment referring to then? I don't see a need to manually adjust for a timezone. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13051: [SPARK-15271] [MESOS] Allow force pulling executo...
Github user philipphoffmann commented on a diff in the pull request: https://github.com/apache/spark/pull/13051#discussion_r71129270 --- Diff: core/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosFineGrainedSchedulerBackendSuite.scala --- @@ -150,6 +150,7 @@ class MesosFineGrainedSchedulerBackendSuite val conf = new SparkConf() .set("spark.mesos.executor.docker.image", "spark/mock") + .set("spark.mesos.executor.docker.forcePullImage", "true") --- End diff -- Did so :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13051: [SPARK-15271] [MESOS] Allow force pulling executo...
Github user philipphoffmann commented on a diff in the pull request: https://github.com/apache/spark/pull/13051#discussion_r71129217 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala --- @@ -119,21 +122,25 @@ private[mesos] object MesosSchedulerBackendUtil extends Logging { } /** - * Setup a docker containerizer + * Setup a docker containerizer from MesosDriverDescription scheduler properties */ def setupContainerBuilderDockerInfo( imageName: String, -conf: SparkConf, +conf: Map[String, String], builder: ContainerInfo.Builder): Unit = { +val forcePullImage = conf --- End diff -- Yeah, so as i already mentioned above, `MesosClusterScheduler` uses this method passing a `Map[String, String]` so if we want to keep reusing this method (which makes sense), we will have to go with the Strings. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13051: [SPARK-15271] [MESOS] Allow force pulling executo...
Github user philipphoffmann commented on a diff in the pull request: https://github.com/apache/spark/pull/13051#discussion_r71128900 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -408,8 +408,11 @@ private[spark] class MesosCoarseGrainedSchedulerBackend( .addAllResources(memResourcesToUse.asJava) sc.conf.getOption("spark.mesos.executor.docker.image").foreach { image => -MesosSchedulerBackendUtil - .setupContainerBuilderDockerInfo(image, sc.conf, taskBuilder.getContainerBuilder) +MesosSchedulerBackendUtil.setupContainerBuilderDockerInfo( + image, + sc.conf.getAll.toMap, --- End diff -- I agree, I provided a better approach. Note that this method is also called from `MesosClusterScheduler` which maintains the driver settings as a raw `Map[String, String]` so passing the `SparkConf` here is not an option imho. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71127773 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non applies to both date type and timestamp type. By default, it is None which means trying to parse times and date by ``java.sql.Timestamp.valueOf()`` and ``java.sql.Date.valueOf()``. +:param timezone: defines the timezone to be used for both date type and timestamp type. + If a timezone is specified in the data, this will load them after --- End diff -- I thought it loses the timezone information after being loaded into Spark. I mean, `Timestamp` and `Date` instances don't have timezone information in them. The timezone specified in the input is being used in the example.. I am sorry that I think I didn't understand cleanly. Do you mind if I ask what you expect in before being read, after being read (in dataframe) and after being written? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceGroups s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14222 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62456/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceGroups s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14222 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceGroups s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14222 **[Test build #62456 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62456/consoleFull)** for PR 14222 at commit [`7e8d8c1`](https://github.com/apache/spark/commit/7e8d8c116552642573cc89bd11fc2e82f2a0f82a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14150: [SPARK-16494] [ML] Upgrade breeze version to 0.12
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/14150 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceGroups s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14222 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceGroups s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14222 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62454/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceGroups s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14222 **[Test build #62454 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62454/consoleFull)** for PR 14222 at commit [`4ba124c`](https://github.com/apache/spark/commit/4ba124cf8c6b441e37fe1943c5b8164eeb2470d1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71126016 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non applies to both date type and timestamp type. By default, it is None which means trying to parse times and date by ``java.sql.Timestamp.valueOf()`` and ``java.sql.Date.valueOf()``. +:param timezone: defines the timezone to be used for both date type and timestamp type. + If a timezone is specified in the data, this will load them after --- End diff -- In the example above, the input specifies a timezone and that must be used to interpret it. You say "it becomes in the dataframe" but what it becomes is a timestamp, which is unambiguous and has no timezone. Timezone matters when converting back to a string for display, but, your example only shows the parameter used on reading, and shows no timezone in the output. I am not sure that this is the intended behavior? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14247: [MINOR] Remove unused arg in als.py
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14247 Seems reasonable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14137: SPARK-16478 graphX (added graph caching in strongly conn...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14137 LGTM, will leave open for a bit for comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14137: SPARK-16478 graphX (added graph caching in strongly conn...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14137 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71124928 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -195,18 +202,40 @@ private[sql] class CsvOutputWriter( private var records: Long = 0L private val csvWriter = new LineCsvWriter(params, dataSchema.fieldNames.toSeq) - private def rowToString(row: Seq[Any]): Seq[String] = row.map { field => -if (field != null) { - field.toString -} else { - params.nullValue + private def rowToString(row: InternalRow): Seq[String] = { +var i = 0 +val values = new Array[String](row.numFields) +while (i < row.numFields) { + if (!row.isNullAt(i)) { +values(i) = fieldsConverters(i).apply(row, i) + } else { +values(i) = params.nullValue + } + i += 1 } +values + } + + private def makeConverter(dataType: DataType): ValueConverter = dataType match { +case DateType if params.dateFormat != null => + (row: InternalRow, ordinal: Int) => + params.dateFormat.format(DateTimeUtils.toJavaDate(row.getInt(ordinal))) + +case TimestampType if params.dateFormat != null => + (row: InternalRow, ordinal: Int) => + params.dateFormat.format(DateTimeUtils.toJavaTimestamp(row.getLong(ordinal))) --- End diff -- This was merged before [here](https://github.com/apache/spark/pull/11550) but `timezone` was not concerned. So, this PR adds the support for.. - `timezone` for both reading and writing - `dateFormat` for writing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71124645 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -195,18 +202,40 @@ private[sql] class CsvOutputWriter( private var records: Long = 0L private val csvWriter = new LineCsvWriter(params, dataSchema.fieldNames.toSeq) - private def rowToString(row: Seq[Any]): Seq[String] = row.map { field => -if (field != null) { - field.toString -} else { - params.nullValue + private def rowToString(row: InternalRow): Seq[String] = { +var i = 0 +val values = new Array[String](row.numFields) +while (i < row.numFields) { + if (!row.isNullAt(i)) { +values(i) = fieldsConverters(i).apply(row, i) + } else { +values(i) = params.nullValue + } + i += 1 } +values + } + + private def makeConverter(dataType: DataType): ValueConverter = dataType match { +case DateType if params.dateFormat != null => + (row: InternalRow, ordinal: Int) => + params.dateFormat.format(DateTimeUtils.toJavaDate(row.getInt(ordinal))) + +case TimestampType if params.dateFormat != null => + (row: InternalRow, ordinal: Int) => + params.dateFormat.format(DateTimeUtils.toJavaTimestamp(row.getLong(ordinal))) --- End diff -- Oh, sorry. This is not the part of this PR. This is https://github.com/apache/spark/blob/e1dc853737fc1739fbb5377ffe31fb2d89935b1f/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala#L280-L291 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14247: [MINOR] Remove unused arg in als.py
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/14247 [MINOR] Remove unused arg in als.py ## What changes were proposed in this pull request? The second arg in method `update()` is never used. So I delete it. ## How was this patch tested? local run with `./bin/spark-submit examples/src/main/python/als.py` You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhengruifeng/spark als_refine Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14247.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14247 commit 70afb784bc71f1a669285842833b58c1007b32ca Author: Zheng RuiFeng Date: 2016-07-18T09:59:53Z del unused arg --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14176: [SPARK-16525][SQL] Enable Row Based HashMap in HashAggre...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14176 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62452/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14174: [SPARK-16524][SQL] Add RowBatch and RowBasedHashMapGener...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14174 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14176: [SPARK-16525][SQL] Enable Row Based HashMap in HashAggre...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14176 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14174: [SPARK-16524][SQL] Add RowBatch and RowBasedHashMapGener...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14174 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62453/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14176: [SPARK-16525][SQL] Enable Row Based HashMap in HashAggre...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14176 **[Test build #62452 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62452/consoleFull)** for PR 14176 at commit [`ce72d90`](https://github.com/apache/spark/commit/ce72d94bfa720460126a3573642a8a97bc53). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14174: [SPARK-16524][SQL] Add RowBatch and RowBasedHashMapGener...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14174 **[Test build #62453 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62453/consoleFull)** for PR 14174 at commit [`2c1973a`](https://github.com/apache/spark/commit/2c1973a872e5b8d99a55234724ec24acbc5f70ff). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71124186 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non applies to both date type and timestamp type. By default, it is None which means trying to parse times and date by ``java.sql.Timestamp.valueOf()`` and ``java.sql.Date.valueOf()``. +:param timezone: defines the timezone to be used for both date type and timestamp type. + If a timezone is specified in the data, this will load them after --- End diff -- Yes, it will use the timezone specified in the input. Since `Date` and `Timestamp` do not keep timezone information, it calculates the differences between specified timezone in the input and in the option but the standard is the one specified in the option. I meant to say, for example, if `timezone` is set to `GMT`, all the read `Date` and `Timestamp` are in `GMT` timezone after calculating the differences. So.. If the CSV data is as below: ``` 27/08/2015 00:00 PDT 27/08/2015 01:00 PDT 27/08/2015 02:00 PDT ``` If this is read as below: ```scala spark.read .format("csv") .option("timezone", "GTM") .option("dateFormat", "dd/MM/ HH:mm z") .load("path") ``` it will become as below in dataframe (difference between `GMT` and `PDT` is 7 hours). ``` 27/08/2015 07:00 27/08/2015 08:00 27/08/2015 09:00 ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71124234 --- Diff: python/pyspark/sql/readwriter.py --- @@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non applies to both date type and timestamp type. By default, it is None which means trying to parse times and date by ``java.sql.Timestamp.valueOf()`` and ``java.sql.Date.valueOf()``. +:param timezone: defines the timezone to be used for both date type and timestamp type. + If a timezone is specified in the data, this will load them after --- End diff -- If this behaviour looks fine, I will just change the documentation to be more clear.. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14137: SPARK-16478 graphX (added graph caching in strongly conn...
Github user wesolowskim commented on the issue: https://github.com/apache/spark/pull/14137 Is it sufficient right now or should I do something more? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71123633 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -195,18 +202,40 @@ private[sql] class CsvOutputWriter( private var records: Long = 0L private val csvWriter = new LineCsvWriter(params, dataSchema.fieldNames.toSeq) - private def rowToString(row: Seq[Any]): Seq[String] = row.map { field => -if (field != null) { - field.toString -} else { - params.nullValue + private def rowToString(row: InternalRow): Seq[String] = { +var i = 0 +val values = new Array[String](row.numFields) +while (i < row.numFields) { + if (!row.isNullAt(i)) { +values(i) = fieldsConverters(i).apply(row, i) + } else { +values(i) = params.nullValue + } + i += 1 } +values + } + + private def makeConverter(dataType: DataType): ValueConverter = dataType match { +case DateType if params.dateFormat != null => + (row: InternalRow, ordinal: Int) => + params.dateFormat.format(DateTimeUtils.toJavaDate(row.getInt(ordinal))) + +case TimestampType if params.dateFormat != null => + (row: InternalRow, ordinal: Int) => + params.dateFormat.format(DateTimeUtils.toJavaTimestamp(row.getLong(ordinal))) --- End diff -- This is formatting to a string, rather than parsing from a string right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14136 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14136 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62455/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14136 **[Test build #62455 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62455/consoleFull)** for PR 14136 at commit [`d541b46`](https://github.com/apache/spark/commit/d541b46e245dbaa200040cc9220a2095717d40ba). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14243: [SPARK-10683][SPARK-16510][SPARKR] Move SparkR include j...
Github user sun-rui commented on the issue: https://github.com/apache/spark/pull/14243 Will this test be run always no matter if the "sparkr" profile is specified or not? In other words, does R need to installed for all spark tests to pass? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14246: [SPARK-16600][MLLib] fix some latex formula syntax error
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14246 OK, especially if you've had a search for other similar latex issues. It probably doesn't even need a JIRA but OK. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14245: [SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java example u...
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14245 Reused JIRA number SPARK-16303 and renamed Scala/Java example file names. Python examples are not being updated to use the `include_example` tag yet. The PR (#14098) is still in WIP status. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14169: [SPARK-16515][SQL]set default record reader and writer f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14169 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14169: [SPARK-16515][SQL]set default record reader and writer f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14169 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62451/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14169: [SPARK-16515][SQL]set default record reader and writer f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14169 **[Test build #62451 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62451/consoleFull)** for PR 14169 at commit [`0edfed4`](https://github.com/apache/spark/commit/0edfed48a1e6a438a18d488404b900a51475a2d5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14065: [SPARK-14743][YARN] Add a configurable token mana...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/14065#discussion_r71121047 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/security/ConfigurableCredentialManager.scala --- @@ -0,0 +1,158 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.deploy.yarn.security + +import java.util.ServiceLoader + +import scala.collection.JavaConverters._ +import scala.collection.mutable + +import org.apache.hadoop.conf.Configuration +import org.apache.hadoop.security.Credentials + +import org.apache.spark.SparkConf +import org.apache.spark.internal.Logging +import org.apache.spark.util.Utils + +/** + * A ConfigurableCredentialManager to manage all the registered credential providers and offer + * APIs for other modules to obtain credentials as well as renewal time. By default + * [[HDFSCredentialProvider]], [[HiveCredentialProvider]] and [[HBaseCredentialProvider]] will + * be loaded in, any plugged-in credential provider wants to be managed by + * ConfigurableCredentialManager needs to implement [[ServiceCredentialProvider]] interface and put + * into resources to be loaded by ServiceLoader. + * + * Also the specific credential provider is controlled by + * spark.yarn.security.credentials.{service}.enabled, it will not be loaded in if set to false. + */ +final class ConfigurableCredentialManager private[yarn] (sparkConf: SparkConf) extends Logging { + private val deprecatedProviderEnabledConfig = "spark.yarn.security.tokens.%s.enabled" + private val providerEnabledConfig = "spark.yarn.security.credentials.%s.enabled" + + // Maintain all the registered credential providers + private val credentialProviders = mutable.HashMap[String, ServiceCredentialProvider]() + + // Default crendetial providers that will be loaded automatically, unless specifically disabled. + private val defaultCredentialProviders = Map( +"hdfs" -> "org.apache.spark.deploy.yarn.security.HDFSCredentialProvider", +"hive" -> "org.apache.spark.deploy.yarn.security.HiveCredentialProvider", +"hbase" -> "org.apache.spark.deploy.yarn.security.HBaseCredentialProvider" + ) + + // AMDelegationTokenRenewer, this will only be create and started in the AM + private var _delegationTokenRenewer: AMDelegationTokenRenewer = null + + // ExecutorDelegationTokenUpdater, this will only be created and started in the driver and + // executor side. + private var _delegationTokenUpdater: ExecutorDelegationTokenUpdater = null + + def initialize(): Unit = { +val providers = ServiceLoader.load(classOf[ServiceCredentialProvider], + Utils.getContextOrSparkClassLoader).asScala + +// Filter out credentials in which spark.yarn.security.credentials.{service}.enabled is false. +providers.filter { p => + sparkConf.getOption(providerEnabledConfig.format(p.serviceName)) +.orElse { + sparkConf.getOption(deprecatedProviderEnabledConfig.format(p.serviceName)).map { c => + logWarning(s"${deprecatedProviderEnabledConfig.format(p.serviceName)} is deprecated, " + + s"using ${providerEnabledConfig.format(p.serviceName)} instead") +c + } +} +.getOrElse(defaultCredentialProviders.keySet.find(_ == p.serviceName).isDefined.toString) --- End diff -- This means that by default hdfs, hive and hbase credential providers will be loaded unless explicitly disabled by configuration. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-ma
[GitHub] spark pull request #13912: [SPARK-16216][SQL] CSV data source supports custo...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13912#discussion_r71120468 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -195,18 +202,40 @@ private[sql] class CsvOutputWriter( private var records: Long = 0L private val csvWriter = new LineCsvWriter(params, dataSchema.fieldNames.toSeq) - private def rowToString(row: Seq[Any]): Seq[String] = row.map { field => -if (field != null) { - field.toString -} else { - params.nullValue + private def rowToString(row: InternalRow): Seq[String] = { +var i = 0 +val values = new Array[String](row.numFields) +while (i < row.numFields) { + if (!row.isNullAt(i)) { +values(i) = fieldsConverters(i).apply(row, i) + } else { +values(i) = params.nullValue + } + i += 1 } +values + } + + private def makeConverter(dataType: DataType): ValueConverter = dataType match { +case DateType if params.dateFormat != null => + (row: InternalRow, ordinal: Int) => + params.dateFormat.format(DateTimeUtils.toJavaDate(row.getInt(ordinal))) + +case TimestampType if params.dateFormat != null => + (row: InternalRow, ordinal: Int) => + params.dateFormat.format(DateTimeUtils.toJavaTimestamp(row.getLong(ordinal))) --- End diff -- @srowen Maybe you are looking for this case. I had to do it like this to avoid per-record type dispatch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14246: [SPARK-16600][MLLib] fix some latex formula synta...
GitHub user WeichenXu123 opened a pull request: https://github.com/apache/spark/pull/14246 [SPARK-16600][MLLib] fix some latex formula syntax error ## What changes were proposed in this pull request? `\partial\x` ==> `\partial x` `har{x_i}` ==> `hat{x_i}` ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/WeichenXu123/spark fix_formular_err Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14246.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14246 commit 127a428e6a3ce242a826597c841b00651a7f5dd5 Author: WeichenXu Date: 2016-07-13T16:58:19Z fix_formular_err --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceGroups s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14222 **[Test build #62456 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62456/consoleFull)** for PR 14222 at commit [`7e8d8c1`](https://github.com/apache/spark/commit/7e8d8c116552642573cc89bd11fc2e82f2a0f82a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceGroups s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14222 **[Test build #62454 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62454/consoleFull)** for PR 14222 at commit [`4ba124c`](https://github.com/apache/spark/commit/4ba124cf8c6b441e37fe1943c5b8164eeb2470d1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14136 **[Test build #62455 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62455/consoleFull)** for PR 14136 at commit [`d541b46`](https://github.com/apache/spark/commit/d541b46e245dbaa200040cc9220a2095717d40ba). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14150: [SPARK-16494] [ML] Upgrade breeze version to 0.12
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14150#discussion_r71114168 --- Diff: mllib/src/test/java/org/apache/spark/ml/feature/JavaPCASuite.java --- @@ -107,7 +107,11 @@ public VectorPair call(Tuple2 pair) { .fit(df); List result = pca.transform(df).select("pca_features", "expected").toJavaRDD().collect(); for (Row r : result) { - Assert.assertEquals(r.get(1), r.get(0)); + Vector calculatedVector = (Vector)r.get(0); --- End diff -- This needs a rebase and you might run dev/lint-java manually; I think this cast will still fail the style checker? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12983: [SPARK-15213][PySpark] Unify 'range' usages
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/12983 @MechCoder @srowen There is no prefermance diffence. There is only one little difference: Py2 have 'xrange' and 'range', while Py3 only have 'range'. So unifying all case to 'range' may be more acceptable for Py3 users. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14234: [MINOR][SQL][STREAMING][DOCS] Fix minor typos, punctuati...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14234 Ah OK I now see the nature of the problem in the original code blocks. Great, that's an important fix. The rest look good. I wouldn't worry about the image just for that; I don't know where the source is. LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14238: [MINOR][TYPO] fix fininsh typo
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14238 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14238: [MINOR][TYPO] fix fininsh typo
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14238 OK. Merged to master/2.0 to match the previous changes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org