[GitHub] spark issue #12004: [SPARK-7481][build] [WIP] Add Hadoop 2.7+ spark-cloud mo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12004 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14861: [SPARK-17287] [PYSPARK] Add recursive kwarg to Python Sp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14861 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14857: [SPARK-17261][PYSPARK] Using HiveContext after re-creati...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14857 **[Test build #64558 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64558/consoleFull)** for PR 14857 at commit [`986a24f`](https://github.com/apache/spark/commit/986a24fab27e258f263590a2e55cb88c0f8a662a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14746: [SPARK-17180] [SQL] Fix View Resolution Order in ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14746#discussion_r76639805 --- Diff: sql/core/src/main/java/org/apache/spark/sql/ViewType.java --- @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.sql; + +/** + * ViewType is used to specify the type of views. + */ +public enum ViewType { --- End diff -- I thought you want me to use public enum. Let me change it now. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14204: [SPARK-16520] [WEBUI] Link executors to corresponding wo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14204 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64575/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14830: [SPARK-16992][PYSPARK] PEP8 on documentation examples
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14830 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64563/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14830: [SPARK-16992][PYSPARK] PEP8 on documentation examples
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14830 **[Test build #64563 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64563/consoleFull)** for PR 14830 at commit [`14b2260`](https://github.com/apache/spark/commit/14b2260e9bb45737ed7a7580f8b7aa45caae7694). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14553: [SPARK-16963] Changes to Source trait and related implem...
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/14553 @rxin and @marmbrus, would it be possible to get this PR reviewed soon? I can split it into smaller chunks if that would make things easier; I just need to know. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13065 **[Test build #64579 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64579/consoleFull)** for PR 13065 at commit [`c41e308`](https://github.com/apache/spark/commit/c41e308a261fa0303d45a63306732af5909c373e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14204: [SPARK-16520] [WEBUI] Link executors to corresponding wo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14204 **[Test build #64575 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64575/consoleFull)** for PR 14204 at commit [`987caa3`](https://github.com/apache/spark/commit/987caa36f49889746f544e6d6d0ad1c94e643d81). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13584: [SPARK-15509][ML][SparkR] R MLlib algorithms should supp...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13584 **[Test build #64578 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64578/consoleFull)** for PR 13584 at commit [`1bc150f`](https://github.com/apache/spark/commit/1bc150f8af93f0e5d35e40fd39e33176c974d8cf). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14118: [SPARK-16462][SPARK-16460][SPARK-15144][SQL] Make CSV ca...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14118 **[Test build #64576 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64576/consoleFull)** for PR 14118 at commit [`d5357f9`](https://github.com/apache/spark/commit/d5357f9d784cc277d58fd896738a87a7aff7ba70). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14638 **[Test build #64569 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64569/consoleFull)** for PR 14638 at commit [`1e22b68`](https://github.com/apache/spark/commit/1e22b68c370475714079a4c65250d5941c1f5998). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14623: [SPARK-17044][SQL] Make test files for window functions ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14623 **[Test build #64570 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64570/consoleFull)** for PR 14623 at commit [`1229fdc`](https://github.com/apache/spark/commit/1229fdcde3a8e0b3b79a472319f1042f39d8ba6d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14426: [SPARK-16475][SQL] Broadcast Hint for SQL Queries
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14426 **[Test build #64574 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64574/consoleFull)** for PR 14426 at commit [`377b625`](https://github.com/apache/spark/commit/377b6251eec50a728aea17d306695dbec865269e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14116: [SPARK-16452][SQL] Support basic INFORMATION_SCHEMA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14116 **[Test build #64577 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64577/consoleFull)** for PR 14116 at commit [`7543069`](https://github.com/apache/spark/commit/754306970c4635624207c8082b175b4e195928b0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12004: [SPARK-7481][build] [WIP] Add Hadoop 2.7+ spark-cloud mo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12004 **[Test build #64580 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64580/consoleFull)** for PR 12004 at commit [`f39018e`](https://github.com/apache/spark/commit/f39018eee40ef463ebfdfb0f6a7ba6384b46c459). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14784: [SPARK-17210][SPARKR] sparkr.zip is not distributed to e...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14784 **[Test build #64566 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64566/consoleFull)** for PR 14784 at commit [`986cddc`](https://github.com/apache/spark/commit/986cddc360d0008fcca395e53254d59b0e4b6988). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14452 **[Test build #64572 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64572/consoleFull)** for PR 14452 at commit [`c6d987f`](https://github.com/apache/spark/commit/c6d987f584859224a05b17a157be30389066dccb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14712 **[Test build #64567 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64567/consoleFull)** for PR 14712 at commit [`3407c7f`](https://github.com/apache/spark/commit/3407c7f7aa62503e62c7c4847ad6a2568a676c38). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14691: [SPARK-16407][STREAMING] Allow users to supply custom st...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14691 **[Test build #64568 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64568/consoleFull)** for PR 14691 at commit [`c7bbffc`](https://github.com/apache/spark/commit/c7bbffcdbdc1723e165eec0bb481b1f385e18ac9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14435: [SPARK-16756][SQL][WIP] Add `sql` function to LogicalPla...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14435 **[Test build #64573 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64573/consoleFull)** for PR 14435 at commit [`3a3f8ac`](https://github.com/apache/spark/commit/3a3f8acb5d27bce8d9cdce4538e62745b3bc8757). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14527: [SPARK-16938][SQL] `drop/dropDuplicate` should handle th...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14527 **[Test build #64571 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64571/consoleFull)** for PR 14527 at commit [`9359601`](https://github.com/apache/spark/commit/93596018d9d6e06085fcb5df065f30946954c93b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14803 **[Test build #64565 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64565/consoleFull)** for PR 14803 at commit [`0d841e2`](https://github.com/apache/spark/commit/0d841e27e647d4187be56ddd88ac4f06eb560ed7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14858: [SPARK-17219][ML] Add NaN value handling in Bucketizer
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14858 **[Test build #64557 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64557/consoleFull)** for PR 14858 at commit [`bfb5b33`](https://github.com/apache/spark/commit/bfb5b333d0a4e4a9d05a25cc0d47a5cdbd496965). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14805: [MINOR][DOCS] Fix minor typos in python example code
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14805 **[Test build #64564 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64564/consoleFull)** for PR 14805 at commit [`36c99f8`](https://github.com/apache/spark/commit/36c99f83711fe912fbaa66cbeac07ae4a1bb1d2e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SparkR] Automate bui...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14859 **[Test build #64556 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64556/consoleFull)** for PR 14859 at commit [`1f23b05`](https://github.com/apache/spark/commit/1f23b0596b98cf333cab303f4b5ab53940bafbca). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14855: [SPARK-17284] [SQL] Remove Statistics-related Table Prop...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14855 **[Test build #64560 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64560/consoleFull)** for PR 14855 at commit [`ce8e8b8`](https://github.com/apache/spark/commit/ce8e8b89a5b61648daaa59578e2b6a99ec2f6d74). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14850: [SPARK-17279][SQL] better error message for NPE during S...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14850 **[Test build #64562 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64562/consoleFull)** for PR 14850 at commit [`1bd8382`](https://github.com/apache/spark/commit/1bd83826ff2bd2c99a47fe0029cd82910d355a7e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14830: [SPARK-16992][PYSPARK] PEP8 on documentation examples
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14830 **[Test build #64563 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64563/consoleFull)** for PR 14830 at commit [`14b2260`](https://github.com/apache/spark/commit/14b2260e9bb45737ed7a7580f8b7aa45caae7694). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14860: [SPARK-17264] [SQL] DataStreamWriter should document tha...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14860 **[Test build #64555 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64555/consoleFull)** for PR 14860 at commit [`b73074b`](https://github.com/apache/spark/commit/b73074bbe67ddd69caf0b65fafe36016bf805422). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14862: [SPARK-17295][SQL] Create TestHiveSessionState use refle...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14862 **[Test build #64554 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64554/consoleFull)** for PR 14862 at commit [`0867d2a`](https://github.com/apache/spark/commit/0867d2ac853b7634de53de0f07665636598ca454). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract `outpu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14864 **[Test build #64552 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64552/consoleFull)** for PR 14864 at commit [`07196a8`](https://github.com/apache/spark/commit/07196a8acbf6f0a68f29f96d1eeea74f53bbeb8a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14854: [SPARK-17283][WIP][Core] Cancel job in RDD.take() as soo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14854 **[Test build #64561 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64561/consoleFull)** for PR 14854 at commit [`e9c7dfb`](https://github.com/apache/spark/commit/e9c7dfb46360a2f3fa689ca448a26b114dbc02b5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14863: [SPARK-16992][PYSPARK] use map comprehension in doc
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14863 **[Test build #64553 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64553/consoleFull)** for PR 14863 at commit [`7a2621e`](https://github.com/apache/spark/commit/7a2621ec2e1f588ae2252327ced61c41c28a9243). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14856: [SPARK-17241][SparkR][MLlib] SparkR spark.glm should hav...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14856 **[Test build #64559 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64559/consoleFull)** for PR 14856 at commit [`6417049`](https://github.com/apache/spark/commit/6417049e9185434bc23c651217d73a88abe4f606). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13231: [SPARK-15453] [SQL] Sort Merge Join to use bucketing met...
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/13231 Continuing this work in a new PR : https://github.com/apache/spark/pull/14864 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14862: [SPARK-17295][SQL] Create TestHiveSessionState use refle...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14862 We are trying to get rid of `HiveSessionState`. Thus, I am not sure what you did here is in our direction. cc @cloud-fan @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14691: [SPARK-16407][STREAMING] Allow users to supply custom st...
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/14691 jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14239: [SPARK-16593] [CORE] [WIP] Provide a pre-fetch mechanism...
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/14239 thanks for the explanation, this makes much more sense now. I'm still a bit concerned about the memory usage of this though, especially with external shuffle on the nodemanager. Were you using the external shuffle to test this or just the shuffle built into the executors? How much memory did you give whatever was shuffling and how big were the blocks being fetched? Does this look at all about the size its trying to cache vs size available to shuffle handler? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract `outpu...
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/14864 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract...
GitHub user tejasapatil opened a pull request: https://github.com/apache/spark/pull/14864 [SPARK-15453] [SQL] FileSourceScanExec to extract `outputOrdering` information ## What changes were proposed in this pull request? Extracting sort ordering information in `FileSourceScanExec` so that planner can make use of it. My motivation to make this change was to get Sort Merge join in par with Hive's Sort-Merge-Bucket join when the source tables are bucketed + sorted. Query: ``` val df = (0 until 16).map(i => (i % 8, i * 2, i.toString)).toDF("i", "j", "k").coalesce(1) df.write.bucketBy(8, "j", "k").sortBy("j", "k").saveAsTable("table8") df.write.bucketBy(8, "j", "k").sortBy("j", "k").saveAsTable("table9") context.sql("SELECT * FROM table8 a JOIN table9 b ON a.j=b.j AND a.k=b.k").explain(true) ``` Before: ``` == Physical Plan == *SortMergeJoin [j#120, k#121], [j#123, k#124], Inner :- *Sort [j#120 ASC, k#121 ASC], false, 0 : +- *Project [i#119, j#120, k#121] : +- *Filter (isnotnull(k#121) && isnotnull(j#120)) :+- *FileScan orc default.table8[i#119,j#120,k#121] Batched: false, Format: ORC, InputPaths: file:/Users/tejasp/Desktop/dev/tp-spark/spark-warehouse/table8, PartitionFilters: [], PushedFilters: [IsNotNull(k), IsNotNull(j)], ReadSchema: struct +- *Sort [j#123 ASC, k#124 ASC], false, 0 +- *Project [i#122, j#123, k#124] +- *Filter (isnotnull(k#124) && isnotnull(j#123)) +- *FileScan orc default.table9[i#122,j#123,k#124] Batched: false, Format: ORC, InputPaths: file:/Users/tejasp/Desktop/dev/tp-spark/spark-warehouse/table9, PartitionFilters: [], PushedFilters: [IsNotNull(k), IsNotNull(j)], ReadSchema: struct ``` After: (note that the `Sort` step is no longer there) ``` == Physical Plan == *SortMergeJoin [j#49, k#50], [j#52, k#53], Inner :- *Project [i#48, j#49, k#50] : +- *Filter (isnotnull(k#50) && isnotnull(j#49)) : +- *FileScan orc default.table8[i#48,j#49,k#50] Batched: false, Format: ORC, InputPaths: file:/Users/tejasp/Desktop/dev/tp-spark/spark-warehouse/table8, PartitionFilters: [], PushedFilters: [IsNotNull(k), IsNotNull(j)], ReadSchema: struct +- *Project [i#51, j#52, k#53] +- *Filter (isnotnull(j#52) && isnotnull(k#53)) +- *FileScan orc default.table9[i#51,j#52,k#53] Batched: false, Format: ORC, InputPaths: file:/Users/tejasp/Desktop/dev/tp-spark/spark-warehouse/table9, PartitionFilters: [], PushedFilters: [IsNotNull(j), IsNotNull(k)], ReadSchema: struct ``` ## How was this patch tested? Added a test case in `JoinSuite`. Ran all other tests in `JoinSuite` You can merge this pull request into a Git repository by running: $ git pull https://github.com/tejasapatil/spark SPARK-15453_smb_optimization Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14864.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14864 commit 07196a8acbf6f0a68f29f96d1eeea74f53bbeb8a Author: Tejas Patil Date: 2016-08-26T07:00:35Z [SPARK-15453] [SQL] Sort Merge Join to use bucketing metadata to optimize query plan BEFORE ``` val df = (0 until 16).map(i => (i % 8, i * 2, i.toString)).toDF("i", "j", "k").coalesce(1) hc.sql("DROP TABLE table8").collect df.write.format("org.apache.spark.sql.hive.orc.OrcFileFormat").bucketBy(8, "j", "k").sortBy("j", "k").saveAsTable("table8") hc.sql("DROP TABLE table9").collect df.write.format("org.apache.spark.sql.hive.orc.OrcFileFormat").bucketBy(8, "j", "k").sortBy("j", "k").saveAsTable("table9") hc.sql("SELECT * FROM table8 a JOIN table9 b ON a.j=b.j AND a.k=b.k").explain(true) == Parsed Logical Plan == 'Project [*] +- 'Join Inner, (('a.j = 'b.j) && ('a.k = 'b.k)) :- 'UnresolvedRelation table8, a +- 'UnresolvedRelation table9, b == Analyzed Logical Plan == i: int, j: int, k: string, i: int, j: int, k: string Project [i#119, j#120, k#121, i#122, j#123, k#124] +- Join Inner, ((j#120 = j#123) && (k#121 = k#124)) :- SubqueryAlias a : +- SubqueryAlias table8 : +- Relation[i#119,j#120,k#121] orc +- SubqueryAlias b +- SubqueryAlias table9 +- Relation[i#122,j#123,k#124] orc == Optimized Logical Plan == Join Inner, ((j#120 = j#123) && (k#121 = k#124)) :- Filter (isnotnull(k#121) && isnotnull(j#120)) : +- Relation[i#119,j#120,k#121] orc +- Filter (isnotnull(k#124) && isnotnull(j#123)) +- Relation[i#122,j#123,k#124] orc == Physical Plan == *SortMergeJoin [j#120, k#121], [j#123, k#124], Inner :- *Sort [j#120 ASC, k#121 ASC], false, 0 : +- *Project [i#119, j#120, k#121] : +- *Filter (isnotnull(k#121) && isno
[GitHub] spark pull request #14863: [SPARK-16992][PYSPARK] use map comprehension in d...
GitHub user Stibbons opened a pull request: https://github.com/apache/spark/pull/14863 [SPARK-16992][PYSPARK] use map comprehension in doc Code is equivalent, but map comprehency is most of the time faster than a map. You can merge this pull request into a Git repository by running: $ git pull https://github.com/Stibbons/spark map_comprehension Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14863.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14863 commit 7a2621ec2e1f588ae2252327ced61c41c28a9243 Author: Gaetan Semet Date: 2016-08-29T14:30:18Z use map comprehension Signed-off-by: Gaetan Semet --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user maropu commented on the issue: https://github.com/apache/spark/pull/13065 @hvanhovell yea, thx for letting me know. I'll do that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/13065 @maropu I have updated the PR. Want to take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14862: [SPARK-17295][SQL] Create TestHiveSessionState us...
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/14862 [SPARK-17295][SQL] Create TestHiveSessionState use reflect logic based on the setting of CATALOG_IMPLEMENTATION ## What changes were proposed in this pull request? Currently we create a new `TestHiveSessionState` in `TestHive`, but in `SparkSession` we create `SessionState`/`HiveSessionState` use reflect logic based on the setting of CATALOG_IMPLEMENTATION, we should make the both consist, then we can test the reflect logic of `SparkSession` in `TestHive`. To achieve this, we add `test-hive` to the value set of CATALOG_IMPLEMENTATION, and updated relative references. ## How was this patch tested? existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jiangxb1987/spark testhive Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14862.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14862 commit 0867d2ac853b7634de53de0f07665636598ca454 Author: jiangxingbo Date: 2016-08-29T09:45:16Z create TestHiveSessionState use reflect logic based on the setting of CATALOG_IMPLEMENTATION. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14597: [SPARK-17017][MLLIB][ML] add a chiSquare Selector...
Github user mpjlu commented on a diff in the pull request: https://github.com/apache/spark/pull/14597#discussion_r76624379 --- Diff: python/pyspark/mllib/feature.py --- @@ -276,24 +276,64 @@ class ChiSqSelector(object): """ Creates a ChiSquared feature selector. -:param numTopFeatures: number of features that selector will select. - >>> data = [ ... LabeledPoint(0.0, SparseVector(3, {0: 8.0, 1: 7.0})), ... LabeledPoint(1.0, SparseVector(3, {1: 9.0, 2: 6.0})), ... LabeledPoint(1.0, [0.0, 9.0, 8.0]), ... LabeledPoint(2.0, [8.0, 9.0, 5.0]) ... ] ->>> model = ChiSqSelector(1).fit(sc.parallelize(data)) +>>> model = ChiSqSelector().setNumTopFeatures(1).fit(sc.parallelize(data)) +>>> model.transform(SparseVector(3, {1: 9.0, 2: 6.0})) +SparseVector(1, {0: 6.0}) +>>> model.transform(DenseVector([8.0, 9.0, 5.0])) +DenseVector([5.0]) +>>> model = ChiSqSelector().setPercentile(0.34).fit(sc.parallelize(data)) >>> model.transform(SparseVector(3, {1: 9.0, 2: 6.0})) SparseVector(1, {0: 6.0}) >>> model.transform(DenseVector([8.0, 9.0, 5.0])) DenseVector([5.0]) +>>> data = [ +... LabeledPoint(0.0, SparseVector(4, {0: 8.0, 1: 7.0})), +... LabeledPoint(1.0, SparseVector(4, {1: 9.0, 2: 6.0, 3: 4.0})), +... LabeledPoint(1.0, [0.0, 9.0, 8.0, 4.0]), +... LabeledPoint(2.0, [8.0, 9.0, 5.0, 9.0]) +... ] +>>> model = ChiSqSelector().setAlpha(0.1).fit(sc.parallelize(data)) +>>> model.transform(DenseVector([1.0,2.0,3.0,4.0])) +DenseVector([4.0]) .. versionadded:: 1.4.0 """ -def __init__(self, numTopFeatures): -self.numTopFeatures = int(numTopFeatures) +def __init__(self): +self.param = 50 --- End diff -- Use three variables are clear. It needs another selectionType variable also. In the fit function, according to the selection type, call different functions. If you prefer that, I will change the code. I am ok for both methods. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14597: [SPARK-17017][MLLIB][ML] add a chiSquare Selector...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14597#discussion_r76622682 --- Diff: python/pyspark/mllib/feature.py --- @@ -276,24 +276,64 @@ class ChiSqSelector(object): """ Creates a ChiSquared feature selector. -:param numTopFeatures: number of features that selector will select. - >>> data = [ ... LabeledPoint(0.0, SparseVector(3, {0: 8.0, 1: 7.0})), ... LabeledPoint(1.0, SparseVector(3, {1: 9.0, 2: 6.0})), ... LabeledPoint(1.0, [0.0, 9.0, 8.0]), ... LabeledPoint(2.0, [8.0, 9.0, 5.0]) ... ] ->>> model = ChiSqSelector(1).fit(sc.parallelize(data)) +>>> model = ChiSqSelector().setNumTopFeatures(1).fit(sc.parallelize(data)) +>>> model.transform(SparseVector(3, {1: 9.0, 2: 6.0})) +SparseVector(1, {0: 6.0}) +>>> model.transform(DenseVector([8.0, 9.0, 5.0])) +DenseVector([5.0]) +>>> model = ChiSqSelector().setPercentile(0.34).fit(sc.parallelize(data)) >>> model.transform(SparseVector(3, {1: 9.0, 2: 6.0})) SparseVector(1, {0: 6.0}) >>> model.transform(DenseVector([8.0, 9.0, 5.0])) DenseVector([5.0]) +>>> data = [ +... LabeledPoint(0.0, SparseVector(4, {0: 8.0, 1: 7.0})), +... LabeledPoint(1.0, SparseVector(4, {1: 9.0, 2: 6.0, 3: 4.0})), +... LabeledPoint(1.0, [0.0, 9.0, 8.0, 4.0]), +... LabeledPoint(2.0, [8.0, 9.0, 5.0, 9.0]) +... ] +>>> model = ChiSqSelector().setAlpha(0.1).fit(sc.parallelize(data)) +>>> model.transform(DenseVector([1.0,2.0,3.0,4.0])) +DenseVector([4.0]) .. versionadded:: 1.4.0 """ -def __init__(self, numTopFeatures): -self.numTopFeatures = int(numTopFeatures) +def __init__(self): +self.param = 50 --- End diff -- It seems like param is used to mean many different things. Why not different fields like in the Scala version for clarity? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14597: [SPARK-17017][MLLIB][ML] add a chiSquare Selector based ...
Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/14597 Hi @srowen , I have added Python API and test cases for ChiSqSelector. Could you kindly review it again. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11956: [SPARK-14098][SQL] Generate Java code that gets a float/...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/11956 @davies Could you please share your great opinions regarding these design questions among our community while we know you are busy? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK] PEP8 on documentation exam...
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r76608499 --- Diff: examples/src/main/python/als.py --- @@ -62,10 +62,10 @@ def update(i, mat, ratings): example. Please use pyspark.ml.recommendation.ALS for more conventional use.""", file=sys.stderr) -spark = SparkSession\ -.builder\ -.appName("PythonALS")\ -.getOrCreate() +spark = (SparkSession --- End diff -- I have not changed all this initilization lines, since they do not appear most of the time in the documentation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14830: [SPARK-16992][PYSPARK] autopep8 on documentation example...
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14830 Cool I wasn't sure of it. No pbl, I can even split it into several PR --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14830: [SPARK-16992][PYSPARK] autopep8 on documentation example...
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/14830 For what its worth pep8 says: > The preferred way of wrapping long lines is by using Python's implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash for line continuation. So this sounds like keeping in line with the general more pep8ification of the code - but I am a little concerned about just how many files this touches now that it isn't just an autogenerated change*, but I'll try and set aside some time this week to review it (I'm currently ~13 hours off my regular timezone so my review times may be a little erratic). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14830: [SPARK-16992][PYSPARK] autopep8 on documentation example...
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14830 Here is a new proposal. I've taken into account your remark, hope all $on/$off things are ok, and added some minor rework with the multiline syntax (I find using \ weird and inelegant, using parenthesis "()" make is more readable, TMHO). Tell me what you think about this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK] autopep8 on documentation ...
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r76599413 --- Diff: examples/src/main/python/ml/aft_survival_regression.py --- @@ -17,9 +17,9 @@ from __future__ import print_function +from pyspark.ml.linalg import Vectors # $example on$ from pyspark.ml.regression import AFTSurvivalRegression -from pyspark.ml.linalg import Vectors --- End diff -- In that case, move the `# $example on$` comment up above the `from pyspark.ml.linalg import Vectors` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK] autopep8 on documentation ...
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r76598828 --- Diff: examples/src/main/python/ml/aft_survival_regression.py --- @@ -17,9 +17,9 @@ from __future__ import print_function +from pyspark.ml.linalg import Vectors # $example on$ from pyspark.ml.regression import AFTSurvivalRegression -from pyspark.ml.linalg import Vectors --- End diff -- I actually prefer this line be in the doc --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14746: [SPARK-17180] [SQL] Fix View Resolution Order in ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14746#discussion_r76597119 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala --- @@ -105,7 +96,14 @@ case class CreateViewCommand( } val sessionState = sparkSession.sessionState -if (isTemporary) { +// 1) CREATE VIEW: create a temp view when users explicitly specify the keyword TEMPORARY; +// otherwise, create a permanent view no matter whether the temporary view +// with the same name exists or not. +// 2) ALTER VIEW: alter the temporary view if the temp view exists; otherwise, try to alter +//the permanent view. Here, it follows the same resolution like DROP VIEW, +//since users are unable to specify the keyword TEMPORARY. +if (viewType == ViewType.Temporary || +(viewType != ViewType.Permanent && sessionState.catalog.isTemporaryTable(name))) { --- End diff -- This is not so readable, how about we use 3 branches for temp view and permanent view and any view individually? Also, we don't need to mention CREATE VIEW or ALTER VIEW here, the semantic is clearly defined by the SaveMode and ViewType, we just need to document how these codes match the semantic. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14851: [SPARK-17281][ML][MLLib] Add treeAggregateDepth paramete...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/14851 cc @jkbradley thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14746: [SPARK-17180] [SQL] Fix View Resolution Order in ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14746#discussion_r76596767 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala --- @@ -69,23 +66,17 @@ case class CreateViewCommand( override def output: Seq[Attribute] = Seq.empty[Attribute] - if (!isTemporary) { --- End diff -- why remove this check? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14746: [SPARK-17180] [SQL] Fix View Resolution Order in ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14746#discussion_r76596601 --- Diff: sql/core/src/main/java/org/apache/spark/sql/ViewType.java --- @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.sql; + +/** + * ViewType is used to specify the type of views. + */ +public enum ViewType { --- End diff -- This doesn't need to be public to end users, we can put it in `view.scala` and use `sealed trait` to implement it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14833: fixed a typo
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14833 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14833: fixed a typo
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14833 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14833: fixed a typo
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14833 **[Test build #3235 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3235/consoleFull)** for PR 14833 at commit [`a93ce34`](https://github.com/apache/spark/commit/a93ce34873b4fe55675feaee26f32251468dceeb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK] autopep8 on documentation ...
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r76595360 --- Diff: examples/src/main/python/ml/binarizer_example.py --- @@ -17,9 +17,10 @@ from __future__ import print_function -from pyspark.sql import SparkSession # $example on$ from pyspark.ml.feature import Binarizer +from pyspark.sql import SparkSession --- End diff -- yes I see, makes perfectly sense ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14731: [SPARK-17159] [streaming]: optimise check for new...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/14731#discussion_r76593850 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala --- @@ -244,6 +244,31 @@ class SparkHadoopUtil extends Logging { } /** + * List directories/files matching the path and return the `FileStatus` results. + * If the pattern is not a regexp then a simple `getFileStatus(pattern)` + * is called to get the status of that path. + * If the path/pattern does not match anything in the filesystem, + * an empty sequence is returned. + * @param pattern pattern + * @return a possibly empty array of FileStatus entries + */ + def globToFileStatus(pattern: Path): Array[FileStatus] = { --- End diff -- essentially if anything which might be a wildcard is hit, it gets handed off to the globber for the full interpretation. Same for ^ and ], which are only part of a pattern within the context of an opening [ Its only those strings which can be verified to be regexp free in a simple context-free string scan that say "absolutely no patterns here" regarding the bigger change: most of it is isolation of the sensitive code *and the tests to verify behaviour* --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14731: [SPARK-17159] [streaming]: optimise check for new...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14731#discussion_r76594233 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala --- @@ -244,6 +244,31 @@ class SparkHadoopUtil extends Logging { } /** + * List directories/files matching the path and return the `FileStatus` results. + * If the pattern is not a regexp then a simple `getFileStatus(pattern)` + * is called to get the status of that path. + * If the path/pattern does not match anything in the filesystem, + * an empty sequence is returned. + * @param pattern pattern + * @return a possibly empty array of FileStatus entries + */ + def globToFileStatus(pattern: Path): Array[FileStatus] = { --- End diff -- Yea, but then that's wrong if for example my path actually has a ? or ^ or ] in it. It doesn't seem essential and seems even problematic to add this behavior change to an otherwise clear fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14833: fixed a typo
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14833 **[Test build #3235 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3235/consoleFull)** for PR 14833 at commit [`a93ce34`](https://github.com/apache/spark/commit/a93ce34873b4fe55675feaee26f32251468dceeb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14118: [SPARK-16462][SPARK-16460][SPARK-15144][SQL] Make CSV ca...
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/14118 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14118: [SPARK-16462][SPARK-16460][SPARK-15144][SQL] Make CSV ca...
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/14118 > What if I am writing explicitly an empty string out? Does it become just 1,,2? Yes. It becomes `1,,2` in 2.0, and the same `1,,2` with this patch -- no behavior changes. > Can you also clarify whether this is behavior changing, or something else? This patch behaves differently from 2.0 when reading `1,,2` back: (given `nullValue` the default value: empty string ""), `1,,2` would be read back as `1,,2` in 2.0, but would be read back as `1,[null],[null],2` with this patch. @rxin ~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14861: [SPARK-17287] [PySpark] Add `recursive` kwarg to ...
GitHub user jpiper opened a pull request: https://github.com/apache/spark/pull/14861 [SPARK-17287] [PySpark] Add `recursive` kwarg to Java Python `SparkContext.addFile` ## What changes were proposed in this pull request? Add the ability to add entire directories using the PySpark interface `SparkContext.addFile(dir, recursive=True)` ## How was this patch tested? I've added a test file in a nested folders in `python/test_support`. I use `addFile` to distribute this folder, and then read the file back using the directory structure. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jpiper/spark jpiper/pyspark_addfiles Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14861.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14861 commit cabcca30252a189def2b0357b3951fac0870a3db Author: Jason Piper Date: 2016-08-29T09:53:51Z Add `recursive` kwarg to Java Python `SparkContext.addFile` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK] autopep8 on documentation ...
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r76582404 --- Diff: examples/src/main/python/ml/binarizer_example.py --- @@ -17,9 +17,10 @@ from __future__ import print_function -from pyspark.sql import SparkSession # $example on$ from pyspark.ml.feature import Binarizer +from pyspark.sql import SparkSession --- End diff -- Some of the examples files are used in generating the website documentation, and the "example on" and "example off" tags are used to determine which parts get pulled in to the website (in this case this is done since we don't want to have the same boiler plate imports for each example - rather showing the ones specific to that). You can take a look at `./docs/ml-features.md` which includes this file to see how its used in markdown and the generated website documentation at http://spark.apache.org/docs/latest/ml-features.html#binarizer . The instructions for building the docs locally are located at `./docs/README.md` - let me know if you need any help with that - the documentation build is sometimes a bit overlooked since many of the developers don't build it manually often. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r76582188 --- Diff: python/pep8rc --- @@ -0,0 +1,21 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +[pep8] --- End diff -- I don't know if they can be merged. Will try it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14860: [SPARK-17264] [SQL] DataStreamWriter should docum...
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/14860 [SPARK-17264] [SQL] DataStreamWriter should document that it only supports Parquet for now ## What changes were proposed in this pull request? Clarify that only parquet files are supported by DataStreamWriter now ## How was this patch tested? (Doc build -- no functional changes to test) You can merge this pull request into a Git repository by running: $ git pull https://github.com/srowen/spark SPARK-17264 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14860.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14860 commit b73074bbe67ddd69caf0b65fafe36016bf805422 Author: Sean Owen Date: 2016-08-29T09:51:30Z Clarify that only parquet files are supported by DataStreamWriter now --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14536: Merge pull request #1 from apache/master
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14536 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14449: [SPARK-16843][MLLIB] add the percentage ChiSquare...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14449 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #10572: SPARK-12619 Combine small files in a hadoop direc...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10572 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #10995: [SPARK-13120] [test-maven] Shade protobuf-java
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10995 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12695: [SPARK-14914] Normalize Paths/URIs for windows.
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12695 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13658: [SPARK-15937] [yarn] Improving the logic to wait ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13658 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14505: Branch 2.0
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14505 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14810: Branch 1.6
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14810 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12694: [SPARK-14914] Fix Command too long for windows. E...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12694 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12753: [SPARK-3767] [CORE] Support wildcard in Spark pro...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12753 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14788: [SPARK-17174][SQL] Add the support for TimestampType for...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14788 For `date_add`, `date_sub`, `add_month`, I think we should support both DateType and TimestampType, and the return type should depend on the input type. For `last_day`, `first_day`, we should support both DateType and TimestampType, but the return type should always be DateType For `date_trunc`, we should support both DateType and TimestampType, but the return type should always be TimestampType cc @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14849: [BUILD] Closes some stale PRs.
Github user srowen closed the pull request at: https://github.com/apache/spark/pull/14849 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r76580634 --- Diff: python/pep8rc --- @@ -0,0 +1,21 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +[pep8] --- End diff -- Maybe I'm missing something but should they be in the same file or are they separate? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SparkR] Automate bui...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14859 Good point, I suppose there is a weak promise there that it runs on Windows. Could anyone else who knows Windows weigh in? I assume @dongjoon-hyun is on board. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14698: [SPARK-17061][SPARK-17093][SQL] `MapObjects` shou...
Github user lw-lin commented on a diff in the pull request: https://github.com/apache/spark/pull/14698#discussion_r76579917 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -136,7 +136,7 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks { // some expression is reusing variable names across different instances. // This behavior is tested in ExpressionEvalHelperSuite. val plan = generateProject( - GenerateUnsafeProjection.generate( + UnsafeProjection.create( --- End diff -- @viirya maybe test against the following? - + this patch's changes to ObjectExpressionsSuite.scala - + this patch's changes to ExpressionEvalHelper.scala (this is also critical) - - this patch's changes to objects.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SparkR] Automate bui...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14859 Ah, I thought Windows is already officially supported assuming from this documentation https://github.com/apache/spark/blob/master/docs/index.md#downloading. BTW, I do understand your concerns but I believe this will make easy to review Window-specific tests. I mean, at least, we can identify Windows-specific problems easily and as you already know, I believe it is hard to review the PRs for Windows-specific problems currently. I wouldn't mind if this should be closed because, at least, I proposed a automated build on Windows here so reviewers can use this after manually merging this anyway. My personal opinion is, though, to try to use this as it does not affect code-base or other builds. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14712 Looks like Jenkins doesn't work for a while. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r76577358 --- Diff: python/pep8rc --- @@ -0,0 +1,21 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +[pep8] --- End diff -- actually, tox.ini looks more similar to this pep8rc than isort.cfg, but github doesn't show it that way. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r76577137 --- Diff: dev/isort.cfg --- @@ -1,9 +1,9 @@ # Licensed to the Apache Software Foundation (ASF) under one or more -# contributor license agreements. See the NOTICE file distributed with +# contributor license agreements. See the NOTICE file distributed with --- End diff -- No, this is not pep8 related! I deleted dev/tox.ini and created a new file dev/isort.cfg from scratch. But the fact is git "finds" file renames based on content similarities, so it sees this as a file rename. Actually, I did deleted the tox.ini in another commit than the commit that creates isort.ini (see https://github.com/apache/spark/pull/14567/commits), so I guess this "visualisation" of a file renaming comes from github --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14712 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14712 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SparkR] Automate bui...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14859 Hm, we also had Travis config that isn't used now, to try to add Java style checking. I can see the value in adding Windows testing, but here we have a third CI tool involved. I'm concerned that I for example wouldn't know how to maintain it. I suppose we also need to decide if Windows is even supported? I don't think it is supported for development, certainly. For deployment -- best effort is my understanding, but may not work. If this relies on a bunch of setup to run (including needing a sorta unofficial copy of Hadoop's winutils) then does testing it this way say much about how it works on Windows? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14698: [SPARK-17061][SPARK-17093][SQL] `MapObjects` shou...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/14698#discussion_r76576622 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -136,7 +136,7 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks { // some expression is reusing variable names across different instances. // This behavior is tested in ExpressionEvalHelperSuite. val plan = generateProject( - GenerateUnsafeProjection.generate( + UnsafeProjection.create( --- End diff -- But looks like this change doesn't reflect in the test? Without this change, the added test is passed too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SparkR] Automate bui...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14859 cc @rxin @srowen (for build) @JoshRosen (for project infra) @dongjoon-hyun (who suggested AppVeyor CI) @steveloughran (who is the author of winutils) @felixcheung and @shivaram (for SparkR) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14833: fixed a typo
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14833 Jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14859: [SPARK-17200][PROJECT INFRA][BUILD][SparkR] Autom...
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/14859 [SPARK-17200][PROJECT INFRA][BUILD][SparkR] Automate building and testing on Windows (currently SparkR only) ## What changes were proposed in this pull request? This PR adds the build automation on Windows with [AppVeyor](https://www.appveyor.com/) CI tool. Currently, this only runs the tests for SparkR as we have been having some issues with testing Windows-specific PRs (e.g. https://github.com/apache/spark/pull/14743 and https://github.com/apache/spark/pull/13165) and hard time to verify this. One concern is, this build is dependent on [steveloughran/winutils](https://github.com/steveloughran/winutils) for pre-built Hadoop bin package (who is a Hadoop PMC member). ## How was this patch tested? Manually, https://ci.appveyor.com/project/HyukjinKwon/spark/build/8-SPARK-17200-build Some tests are already being failed and this was found in https://github.com/apache/spark/pull/14743#issuecomment-241405287, which are currently as below: ``` Skipped 1. create DataFrame from RDD (@test_sparkSQL.R#200) - Hive is not build with SparkSQL, skipped 2. test HiveContext (@test_sparkSQL.R#1041) - Hive is not build with SparkSQL, skipped 3. read/write ORC files (@test_sparkSQL.R#1748) - Hive is not build with SparkSQL, skipped 4. enableHiveSupport on SparkSession (@test_sparkSQL.R#2480) - Hive is not build with SparkSQL, skipped Warnings --- 1. infer types and check types (@test_sparkSQL.R#109) - unable to identify current timezone 'C': please set environment variable 'TZ' Failed - 1. Error: union on two RDDs (@test_binary_function.R#38) --- 1: textFile(sc, fileName) at C:/projects/spark/R/lib/SparkR/tests/testthat/test_binary_function.R:38 2: callJMethod(sc, "textFile", path, getMinPartitions(sc, minPartitions)) 3: invokeJava(isStatic = FALSE, objId$id, methodName, ...) 4: stop(readString(conn)) 2. Error: zipPartitions() on RDDs (@test_binary_function.R#84) - 1: textFile(sc, fileName, 1) at C:/projects/spark/R/lib/SparkR/tests/testthat/test_binary_function.R:84 2: callJMethod(sc, "textFile", path, getMinPartitions(sc, minPartitions)) 3: invokeJava(isStatic = FALSE, objId$id, methodName, ...) 4: stop(readString(conn)) 3. Error: saveAsObjectFile()/objectFile() following textFile() works (@test_binaryFile.R#31) 1: textFile(sc, fileName1, 1) at C:/projects/spark/R/lib/SparkR/tests/testthat/test_binaryFile.R:31 2: callJMethod(sc, "textFile", path, getMinPartitions(sc, minPartitions)) 3: invokeJava(isStatic = FALSE, objId$id, methodName, ...) 4: stop(readString(conn)) 4. Error: saveAsObjectFile()/objectFile() works on a parallelized list (@test_binaryFile.R#46) 1: objectFile(sc, fileName) at C:/projects/spark/R/lib/SparkR/tests/testthat/test_binaryFile.R:46 2: callJMethod(sc, "objectFile", path, getMinPartitions(sc, minPartitions)) 3: invokeJava(isStatic = FALSE, objId$id, methodName, ...) 4: stop(readString(conn)) 5. Error: saveAsObjectFile()/objectFile() following RDD transformations works (@test_binaryFile.R#57) 1: textFile(sc, fileName1) at C:/projects/spark/R/lib/SparkR/tests/testthat/test_binaryFile.R:57 2: callJMethod(sc, "textFile", path, getMinPartitions(sc, minPartitions)) 3: invokeJava(isStatic = FALSE, objId$id, methodName, ...) 4: stop(readString(conn)) 6. Error: saveAsObjectFile()/objectFile() works with multiple paths (@test_binaryFile.R#85) 1: objectFile(sc, c(fileName1, fileName2)) at C:/projects/spark/R/lib/SparkR/tests/testthat/test_binaryFile.R:85 2: callJMethod(sc, "objectFile", path, getMinPartitions(sc, minPartitions)) 3: invokeJava(isStatic = FALSE, objId$id, methodName, ...) 4: stop(readString(conn)) 7. Error: spark.glm save/load (@test_mllib.R#162) -- 1: read.ml(modelPath) at C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:162 2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path) 3: invokeJava(isStatic = TRUE, className, methodName, ...) 4: stop(readString(conn)) 8. Error: glm save/load (@test_mllib.R#292) 1: read.ml(modelPath) at C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:292 2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path) 3: invokeJava(isStatic = TRUE, className, methodName, ...) 4: stop(readString(conn)) 9. Error: spark.kmeans (@test_mllib.R#340) - 1: read.ml(modelPath) at C:/projects/spark/R/lib/S
[GitHub] spark issue #14836: [MINOR][MLlib][SQL] Clean up unused variables and unused...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14836 I think the other changes are trivial but not wrong. I'd generally not bother with these bitty changes. It's not that they're wrong but that it takes me some time to go think through whether they're valid, and it's probably not worth our time collectively. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org