[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18334 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78972/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78972 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78972/testReport)** for PR 18334 at commit [`db8a640`](https://github.com/apache/spark/commit/db8a640884b23507f14cecee4fb3661b44874cef). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78972 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78972/testReport)** for PR 18334 at commit [`db8a640`](https://github.com/apache/spark/commit/db8a640884b23507f14cecee4fb3661b44874cef). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/18334 @rxin Currently we are re-calculating the stats. If we want to support incremental stats update, we may need to maintain some specific data structures. e.g. for ndv in column stats, store data structure for `HyperLogLogPlusPlus`. These structures are stored per column and table, which could cause considerable memory cost if stored in memory (and nonpersistent). If we want to persist them, there's also a problem given that currently metastore doesn't provide such APIs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18334 Can the stats be updated incrementally? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78918/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78918 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78918/testReport)** for PR 18334 at commit [`9142834`](https://github.com/apache/spark/commit/9142834b5ceb71798aa8b6919f86f619e78deb01). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78918 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78918/testReport)** for PR 18334 at commit [`9142834`](https://github.com/apache/spark/commit/9142834b5ceb71798aa8b6919f86f619e78deb01). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78903/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78903 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78903/testReport)** for PR 18334 at commit [`e53ab08`](https://github.com/apache/spark/commit/e53ab08d3a9e80263396301ea6359cc1b49dc94f). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78903 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78903/testReport)** for PR 18334 at commit [`e53ab08`](https://github.com/apache/spark/commit/e53ab08d3a9e80263396301ea6359cc1b49dc94f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/18334 PR for invalidating stats is submitted: [#18449](https://github.com/apache/spark/pull/18449) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/18334 @cloud-fan OK. I'll create another ticket for invalidating stats. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78739/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78739 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78739/testReport)** for PR 18334 at commit [`3392663`](https://github.com/apache/spark/commit/339266332ffbd937be2a4a58cffdbf3857b0db76). * This patch **fails due to an unknown error code, -10**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18334 @wzhfy Let's create a new JIRA ticket and link it in this PR, as this PR does 2 things: 1. invalidate stats after data changing 2. auto update stats if the config is on --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78739 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78739/testReport)** for PR 18334 at commit [`3392663`](https://github.com/apache/spark/commit/339266332ffbd937be2a4a58cffdbf3857b0db76). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18334 Will do a review tonight. Sorry for the delay --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18334 LGTM except one minor comment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78629/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78629 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78629/testReport)** for PR 18334 at commit [`dd29281`](https://github.com/apache/spark/commit/dd29281e7adb7a185d1e642d014d6f14f3537dd0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78629/testReport)** for PR 18334 at commit [`dd29281`](https://github.com/apache/spark/commit/dd29281e7adb7a185d1e642d014d6f14f3537dd0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78602/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78602 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78602/testReport)** for PR 18334 at commit [`5a43594`](https://github.com/apache/spark/commit/5a43594fb8a2fb2885c4d268140f28827a65ff5a). * This patch **fails due to an unknown error code, -10**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78602 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78602/testReport)** for PR 18334 at commit [`5a43594`](https://github.com/apache/spark/commit/5a43594fb8a2fb2885c4d268140f28827a65ff5a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/18334 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78592/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78592 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78592/testReport)** for PR 18334 at commit [`5a43594`](https://github.com/apache/spark/commit/5a43594fb8a2fb2885c4d268140f28827a65ff5a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78592 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78592/testReport)** for PR 18334 at commit [`5a43594`](https://github.com/apache/spark/commit/5a43594fb8a2fb2885c4d268140f28827a65ff5a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18334 LGTM, let's resolve the conflict --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78389/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78389 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78389/testReport)** for PR 18334 at commit [`625603e`](https://github.com/apache/spark/commit/625603e610ec38d9c48cfaf8691ffaeb9491a393). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78389 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78389/testReport)** for PR 18334 at commit [`625603e`](https://github.com/apache/spark/commit/625603e610ec38d9c48cfaf8691ffaeb9491a393). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/18334 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78370/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78369/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78370 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78370/testReport)** for PR 18334 at commit [`625603e`](https://github.com/apache/spark/commit/625603e610ec38d9c48cfaf8691ffaeb9491a393). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78369 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78369/testReport)** for PR 18334 at commit [`285be5b`](https://github.com/apache/spark/commit/285be5b5d5df895ae2c49d85458276f23a6c1bed). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/18334 @cloud-fan @gatorsmile I made the following changes: - add a config to trigger stats update and set it false by default. - update stats after add partition command, by adding the total size of these partitions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18334 Listing the files of a partitioned table in the cloud is expensive when the number of files is large. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/18334 OK since we'll remove rows and column stats in these commands, maybe it's better to set it false. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/18334 Yea I'll add a config for this. But how about set it true by default? I think usually the overhead of getting the file sizes is negligible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18334 +1 to provide a flag to automatically trigger the stats updates. We cat set it false by default to not surprise users --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18334 These commands will automatically trigger the stats updates, which could be expensive. Another way is to simply set it to zero or mark it unreliable? Can we provide a SQLConf conf for this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18334 How about add a partition? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78214/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78214 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78214/testReport)** for PR 18334 at commit [`9d4d97a`](https://github.com/apache/spark/commit/9d4d97a272a74d99f79026648b0af72b4f5249ab). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78214 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78214/testReport)** for PR 18334 at commit [`9d4d97a`](https://github.com/apache/spark/commit/9d4d97a272a74d99f79026648b0af72b4f5249ab). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/18334 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78202/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78202 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78202/testReport)** for PR 18334 at commit [`9d4d97a`](https://github.com/apache/spark/commit/9d4d97a272a74d99f79026648b0af72b4f5249ab). * This patch **fails Spark unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18334 **[Test build #78202 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78202/testReport)** for PR 18334 at commit [`9d4d97a`](https://github.com/apache/spark/commit/9d4d97a272a74d99f79026648b0af72b4f5249ab). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18334: [SPARK-21127] [SQL] Update statistics after data changin...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/18334 cc @cloud-fan @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org