[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391879#comment-17391879 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-891463544 Thanks a lot for your review and merge @lw309637554 @codope -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391254#comment-17391254 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 merged pull request #3259: URL: https://github.com/apache/hudi/pull/3259 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390501#comment-17390501 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 76edd41bb74de677ab6367841db2ebe9796ab0f5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1262) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1266) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1270) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390474#comment-17390474 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 76edd41bb74de677ab6367841db2ebe9796ab0f5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1262) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1266) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1270) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390471#comment-17390471 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-889792948 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390472#comment-17390472 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 removed a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-889726781 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390445#comment-17390445 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 76edd41bb74de677ab6367841db2ebe9796ab0f5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1262) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1266) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390416#comment-17390416 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 76edd41bb74de677ab6367841db2ebe9796ab0f5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1262) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1266) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390415#comment-17390415 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-889726781 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390407#comment-17390407 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 76edd41bb74de677ab6367841db2ebe9796ab0f5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1262) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390337#comment-17390337 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 0bb6768327f3a54bb25d4504043acfb94ecfa311 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1256) * 76edd41bb74de677ab6367841db2ebe9796ab0f5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1262) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390306#comment-17390306 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 0bb6768327f3a54bb25d4504043acfb94ecfa311 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1256) * 76edd41bb74de677ab6367841db2ebe9796ab0f5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390304#comment-17390304 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * abefb17f2c42c06e9c81ec26c6561172fedf4add Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1252) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1254) * 0bb6768327f3a54bb25d4504043acfb94ecfa311 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1256) * 76edd41bb74de677ab6367841db2ebe9796ab0f5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390298#comment-17390298 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * abefb17f2c42c06e9c81ec26c6561172fedf4add Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1252) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1254) * 0bb6768327f3a54bb25d4504043acfb94ecfa311 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1256) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390297#comment-17390297 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * abefb17f2c42c06e9c81ec26c6561172fedf4add Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1252) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1254) * 0bb6768327f3a54bb25d4504043acfb94ecfa311 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390296#comment-17390296 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * abefb17f2c42c06e9c81ec26c6561172fedf4add Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1252) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1254) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390285#comment-17390285 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * abefb17f2c42c06e9c81ec26c6561172fedf4add Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1252) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1254) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390284#comment-17390284 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-889626943 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390268#comment-17390268 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * abefb17f2c42c06e9c81ec26c6561172fedf4add Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1252) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390257#comment-17390257 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r679615918 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java ## @@ -1115,6 +1125,47 @@ public void testAsyncClusteringServiceWithCompaction() throws Exception { }); } + @Test Review comment: nice idea, changed ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -118,20 +134,41 @@ public static void main(String[] args) { jsc.stop(); } + private static void validateRunningMode(Config cfg) { +// --mode has a higher priority than --schedule +// If we remove --schedule option in the future we need to change runningMode default value to EXECUTE +if (StringUtils.isNullOrEmpty(cfg.runningMode)) { Review comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390255#comment-17390255 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r679615761 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java ## @@ -449,6 +451,14 @@ static void assertAtleastNDeltaCommits(int minExpected, String tablePath, FileSy assertTrue(minExpected <= numDeltaCommits, "Got=" + numDeltaCommits + ", exp >=" + minExpected); } +static void assertAtLeastNCompletedReplaceCommits(int minExpected, String tablePath, DistributedFileSystem fs) { Review comment: Sure, changed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390256#comment-17390256 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r679615842 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java ## @@ -1115,6 +1125,47 @@ public void testAsyncClusteringServiceWithCompaction() throws Exception { }); } + @Test + public void testHoodieAsyncClusteringJobWithScheduleAndExecute() throws Exception { Review comment: nice idea, changed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390247#comment-17390247 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * ec01ad1f162813a5fafb7d14da7b65eea64d06ea Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1132) * abefb17f2c42c06e9c81ec26c6561172fedf4add Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1252) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390246#comment-17390246 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * ec01ad1f162813a5fafb7d14da7b65eea64d06ea Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1132) * abefb17f2c42c06e9c81ec26c6561172fedf4add UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388044#comment-17388044 ] ASF GitHub Bot commented on HUDI-2164: -- codope commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r677432961 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -118,20 +134,41 @@ public static void main(String[] args) { jsc.stop(); } + private static void validateRunningMode(Config cfg) { +// --mode has a higher priority than --schedule +// If we remove --schedule option in the future we need to change runningMode default value to EXECUTE +if (StringUtils.isNullOrEmpty(cfg.runningMode)) { Review comment: Can we also add a validation here that if the mode is execute, then instant-time should not be empty and it should be a valid instant time? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388042#comment-17388042 ] ASF GitHub Bot commented on HUDI-2164: -- codope commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r677427994 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java ## @@ -1115,6 +1125,47 @@ public void testAsyncClusteringServiceWithCompaction() throws Exception { }); } + @Test Review comment: Should we make this a `@ParameterizedTest` and run for different execution modes? We can validate replace commits based on execution modes, e.g. if mode is schedule then there should be no replace commits. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388041#comment-17388041 ] ASF GitHub Bot commented on HUDI-2164: -- codope commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r677425971 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java ## @@ -1115,6 +1125,47 @@ public void testAsyncClusteringServiceWithCompaction() throws Exception { }); } + @Test + public void testHoodieAsyncClusteringJobWithScheduleAndExecute() throws Exception { Review comment: I see a lot of common code between this test and `testHoodieAsyncClusteringJob`. Shall we extract it to common private method? The try-catch part is only different, which can remain in the test method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388039#comment-17388039 ] ASF GitHub Bot commented on HUDI-2164: -- codope commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r677422752 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java ## @@ -449,6 +451,14 @@ static void assertAtleastNDeltaCommits(int minExpected, String tablePath, FileSy assertTrue(minExpected <= numDeltaCommits, "Got=" + numDeltaCommits + ", exp >=" + minExpected); } +static void assertAtLeastNCompletedReplaceCommits(int minExpected, String tablePath, DistributedFileSystem fs) { Review comment: There is already `assertAtLeastNReplaceCommits` method. I think we can reuse that as it also fetches the completedReplaceTimeline. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387938#comment-17387938 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-887368347 LGTM, @satishkotha @codope can you also review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387934#comment-17387934 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r677287722 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -49,6 +51,9 @@ private transient FileSystem fs; private TypedProperties props; private final JavaSparkContext jsc; + private static final String EXECUTE = "execute"; + private static final String SCHEDULE = "schedule"; + private static final String SCHEDULE_AND_EXECUTE = "scheduleandexecute"; Review comment: is remove toLowerCase better, it can reduce user's Confused -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387011#comment-17387011 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-886331783 > @zhangyue19921010 hi, some minor comments Hi @lw309637554 Thanks a lot for your reviewing! All responded. PTAL :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387010#comment-17387010 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r676258363 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java ## @@ -449,6 +451,14 @@ static void assertAtleastNDeltaCommits(int minExpected, String tablePath, FileSy assertTrue(minExpected <= numDeltaCommits, "Got=" + numDeltaCommits + ", exp >=" + minExpected); } +static void assertAtLeastNCompletedReplaceCommits(int minExpected, String tablePath, DistributedFileSystem fs) { + HoodieTableMetaClient meta = HoodieTableMetaClient.builder().setConf(fs.getConf()).setLoadActiveTimelineOnLoad(true).setBasePath(tablePath).build(); + HoodieTimeline timeline = meta.getActiveTimeline().getCompletedReplaceTimeline(); + LOG.info("Timeline Instants=" + meta.getActiveTimeline().getInstants().collect(Collectors.toList())); Review comment: Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387009#comment-17387009 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r676258331 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -164,11 +196,37 @@ private int doCluster(JavaSparkContext jsc) throws Exception { private Option doSchedule(JavaSparkContext jsc) throws Exception { String schemaStr = getSchemaFromLatestInstant(); try (SparkRDDWriteClient client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr, cfg.parallelism, Option.empty(), props)) { - if (cfg.clusteringInstantTime != null) { -client.scheduleClusteringAtInstant(cfg.clusteringInstantTime, Option.empty()); -return Option.of(cfg.clusteringInstantTime); + return doSchedule(client); +} + } + + private Option doSchedule(SparkRDDWriteClient client) { +if (cfg.clusteringInstantTime != null) { + client.scheduleClusteringAtInstant(cfg.clusteringInstantTime, Option.empty()); + return Option.of(cfg.clusteringInstantTime); +} +return client.scheduleClustering(Option.empty()); + } + + public int doScheduleAndCluster(JavaSparkContext jsc) throws Exception { +LOG.info("Step 1: Do schedule"); +String schemaStr = getSchemaFromLatestInstant(); +try (SparkRDDWriteClient client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr, cfg.parallelism, Option.empty(), props)) { + + Option instantTime = doSchedule(client); + int result = instantTime.isPresent() ? 0 : -1; + + if (result == -1) { +LOG.info("Couldn't Generate Cluster Plan"); Review comment: Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387008#comment-17387008 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r676257147 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,29 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); Review comment: > When developers call `public int cluster(int retry)` internally like > > https://github.com/apache/hudi/blob/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52/hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java#L1069 > > > They may not set running mode config, so we need check this value to avoid NLP. Really nice catching here. Thought about it carefully, setting "" here as default value is not good enough, maybe we need to validate RunningMode based on cfg.runSchedule when users let RunningMode be null. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387007#comment-17387007 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r676258087 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -49,6 +51,9 @@ private transient FileSystem fs; private TypedProperties props; private final JavaSparkContext jsc; + private static final String EXECUTE = "execute"; + private static final String SCHEDULE = "schedule"; + private static final String SCHEDULE_AND_EXECUTE = "scheduleandexecute"; Review comment: e, reasons for using all lowercase is that we use `switch (cfg.runningMode.toLowerCase()) {xxx}` to do switch, so that users can use --mode scheduleAndExecute/scehduleANDEXECUTE/SCHEDULEandEXECUTE, etc -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387006#comment-17387006 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r676257147 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,29 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); Review comment: > When developers call `public int cluster(int retry)` internally like > > https://github.com/apache/hudi/blob/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52/hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java#L1069 > > > They may not set running mode config, so we need check this value to avoid NLP. Thought about it carefully, setting "" here as default value is not good enough, maybe we need to validate RunningMode based on cfg.runSchedule when users let RunningMode be null. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386205#comment-17386205 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * ec01ad1f162813a5fafb7d14da7b65eea64d06ea Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1132) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386177#comment-17386177 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * ec01ad1f162813a5fafb7d14da7b65eea64d06ea Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1132) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386175#comment-17386175 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-885574686 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386103#comment-17386103 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * ec01ad1f162813a5fafb7d14da7b65eea64d06ea Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386072#comment-17386072 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * bb55c4ce75a59c179395748793c113cce5bcf714 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1039) * ec01ad1f162813a5fafb7d14da7b65eea64d06ea Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386065#comment-17386065 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * bb55c4ce75a59c179395748793c113cce5bcf714 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1039) * ec01ad1f162813a5fafb7d14da7b65eea64d06ea UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385871#comment-17385871 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-885356337 @zhangyue19921010 hi, some minor comments -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385870#comment-17385870 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r675270701 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java ## @@ -449,6 +451,14 @@ static void assertAtleastNDeltaCommits(int minExpected, String tablePath, FileSy assertTrue(minExpected <= numDeltaCommits, "Got=" + numDeltaCommits + ", exp >=" + minExpected); } +static void assertAtLeastNCompletedReplaceCommits(int minExpected, String tablePath, DistributedFileSystem fs) { + HoodieTableMetaClient meta = HoodieTableMetaClient.builder().setConf(fs.getConf()).setLoadActiveTimelineOnLoad(true).setBasePath(tablePath).build(); + HoodieTimeline timeline = meta.getActiveTimeline().getCompletedReplaceTimeline(); + LOG.info("Timeline Instants=" + meta.getActiveTimeline().getInstants().collect(Collectors.toList())); Review comment: Timeline Instants= -> Timeline instants = -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385869#comment-17385869 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r675270601 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -164,11 +196,37 @@ private int doCluster(JavaSparkContext jsc) throws Exception { private Option doSchedule(JavaSparkContext jsc) throws Exception { String schemaStr = getSchemaFromLatestInstant(); try (SparkRDDWriteClient client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr, cfg.parallelism, Option.empty(), props)) { - if (cfg.clusteringInstantTime != null) { -client.scheduleClusteringAtInstant(cfg.clusteringInstantTime, Option.empty()); -return Option.of(cfg.clusteringInstantTime); + return doSchedule(client); +} + } + + private Option doSchedule(SparkRDDWriteClient client) { +if (cfg.clusteringInstantTime != null) { + client.scheduleClusteringAtInstant(cfg.clusteringInstantTime, Option.empty()); + return Option.of(cfg.clusteringInstantTime); +} +return client.scheduleClustering(Option.empty()); + } + + public int doScheduleAndCluster(JavaSparkContext jsc) throws Exception { +LOG.info("Step 1: Do schedule"); +String schemaStr = getSchemaFromLatestInstant(); +try (SparkRDDWriteClient client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr, cfg.parallelism, Option.empty(), props)) { + + Option instantTime = doSchedule(client); + int result = instantTime.isPresent() ? 0 : -1; + + if (result == -1) { +LOG.info("Couldn't Generate Cluster Plan"); Review comment: Couldn't Generate Cluster Plan -> Couldn't generate cluster plan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385863#comment-17385863 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r675268621 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,29 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); Review comment: "" -> can we use Optional to check null? "" is confused -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385860#comment-17385860 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r675267730 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -49,6 +51,9 @@ private transient FileSystem fs; private TypedProperties props; private final JavaSparkContext jsc; + private static final String EXECUTE = "execute"; + private static final String SCHEDULE = "schedule"; + private static final String SCHEDULE_AND_EXECUTE = "scheduleandexecute"; Review comment: is scheduleandexecute -> scheduleAndExecute better? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384564#comment-17384564 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r673605286 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -171,4 +200,38 @@ private int doCluster(JavaSparkContext jsc) throws Exception { return client.scheduleClustering(Option.empty()); } } + + @TestOnly + public int doScheduleAndCluster() throws Exception { +return this.doScheduleAndCluster(jsc); + } + + public int doScheduleAndCluster(JavaSparkContext jsc) throws Exception { +LOG.info("Step 1: Do schedule"); +String schemaStr = getSchemaFromLatestInstant(); +try (SparkRDDWriteClient client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr, cfg.parallelism, Option.empty(), props)) { + + Option instantTime; + if (cfg.clusteringInstantTime != null) { +client.scheduleClusteringAtInstant(cfg.clusteringInstantTime, Option.empty()); +instantTime = Option.of(cfg.clusteringInstantTime); + } else { +instantTime = client.scheduleClustering(Option.empty()); + } + + int result = instantTime.isPresent() ? 0 : -1; Review comment: Nice idea. Changed. PTAL :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384205#comment-17384205 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384175#comment-17384175 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r672888552 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,26 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); Review comment: make sense ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -171,4 +200,38 @@ private int doCluster(JavaSparkContext jsc) throws Exception { return client.scheduleClustering(Option.empty()); } } + + @TestOnly + public int doScheduleAndCluster() throws Exception { +return this.doScheduleAndCluster(jsc); + } + + public int doScheduleAndCluster(JavaSparkContext jsc) throws Exception { +LOG.info("Step 1: Do schedule"); +String schemaStr = getSchemaFromLatestInstant(); +try (SparkRDDWriteClient client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr, cfg.parallelism, Option.empty(), props)) { + + Option instantTime; + if (cfg.clusteringInstantTime != null) { +client.scheduleClusteringAtInstant(cfg.clusteringInstantTime, Option.empty()); +instantTime = Option.of(cfg.clusteringInstantTime); + } else { +instantTime = client.scheduleClustering(Option.empty()); + } + + int result = instantTime.isPresent() ? 0 : -1; Review comment: can we reuse the sparkclient modify doCluster(sc) to doCluster(sc, client) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384159#comment-17384159 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384087#comment-17384087 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (bb55c4c) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `16.44%`. > The diff coverage is `42.10%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## = - Coverage 44.10% 27.66% -16.45% + Complexity 5157 1326 -3831 = Files 936 390 -546 Lines 4162915609-26020 Branches 4189 1385 -2804 = - Hits 18362 4318-14044 + Misses2163810964-10674 + Partials 1629 327 -1302 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `21.19% <ø> (-13.27%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.88% <ø> (-50.86%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.64% <42.10%> (+50.52%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `53.26% <42.10%> (+53.26%)` | :arrow_up: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384077#comment-17384077 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (bb55c4c) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `27.99%`. > The diff coverage is `42.10%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## = - Coverage 44.10% 16.11% -28.00% + Complexity 5157 506 -4651 = Files 936 284 -652 Lines 4162911903-29726 Branches 4189 990 -3199 = - Hits 18362 1918-16444 + Misses21638 9818-11820 + Partials 1629 167 -1462 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <ø> (-34.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.88% <ø> (-50.86%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.64% <42.10%> (+50.52%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `53.26% <42.10%> (+53.26%)` | :arrow_up: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384057#comment-17384057 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (bb55c4c) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `41.29%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #3259 +/- ## - Coverage 44.10% 2.81% -41.30% + Complexity 5157 85 -5072 Files 936 284 -652 Lines 41629 11903-29726 Branches 4189 990 -3199 - Hits 18362 335-18027 + Misses21638 11542-10096 + Partials 1629 26 -1603 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <ø> (-34.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.88% <ø> (-50.86%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `8.91% <0.00%> (-0.21%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `0.00% <0.00%> (ø)` | | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384036#comment-17384036 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384017#comment-17384017 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r672888552 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,26 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); Review comment: make sense ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -171,4 +200,38 @@ private int doCluster(JavaSparkContext jsc) throws Exception { return client.scheduleClustering(Option.empty()); } } + + @TestOnly + public int doScheduleAndCluster() throws Exception { +return this.doScheduleAndCluster(jsc); + } + + public int doScheduleAndCluster(JavaSparkContext jsc) throws Exception { +LOG.info("Step 1: Do schedule"); +String schemaStr = getSchemaFromLatestInstant(); +try (SparkRDDWriteClient client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr, cfg.parallelism, Option.empty(), props)) { + + Option instantTime; + if (cfg.clusteringInstantTime != null) { +client.scheduleClusteringAtInstant(cfg.clusteringInstantTime, Option.empty()); +instantTime = Option.of(cfg.clusteringInstantTime); + } else { +instantTime = client.scheduleClustering(Option.empty()); + } + + int result = instantTime.isPresent() ? 0 : -1; Review comment: can we reuse the sparkclient modify doCluster(sc) to doCluster(sc, client) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383930#comment-17383930 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * bb55c4ce75a59c179395748793c113cce5bcf714 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1039) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383877#comment-17383877 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=922) * bb55c4ce75a59c179395748793c113cce5bcf714 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1039) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383876#comment-17383876 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=922) * bb55c4ce75a59c179395748793c113cce5bcf714 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383862#comment-17383862 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r672894516 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -171,4 +200,38 @@ private int doCluster(JavaSparkContext jsc) throws Exception { return client.scheduleClustering(Option.empty()); } } + + @TestOnly + public int doScheduleAndCluster() throws Exception { +return this.doScheduleAndCluster(jsc); + } + + public int doScheduleAndCluster(JavaSparkContext jsc) throws Exception { +LOG.info("Step 1: Do schedule"); +String schemaStr = getSchemaFromLatestInstant(); +try (SparkRDDWriteClient client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr, cfg.parallelism, Option.empty(), props)) { + + Option instantTime; + if (cfg.clusteringInstantTime != null) { +client.scheduleClusteringAtInstant(cfg.clusteringInstantTime, Option.empty()); +instantTime = Option.of(cfg.clusteringInstantTime); + } else { +instantTime = client.scheduleClustering(Option.empty()); + } + + int result = instantTime.isPresent() ? 0 : -1; Review comment: can we reuse the sparkclient modify doCluster(sc) to doCluster(sc, client) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383856#comment-17383856 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r672888552 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,26 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); Review comment: make sense -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17382944#comment-17382944 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-882142496 Hi @lw309637554 travis and azure are all paseed :) PTAL. could we move on? :) thanks a lot -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381074#comment-17381074 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=922) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381044#comment-17381044 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=922) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381043#comment-17381043 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-880409359 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381027#comment-17381027 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (b4aa786) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (d024439) will **increase** coverage by `3.65%`. > The diff coverage is `38.88%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## + Coverage 44.10% 47.76% +3.65% - Complexity 5157 5566 +409 Files 936 936 Lines 4162941653 +24 Branches 4189 4195 +6 + Hits 1836219897+1535 + Misses2163819987-1651 - Partials 1629 1769 +140 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `39.97% <ø> (ø)` | | | hudiclient | `34.47% <ø> (+<0.01%)` | :arrow_up: | | hudicommon | `48.69% <ø> (ø)` | | | hudiflink | `59.68% <ø> (ø)` | | | hudihadoopmr | `52.02% <ø> (ø)` | | | hudisparkdatasource | `67.21% <ø> (ø)` | | | hudisync | `55.73% <ø> (ø)` | | | huditimelineservice | `64.07% <ø> (ø)` | | | hudiutilities | `58.96% <38.88%> (+49.84%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `51.06% <38.88%> (+51.06%)` | :arrow_up: | | [...e/hudi/client/heartbeat/HoodieHeartbeatClient.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9oZWFydGJlYXQvSG9vZGllSGVhcnRiZWF0Q2xpZW50LmphdmE=) | `69.15% <0.00%> (+0.93%)` | :arrow_up: | | [.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==) | `88.79% <0.00%> (+5.17%)` | :arrow_up: | | [...e/hudi/utilities/transform/ChainedTransformer.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9DaGFpbmVkVHJhbnNmb3JtZXIuamF2YQ==) | `100.00% <0.00%> (+11.11%)` | :arrow_up: | | [...g/apache/hudi/utilities/schema/SchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlci5qYXZh) | `71.42% <0.00%>
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381025#comment-17381025 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381022#comment-17381022 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381014#comment-17381014 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (b4aa786) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `16.80%`. > The diff coverage is `38.88%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## = - Coverage 44.10% 27.29% -16.81% + Complexity 5157 1292 -3865 = Files 936 386 -550 Lines 4162915367-26262 Branches 4189 1345 -2844 = - Hits 18362 4195-14167 + Misses2163810864-10774 + Partials 1629 308 -1321 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `20.91% <ø> (-13.56%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <ø> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `58.96% <38.88%> (+49.84%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `51.06% <38.88%> (+51.06%)` | :arrow_up: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381007#comment-17381007 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (b4aa786) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `28.34%`. > The diff coverage is `38.88%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## = - Coverage 44.10% 15.76% -28.35% + Complexity 5157 493 -4664 = Files 936 284 -652 Lines 4162911859-29770 Branches 4189 988 -3201 = - Hits 18362 1869-16493 + Misses21638 9823-11815 + Partials 1629 167 -1462 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <ø> (-34.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <ø> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `58.96% <38.88%> (+49.84%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `51.06% <38.88%> (+51.06%)` | :arrow_up: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381001#comment-17381001 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-880370120 > @zhangyue19921010 hello, Which company are you from? Can we add wechat? My wechat is lw19900302 It's my pleasure. I'm coming from freewheel :) Also all the changes are done. PTAL and thanks a lot for your review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381000#comment-17381000 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-880370120 > @zhangyue19921010 hello, Which company are you from? Can we add wechat? My wechat is lw19900302 It's my pleasure. I'm coming from freewheel :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380999#comment-17380999 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670111508 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java ## @@ -1059,6 +1059,50 @@ public void testHoodieAsyncClusteringJob() throws Exception { assertEquals(1, metaClient.getActiveTimeline().getCompletedReplaceTimeline().getInstants().toArray().length); } + @Test + public void testHoodieAsyncClusteringJobWithScheduleAndExecute() throws Exception { +String tableBasePath = dfsBasePath + "/asyncClustering2"; +// Keep it higher than batch-size to test continuous mode +int totalRecords = 3000; + +// Initial bulk insert +HoodieDeltaStreamer.Config cfg = TestHelpers.makeConfig(tableBasePath, WriteOperationType.INSERT); +cfg.continuousMode = true; +cfg.tableType = HoodieTableType.COPY_ON_WRITE.name(); +cfg.configs.add(String.format("%s=%d", SourceConfigs.MAX_UNIQUE_RECORDS_PROP, totalRecords)); +cfg.configs.add(String.format("%s=false", HoodieCompactionConfig.AUTO_CLEAN_PROP.key())); +cfg.configs.add(String.format("%s=true", HoodieClusteringConfig.ASYNC_CLUSTERING_ENABLE_OPT_KEY.key())); +HoodieDeltaStreamer ds = new HoodieDeltaStreamer(cfg, jsc); +deltaStreamerTestRunner(ds, cfg, (r) -> { + TestHelpers.assertAtLeastNCommits(2, tableBasePath, dfs); + HoodieClusteringJob.Config scheduleClusteringConfig = buildHoodieClusteringUtilConfig(tableBasePath, + null, true); + scheduleClusteringConfig.runningMode = "scheduleAndExecute"; + HoodieClusteringJob scheduleClusteringJob = new HoodieClusteringJob(jsc, scheduleClusteringConfig); + + try { +int result = scheduleClusteringJob.doScheduleAndCluster(); +if (result == 0) { + LOG.info("Cluster success"); +} else { + LOG.warn("Import failed"); + return false; +} + } catch (Exception e) { +LOG.warn("ScheduleAndExecute clustering failed", e); +return false; + } + + HoodieTableMetaClient metaClient = HoodieTableMetaClient.builder().setConf(this.dfs.getConf()).setBasePath(tableBasePath).setLoadActiveTimelineOnLoad(true).build(); + int pendingReplaceSize = metaClient.getActiveTimeline().filterPendingReplaceTimeline().getInstants().toArray().length; Review comment: Nice catching. Changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380998#comment-17380998 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670111402 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -171,4 +200,38 @@ private int doCluster(JavaSparkContext jsc) throws Exception { return client.scheduleClustering(Option.empty()); } } + + @TestOnly + public int doScheduleAndCluster() throws Exception { +return this.doScheduleAndCluster(jsc); + } + + public int doScheduleAndCluster(JavaSparkContext jsc) throws Exception { +LOG.info("Step 1: Do schedule"); +String schemaStr = getSchemaFromLatestInstant(); +try (SparkRDDWriteClient client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr, cfg.parallelism, Option.empty(), props)) { + + Option instantTime; + if (cfg.clusteringInstantTime != null) { +client.scheduleClusteringAtInstant(cfg.clusteringInstantTime, Option.empty()); +instantTime = Option.of(cfg.clusteringInstantTime); + } else { +instantTime = client.scheduleClustering(Option.empty()); + } + + int result = instantTime.isPresent() ? 0 : -1; Review comment: E, actually, there already has doSchedule() and doCluster() function. But if we let doScheduleAndCluster() use doschedule() and docluster() directly, it will start and stop SparkRDDWriteClient twice which is an expensive action and unnecessary. Maybe let schedule action and cluster action use a common SparkRDDWriteClient is better. For example start and stop Timeline service twice. ``` 21/07/15 11:05:11 INFO EmbeddedTimelineService: Starting Timeline service !! 21/07/15 11:05:11 INFO EmbeddedTimelineService: Overriding hostIp to (localhost) found in spark-conf. It was null 21/07/15 11:05:11 INFO FileSystemViewManager: Creating View Manager with storage type :MEMORY 21/07/15 11:05:11 INFO FileSystemViewManager: Creating in-memory based Table View 21/07/15 11:05:11 INFO log: Logging initialized @4500ms to org.apache.hudi.org.eclipse.jetty.util.log.Slf4jLog 21/07/15 11:05:11 INFO Javalin: __ __ _ / / _ _ __ _ / /(_) __ / // __ `/| | / // __ `// // // __ \ / /_/ // /_/ / | |/ // /_/ // // // / / / \/ \__,_/ |___/ \__,_//_//_//_/ /_/ https://javalin.io/documentation 21/07/15 11:05:11 INFO Javalin: Starting Javalin ... ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380997#comment-17380997 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (b4aa786) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `0.00%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #3259 +/- ## === - Coverage 2.83% 2.82% -0.01% Complexity 85 85 === Files 284 284 Lines 11835 11859 +24 Branches982 988 +6 === Hits335 335 - Misses11474 11498 +24 Partials 26 26 ``` | Flag | Coverage Δ | | |---|---|---| | hudiclient | `0.00% <ø> (ø)` | | | hudisync | `4.85% <ø> (ø)` | | | hudiutilities | `9.04% <0.00%> (-0.08%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `0.00% <0.00%> (ø)` | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=continue_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=footer_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation). Last update [d024439...b4aa786](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=lastupdated_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380996#comment-17380996 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670110077 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -171,4 +200,38 @@ private int doCluster(JavaSparkContext jsc) throws Exception { return client.scheduleClustering(Option.empty()); } } + + @TestOnly + public int doScheduleAndCluster() throws Exception { Review comment: Sure thing. Changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380995#comment-17380995 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670109986 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,26 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); + switch (runningMode) { +case SCHEDULE: { + LOG.info("Running Mode: [" + SCHEDULE + "]; Do schedule"); + Option instantTime = doSchedule(jsc); + int result = instantTime.isPresent() ? 0 : -1; + if (result == 0) { +LOG.info("The schedule instant time is " + instantTime.get()); + } + return result; +} +case SCHEDULE_AND_EXECUTE: { + LOG.info("Running Mode: [" + SCHEDULE_AND_EXECUTE + "]"); + return doScheduleAndCluster(jsc); +} +case EXECUTE: +default: { + LOG.info("Running Mode: [" + EXECUTE + "]; Do cluster"); Review comment: Nice catching. I changed the default behavior as `LOG.info("Unsupported running mode [" + runningMode + "], quit the job directly");` in case users set a wrong value of --mode like `--mode abcd`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380994#comment-17380994 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670109400 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,26 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); Review comment: When developers call `public int cluster(int retry)` internally like https://github.com/apache/hudi/blob/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52/hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java#L1069 They may not set running mode config, so we need check this value to avoid NLP. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380993#comment-17380993 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875) * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380992#comment-17380992 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875) * b4aa7869d8343a16b225a81844e907fbee63b576 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380534#comment-17380534 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-879829631 @zhangyue19921010 hello, Which company are you from? Can we add wechat? My wechat is lw19900302 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380530#comment-17380530 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r669542580 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java ## @@ -1059,6 +1059,50 @@ public void testHoodieAsyncClusteringJob() throws Exception { assertEquals(1, metaClient.getActiveTimeline().getCompletedReplaceTimeline().getInstants().toArray().length); } + @Test + public void testHoodieAsyncClusteringJobWithScheduleAndExecute() throws Exception { +String tableBasePath = dfsBasePath + "/asyncClustering2"; +// Keep it higher than batch-size to test continuous mode +int totalRecords = 3000; + +// Initial bulk insert +HoodieDeltaStreamer.Config cfg = TestHelpers.makeConfig(tableBasePath, WriteOperationType.INSERT); +cfg.continuousMode = true; +cfg.tableType = HoodieTableType.COPY_ON_WRITE.name(); +cfg.configs.add(String.format("%s=%d", SourceConfigs.MAX_UNIQUE_RECORDS_PROP, totalRecords)); +cfg.configs.add(String.format("%s=false", HoodieCompactionConfig.AUTO_CLEAN_PROP.key())); +cfg.configs.add(String.format("%s=true", HoodieClusteringConfig.ASYNC_CLUSTERING_ENABLE_OPT_KEY.key())); +HoodieDeltaStreamer ds = new HoodieDeltaStreamer(cfg, jsc); +deltaStreamerTestRunner(ds, cfg, (r) -> { + TestHelpers.assertAtLeastNCommits(2, tableBasePath, dfs); + HoodieClusteringJob.Config scheduleClusteringConfig = buildHoodieClusteringUtilConfig(tableBasePath, + null, true); + scheduleClusteringConfig.runningMode = "scheduleAndExecute"; + HoodieClusteringJob scheduleClusteringJob = new HoodieClusteringJob(jsc, scheduleClusteringConfig); + + try { +int result = scheduleClusteringJob.doScheduleAndCluster(); +if (result == 0) { + LOG.info("Cluster success"); +} else { + LOG.warn("Import failed"); + return false; +} + } catch (Exception e) { +LOG.warn("ScheduleAndExecute clustering failed", e); +return false; + } + + HoodieTableMetaClient metaClient = HoodieTableMetaClient.builder().setConf(this.dfs.getConf()).setBasePath(tableBasePath).setLoadActiveTimelineOnLoad(true).build(); + int pendingReplaceSize = metaClient.getActiveTimeline().filterPendingReplaceTimeline().getInstants().toArray().length; Review comment: can we use like TestHelpers.assertAtLeastNReplaceCommits(2, tableBasePath, dfs);? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380525#comment-17380525 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r669540279 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -171,4 +200,38 @@ private int doCluster(JavaSparkContext jsc) throws Exception { return client.scheduleClustering(Option.empty()); } } + + @TestOnly + public int doScheduleAndCluster() throws Exception { +return this.doScheduleAndCluster(jsc); + } + + public int doScheduleAndCluster(JavaSparkContext jsc) throws Exception { +LOG.info("Step 1: Do schedule"); +String schemaStr = getSchemaFromLatestInstant(); +try (SparkRDDWriteClient client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr, cfg.parallelism, Option.empty(), props)) { + + Option instantTime; + if (cfg.clusteringInstantTime != null) { +client.scheduleClusteringAtInstant(cfg.clusteringInstantTime, Option.empty()); +instantTime = Option.of(cfg.clusteringInstantTime); + } else { +instantTime = client.scheduleClustering(Option.empty()); + } + + int result = instantTime.isPresent() ? 0 : -1; Review comment: 1、can we implement three function? doschedule、docluster、doScheduleAndCluster 2、doScheduleAndCluster is consists of doschedule and docluster doScheduleAndCluster { doschedule() docluster() } -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380521#comment-17380521 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r669537910 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -171,4 +200,38 @@ private int doCluster(JavaSparkContext jsc) throws Exception { return client.scheduleClustering(Option.empty()); } } + + @TestOnly + public int doScheduleAndCluster() throws Exception { Review comment: can we remove this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380518#comment-17380518 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r669536894 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,26 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); Review comment: when runningMode will be null? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380520#comment-17380520 ] ASF GitHub Bot commented on HUDI-2164: -- lw309637554 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r669537132 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,26 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); + switch (runningMode) { +case SCHEDULE: { + LOG.info("Running Mode: [" + SCHEDULE + "]; Do schedule"); + Option instantTime = doSchedule(jsc); + int result = instantTime.isPresent() ? 0 : -1; + if (result == 0) { +LOG.info("The schedule instant time is " + instantTime.get()); + } + return result; +} +case SCHEDULE_AND_EXECUTE: { + LOG.info("Running Mode: [" + SCHEDULE_AND_EXECUTE + "]"); + return doScheduleAndCluster(jsc); +} +case EXECUTE: +default: { + LOG.info("Running Mode: [" + EXECUTE + "]; Do cluster"); Review comment: if we have check the param . this will not happend? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380271#comment-17380271 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-879554232 Hi @satishkotha and @lw309637554 , sorry to bother you. Could you please take a look at this patch at your convince? I believe it is worth a little attention here. Because users can build up an async clustering pipeline through this PR which can schedule and cluster automatically and easily :) Looking forward your reply. Thanks a lot. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379755#comment-17379755 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (7ae050e) into [master](https://codecov.io/gh/apache/hudi/commit/16e90d30eaa14e5c1c4632ad0a90497df601c637?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (16e90d3) will **increase** coverage by `0.09%`. > The diff coverage is `42.85%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## + Coverage 47.62% 47.71% +0.09% - Complexity 5502 5529 +27 Files 930 934 +4 Lines 4126841480 +212 Branches 4137 4173 +36 + Hits 1965519794 +139 - Misses1986519924 +59 - Partials 1748 1762 +14 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `39.97% <ø> (ø)` | | | hudiclient | `34.46% <ø> (-0.13%)` | :arrow_down: | | hudicommon | `48.55% <ø> (-0.04%)` | :arrow_down: | | hudiflink | `60.03% <ø> (+0.44%)` | :arrow_up: | | hudihadoopmr | `51.55% <ø> (+0.26%)` | :arrow_up: | | hudisparkdatasource | `67.37% <ø> (+0.05%)` | :arrow_up: | | hudisync | `54.51% <ø> (+0.03%)` | :arrow_up: | | huditimelineservice | `64.07% <ø> (ø)` | | | hudiutilities | `59.17% <42.85%> (+0.60%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `58.06% <42.85%> (-3.37%)` | :arrow_down: | | [...ion/cluster/SparkClusteringPlanActionExecutor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NsdXN0ZXIvU3BhcmtDbHVzdGVyaW5nUGxhbkFjdGlvbkV4ZWN1dG9yLmphdmE=) | `60.00% <0.00%> (-15.00%)` | :arrow_down: | | [...ache/hudi/common/fs/inline/InMemoryFileSystem.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL2lubGluZS9Jbk1lbW9yeUZpbGVTeXN0ZW0uamF2YQ==) | `79.31% <0.00%> (-10.35%)` | :arrow_down: | | [...java/org/apache/hudi/table/HoodieTableFactory.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZUZhY3RvcnkuamF2YQ==) | `84.61% <0.00%> (-7.06%)` | :arrow_down: | |
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379754#comment-17379754 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (7ae050e) into [master](https://codecov.io/gh/apache/hudi/commit/16e90d30eaa14e5c1c4632ad0a90497df601c637?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (16e90d3) will **decrease** coverage by `3.90%`. > The diff coverage is `42.85%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## - Coverage 47.62% 43.71% -3.91% + Complexity 5502 4598 -904 Files 930 818 -112 Lines 4126835216-6052 Branches 4137 3233 -904 - Hits 1965515396-4259 + Misses1986518664-1201 + Partials 1748 1156 -592 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `39.97% <ø> (ø)` | | | hudiclient | `34.46% <ø> (-0.13%)` | :arrow_down: | | hudicommon | `48.55% <ø> (-0.04%)` | :arrow_down: | | hudiflink | `60.03% <ø> (+0.44%)` | :arrow_up: | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `5.37% <ø> (-49.11%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.17% <42.85%> (+0.60%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `58.06% <42.85%> (-3.37%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...main/java/org/apache/hudi/hive/HiveSyncConfig.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN5bmNDb25maWcuamF2YQ==) | `0.00% <0.00%> (-98.08%)` | :arrow_down: | |
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379751#comment-17379751 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (7ae050e) into [master](https://codecov.io/gh/apache/hudi/commit/16e90d30eaa14e5c1c4632ad0a90497df601c637?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (16e90d3) will **decrease** coverage by `20.06%`. > The diff coverage is `42.85%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## = - Coverage 47.62% 27.56% -20.07% + Complexity 5502 1293 -4209 = Files 930 385 -545 Lines 4126815238-26030 Branches 4137 1322 -2815 = - Hits 19655 4200-15455 + Misses1986510731 -9134 + Partials 1748 307 -1441 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `20.93% <ø> (-13.65%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `5.37% <ø> (-49.11%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.17% <42.85%> (+0.60%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `58.06% <42.85%> (-3.37%)` | :arrow_down: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379741#comment-17379741 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (7ae050e) into [master](https://codecov.io/gh/apache/hudi/commit/16e90d30eaa14e5c1c4632ad0a90497df601c637?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (16e90d3) will **decrease** coverage by `31.64%`. > The diff coverage is `42.85%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## = - Coverage 47.62% 15.97% -31.65% + Complexity 5502 495 -5007 = Files 930 283 -647 Lines 4126811734-29534 Branches 4137 967 -3170 = - Hits 19655 1875-17780 + Misses19865 9692-10173 + Partials 1748 167 -1581 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <ø> (-34.59%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `5.37% <ø> (-49.11%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.17% <42.85%> (+0.60%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `58.06% <42.85%> (-3.37%)` | :arrow_down: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379722#comment-17379722 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (7ae050e) into [master](https://codecov.io/gh/apache/hudi/commit/16e90d30eaa14e5c1c4632ad0a90497df601c637?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (16e90d3) will **decrease** coverage by `44.77%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650=150=pr=VTTXabwbs2_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #3259 +/- ## - Coverage 47.62% 2.85% -44.78% + Complexity 5502 85 -5417 Files 930 283 -647 Lines 41268 11734-29534 Branches 4137 967 -3170 - Hits 19655 335-19320 + Misses19865 11373 -8492 + Partials 1748 26 -1722 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <ø> (-34.59%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `5.37% <ø> (-49.11%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `9.04% <0.00%> (-49.53%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `0.00% <0.00%> (-61.43%)` | :arrow_down: | | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | |
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379608#comment-17379608 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379586#comment-17379586 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870) * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379584#comment-17379584 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870) * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379546#comment-17379546 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379522#comment-17379522 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * d369ea7aedc892c995c4cd0132e15b2bb29cfb65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=862) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=870) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379520#comment-17379520 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878723946 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379385#comment-17379385 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379359#comment-17379359 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)