[ https://issues.apache.org/jira/browse/HUDI-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinoth Chandar updated HUDI-2475: --------------------------------- Summary: Rolling Upgrade downgrade story for 0.10 & enabling metadata (was: Upgrade downgrade infra for enabling metadata) > Rolling Upgrade downgrade story for 0.10 & enabling metadata > ------------------------------------------------------------ > > Key: HUDI-2475 > URL: https://issues.apache.org/jira/browse/HUDI-2475 > Project: Apache Hudi > Issue Type: Sub-task > Reporter: sivabalan narayanan > Assignee: Vinoth Chandar > Priority: Blocker > Fix For: 0.10.0 > > > Upgrade downgrade infra for enabling metadata. > > If user is having a writer process and clustering/compaction running async. > > - New synchronous metadata design, has a constraint that once metadata table > is bootstrapped, all commits will happen synchronously. In other words, there > is no catch up business wrt datatable. > So, it may not be feasible to do rolling upgrade (i.e. upgrade writer first > while async compaction is running) and then upgrade async compaction. > Bootstrap has to be done by stopping all processes and then we can restart > all other processes one by one (by using the upgraded hudi library) w/ > metadata enabled. > This is the only viable option I can think of. > 1. Stop all processes. Upgrade to hudi to a version w/ synchronous metadata. > bring up one writer process w/ metadata config enabled. this will bootstrap > the metadata table. and from there on, any new commits by the writer will do > synchronous updates to metadata. > Note: users can choose to upgrade via hudi-cli if need be. but easier would > be to just start the writer. Expect some delay for first commit since > bootstrap will be happening. > 2. Once first commit in previous writer process completes successfully, we > can restart all other processes. Upgrade the async table service (to hudi > version w/ metadata enabled) and restart it. *Ensure metadata table is > enabled across all processes.* Even if missed on one, could result in data > loss. > > By this, once metadata table is bootstrapped, any new commits from all > processes will be synced to metadata. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)