[ 
https://issues.apache.org/jira/browse/HUDI-8076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-8076:
---------------------------------
    Description: 
*(!) Work in Progress.* 
h3. Basic Idea: 

Introduce support for Hudi writer code to produce storage format for the last 
2-3 table versions (0.14/6, 1.0/7,8). This enables older readers to continue 
reading the table even when writers are upgraded, as long as writer produces 
storage compatible with the latest table version the reader can read. The 
readers can then be rolling upgraded easily, at different cadence without need 
for any tight co-ordination. Additionally, the reader should have ability to 
"dynamically" deduce table version based on table properties, such that when 
the writer is switched to the latest table version, subsequent reads will just 
adapt and read it as the latest table version. 

Operators still need to ensure all readers have the latest binary that supports 
a given table version, before switching the writer to that version. Special 
consideration to table services, as reader/writer processes, that should be 
able manage the tables as well. Queries should gracefully fail during table 
version switches and start eventually succeeding when writer completes 
switching. Writers/table services should fail if working with an unsupported 
table version, without which one cannot start switching writers to new version 
(this may still need a minor release on the last 2-3 table versions?)
h3. High level approach: 

We need to introduce table version aware reading/writing inside the core layers 
of Hudi, as well as query engines like Spark/Flink. 

To this effect : We need a HoodieStorageFormat abstraction that can cover the 
following layers . 

*Table Properties* ** 

All table properties need to be table version aware. 



*Timeline* 

Timeline already has a timeline layout version, which can be extended to write 
older and newer timeline. The ArchivedTimeline can be old style or LSM style, 
depending on table version. Typically, we have only written timeline with 
latest schema, we may have to version the .avsc files themselves and write 
using the table version specific avro specific class? Also we need a flexible 
mapping from action names used in each table version, to handle things like 
replacecommit to cluster. 

{*}TBD{*}: whether or not completion time based changes can be retained, 
assuming instant file creation timestamp. 
h4. *FileSystemView*

Need to support file slice grouping based on old/new timeline + file naming 
combination. Also need abstractions for how files are named in each table 
version, to handle things like log file name changes.
h4. *WriteHandle*

This layer may or may not need changes, as base files don't really change. and 
log format is already versioned (see next point). However, it's prudent to 
ensure we have a mechanism for this, since there could be different ways of 
encoding records or footers etc (e.g HFile k-v pairs, ...) 
h4. Metadata table 

Encoding of k-v pairs, their schemas etc.. which partitions are supported in 
what table versions. *TBD* to see if any of the code around recent 
simplification needs to be undone. 
h4. LogFormat Reader/Writer

the blogs and format itself is version, but its not yet tied to the table 
version overall. So we need these links so that the reader can for eg decide 
how to read/assemble log file scanning? FileGroupReader abstractions may need 
to change as well. 
h4. Table Service

This should largely be independent and built on layers above. but some cases 
exist, depending on whether the code relies on format specifics for generating 
plans or execution. This may be a mix of code changes, acceptable behavior 
changes and format specific specializations. 

TBD: to see if behavior like writing rollback block needs to be 

  was:
*(!) Work in Progress.* 
h3. Basic Idea: 

Introduce support for Hudi writer code to produce storage format for the last 
2-3 table versions (0.14/6, 1.0/7,8). This enables older readers to continue 
reading the table even when writers are upgraded, as long as writer produces 
storage compatible with the latest table version the reader can read. The 
readers can then be rolling upgraded easily, at different cadence without need 
for any tight co-ordination. Additionally, the reader should have ability to 
"dynamically" deduce table version based on table properties, such that when 
the writer is switched to the latest table version, subsequent reads will just 
adapt and read it as the latest table version. 

Operators still need to ensure all readers have the latest binary that supports 
a given table version, before switching the writer to that version. Special 
consideration to table services, as reader/writer processes, that should be 
able manage the tables as well. Queries should gracefully fail during table 
version switches and start eventually succeeding when writer completes 
switching. Writers/table services should fail if working with an unsupported 
table version, without which one cannot start switching writers to new version 
(this may still need a minor release on the last 2-3 table versions?)
h3. High level approach: 

We need to introduce table version aware reading/writing inside the core layers 
of Hudi, as well as query engines like Spark/Flink. 

To this effect : We need a HoodieStorageFormat abstraction that can cover the 
following layers . 

Timeline : Timeline already has a timeline layout version, which can be 
extended to write older and newer timeline. The ArchivedTimeline can be old 
style or LSM style, depending on table version. 
{*}TBD{*}: whether or not completion time based changes can be retained, 
assuming instant file creation timestamp. 

 

WriteHandle : This layer may or may not need changes, as base files don't 
really change. and log format is already versioned (see next point). However, 
it's prudent to ensure we have a mechanism for this, since there could be 
different ways of encoding records or footers etc (e.g HFile k-v pairs, ...) 

 

Metadata table:  Encoding of k-v pairs, their schemas etc.. which partitions 
are supported in what table versions. *TBD* to see if any of the code around 
recent simplification needs to be undone. 

 

LogFormat Reader/Writer : the blogs and format itself is version, but its not 
yet tied to the table version overall. So we need these links so that the 
reader can for eg decide how to read/assemble log file scanning? 
FileGroupReader abstractions may need to change as well. 


> RFC for backwards compatible writer mode in Hudi 1.0
> ----------------------------------------------------
>
>                 Key: HUDI-8076
>                 URL: https://issues.apache.org/jira/browse/HUDI-8076
>             Project: Apache Hudi
>          Issue Type: New Feature
>            Reporter: Ethan Guo
>            Assignee: Vinoth Chandar
>            Priority: Major
>             Fix For: 1.0.0
>
>
> *(!) Work in Progress.* 
> h3. Basic Idea: 
> Introduce support for Hudi writer code to produce storage format for the last 
> 2-3 table versions (0.14/6, 1.0/7,8). This enables older readers to continue 
> reading the table even when writers are upgraded, as long as writer produces 
> storage compatible with the latest table version the reader can read. The 
> readers can then be rolling upgraded easily, at different cadence without 
> need for any tight co-ordination. Additionally, the reader should have 
> ability to "dynamically" deduce table version based on table properties, such 
> that when the writer is switched to the latest table version, subsequent 
> reads will just adapt and read it as the latest table version. 
> Operators still need to ensure all readers have the latest binary that 
> supports a given table version, before switching the writer to that version. 
> Special consideration to table services, as reader/writer processes, that 
> should be able manage the tables as well. Queries should gracefully fail 
> during table version switches and start eventually succeeding when writer 
> completes switching. Writers/table services should fail if working with an 
> unsupported table version, without which one cannot start switching writers 
> to new version (this may still need a minor release on the last 2-3 table 
> versions?)
> h3. High level approach: 
> We need to introduce table version aware reading/writing inside the core 
> layers of Hudi, as well as query engines like Spark/Flink. 
> To this effect : We need a HoodieStorageFormat abstraction that can cover the 
> following layers . 
> *Table Properties* ** 
> All table properties need to be table version aware. 
> *Timeline* 
> Timeline already has a timeline layout version, which can be extended to 
> write older and newer timeline. The ArchivedTimeline can be old style or LSM 
> style, depending on table version. Typically, we have only written timeline 
> with latest schema, we may have to version the .avsc files themselves and 
> write using the table version specific avro specific class? Also we need a 
> flexible mapping from action names used in each table version, to handle 
> things like replacecommit to cluster. 
> {*}TBD{*}: whether or not completion time based changes can be retained, 
> assuming instant file creation timestamp. 
> h4. *FileSystemView*
> Need to support file slice grouping based on old/new timeline + file naming 
> combination. Also need abstractions for how files are named in each table 
> version, to handle things like log file name changes.
> h4. *WriteHandle*
> This layer may or may not need changes, as base files don't really change. 
> and log format is already versioned (see next point). However, it's prudent 
> to ensure we have a mechanism for this, since there could be different ways 
> of encoding records or footers etc (e.g HFile k-v pairs, ...) 
> h4. Metadata table 
> Encoding of k-v pairs, their schemas etc.. which partitions are supported in 
> what table versions. *TBD* to see if any of the code around recent 
> simplification needs to be undone. 
> h4. LogFormat Reader/Writer
> the blogs and format itself is version, but its not yet tied to the table 
> version overall. So we need these links so that the reader can for eg decide 
> how to read/assemble log file scanning? FileGroupReader abstractions may need 
> to change as well. 
> h4. Table Service
> This should largely be independent and built on layers above. but some cases 
> exist, depending on whether the code relies on format specifics for 
> generating plans or execution. This may be a mix of code changes, acceptable 
> behavior changes and format specific specializations. 
> TBD: to see if behavior like writing rollback block needs to be 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to