Hi, i have created a PR here: https://github.com/apache/hudi/pull/2854/files
In the PR i do these changes:
1. Add a metadata column: "_hoodie_cdc_operation", i did not add a config
option because i can not find a good way to make the code clean, a metadata
column is very primitive and a config opt
Ack.
But the "rerun tests" bot should be working. I see the github actions
running actually. So not sure.
https://github.com/apache/hudi/actions
May be need a JIRA to investigate :)
On Fri, Apr 16, 2021 at 6:44 AM Roc Marshal wrote:
>
>
>
> Susudong.
> Thanks for your help.
> Now,
Hi Danny,
Read up on the Flink docs as well.
If we don't actually publish data to the metacolumn, I think the overhead
is pretty low w.r.t avro/parquet. Both are very good at encoding nulls.
But, I feel it's worth adding a HoodieWriteConfig to control this and since
addition of meta columns mostl
> Is it providing the ability to author continuous queries on
Hudi source tables end-end,
given Flink can use the flags to generate retract/upsert streams
Yes,that's the key point, with these flags plus flink stateful operators,
we can have a real time incremental ETL pipeline.
For example, a glo
Keeping compatibility is a must. i.e users should be able to upgrade to the
new release with the _hoodie_cdc_flag meta column,
and be able to query new data (with this new meta col) alongside old data
(without this new meta col).
In fact, they should be able to downgrade back to previous versions (