Re: [DISCUSS] Hudi Reverse Streamer

Vinoth Chandar Thu, 30 Mar 2023 20:12:36 -0700

Essentially.

Old architecture :    (operational database) ==> some tool ==> (data
warehouse raw data) ==> SQL ETL ==> (data warehouse derived data)


New architecture : (operational database) ==> Hudi delta Streamer ==> (Hudi
raw data) ==> Spark/Flink Hudi ETL ==> (Hudi derived data) ==> Hudi Reverse
Streamer ==> (Data Warehouse/Kafka/Operational Database)

On Thu, Mar 30, 2023 at 8:09 PM Vinoth Chandar <vin...@apache.org> wrote:

> Hi all,
>
> Any interest in building a reverse streaming tool, that does the reverse
> of what the DeltaStreamer tool does? It will read Hudi table incrementally
> (only source) and write out the data to a variety of sinks - Kafka, JDBC
> Databases, DFS.
>
> This has come up many times with data warehouse users. Often times, they
> want to use Hudi to speed up or reduce costs on their data ingestion and
> ETL (using Spark/Flink), but want to move the derived data back into a data
> warehouse or an operational database for serving.
>
> What do you all think?
>
> Thanks
> Vinoth
>

Re: [DISCUSS] Hudi Reverse Streamer

Reply via email to