Essentially. Old architecture : (operational database) ==> some tool ==> (data warehouse raw data) ==> SQL ETL ==> (data warehouse derived data)
New architecture : (operational database) ==> Hudi delta Streamer ==> (Hudi raw data) ==> Spark/Flink Hudi ETL ==> (Hudi derived data) ==> Hudi Reverse Streamer ==> (Data Warehouse/Kafka/Operational Database) On Thu, Mar 30, 2023 at 8:09 PM Vinoth Chandar <vin...@apache.org> wrote: > Hi all, > > Any interest in building a reverse streaming tool, that does the reverse > of what the DeltaStreamer tool does? It will read Hudi table incrementally > (only source) and write out the data to a variety of sinks - Kafka, JDBC > Databases, DFS. > > This has come up many times with data warehouse users. Often times, they > want to use Hudi to speed up or reduce costs on their data ingestion and > ETL (using Spark/Flink), but want to move the derived data back into a data > warehouse or an operational database for serving. > > What do you all think? > > Thanks > Vinoth >