[PR] [HUDI-7416] Add interface for StreamProfile to be used in StreamSync for reading and writing data. [hudi]

via GitHub Fri, 16 Feb 2024 05:32:20 -0800


vinishjail97 opened a new pull request, #10687:
URL: https://github.com/apache/hudi/pull/10687


   ### Change Logs
   
   Introducing a new class known as `StreamProfile` which contains details 
about how the next sync round in StreamSync should be consumed and written. For 
eg:
   
   `KafkaStreamProfile` contains number of events to consume in this sync round.
   `S3StreamProfile` contains the list of files to consume in this sync round
   `HudiIncrementalStreamProfile` contains the beginInstant and endInstant 
commit times to consume in this sync round.
   In future we can add the method for choosing the writeOperationType and 
indexType as well, for now `streamProfile.getSourceSpecificContext() `will be 
used to consume the data from the source.
   
   ### Impact
   
   No change in public API's, Option has been used to define the new field in 
the constructors and previous constructors are backwards compatible.
   
   ### Risk level (write none, low medium or high below)
   
   Low
   
   ### Documentation Update
   
   None, this is just adding an optional interface that can be used to consume 
and write data in StreamSync utility. 
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Change Logs and Impact were stated clearly
   - [x] Adequate tests were added if applicable
   - [] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [HUDI-7416] Add interface for StreamProfile to be used in StreamSync for reading and writing data. [hudi]

Reply via email to