[ 
https://issues.apache.org/jira/browse/HUDI-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454035#comment-17454035
 ] 

sivabalan narayanan commented on HUDI-1214:
-------------------------------------------

[~rmahindra] [~Trevorzhang] [~vbalaji] : I don't quite get the requirement 
here. Can someone please clarify. 

This is my understanding:

bootstrapping is done via spark DS, and later we start deltastreamer. While 
starting deltastreamer we can provide an check point via configs right. So, 
what exactly are we looking at here.

 

> Need ability to set deltastreamer checkpoints when doing Spark datasource 
> writes
> --------------------------------------------------------------------------------
>
>                 Key: HUDI-1214
>                 URL: https://issues.apache.org/jira/browse/HUDI-1214
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Spark Integration
>            Reporter: Balaji Varadarajan
>            Assignee: Trevorzhang
>            Priority: Major
>              Labels: sev:high, user-support-issues
>             Fix For: 0.11.0
>
>
> Such support is needed  for bootstrapping cases when users use spark write to 
> do initial bootstrap and then subsequently use deltastreamer.
> DeltaStreamer manages checkpoints inside hoodie commit files and expects 
> checkpoints in previously committed metadata. Users are expected to pass 
> checkpoint or initial checkpoint provider when performing bootstrap through 
> deltastreamer. Such support is not present when doing bootstrap using Spark 
> Datasource.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to