bithw1 opened a new issue, #8348:
URL: https://github.com/apache/hudi/issues/8348

   Hi,
   
   I am reading at 
https://hudi.apache.org/docs/flink-quick-start-guide#streaming-query
   
   The example query is as follows:
   
   ```
   CREATE TABLE t1(
     uuid VARCHAR(20) PRIMARY KEY NOT ENFORCED,
     name VARCHAR(10),
     age INT,
     ts TIMESTAMP(3),
     `partition` VARCHAR(20)
   )
   PARTITIONED BY (`partition`)
   WITH (
     'connector' = 'hudi',
     'path' = '${path}',
     'table.type' = 'MERGE_ON_READ',
     'read.streaming.enabled' = 'true',  -- this option enable the streaming 
read
     'read.start-commit' = '20210316134557', -- specifies the start commit 
instant time
     'read.streaming.check-interval' = '4' -- specifies the check interval for 
finding new source commits, default 60s.
   );
   
   -- Then query the table in stream mode
   select * from t1;
   
   ```
   I got a question about the option `read.start-commit`:
    When I start to run the query for the first time, the `read.start-commit` 
specify where the query starts.Then, the query run for a while(eg, one day) and 
the query stops ,  the hudi commits time have changed many times during this 
period.
   
   When I restart the query, how could I deal with the commit time? Should I 
manually specify a newer start-commit(It is very likely that I don't know which 
commits that flink query has processed)? 
   Are there checkpoint mechanism for `read.start-commit`?
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to