jack1234smith opened a new issue, #11023:
URL: https://github.com/apache/hudi/issues/11023

   **Describe the problem you faced**
   
   Error exception:
   java.util.NoSuchElementException: No value present in Option
        at org.apache.hudi.common.util.Option.get(Option.java:89)
        at 
org.apache.hudi.table.format.mor.MergeOnReadInputFormat.initIterator(MergeOnReadInputFormat.java:204)
        at 
org.apache.hudi.table.format.mor.MergeOnReadInputFormat.open(MergeOnReadInputFormat.java:189)
        at 
org.apache.hudi.source.StreamReadOperator.processSplits(StreamReadOperator.java:169)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
        at 
org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
        at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMail(MailboxProcessor.java:398)
        at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:367)
        at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:352)
        at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:229)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788)
        at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952)
        at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:931)
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
        at java.lang.Thread.run(Thread.java:745)
   
   data error:
   
![17131698404766](https://github.com/apache/hudi/assets/50668893/9fb26f15-6228-486d-a9c5-2b70c746f784)
   
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. kill yarn session
   2. Restart job from checkpoint
   
   
   **Environment Description**
   
   * Hudi version : 0.14.1
   
   * Hive version : 3.1.3
   
   * Hadoop version : 3.3.6
   
   * Storage (HDFS/S3/GCS..) : HDFS
   
   * Running on Yarn Session? (yes/no) :  yes
   
   
   **Additional context**
   
   My table are:
   CREATE TABLE if not exists ods_table(
        id int,
        count_num double,
        write_time timestamp(0),
        _part string,
        proc_time  timestamp(3),
        WATERMARK FOR write_time AS write_time
   ) 
   PARTITIONED BY (_part)
   WITH (
       'connector'='hudi',
       'path' ='hdfs://masters/test/ods_table',
       'table.type'='MERGE_ON_READ',
       'hoodie.datasource.write.recordkey.field' = 'id',
       'hoodie.datasource.write.precombine.field' = 'write_time', 
       'write.bucket_assign.tasks'='1',
       'write.tasks' = '1', 
       'compaction.tasks' = '1',
       'compaction.async.enabled' = 'true',
       'compaction.schedule.enabled' = 'true',
       'compaction.trigger.strategy' = 'time_elapsed',
       'compaction.delta_seconds' = '600',
       'compaction.delta_commits' = '1',
       'read.streaming.enabled' = 'true',
       'read.streaming.skip_compaction' = 'true',
       'read.start-commit' = 'earliest',
       'changelog.enabled' = 'true',    
       'hive_sync.enable'='true',
       'hive_sync.mode' = 'hms',
       'hive_sync.metastore.uris' = 'thrift://h35:9083',
       'hive_sync.db'='test',
       'hive_sync.table'='hive_ods_table'
   );
   
   CREATE TABLE if not exists ads_table(
        sta_date string,
        num double,
        proc_time as proctime()
   )
   WITH (
       'connector'='hudi',
       'path' ='hdfs://masters/test/ads_table',
       'table.type'='COPY_ON_WRITE',
       'hoodie.datasource.write.recordkey.field' = 'sta_date',
       'write.bucket_assign.tasks'='1',
       'write.tasks' = '1', 
       'compaction.tasks' = '1',
       'compaction.async.enabled' = 'true',
       'compaction.schedule.enabled' = 'true',
       'compaction.trigger.strategy' = 'time_elapsed',
       'compaction.delta_seconds' = '600',
        'compaction.delta_commits' = '1',
       'read.streaming.enabled' = 'true',
       'read.streaming.skip_compaction' = 'true',
        'read.start-commit' = 'earliest',
       'changelog.enabled' = 'true',    
       'hive_sync.enable'='true',
       'hive_sync.mode' = 'hms',
       'hive_sync.metastore.uris' = 'thrift://h35:9083',
       'hive_sync.db'='test',
       'hive_sync.table'='hive_ads_table'
   );
   
   My job is:
   insert into test.ads_table
   select _part, sum(count_num)
   from test.ods_table
   group by _part;
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to