aizain opened a new issue, #7375:
URL: https://github.com/apache/hudi/issues/7375

   **Describe the problem you faced**
   
   When i enable async clustering, hudi write xxx.replacecommit.requested is 
avro.schema. but canSkipBatch function read it file use json reader, throw  
Unrecognized token 'Obj^A^B^Vavro'.
   
   How can i fixed it ?i deleted it but it also happend in next replacecommit
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. run spark stream use sink hudi
   
   spark.
         sql(conf.getSql).
         na.fill("").
         writeStream.
         format("hudi").
         options( conf.getHudiConf).
         option("checkpointLocation", conf.getCheckpointPath).
         trigger(conf.getTrigger).
         outputMode(OutputMode.Append()).
         start(conf.getOutputPath(conf.getHudiTableName))
   
   2. when doing clusting => 20221204152715580__replacecommit__REQUESTED
    => 20221204150150328.replacecommit.requested
   is avro file
   <img width="856" alt="image" 
src="https://user-images.githubusercontent.com/17040353/205480214-149835a4-d256-4b0e-a373-4a846426d4a1.png";>
   
   3. keep run stream
   4. when run function HoodieStreamingSink.addBatch => canSkipBatch => 
CommitUtils.getLatestCommitMetadataWithValidCheckpointInfo
   5. throws error
   
   Caused by: org.apache.hudi.exception.HoodieIOException: Failed to parse 
HoodieCommitMetadata for [==>20221204152715580_
   _replacecommit__REQUESTED]
   Caused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 
'Obj^A^B^Vavro': was expecting ('true', 'f
   alse' or 'null')
   
   **Expected behavior**
   
   
   **Environment Description**
   
   * Hudi version :
   0.12.1
   
   * Spark version :
   2.4.3.2
   
   * Hive version :
   no
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) :
   use HDFS
   
   * Running on Docker? (yes/no) :
   no
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   
   <img width="1432" alt="image" 
src="https://user-images.githubusercontent.com/17040353/205480108-03422fb7-fbe7-40dd-93e7-75b85eecbe21.png";>
   
   
   <img width="548" alt="image" 
src="https://user-images.githubusercontent.com/17040353/205480341-98de0cd8-6062-4f6f-8801-f380464d4e85.png";>
   
   <img width="1289" alt="image" 
src="https://user-images.githubusercontent.com/17040353/205480351-131bab02-8710-414c-aa6c-c0c065562d56.png";>
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to