hudi-bot opened a new issue, #14777:
URL: https://github.com/apache/hudi/issues/14777

   current when hudi bootstrap a parquet file, or upsert into a parquet file 
which contains timestmap column, it will fail because these issues:
   
   1) At bootstrap operation, if the origin parquet file was written by a spark 
application, then spark will default save timestamp as int96(see 
spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because of 
Hudi can not read Int96 type now.(this issue can be solve by upgrade parquet to 
1.12.0, and set parquet.avro.readInt96AsFixed=true, please check 
[https://github|https://github/] 
<[https://github/]>.com/apache/parquet-mr/pull/831/files) 
   
   2) after bootstrap, doing upsert will fail because we use hoodie schema to 
read origin parquet file. The schema is not match because hoodie schema  treat 
timestamp as long and at origin file it’s Int96 
   
   3) after bootstrap, and partial update for a parquet file will fail, because 
we copy the old record and save by hoodie schema( we miss a convertFixedToLong 
operation like spark does)
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-1779
   - Type: Bug
   - Epic: https://issues.apache.org/jira/browse/HUDI-1265
   - Fix version(s):
     - 1.1.0
   - Attachment(s):
     - 08/Apr/21 
13:32;lrz;unsupportInt96.png;https://issues.apache.org/jira/secure/attachment/13023562/unsupportInt96.png
     - 08/Apr/21 
13:32;lrz;upsertFail.png;https://issues.apache.org/jira/secure/attachment/13023563/upsertFail.png
     - 08/Apr/21 
13:32;lrz;upsertFail2.png;https://issues.apache.org/jira/secure/attachment/13023564/upsertFail2.png
   
   
   ---
   
   
   ## Comments
   
   08/Aug/21 20:17;githubbot;hudi-bot edited a comment on pull request #2790:
   URL: https://github.com/apache/hudi/pull/2790#issuecomment-869766484
   
   
      <!--
      Meta data
      {
        "version" : 1,
        "metaDataEntries" : [ {
          "hash" : "41aec7191e0345c1d8a4efb805ea04e1510bd480",
          "status" : "FAILURE",
          "url" : 
"https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=502";,
          "triggerID" : "41aec7191e0345c1d8a4efb805ea04e1510bd480",
          "triggerType" : "PUSH"
        } ]
      }-->
      ## CI report:
      
      * 41aec7191e0345c1d8a4efb805ea04e1510bd480 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=502)
 
      
      <details>
      <summary>Bot commands</summary>
        The @flinkbot bot supports the following commands:
      
       - `@flinkbot run travis` re-run the last Travis build
       - `@flinkbot run azure` re-run the last Azure build
      </details>
   
   
   -- 
   This is an automated message from the Apache Git Service.
   To respond to the message, please log on to GitHub and use the
   URL above to go to the specific comment.
   
   To unsubscribe, e-mail: [email protected]
   
   For queries about this service, please contact Infrastructure at:
   [email protected]
   ;;;
   
   ---
   
   09/Aug/21 04:22;githubbot;hudi-bot edited a comment on pull request #2790:
   URL: https://github.com/apache/hudi/pull/2790#issuecomment-869766484
   
   
      <!--
      Meta data
      {
        "version" : 1,
        "metaDataEntries" : [ {
          "hash" : "41aec7191e0345c1d8a4efb805ea04e1510bd480",
          "status" : "FAILURE",
          "url" : 
"https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=502";,
          "triggerID" : "41aec7191e0345c1d8a4efb805ea04e1510bd480",
          "triggerType" : "PUSH"
        } ]
      }-->
      ## CI report:
      
      * 41aec7191e0345c1d8a4efb805ea04e1510bd480 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=502)
 
      
      <details>
      <summary>Bot commands</summary>
        @hudi-bot supports the following commands:
      
       - `@hudi-bot run travis` re-run the last Travis build
       - `@hudi-bot run azure` re-run the last Azure build
      </details>
   
   
   -- 
   This is an automated message from the Apache Git Service.
   To respond to the message, please log on to GitHub and use the
   URL above to go to the specific comment.
   
   To unsubscribe, e-mail: [email protected]
   
   For queries about this service, please contact Infrastructure at:
   [email protected]
   ;;;
   
   ---
   
   13/Dec/21 14:26;shivnarayan;[~alexey.kudinkin] :  Related to parquet 
upgrade.;;;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to