ashvina opened a new issue, #595:
URL: https://github.com/apache/incubator-xtable/issues/595

   ### Feature Request / Improvement
   
   There are two issues with how XTable parses the commit log of source Delta 
tables that have the deletion vectors property set.
   
   1. Missing `tightBounds` Property: For Delta tables with deletion vectors, 
the file stats include an additional property called `tightBounds`. This 
property is missing in XTable's representation of the Delta stats. As a result, 
parsing commit logs fails.
   
   > `Caused by: 
com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized 
field "tightBounds" (class 
org.apache.xtable.delta.DeltaStatsExtractor$DeltaStats), not marked as 
ignorable (4 known properties: "nullCount", "numRecords", "maxValues", 
"minValues"])`
   
   2. Incorrect Handling of Delete Vectors: When a delete vector is added for a 
data file in Delta Lake, the commit log contains both a `remove` and an `add` 
entry for the same data file. This is done to link deletion vector file to the 
data file. However, XTable incorrectly adds the data file path to both the new 
and removed file sets in `FileDiff`. XTable should ignore this since no new 
data file is generated. Instead, once representation of deletion vectors is 
added, it should report the addition of a deletion vector.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to