szehon-ho opened a new pull request, #5376:
URL: https://github.com/apache/iceberg/pull/5376

   This adds following columns to all files tables:
   
   - column_sizes_metrics
   - value_counts_metrics
   - null_value_counts_metrics
   - nan_value_counts_metrics
   - lower_bounds_metrics
   - upper_bounds_metrics
   
   This is to keep backward compatibility as the existing metrics columns can 
not be changed.
   
   The first four return Map<String, Long>.  Key is the human-readable column 
name (dot separated for nested columns).
   The last two return Map<String, String>.  Key is like above, Value is 
human-readable upper/lower bound.
   
   Example: upper_bounds_metrics = Map ("mystruct.timestamp" => 
"1970-01-01T00:00:00.000002")
   
   This makes Iceberg metadata tables is a bit closer to Trino, where the last 
two columns are <Long, String> (column id to human readable bound).  It goes 
beyond and even resolves the column to make it readable.
   
   Implementation detail:  Not that we add new columns to files table, it 
becomes not a 1 to 1 mapping with the "DataFile" java object, so we have to add 
code to handle column mapping in projection case.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to