alamb commented on issue #4328:
URL: https://github.com/apache/arrow-rs/issues/4328#issuecomment-2100558684

   > arrow_datatype, this will be as arrow data type => Int64 or the likes?
   
   I was thinking 
https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html
   
   
   > impl IntoIterator<Item = Option<&Statistics>>, will this be Parquet 
Statistics of all columns in 'current' row group
   
   I think it would be 
[Statistics](https://docs.rs/parquet/latest/parquet/file/statistics/enum.Statistics.html),
 where each 
[Statistics](https://docs.rs/parquet/latest/parquet/file/statistics/enum.Statistics.html)
 represents the values for a single row group.
   
   >  if that's the case why it's impl IntoIterator and not just 
Option<&Statistics>?
   
   The idea is to be able to create (efficiently) statistics for multiple row 
groups at a time -- since each arrow Array has significant overhead, they only 
make sense when they store multiple values
   
   > Sorry, just trying to get an understanding of all the moving parts.
   
   Yeah, I agree this is a complex issue....
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to