steveloughran commented on issue #15628:
URL: https://github.com/apache/iceberg/issues/15628#issuecomment-4100227084

   @RussellSpitzer and here's the results of a the spark reader benchmark I've 
added to the PR.
   The longer lines are all the shedded ones, even when filtering or selecting 
on a shedded column.
   
   
[human-readable-output.txt](https://github.com/user-attachments/files/26148519/human-readable-output.txt)
   
   I can think of some more tests there
   * cost of adding rows
   * extending the persisted structure (I'd parameterize for a simple vs 
complex variant)
   * and optionally partition by the category:int field, both direct and via 
the variant.
   
   Suggestions welcome


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to