aihuaxu commented on PR #14297:
URL: https://github.com/apache/iceberg/pull/14297#issuecomment-3730387702

   > This PR is super exciting! Does this rely on variant shredding support in 
Spark? Is it supported in Spark 4.1 already, or planned for future releases?
   > 
   > Regarding the heuristics - I'd like to propose adding table properties as 
hints for variant shredding. Similarly to properties used for bloom filters, it 
could be good to introduce something like 
`write.parquet.variant-shredding-enabled.column.col1`, which will hint to the 
writer that this column is important for shredding. Many variants have 
important fields for which shredding should be enforced, and other fields which 
are less central and can be managed with simpler heuristics. Would love to hear 
your thoughts!
   
   Yeah. I'm also thinking of that too. Will address that separately. Basically 
based on read pattern, the user can specify the shredding schema.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to