nssalian commented on PR #14297: URL: https://github.com/apache/iceberg/pull/14297#issuecomment-4057443884
Thanks for the context @pvary , that makes sense. I'll rework the `writeProperties` overload and `collectedProperties` changes. The analyzer and heuristics work is unaffected. To make sure I implement this correctly: the `BufferedWriter` currently needs properties for two things, whether `spark.sql.iceberg.shred-variants` is enabled to decide if shredding applies, and the `spark.sql.iceberg.variant.inference.buffer-size` to know how many rows to buffer before inferring the schema. If `WriterFunction` stays schema-only, should the `BufferedWriter` be created at a higher level (e.g., in the `WriteBuilder` before `createWriterFunc` is called) and then delegate to the standard `WriterFunction` once the schema is inferred? Or is there another pattern you'd recommend? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
