CTTY commented on issue #1650: URL: https://github.com/apache/iceberg-rust/issues/1650#issuecomment-3259415564
Hi @liurenjie1024 , thanks for the inputs! Your idea sounds good to me and I agree that we should make smaller steps if possible. Next I'll try to make a draft based on it! One thing I'm not too sure about the `PartitioningWriter` interface is that the incoming `batch` may still contain rows from different partitions (e.g. when the user has a partitioned table and wants to go with round robin partitioning mode to avoid partition skew) ```rust pub trait PartitioningWriter { // if `batch` here contains data from multiple partitions, // then the entire batch would still be written to the partition of `partition_key` fn write(&self, partition_key: PartitionKey, batch: RecordBatch); } ``` I'm thinking of something like this: ```rust pub trait PartitioningWriter { // use record batch splitter to split the incoming batch first fn write(&self, batch: RecordBatch); // the `batch` here should be splitted only // technically this shouldn't be public accessible fn do_write(&self, partition_key: PartitionKey, splitted_batch: RecordBatch); } ``` Please lmk your thoughts! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org