marvinlanhenke commented on issue #329:
URL: https://github.com/apache/iceberg-rust/issues/329#issuecomment-2041548391

   > I'm not sure whether my understanding is correct: The target of 
`table.append()` is used to insert a batch of data into the table. It's seems 
like a high level API which will use two lower API:
   > 
   > 1. [writer API](https://github.com/apache/iceberg-rust/issues/34) for 
convert RecordBatch to DataFile
   > 2. [transaction 
API](https://github.com/apache/iceberg-rust/blob/ca9de89ac9d95683c8fe9191f72ab922dc4c7672/crates/iceberg/src/transaction.rs#L30)
  for commit the DataFile(update the table metadata)
   > 
   > To separate these two interfaces, I think we don't need to delegate the 
conversion between `RecordBatch` and `DataFile` in the transaction.
   
   I think your understanding is correct - and I agree if the writer API 
already does the conversion from RecordBatch to DataFile, the Transaction 
shouldn't be concerned with this issue, since it is a higher-level API. 
However, the Transaction calls the writer that writes the actual DataFile, 
which seems reasonable. 
   
   So the Transaction `append` (if I understand the py impl correctly) does all 
of those things:
   - calling the writer to write the DataFile
   - create an instance of MergingSnapshotProducer -> responsible for writing 
the manifest, manifest_list, snapshot_update
   - commit -> update_table() on the Catalog with TableUpdate & 
TableRequirements
   
   @ZENOTME 
   Where would the writer API (which I only know from the design spec in #34) 
fit best here? Should a Transaction create a new writer everytime a new 
transaction is created? Or should the Table itself hold a ref to a writer?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to