kbendick commented on issue #4977:
URL: https://github.com/apache/iceberg/issues/4977#issuecomment-1154099316

   Thanks for opening this @kondamudikarthik!
   
   @youngxinler: first, thanks for your interest in this discussions. It’s not 
entirely true that writing to Iceberg tables needs to be achieved via a 
computing engine. It can absolutely be achieved with just the plain Java API 
(or the new Python refactor). While I generally recommend that a compute engine 
be used, as it’s an easier experience, my understanding is that Kafka Connect 
is a framework for map-side only / per-record transforms for writing data from 
Kafka to some other location (in this case Iceberg tables).
   
   Near-real time data is very often written as-is or with only a handful of 
per-record transforms. For example, upstream producers will stage the data as 
they want it written, and then the folks who maintain the data lake might add 
some metadata to the record (ingestion timestamp etc) but otherwise write the 
data out as it’s been received, while ensuring that the data has consistent 
schema.
   
   There are many scenarios in which Kafka Connect would be beneficial to end 
users and if there’s interest from the community then it’s something we should 
definitely consider supporting.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to