hudi-bot opened a new issue, #14649:
URL: https://github.com/apache/hudi/issues/14649

   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-1207
   - Type: Task
   - Affects version(s):
     - 0.9.0
   
   
   ---
   
   
   ## Comments
   
   21/Aug/20 01:35;wangxianghu#1;Hi, [~vinoth] as discussed before,  put Kafka 
callback in hudi-client module would be more uniform and elegant, so I filed 
this ticket.
   
   Core modules(like {{hudi-client}} and {{hudi-spark}}) taking a direct 
dependency on application dependencies(like Kafka) will make the core package 
more and more bloated, this is indeed a problem. 
   
   As for this case, maybe we can make kafka related denpendencies scope to 
"provided", if user really want to use kafka callback,  they can add kafka 
dependency by themselves(via --jars). 
   
   WDYT?;;;
   
   ---
   
   01/Sep/20 14:45;vinoth;[~wangxianghu] thanks for opening this issue. 
Thinking out aloud, it's better to have the actual implementations of the 
callback outside of core packages. if we have this in hudi-utilities, then the 
utilities bundle can have, and so it can be used with delta streamer tool. we 
can also put this in hudi-spark per se and it will be in both the bundles. 
   
   can we revisit this once with the new package structure for multi engine 
support as well. thing could change a bit there. ;;;
   
   ---
   
   02/Sep/20 01:15;wangxianghu#1;[~vinoth], thanks for the reply.
   
   Your view make senses, let's keep it out of core module and add Kafka 
implementation to `hudi-spark` independently.
   
   Changed the title to "Add kafka implementation of write commit callback to 
Spark datasources";;;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to