hudi-bot opened a new issue, #14649: URL: https://github.com/apache/hudi/issues/14649
## JIRA info - Link: https://issues.apache.org/jira/browse/HUDI-1207 - Type: Task - Affects version(s): - 0.9.0 --- ## Comments 21/Aug/20 01:35;wangxianghu#1;Hi, [~vinoth] as discussed before, put Kafka callback in hudi-client module would be more uniform and elegant, so I filed this ticket. Core modules(like {{hudi-client}} and {{hudi-spark}}) taking a direct dependency on application dependencies(like Kafka) will make the core package more and more bloated, this is indeed a problem. As for this case, maybe we can make kafka related denpendencies scope to "provided", if user really want to use kafka callback, they can add kafka dependency by themselves(via --jars). WDYT?;;; --- 01/Sep/20 14:45;vinoth;[~wangxianghu] thanks for opening this issue. Thinking out aloud, it's better to have the actual implementations of the callback outside of core packages. if we have this in hudi-utilities, then the utilities bundle can have, and so it can be used with delta streamer tool. we can also put this in hudi-spark per se and it will be in both the bundles. can we revisit this once with the new package structure for multi engine support as well. thing could change a bit there. ;;; --- 02/Sep/20 01:15;wangxianghu#1;[~vinoth], thanks for the reply. Your view make senses, let's keep it out of core module and add Kafka implementation to `hudi-spark` independently. Changed the title to "Add kafka implementation of write commit callback to Spark datasources";;; -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
