A bit of background… The Kafka connector is two classes instead of a single KafkaStreams connector (with publish(),subscribe()) because at least a while ago, don’t know if this is still the case, Kafka had two completely separate classes for a “consumer” and a “producer" each with very different config setup params. By comparison MQTT has a single MqttClient class (with publish()/subscribe()).
At the time, the decision was to name the Edgent Kafka classes similar to the underlying Kafka API classes. Hence KafkaConsumer (~wrapping Kafka’s ConsumerConnector) and KafkaProducer (~wrapping Kafka’s KafkaProducer). While not exposed today, it’s conceivable that some day one could create an Edgent Kafka connector instance by providing a Kafka API class directly instead of just a config map - e.g., supplying a Kafka KafkaProducer as an arg to the Edgent KafkaProducer connector's constructor. So having the names align seems like goodness. I don’t think the Edgent connectors should be trying to make it unnecessary for a user to understand or to mask the underlying system’s API… just make it usable, easily usable for a simple/common cases, in an Edgent topology context (worrying about when to make an actually external connection, recovering from broken connections / reconnecting, handling common tuple types). As for the specific suggestions, I think simply switching the names of Edgent’s KafkaConsumer and KafkaProducer is a bad idea :-) Offering KafkaSource and KafkaSink is OK I guess (though probably retaining the current names for a release or three). Though I’ll note the Edgent API uses “source” and “sink” as verbs, which take a Supplier and a Consumer fn as args respectively. Note Consumer used in the context with sink. Alternatively there’s KafkaSubscriber and KafkaPublisher. While clearer than Consumer/Producer, I don’t know if they’re any better than Source/Sink. In the end I guess I don’t feel strongly about it all… though wonder if it’s really worth the effort in changing. At least the Edgent connector’s javadoc is pretty good / clear for the classes and their use... I think :-) — Dale > On Mar 20, 2018, at 9:59 PM, vino yang <yanghua1...@gmail.com> wrote: > > Hi Chris, > > All data processing framework could think it as a *pipeline . *The Edgent's > point of view, there could be two endpoints : > > > - source : means data injection; > - sink : means data export; > > There are many frameworks use this conventional naming rule, such as Apache > Flume, Apache Flink, Apache Spark(structured streaming) . > > I think "KafkaConsumer" could be replaced with "KafkaSource" and > "KafkaProducer" could be named "KafkaSink". > > And middle of the pipeline is the transformation of the data, there are > many operators to transform data ,such as map, flatmap, filter, reduce... > and so on. > > Vino yang. > Thanks. > > 2018-03-20 20:51 GMT+08:00 Christofer Dutz <christofer.d...@c-ware.de>: > >> Hi, >> >> have been using the Kafka integration quite often in the past and one >> thing I always have to explain when demonstrating code and which seems to >> confuse everyone seeing the code: >> >> I would expect a KafkaConsumer to consume Edgent messages and publish them >> to Kafka and would expect a KafkaProducer to produce Edgent events. >> >> Unfortunately it seems to be the other way around. This seems a little >> unintuitive. Judging from the continued confusion when demonstrating code >> eventually it’s worth considering to rename these (swap their names). >> Eventually even rename them to “KafkaSource” (Edgent Source that consumes >> Kafka messages and produces Edgent events) and “KafkaConsumer” (Consumes >> Edgent Events and produces Kafka messages). After all the Classes are in >> the Edgent namespace and come from the Edgent libs, so the fixed point when >> inspecting these should be clear. Also I bet no one would be confused if we >> called something that produces Kafka messages a consumer as there should >> never be code that handles this from a Kafka point of view AND uses Edgent >> at the same time. >> >> Chris >> >> >>