Re: Improving the Kafka client ecosystem

Jay Kreps Fri, 18 Jul 2014 15:58:26 -0700

Basically my thought with getting a separate mailing list was to have
a place specifically to discuss issues around clients. I don't see a
lot of discussion about them on the main list. I thought perhaps this
was because people don't like to ask questions which are about
adjacent projects/code bases. But basically whatever will lead to a
robust discussion, bug tracking, etc on clients.


-Jay

On Fri, Jul 18, 2014 at 3:49 PM, Jun Rao <[email protected]> wrote:
> Another important part of eco-system could be around the adaptors of
> getting data from other systems into Kafka and vice versa. So, for the
> ingestion part, this can include things like getting data from mysql,
> syslog, apache server log, etc. For the egress part, this can include
> putting Kafka data into HDFS, S3, etc.
>
> Will a separate mailing list be convenient? Could we just use the Kafka
> mailing list?
>
> Thanks,
>
> Jun
>
>
> On Fri, Jul 18, 2014 at 2:34 PM, Jay Kreps <[email protected]> wrote:
>
>> A question was asked in another thread about what was an effective way
>> to contribute to the Kafka project for people who weren't very
>> enthusiastic about writing Java/Scala code.
>>
>> I wanted to kind of advocate for an area I think is really important
>> and not as good as it could be--the client ecosystem. I think our goal
>> is to make Kafka effective as a general purpose, centralized, data
>> subscription system. This vision only really works if all your
>> applications, are able to integrate easily, whatever language they are
>> in.
>>
>> We have a number of pretty good non-java producers. We have been
>> lacking the features on the server-side to make writing non-java
>> consumers easy. We are fixing that right now as part of the consumer
>> work going on right now (which moves a lot of the functionality in the
>> java consumer to the server side).
>>
>> But apart from this I think there may be a lot more we can do to make
>> the client ecosystem better.
>>
>> Here are some concrete ideas. If anyone has additional ideas please
>> reply to this thread and share them. If you are interested in picking
>> any of these up, please do.
>>
>> 1. The most obvious way to improve the ecosystem is to help work on
>> clients. This doesn't necessarily mean writing new clients, since in
>> many cases we already have a client in a given language. I think any
>> way we can incentivize fewer, better clients rather than many
>> half-working clients we should do. However we are working now on the
>> server-side consumer co-ordination so it should now be possible to
>> write much simpler consumers.
>>
>> 2. It would be great if someone put together a mailing list just for
>> client developers to share tips, tricks, problems, and so on. We can
>> make sure all the main contributors on this too. I think this could be
>> a forum for kind of directing improvements in this area.
>>
>> 3. Help improve the documentation on how to implement a client. We
>> have tried to make the protocol spec not just a dry document but also
>> have it share best practices, rationale, and intentions. I think this
>> could potentially be even better as there is really a range of options
>> from a very simple quick implementation to a more complex highly
>> optimized version. It would be good to really document some of the
>> options and tradeoffs.
>>
>> 4. Come up with a standard way of documenting the features of clients.
>> In an ideal world it would be possible to get the same information
>> (author, language, feature set, download link, source code, etc) for
>> all clients. It would be great to standardize the documentation for
>> the client as well. For example having one or two basic examples that
>> are repeated for every client in a standardized way. This would let
>> someone come to the Kafka site who is not a java developer, and click
>> on the link for their language and view examples of interacting with
>> Kafka in the language they know using the client they would eventually
>> use.
>>
>> 5. Build a Kafka Client Compatibility Kit (KCCK) :-) The idea is this:
>> anyone who wants to implement a client would implement a simple
>> command line program with a set of standardized options. The
>> compatibility kit would be a standard set of scripts that ran their
>> client using this command line driver and validate its behavior. E.g.
>> for a producer it would test that it correctly can send messages, that
>> the ordering is retained, that the client correctly handles
>> reconnection and metadata refresh, and compression. The output would
>> be a list of features that passed are certified, and perhaps basic
>> performance information. This would be an easy way to help client
>> developers write correct clients, as well as having a standardized
>> comparison for the clients that says that they work correctly.
>>
>> -Jay
>>

Re: Improving the Kafka client ecosystem

Reply via email to