Hey Philip,

Yeah I think we have actually done pretty good at getting reasonably
solid clients in a bunch of languages. I just think it is an important
area.

The architecture design patterns idea is fantastic. That would be a
great thing to do.

-Jay



On Fri, Jul 18, 2014 at 11:46 PM, Philip O'Toole
<philip_o_to...@yahoo.com.invalid> wrote:
> Thanks Jay -- some good ideas there.
>
> I agree strongly that fewer, more solid, non-Java clients are better than 
> many shallow ones. Interesting that you feel we could do some more work in 
> this area, as I thought it was well served (even if they have proliferated).
>
> One area I would like see documented better -- and I am considering it myself 
> -- is a collection of Kafka "Architectural Design Patterns", all in one one 
> place. For example, how to use Kafka to build a staging and test environment 
> (tapping the production flow in a non-destructive manner), how to build 
> robust pipelines, to read to and from, say, Apache Storm, how to deploy a 
> cluster in EC2 (the interaction with Availability Zones), topic vs. partition 
> demuxing, etc, etc. I've yet to see a nice consolidation of this information 
> -- it would not really be about coding, but system design. Ideally it would 
> be reviewed by you committers, but someone else would do the work.
>
> Philip
>
>
> ---------------------------
> www.philipotoole.com
>
>
>
> On Friday, July 18, 2014 3:58 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
>
>
>
> Basically my thought with getting a separate mailing list was to have
> a place specifically to discuss issues around clients. I don't see a
> lot of discussion about them on the main list. I thought perhaps this
> was because people don't like to ask questions which are about
> adjacent projects/code bases. But basically whatever will lead to a
> robust discussion, bug tracking, etc on clients.
>
> -Jay
>
>
> On Fri, Jul 18, 2014 at 3:49 PM, Jun Rao <jun...@gmail.com> wrote:
>> Another important part of eco-system could be around the adaptors of
>> getting data from other systems into Kafka and vice versa. So, for the
>> ingestion part, this can include things like getting data from mysql,
>> syslog, apache server log, etc. For the egress part, this can include
>> putting Kafka data into HDFS, S3, etc.
>>
>> Will a separate mailing list be convenient? Could we just use the Kafka
>> mailing list?
>>
>> Thanks,
>>
>> Jun
>>
>>
>> On Fri, Jul 18, 2014 at 2:34 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
>>
>>> A question was asked in another thread about what was an effective way
>>> to contribute to the Kafka project for people who weren't very
>>> enthusiastic about writing Java/Scala code.
>>>
>>> I wanted to kind of advocate for an area I think is really important
>>> and not as good as it could be--the client ecosystem. I think our goal
>>> is to make Kafka effective as a general purpose, centralized, data
>>> subscription system. This vision only really works if all your
>>> applications, are able to integrate easily, whatever language they are
>>> in.
>>>
>>> We have a number of pretty good non-java producers. We have been
>>> lacking the features on the server-side to make writing non-java
>>> consumers easy. We are fixing that right now as part of the consumer
>>> work going on right now (which moves a lot of the functionality in the
>>> java consumer to the server side).
>>>
>>> But apart from this I think there may be a lot more we can do to make
>>> the client ecosystem better.
>>>
>>> Here are some concrete ideas. If anyone has additional ideas please
>>> reply to this thread and share them. If you are interested in picking
>>> any of these up, please do.
>>>
>>> 1. The most obvious way to improve the ecosystem is to help work on
>>> clients. This doesn't necessarily mean writing new clients, since in
>>> many cases we already have a client in a given language. I think any
>>> way we can incentivize fewer, better clients rather than many
>>> half-working clients we should do. However we are working now on the
>>> server-side consumer co-ordination so it should now be possible to
>>> write much simpler consumers.
>>>
>>> 2. It would be great if someone put together a mailing list just for
>>> client developers to share tips, tricks, problems, and so on. We can
>>> make sure all the main contributors on this too. I think this could be
>>> a forum for kind of directing improvements in this area.
>>>
>>> 3. Help improve the documentation on how to implement a client. We
>>> have tried to make the protocol spec not just a dry document but also
>>> have it share best practices, rationale, and intentions. I think this
>>> could potentially be even better as there is really a range of options
>>> from a very simple quick implementation to a more complex highly
>>> optimized version. It would be good to really document some of the
>>> options and tradeoffs.
>>>
>>> 4. Come up with a standard way of documenting the features of clients.
>>> In an ideal world it would be possible to get the same information
>>> (author, language, feature set, download link, source code, etc) for
>>> all clients. It would be great to standardize the documentation for
>>> the client as well. For example having one or two basic examples that
>>> are repeated for every client in a standardized way. This would let
>>> someone come to the Kafka site who is not a java developer, and click
>>> on the link for their language and view examples of interacting with
>>> Kafka in the language they know using the client they would eventually
>>> use.
>>>
>>> 5. Build a Kafka Client Compatibility Kit (KCCK) :-) The idea is this:
>>> anyone who wants to implement a client would implement a simple
>>> command line program with a set of standardized options. The
>>> compatibility kit would be a standard set of scripts that ran their
>>> client using this command line driver and validate its behavior. E.g.
>>> for a producer it would test that it correctly can send messages, that
>>> the ordering is retained, that the client correctly handles
>>> reconnection and metadata refresh, and compression. The output would
>>> be a list of features that passed are certified, and perhaps basic
>>> performance information. This would be an easy way to help client
>>> developers write correct clients, as well as having a standardized
>>> comparison for the clients that says that they work correctly.
>>>
>>> -Jay
>>>

Reply via email to