Re: Improving the Kafka client ecosystem

2014-08-19 Thread Joe Stein
Great idea Gwen! I think it would go a long way to making this work.  Issue
created :)

On Tue, Aug 19, 2014 at 4:33 PM, Gwen Shapira  wrote:

> Does it make sense to merge the Camus mailing list? (i.e. ask the
> Camus community to merge?) Its a fairly large and popular client.
>
> On Tue, Aug 19, 2014 at 1:27 PM, Joe Stein  wrote:
> > I also opened issues on 3 of the clients on github that I frequently
> > use/involved in often enough would be great to get on the README as
> > such.
> >
> > Thanks to the community for driving things along!
> >
> > /***
> >  Joe Stein
> >  Founder, Principal Consultant
> >  Big Data Open Source Security LLC
> >  http://www.stealth.ly
> >  Twitter: @allthingshadoop 
> > /
> >
> >
> > On Tue, Aug 19, 2014 at 4:22 PM, Joe Stein  wrote:
> >
> >> I just joined too, and tweeted.
> >>
> >> /***
> >>  Joe Stein
> >>  Founder, Principal Consultant
> >>  Big Data Open Source Security LLC
> >>  http://www.stealth.ly
> >>  Twitter: @allthingshadoop 
> >> /
> >>
> >>
> >> On Tue, Aug 19, 2014 at 4:08 PM, Jay Kreps  wrote:
> >>
> >>> Cool. I just joined. I'll add it to the website so others can find it.
> >>> If someone was willing to ping some of the other client developers and
> >>> get them to join as well that would probably give us critical mass.
> >>>
> >>> -Jay
> >>>
> >>> On Tue, Aug 19, 2014 at 9:08 AM, Dana Powers 
> wrote:
> >>> > I created kafka-clie...@groups.google.com
> >>> >
> >>> > https://groups.google.com/forum/m/#!forum/kafka-clients
> >>> >
> >>> > No members and no guidelines yet, but it's a start.  Would love to
> get
> >>> this
> >>> > going.
> >>> >
> >>> > Dana
> >>> >  On Aug 19, 2014 9:03 AM, "Mark Roberts"  wrote:
> >>> >
> >>> >> Did this mailing list ever get created? Was there consensus that it
> >>> did or
> >>> >> didn't need created?
> >>> >>
> >>> >> -Mark
> >>> >>
> >>> >> > On Jul 18, 2014, at 14:34, Jay Kreps  wrote:
> >>> >> >
> >>> >> > A question was asked in another thread about what was an effective
> >>> way
> >>> >> > to contribute to the Kafka project for people who weren't very
> >>> >> > enthusiastic about writing Java/Scala code.
> >>> >> >
> >>> >> > I wanted to kind of advocate for an area I think is really
> important
> >>> >> > and not as good as it could be--the client ecosystem. I think our
> >>> goal
> >>> >> > is to make Kafka effective as a general purpose, centralized, data
> >>> >> > subscription system. This vision only really works if all your
> >>> >> > applications, are able to integrate easily, whatever language they
> >>> are
> >>> >> > in.
> >>> >> >
> >>> >> > We have a number of pretty good non-java producers. We have been
> >>> >> > lacking the features on the server-side to make writing non-java
> >>> >> > consumers easy. We are fixing that right now as part of the
> consumer
> >>> >> > work going on right now (which moves a lot of the functionality in
> >>> the
> >>> >> > java consumer to the server side).
> >>> >> >
> >>> >> > But apart from this I think there may be a lot more we can do to
> make
> >>> >> > the client ecosystem better.
> >>> >> >
> >>> >> > Here are some concrete ideas. If anyone has additional ideas
> please
> >>> >> > reply to this thread and share them. If you are interested in
> picking
> >>> >> > any of these up, please do.
> >>> >> >
> >>> >> > 1. The most obvious way to improve the ecosystem is to help work
> on
> >>> >> > clients. This doesn't necessarily mean writing new clients, since
> in
> >>> >> > many cases we already have a client in a given language. I think
> any
> >>> >> > way we can incentivize fewer, better clients rather than many
> >>> >> > half-working clients we should do. However we are working now on
> the
> >>> >> > server-side consumer co-ordination so it should now be possible to
> >>> >> > write much simpler consumers.
> >>> >> >
> >>> >> > 2. It would be great if someone put together a mailing list just
> for
> >>> >> > client developers to share tips, tricks, problems, and so on. We
> can
> >>> >> > make sure all the main contributors on this too. I think this
> could
> >>> be
> >>> >> > a forum for kind of directing improvements in this area.
> >>> >> >
> >>> >> > 3. Help improve the documentation on how to implement a client. We
> >>> >> > have tried to make the protocol spec not just a dry document but
> also
> >>> >> > have it share best practices, rationale, and intentions. I think
> this
> >>> >> > could potentially be even better as there is really a range of
> >>> options
> >>> >> > from a very simple quick implementation to a more complex highly
> >>> >> > optimized version. It would be good to really document some of the
> >>> >> > options and tradeoffs.
> >>> >> >
> >>> >> > 4. Come up with a standard way o

Re: Improving the Kafka client ecosystem

2014-08-19 Thread Gwen Shapira
Does it make sense to merge the Camus mailing list? (i.e. ask the
Camus community to merge?) Its a fairly large and popular client.

On Tue, Aug 19, 2014 at 1:27 PM, Joe Stein  wrote:
> I also opened issues on 3 of the clients on github that I frequently
> use/involved in often enough would be great to get on the README as
> such.
>
> Thanks to the community for driving things along!
>
> /***
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop 
> /
>
>
> On Tue, Aug 19, 2014 at 4:22 PM, Joe Stein  wrote:
>
>> I just joined too, and tweeted.
>>
>> /***
>>  Joe Stein
>>  Founder, Principal Consultant
>>  Big Data Open Source Security LLC
>>  http://www.stealth.ly
>>  Twitter: @allthingshadoop 
>> /
>>
>>
>> On Tue, Aug 19, 2014 at 4:08 PM, Jay Kreps  wrote:
>>
>>> Cool. I just joined. I'll add it to the website so others can find it.
>>> If someone was willing to ping some of the other client developers and
>>> get them to join as well that would probably give us critical mass.
>>>
>>> -Jay
>>>
>>> On Tue, Aug 19, 2014 at 9:08 AM, Dana Powers  wrote:
>>> > I created kafka-clie...@groups.google.com
>>> >
>>> > https://groups.google.com/forum/m/#!forum/kafka-clients
>>> >
>>> > No members and no guidelines yet, but it's a start.  Would love to get
>>> this
>>> > going.
>>> >
>>> > Dana
>>> >  On Aug 19, 2014 9:03 AM, "Mark Roberts"  wrote:
>>> >
>>> >> Did this mailing list ever get created? Was there consensus that it
>>> did or
>>> >> didn't need created?
>>> >>
>>> >> -Mark
>>> >>
>>> >> > On Jul 18, 2014, at 14:34, Jay Kreps  wrote:
>>> >> >
>>> >> > A question was asked in another thread about what was an effective
>>> way
>>> >> > to contribute to the Kafka project for people who weren't very
>>> >> > enthusiastic about writing Java/Scala code.
>>> >> >
>>> >> > I wanted to kind of advocate for an area I think is really important
>>> >> > and not as good as it could be--the client ecosystem. I think our
>>> goal
>>> >> > is to make Kafka effective as a general purpose, centralized, data
>>> >> > subscription system. This vision only really works if all your
>>> >> > applications, are able to integrate easily, whatever language they
>>> are
>>> >> > in.
>>> >> >
>>> >> > We have a number of pretty good non-java producers. We have been
>>> >> > lacking the features on the server-side to make writing non-java
>>> >> > consumers easy. We are fixing that right now as part of the consumer
>>> >> > work going on right now (which moves a lot of the functionality in
>>> the
>>> >> > java consumer to the server side).
>>> >> >
>>> >> > But apart from this I think there may be a lot more we can do to make
>>> >> > the client ecosystem better.
>>> >> >
>>> >> > Here are some concrete ideas. If anyone has additional ideas please
>>> >> > reply to this thread and share them. If you are interested in picking
>>> >> > any of these up, please do.
>>> >> >
>>> >> > 1. The most obvious way to improve the ecosystem is to help work on
>>> >> > clients. This doesn't necessarily mean writing new clients, since in
>>> >> > many cases we already have a client in a given language. I think any
>>> >> > way we can incentivize fewer, better clients rather than many
>>> >> > half-working clients we should do. However we are working now on the
>>> >> > server-side consumer co-ordination so it should now be possible to
>>> >> > write much simpler consumers.
>>> >> >
>>> >> > 2. It would be great if someone put together a mailing list just for
>>> >> > client developers to share tips, tricks, problems, and so on. We can
>>> >> > make sure all the main contributors on this too. I think this could
>>> be
>>> >> > a forum for kind of directing improvements in this area.
>>> >> >
>>> >> > 3. Help improve the documentation on how to implement a client. We
>>> >> > have tried to make the protocol spec not just a dry document but also
>>> >> > have it share best practices, rationale, and intentions. I think this
>>> >> > could potentially be even better as there is really a range of
>>> options
>>> >> > from a very simple quick implementation to a more complex highly
>>> >> > optimized version. It would be good to really document some of the
>>> >> > options and tradeoffs.
>>> >> >
>>> >> > 4. Come up with a standard way of documenting the features of
>>> clients.
>>> >> > In an ideal world it would be possible to get the same information
>>> >> > (author, language, feature set, download link, source code, etc) for
>>> >> > all clients. It would be great to standardize the documentation for
>>> >> > the client as well. For example having one or two basic examples that
>>> >> > are repeated for every client in a s

Re: Improving the Kafka client ecosystem

2014-08-19 Thread Joe Stein
I also opened issues on 3 of the clients on github that I frequently
use/involved in often enough would be great to get on the README as
such.

Thanks to the community for driving things along!

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/


On Tue, Aug 19, 2014 at 4:22 PM, Joe Stein  wrote:

> I just joined too, and tweeted.
>
> /***
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop 
> /
>
>
> On Tue, Aug 19, 2014 at 4:08 PM, Jay Kreps  wrote:
>
>> Cool. I just joined. I'll add it to the website so others can find it.
>> If someone was willing to ping some of the other client developers and
>> get them to join as well that would probably give us critical mass.
>>
>> -Jay
>>
>> On Tue, Aug 19, 2014 at 9:08 AM, Dana Powers  wrote:
>> > I created kafka-clie...@groups.google.com
>> >
>> > https://groups.google.com/forum/m/#!forum/kafka-clients
>> >
>> > No members and no guidelines yet, but it's a start.  Would love to get
>> this
>> > going.
>> >
>> > Dana
>> >  On Aug 19, 2014 9:03 AM, "Mark Roberts"  wrote:
>> >
>> >> Did this mailing list ever get created? Was there consensus that it
>> did or
>> >> didn't need created?
>> >>
>> >> -Mark
>> >>
>> >> > On Jul 18, 2014, at 14:34, Jay Kreps  wrote:
>> >> >
>> >> > A question was asked in another thread about what was an effective
>> way
>> >> > to contribute to the Kafka project for people who weren't very
>> >> > enthusiastic about writing Java/Scala code.
>> >> >
>> >> > I wanted to kind of advocate for an area I think is really important
>> >> > and not as good as it could be--the client ecosystem. I think our
>> goal
>> >> > is to make Kafka effective as a general purpose, centralized, data
>> >> > subscription system. This vision only really works if all your
>> >> > applications, are able to integrate easily, whatever language they
>> are
>> >> > in.
>> >> >
>> >> > We have a number of pretty good non-java producers. We have been
>> >> > lacking the features on the server-side to make writing non-java
>> >> > consumers easy. We are fixing that right now as part of the consumer
>> >> > work going on right now (which moves a lot of the functionality in
>> the
>> >> > java consumer to the server side).
>> >> >
>> >> > But apart from this I think there may be a lot more we can do to make
>> >> > the client ecosystem better.
>> >> >
>> >> > Here are some concrete ideas. If anyone has additional ideas please
>> >> > reply to this thread and share them. If you are interested in picking
>> >> > any of these up, please do.
>> >> >
>> >> > 1. The most obvious way to improve the ecosystem is to help work on
>> >> > clients. This doesn't necessarily mean writing new clients, since in
>> >> > many cases we already have a client in a given language. I think any
>> >> > way we can incentivize fewer, better clients rather than many
>> >> > half-working clients we should do. However we are working now on the
>> >> > server-side consumer co-ordination so it should now be possible to
>> >> > write much simpler consumers.
>> >> >
>> >> > 2. It would be great if someone put together a mailing list just for
>> >> > client developers to share tips, tricks, problems, and so on. We can
>> >> > make sure all the main contributors on this too. I think this could
>> be
>> >> > a forum for kind of directing improvements in this area.
>> >> >
>> >> > 3. Help improve the documentation on how to implement a client. We
>> >> > have tried to make the protocol spec not just a dry document but also
>> >> > have it share best practices, rationale, and intentions. I think this
>> >> > could potentially be even better as there is really a range of
>> options
>> >> > from a very simple quick implementation to a more complex highly
>> >> > optimized version. It would be good to really document some of the
>> >> > options and tradeoffs.
>> >> >
>> >> > 4. Come up with a standard way of documenting the features of
>> clients.
>> >> > In an ideal world it would be possible to get the same information
>> >> > (author, language, feature set, download link, source code, etc) for
>> >> > all clients. It would be great to standardize the documentation for
>> >> > the client as well. For example having one or two basic examples that
>> >> > are repeated for every client in a standardized way. This would let
>> >> > someone come to the Kafka site who is not a java developer, and click
>> >> > on the link for their language and view examples of interacting with
>> >> > Kafka in the language they know using the client they would
>> eventually
>> >> > use.
>> >> >
>> >> > 5. Build a K

Re: Improving the Kafka client ecosystem

2014-08-19 Thread Joe Stein
I just joined too, and tweeted.

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/


On Tue, Aug 19, 2014 at 4:08 PM, Jay Kreps  wrote:

> Cool. I just joined. I'll add it to the website so others can find it.
> If someone was willing to ping some of the other client developers and
> get them to join as well that would probably give us critical mass.
>
> -Jay
>
> On Tue, Aug 19, 2014 at 9:08 AM, Dana Powers  wrote:
> > I created kafka-clie...@groups.google.com
> >
> > https://groups.google.com/forum/m/#!forum/kafka-clients
> >
> > No members and no guidelines yet, but it's a start.  Would love to get
> this
> > going.
> >
> > Dana
> >  On Aug 19, 2014 9:03 AM, "Mark Roberts"  wrote:
> >
> >> Did this mailing list ever get created? Was there consensus that it did
> or
> >> didn't need created?
> >>
> >> -Mark
> >>
> >> > On Jul 18, 2014, at 14:34, Jay Kreps  wrote:
> >> >
> >> > A question was asked in another thread about what was an effective way
> >> > to contribute to the Kafka project for people who weren't very
> >> > enthusiastic about writing Java/Scala code.
> >> >
> >> > I wanted to kind of advocate for an area I think is really important
> >> > and not as good as it could be--the client ecosystem. I think our goal
> >> > is to make Kafka effective as a general purpose, centralized, data
> >> > subscription system. This vision only really works if all your
> >> > applications, are able to integrate easily, whatever language they are
> >> > in.
> >> >
> >> > We have a number of pretty good non-java producers. We have been
> >> > lacking the features on the server-side to make writing non-java
> >> > consumers easy. We are fixing that right now as part of the consumer
> >> > work going on right now (which moves a lot of the functionality in the
> >> > java consumer to the server side).
> >> >
> >> > But apart from this I think there may be a lot more we can do to make
> >> > the client ecosystem better.
> >> >
> >> > Here are some concrete ideas. If anyone has additional ideas please
> >> > reply to this thread and share them. If you are interested in picking
> >> > any of these up, please do.
> >> >
> >> > 1. The most obvious way to improve the ecosystem is to help work on
> >> > clients. This doesn't necessarily mean writing new clients, since in
> >> > many cases we already have a client in a given language. I think any
> >> > way we can incentivize fewer, better clients rather than many
> >> > half-working clients we should do. However we are working now on the
> >> > server-side consumer co-ordination so it should now be possible to
> >> > write much simpler consumers.
> >> >
> >> > 2. It would be great if someone put together a mailing list just for
> >> > client developers to share tips, tricks, problems, and so on. We can
> >> > make sure all the main contributors on this too. I think this could be
> >> > a forum for kind of directing improvements in this area.
> >> >
> >> > 3. Help improve the documentation on how to implement a client. We
> >> > have tried to make the protocol spec not just a dry document but also
> >> > have it share best practices, rationale, and intentions. I think this
> >> > could potentially be even better as there is really a range of options
> >> > from a very simple quick implementation to a more complex highly
> >> > optimized version. It would be good to really document some of the
> >> > options and tradeoffs.
> >> >
> >> > 4. Come up with a standard way of documenting the features of clients.
> >> > In an ideal world it would be possible to get the same information
> >> > (author, language, feature set, download link, source code, etc) for
> >> > all clients. It would be great to standardize the documentation for
> >> > the client as well. For example having one or two basic examples that
> >> > are repeated for every client in a standardized way. This would let
> >> > someone come to the Kafka site who is not a java developer, and click
> >> > on the link for their language and view examples of interacting with
> >> > Kafka in the language they know using the client they would eventually
> >> > use.
> >> >
> >> > 5. Build a Kafka Client Compatibility Kit (KCCK) :-) The idea is this:
> >> > anyone who wants to implement a client would implement a simple
> >> > command line program with a set of standardized options. The
> >> > compatibility kit would be a standard set of scripts that ran their
> >> > client using this command line driver and validate its behavior. E.g.
> >> > for a producer it would test that it correctly can send messages, that
> >> > the ordering is retained, that the client correctly handles
> >> > reconnection and metadata refresh, and compression. The output would
> >> > be a list of features that passed are certified, and perhaps ba

Re: Improving the Kafka client ecosystem

2014-08-19 Thread Jay Kreps
Cool. I just joined. I'll add it to the website so others can find it.
If someone was willing to ping some of the other client developers and
get them to join as well that would probably give us critical mass.

-Jay

On Tue, Aug 19, 2014 at 9:08 AM, Dana Powers  wrote:
> I created kafka-clie...@groups.google.com
>
> https://groups.google.com/forum/m/#!forum/kafka-clients
>
> No members and no guidelines yet, but it's a start.  Would love to get this
> going.
>
> Dana
>  On Aug 19, 2014 9:03 AM, "Mark Roberts"  wrote:
>
>> Did this mailing list ever get created? Was there consensus that it did or
>> didn't need created?
>>
>> -Mark
>>
>> > On Jul 18, 2014, at 14:34, Jay Kreps  wrote:
>> >
>> > A question was asked in another thread about what was an effective way
>> > to contribute to the Kafka project for people who weren't very
>> > enthusiastic about writing Java/Scala code.
>> >
>> > I wanted to kind of advocate for an area I think is really important
>> > and not as good as it could be--the client ecosystem. I think our goal
>> > is to make Kafka effective as a general purpose, centralized, data
>> > subscription system. This vision only really works if all your
>> > applications, are able to integrate easily, whatever language they are
>> > in.
>> >
>> > We have a number of pretty good non-java producers. We have been
>> > lacking the features on the server-side to make writing non-java
>> > consumers easy. We are fixing that right now as part of the consumer
>> > work going on right now (which moves a lot of the functionality in the
>> > java consumer to the server side).
>> >
>> > But apart from this I think there may be a lot more we can do to make
>> > the client ecosystem better.
>> >
>> > Here are some concrete ideas. If anyone has additional ideas please
>> > reply to this thread and share them. If you are interested in picking
>> > any of these up, please do.
>> >
>> > 1. The most obvious way to improve the ecosystem is to help work on
>> > clients. This doesn't necessarily mean writing new clients, since in
>> > many cases we already have a client in a given language. I think any
>> > way we can incentivize fewer, better clients rather than many
>> > half-working clients we should do. However we are working now on the
>> > server-side consumer co-ordination so it should now be possible to
>> > write much simpler consumers.
>> >
>> > 2. It would be great if someone put together a mailing list just for
>> > client developers to share tips, tricks, problems, and so on. We can
>> > make sure all the main contributors on this too. I think this could be
>> > a forum for kind of directing improvements in this area.
>> >
>> > 3. Help improve the documentation on how to implement a client. We
>> > have tried to make the protocol spec not just a dry document but also
>> > have it share best practices, rationale, and intentions. I think this
>> > could potentially be even better as there is really a range of options
>> > from a very simple quick implementation to a more complex highly
>> > optimized version. It would be good to really document some of the
>> > options and tradeoffs.
>> >
>> > 4. Come up with a standard way of documenting the features of clients.
>> > In an ideal world it would be possible to get the same information
>> > (author, language, feature set, download link, source code, etc) for
>> > all clients. It would be great to standardize the documentation for
>> > the client as well. For example having one or two basic examples that
>> > are repeated for every client in a standardized way. This would let
>> > someone come to the Kafka site who is not a java developer, and click
>> > on the link for their language and view examples of interacting with
>> > Kafka in the language they know using the client they would eventually
>> > use.
>> >
>> > 5. Build a Kafka Client Compatibility Kit (KCCK) :-) The idea is this:
>> > anyone who wants to implement a client would implement a simple
>> > command line program with a set of standardized options. The
>> > compatibility kit would be a standard set of scripts that ran their
>> > client using this command line driver and validate its behavior. E.g.
>> > for a producer it would test that it correctly can send messages, that
>> > the ordering is retained, that the client correctly handles
>> > reconnection and metadata refresh, and compression. The output would
>> > be a list of features that passed are certified, and perhaps basic
>> > performance information. This would be an easy way to help client
>> > developers write correct clients, as well as having a standardized
>> > comparison for the clients that says that they work correctly.
>> >
>> > -Jay
>>


Re: Improving the Kafka client ecosystem

2014-08-19 Thread Dana Powers
I created kafka-clie...@groups.google.com

https://groups.google.com/forum/m/#!forum/kafka-clients

No members and no guidelines yet, but it's a start.  Would love to get this
going.

Dana
 On Aug 19, 2014 9:03 AM, "Mark Roberts"  wrote:

> Did this mailing list ever get created? Was there consensus that it did or
> didn't need created?
>
> -Mark
>
> > On Jul 18, 2014, at 14:34, Jay Kreps  wrote:
> >
> > A question was asked in another thread about what was an effective way
> > to contribute to the Kafka project for people who weren't very
> > enthusiastic about writing Java/Scala code.
> >
> > I wanted to kind of advocate for an area I think is really important
> > and not as good as it could be--the client ecosystem. I think our goal
> > is to make Kafka effective as a general purpose, centralized, data
> > subscription system. This vision only really works if all your
> > applications, are able to integrate easily, whatever language they are
> > in.
> >
> > We have a number of pretty good non-java producers. We have been
> > lacking the features on the server-side to make writing non-java
> > consumers easy. We are fixing that right now as part of the consumer
> > work going on right now (which moves a lot of the functionality in the
> > java consumer to the server side).
> >
> > But apart from this I think there may be a lot more we can do to make
> > the client ecosystem better.
> >
> > Here are some concrete ideas. If anyone has additional ideas please
> > reply to this thread and share them. If you are interested in picking
> > any of these up, please do.
> >
> > 1. The most obvious way to improve the ecosystem is to help work on
> > clients. This doesn't necessarily mean writing new clients, since in
> > many cases we already have a client in a given language. I think any
> > way we can incentivize fewer, better clients rather than many
> > half-working clients we should do. However we are working now on the
> > server-side consumer co-ordination so it should now be possible to
> > write much simpler consumers.
> >
> > 2. It would be great if someone put together a mailing list just for
> > client developers to share tips, tricks, problems, and so on. We can
> > make sure all the main contributors on this too. I think this could be
> > a forum for kind of directing improvements in this area.
> >
> > 3. Help improve the documentation on how to implement a client. We
> > have tried to make the protocol spec not just a dry document but also
> > have it share best practices, rationale, and intentions. I think this
> > could potentially be even better as there is really a range of options
> > from a very simple quick implementation to a more complex highly
> > optimized version. It would be good to really document some of the
> > options and tradeoffs.
> >
> > 4. Come up with a standard way of documenting the features of clients.
> > In an ideal world it would be possible to get the same information
> > (author, language, feature set, download link, source code, etc) for
> > all clients. It would be great to standardize the documentation for
> > the client as well. For example having one or two basic examples that
> > are repeated for every client in a standardized way. This would let
> > someone come to the Kafka site who is not a java developer, and click
> > on the link for their language and view examples of interacting with
> > Kafka in the language they know using the client they would eventually
> > use.
> >
> > 5. Build a Kafka Client Compatibility Kit (KCCK) :-) The idea is this:
> > anyone who wants to implement a client would implement a simple
> > command line program with a set of standardized options. The
> > compatibility kit would be a standard set of scripts that ran their
> > client using this command line driver and validate its behavior. E.g.
> > for a producer it would test that it correctly can send messages, that
> > the ordering is retained, that the client correctly handles
> > reconnection and metadata refresh, and compression. The output would
> > be a list of features that passed are certified, and perhaps basic
> > performance information. This would be an easy way to help client
> > developers write correct clients, as well as having a standardized
> > comparison for the clients that says that they work correctly.
> >
> > -Jay
>


Re: Improving the Kafka client ecosystem

2014-08-19 Thread Mark Roberts
Did this mailing list ever get created? Was there consensus that it did or 
didn't need created?

-Mark

> On Jul 18, 2014, at 14:34, Jay Kreps  wrote:
> 
> A question was asked in another thread about what was an effective way
> to contribute to the Kafka project for people who weren't very
> enthusiastic about writing Java/Scala code.
> 
> I wanted to kind of advocate for an area I think is really important
> and not as good as it could be--the client ecosystem. I think our goal
> is to make Kafka effective as a general purpose, centralized, data
> subscription system. This vision only really works if all your
> applications, are able to integrate easily, whatever language they are
> in.
> 
> We have a number of pretty good non-java producers. We have been
> lacking the features on the server-side to make writing non-java
> consumers easy. We are fixing that right now as part of the consumer
> work going on right now (which moves a lot of the functionality in the
> java consumer to the server side).
> 
> But apart from this I think there may be a lot more we can do to make
> the client ecosystem better.
> 
> Here are some concrete ideas. If anyone has additional ideas please
> reply to this thread and share them. If you are interested in picking
> any of these up, please do.
> 
> 1. The most obvious way to improve the ecosystem is to help work on
> clients. This doesn't necessarily mean writing new clients, since in
> many cases we already have a client in a given language. I think any
> way we can incentivize fewer, better clients rather than many
> half-working clients we should do. However we are working now on the
> server-side consumer co-ordination so it should now be possible to
> write much simpler consumers.
> 
> 2. It would be great if someone put together a mailing list just for
> client developers to share tips, tricks, problems, and so on. We can
> make sure all the main contributors on this too. I think this could be
> a forum for kind of directing improvements in this area.
> 
> 3. Help improve the documentation on how to implement a client. We
> have tried to make the protocol spec not just a dry document but also
> have it share best practices, rationale, and intentions. I think this
> could potentially be even better as there is really a range of options
> from a very simple quick implementation to a more complex highly
> optimized version. It would be good to really document some of the
> options and tradeoffs.
> 
> 4. Come up with a standard way of documenting the features of clients.
> In an ideal world it would be possible to get the same information
> (author, language, feature set, download link, source code, etc) for
> all clients. It would be great to standardize the documentation for
> the client as well. For example having one or two basic examples that
> are repeated for every client in a standardized way. This would let
> someone come to the Kafka site who is not a java developer, and click
> on the link for their language and view examples of interacting with
> Kafka in the language they know using the client they would eventually
> use.
> 
> 5. Build a Kafka Client Compatibility Kit (KCCK) :-) The idea is this:
> anyone who wants to implement a client would implement a simple
> command line program with a set of standardized options. The
> compatibility kit would be a standard set of scripts that ran their
> client using this command line driver and validate its behavior. E.g.
> for a producer it would test that it correctly can send messages, that
> the ordering is retained, that the client correctly handles
> reconnection and metadata refresh, and compression. The output would
> be a list of features that passed are certified, and perhaps basic
> performance information. This would be an easy way to help client
> developers write correct clients, as well as having a standardized
> comparison for the clients that says that they work correctly.
> 
> -Jay


Re: Improving the Kafka client ecosystem

2014-07-20 Thread Philip O'Toole


On Saturday, July 19, 2014 4:41 PM, Jay Kreps  wrote:

>>The architecture design patterns idea is fantastic. That would be a
great thing to do.

To that end how important do you think the 0.7 release remains? I still have a 
bit more practical experience with that version (thought that is changing), and 
its simplicity is very appealing vis-à-vis the 0.8 series.

Philip




On Fri, Jul 18, 2014 at 11:46 PM, Philip O'Toole
 wrote:
> Thanks Jay -- some good ideas there.
>
> I agree strongly that fewer, more solid, non-Java clients are better than 
> many shallow ones. Interesting that you feel we could do some more work in 
> this area, as I thought it was well served (even if they have proliferated).
>
> One area I would like see documented better -- and I am considering it myself 
> -- is a collection of Kafka "Architectural Design Patterns", all in one one 
> place. For example, how to use Kafka to build a staging and test environment 
> (tapping the production flow in a non-destructive manner), how to build 
> robust pipelines, to read to and from, say, Apache Storm, how to deploy a 
> cluster in EC2 (the interaction with Availability Zones), topic vs. partition 
> demuxing, etc, etc. I've yet to see a nice consolidation of this information 
> -- it would not really be about coding, but system design. Ideally it would 
> be reviewed by you committers, but someone else would do the work.
>
> Philip
>
>
> ---
> www.philipotoole.com
>
>
>
> On Friday, July 18, 2014 3:58 PM, Jay Kreps  wrote:
>
>
>
> Basically my thought with getting a separate mailing list was to have
> a place specifically to discuss issues around clients. I don't see a
> lot of discussion about them on the main list. I thought perhaps this
> was because people don't like to ask questions which are about
> adjacent projects/code bases. But basically whatever will lead to a
> robust discussion, bug tracking, etc on clients.
>
> -Jay
>
>
> On Fri, Jul 18, 2014 at 3:49 PM, Jun Rao  wrote:
>> Another important part of eco-system could be around the adaptors of
>> getting data from other systems into Kafka and vice versa. So, for the
>> ingestion part, this can include things like getting data from mysql,
>> syslog, apache server log, etc. For the egress part, this can include
>> putting Kafka data into HDFS, S3, etc.
>>
>> Will a separate mailing list be convenient? Could we just use the Kafka
>> mailing list?
>>
>> Thanks,
>>
>> Jun
>>
>>
>> On Fri, Jul 18, 2014 at 2:34 PM, Jay Kreps  wrote:
>>
>>> A question was asked in another thread about what was an effective way
>>> to contribute to the Kafka project for people who weren't very
>>> enthusiastic about writing Java/Scala code.
>>>
>>> I wanted to kind of advocate for an area I think is really important
>>> and not as good as it could be--the client ecosystem. I think our goal
>>> is to make Kafka effective as a general purpose, centralized, data
>>> subscription system. This vision only really works if all your
>>> applications, are able to integrate easily, whatever language they are
>>> in.
>>>
>>> We have a number of pretty good non-java producers. We have been
>>> lacking the features on the server-side to make writing non-java
>>> consumers easy. We are fixing that right now as part of the consumer
>>> work going on right now (which moves a lot of the functionality in the
>>> java consumer to the server side).
>>>
>>> But apart from this I think there may be a lot more we can do to make
>>> the client ecosystem better.
>>>
>>> Here are some concrete ideas. If anyone has additional ideas please
>>> reply to this thread and share them. If you are interested in picking
>>> any of these up, please do.
>>>
>>> 1. The most obvious way to improve the ecosystem is to help work on
>>> clients. This doesn't necessarily mean writing new clients, since in
>>> many cases we already have a client in a given language. I think any
>>> way we can incentivize fewer, better clients rather than many
>>> half-working clients we should do. However we are working now on the
>>> server-side consumer co-ordination so it should now be possible to
>>> write much simpler consumers.
>>>
>>> 2. It would be great if someone put together a mailing list just for
>>> client developers to share tips, tricks, problems, and so on. We can
>>> make sure all the main contributors on this too. I think this could be
>>> a forum for kind of directing improvements in this area.
>>>
>>> 3. Help improve the documentation on how to implement a client. We
>>> have tried to make the protocol spec not just a dry document but also
>>> have it share best practices, rationale, and intentions. I think this
>>> could potentially be even better as there is really a range of options
>>> from a very simple quick implementation to a more complex highly
>>> optimized version. It would be good to really document some of the
>>> options and tradeoffs.
>>>
>>> 4. Come up with a standard way of documenting the features o

Re: Improving the Kafka client ecosystem

2014-07-19 Thread Mark Roberts
Hi all,

As a client engineer on the python client, I would really appreciate a
separate mailing list for client implementation discussion and a language
agnostic test suite.  What might also be really useful is an enumerated
list of error conditions and the expected behavior to come out of them.
 For instance, what do you do if you have a multi-partition producer that
tries to produce to a non-existent topic?  The metadata request is going to
return nothing, which means you don't know where to send the request at
all.  You could just arbitrarily send it to a broker I guess?

At any rate, I have lots of questions about a formalized "certified client"
process.  I'm not against the idea (in fact quite the opposite), but I'm
concerned that non-Java clients will be constrained purely to the currently
existing Java API in the name of client uniformity and standardization.

-Mark



On Sat, Jul 19, 2014 at 12:30 AM, Timothy Chen  wrote:

> The certified client test suite really will benefit all the client
> developers, as writing a Kafka client often is not just talking protocol
> but to be able to handle correctly all the cases, errors and situations,
> but also performance.
>
> From my experience writing a C# client definitely feel that a lot of test
> scenarios could be generalized and used for all clients.
>
> I was reviewing some other client implementation and there are errors and
> cases it didn't handle and having a suite that exposes that will allow
> users to not run knot those problems and try to determine its a client or
> server bug as it's sometimes hard to figure out.
>
> Tim
>
> > On Jul 18, 2014, at 3:57 PM, Jay Kreps  wrote:
> >
> > Basically my thought with getting a separate mailing list was to have
> > a place specifically to discuss issues around clients. I don't see a
> > lot of discussion about them on the main list. I thought perhaps this
> > was because people don't like to ask questions which are about
> > adjacent projects/code bases. But basically whatever will lead to a
> > robust discussion, bug tracking, etc on clients.
> >
> > -Jay
> >
> >> On Fri, Jul 18, 2014 at 3:49 PM, Jun Rao  wrote:
> >> Another important part of eco-system could be around the adaptors of
> >> getting data from other systems into Kafka and vice versa. So, for the
> >> ingestion part, this can include things like getting data from mysql,
> >> syslog, apache server log, etc. For the egress part, this can include
> >> putting Kafka data into HDFS, S3, etc.
> >>
> >> Will a separate mailing list be convenient? Could we just use the Kafka
> >> mailing list?
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >>
> >>> On Fri, Jul 18, 2014 at 2:34 PM, Jay Kreps 
> wrote:
> >>>
> >>> A question was asked in another thread about what was an effective way
> >>> to contribute to the Kafka project for people who weren't very
> >>> enthusiastic about writing Java/Scala code.
> >>>
> >>> I wanted to kind of advocate for an area I think is really important
> >>> and not as good as it could be--the client ecosystem. I think our goal
> >>> is to make Kafka effective as a general purpose, centralized, data
> >>> subscription system. This vision only really works if all your
> >>> applications, are able to integrate easily, whatever language they are
> >>> in.
> >>>
> >>> We have a number of pretty good non-java producers. We have been
> >>> lacking the features on the server-side to make writing non-java
> >>> consumers easy. We are fixing that right now as part of the consumer
> >>> work going on right now (which moves a lot of the functionality in the
> >>> java consumer to the server side).
> >>>
> >>> But apart from this I think there may be a lot more we can do to make
> >>> the client ecosystem better.
> >>>
> >>> Here are some concrete ideas. If anyone has additional ideas please
> >>> reply to this thread and share them. If you are interested in picking
> >>> any of these up, please do.
> >>>
> >>> 1. The most obvious way to improve the ecosystem is to help work on
> >>> clients. This doesn't necessarily mean writing new clients, since in
> >>> many cases we already have a client in a given language. I think any
> >>> way we can incentivize fewer, better clients rather than many
> >>> half-working clients we should do. However we are working now on the
> >>> server-side consumer co-ordination so it should now be possible to
> >>> write much simpler consumers.
> >>>
> >>> 2. It would be great if someone put together a mailing list just for
> >>> client developers to share tips, tricks, problems, and so on. We can
> >>> make sure all the main contributors on this too. I think this could be
> >>> a forum for kind of directing improvements in this area.
> >>>
> >>> 3. Help improve the documentation on how to implement a client. We
> >>> have tried to make the protocol spec not just a dry document but also
> >>> have it share best practices, rationale, and intentions. I think this
> >>> could potentially be even better as there is rea

Re: Improving the Kafka client ecosystem

2014-07-19 Thread Jay Kreps
Hey Philip,

Yeah I think we have actually done pretty good at getting reasonably
solid clients in a bunch of languages. I just think it is an important
area.

The architecture design patterns idea is fantastic. That would be a
great thing to do.

-Jay



On Fri, Jul 18, 2014 at 11:46 PM, Philip O'Toole
 wrote:
> Thanks Jay -- some good ideas there.
>
> I agree strongly that fewer, more solid, non-Java clients are better than 
> many shallow ones. Interesting that you feel we could do some more work in 
> this area, as I thought it was well served (even if they have proliferated).
>
> One area I would like see documented better -- and I am considering it myself 
> -- is a collection of Kafka "Architectural Design Patterns", all in one one 
> place. For example, how to use Kafka to build a staging and test environment 
> (tapping the production flow in a non-destructive manner), how to build 
> robust pipelines, to read to and from, say, Apache Storm, how to deploy a 
> cluster in EC2 (the interaction with Availability Zones), topic vs. partition 
> demuxing, etc, etc. I've yet to see a nice consolidation of this information 
> -- it would not really be about coding, but system design. Ideally it would 
> be reviewed by you committers, but someone else would do the work.
>
> Philip
>
>
> ---
> www.philipotoole.com
>
>
>
> On Friday, July 18, 2014 3:58 PM, Jay Kreps  wrote:
>
>
>
> Basically my thought with getting a separate mailing list was to have
> a place specifically to discuss issues around clients. I don't see a
> lot of discussion about them on the main list. I thought perhaps this
> was because people don't like to ask questions which are about
> adjacent projects/code bases. But basically whatever will lead to a
> robust discussion, bug tracking, etc on clients.
>
> -Jay
>
>
> On Fri, Jul 18, 2014 at 3:49 PM, Jun Rao  wrote:
>> Another important part of eco-system could be around the adaptors of
>> getting data from other systems into Kafka and vice versa. So, for the
>> ingestion part, this can include things like getting data from mysql,
>> syslog, apache server log, etc. For the egress part, this can include
>> putting Kafka data into HDFS, S3, etc.
>>
>> Will a separate mailing list be convenient? Could we just use the Kafka
>> mailing list?
>>
>> Thanks,
>>
>> Jun
>>
>>
>> On Fri, Jul 18, 2014 at 2:34 PM, Jay Kreps  wrote:
>>
>>> A question was asked in another thread about what was an effective way
>>> to contribute to the Kafka project for people who weren't very
>>> enthusiastic about writing Java/Scala code.
>>>
>>> I wanted to kind of advocate for an area I think is really important
>>> and not as good as it could be--the client ecosystem. I think our goal
>>> is to make Kafka effective as a general purpose, centralized, data
>>> subscription system. This vision only really works if all your
>>> applications, are able to integrate easily, whatever language they are
>>> in.
>>>
>>> We have a number of pretty good non-java producers. We have been
>>> lacking the features on the server-side to make writing non-java
>>> consumers easy. We are fixing that right now as part of the consumer
>>> work going on right now (which moves a lot of the functionality in the
>>> java consumer to the server side).
>>>
>>> But apart from this I think there may be a lot more we can do to make
>>> the client ecosystem better.
>>>
>>> Here are some concrete ideas. If anyone has additional ideas please
>>> reply to this thread and share them. If you are interested in picking
>>> any of these up, please do.
>>>
>>> 1. The most obvious way to improve the ecosystem is to help work on
>>> clients. This doesn't necessarily mean writing new clients, since in
>>> many cases we already have a client in a given language. I think any
>>> way we can incentivize fewer, better clients rather than many
>>> half-working clients we should do. However we are working now on the
>>> server-side consumer co-ordination so it should now be possible to
>>> write much simpler consumers.
>>>
>>> 2. It would be great if someone put together a mailing list just for
>>> client developers to share tips, tricks, problems, and so on. We can
>>> make sure all the main contributors on this too. I think this could be
>>> a forum for kind of directing improvements in this area.
>>>
>>> 3. Help improve the documentation on how to implement a client. We
>>> have tried to make the protocol spec not just a dry document but also
>>> have it share best practices, rationale, and intentions. I think this
>>> could potentially be even better as there is really a range of options
>>> from a very simple quick implementation to a more complex highly
>>> optimized version. It would be good to really document some of the
>>> options and tradeoffs.
>>>
>>> 4. Come up with a standard way of documenting the features of clients.
>>> In an ideal world it would be possible to get the same information
>>> (author, language, feature set, downl

Re: Improving the Kafka client ecosystem

2014-07-19 Thread Timothy Chen
The certified client test suite really will benefit all the client developers, 
as writing a Kafka client often is not just talking protocol but to be able to 
handle correctly all the cases, errors and situations, but also performance.

From my experience writing a C# client definitely feel that a lot of test 
scenarios could be generalized and used for all clients.

I was reviewing some other client implementation and there are errors and cases 
it didn't handle and having a suite that exposes that will allow users to not 
run knot those problems and try to determine its a client or server bug as it's 
sometimes hard to figure out.

Tim

> On Jul 18, 2014, at 3:57 PM, Jay Kreps  wrote:
> 
> Basically my thought with getting a separate mailing list was to have
> a place specifically to discuss issues around clients. I don't see a
> lot of discussion about them on the main list. I thought perhaps this
> was because people don't like to ask questions which are about
> adjacent projects/code bases. But basically whatever will lead to a
> robust discussion, bug tracking, etc on clients.
> 
> -Jay
> 
>> On Fri, Jul 18, 2014 at 3:49 PM, Jun Rao  wrote:
>> Another important part of eco-system could be around the adaptors of
>> getting data from other systems into Kafka and vice versa. So, for the
>> ingestion part, this can include things like getting data from mysql,
>> syslog, apache server log, etc. For the egress part, this can include
>> putting Kafka data into HDFS, S3, etc.
>> 
>> Will a separate mailing list be convenient? Could we just use the Kafka
>> mailing list?
>> 
>> Thanks,
>> 
>> Jun
>> 
>> 
>>> On Fri, Jul 18, 2014 at 2:34 PM, Jay Kreps  wrote:
>>> 
>>> A question was asked in another thread about what was an effective way
>>> to contribute to the Kafka project for people who weren't very
>>> enthusiastic about writing Java/Scala code.
>>> 
>>> I wanted to kind of advocate for an area I think is really important
>>> and not as good as it could be--the client ecosystem. I think our goal
>>> is to make Kafka effective as a general purpose, centralized, data
>>> subscription system. This vision only really works if all your
>>> applications, are able to integrate easily, whatever language they are
>>> in.
>>> 
>>> We have a number of pretty good non-java producers. We have been
>>> lacking the features on the server-side to make writing non-java
>>> consumers easy. We are fixing that right now as part of the consumer
>>> work going on right now (which moves a lot of the functionality in the
>>> java consumer to the server side).
>>> 
>>> But apart from this I think there may be a lot more we can do to make
>>> the client ecosystem better.
>>> 
>>> Here are some concrete ideas. If anyone has additional ideas please
>>> reply to this thread and share them. If you are interested in picking
>>> any of these up, please do.
>>> 
>>> 1. The most obvious way to improve the ecosystem is to help work on
>>> clients. This doesn't necessarily mean writing new clients, since in
>>> many cases we already have a client in a given language. I think any
>>> way we can incentivize fewer, better clients rather than many
>>> half-working clients we should do. However we are working now on the
>>> server-side consumer co-ordination so it should now be possible to
>>> write much simpler consumers.
>>> 
>>> 2. It would be great if someone put together a mailing list just for
>>> client developers to share tips, tricks, problems, and so on. We can
>>> make sure all the main contributors on this too. I think this could be
>>> a forum for kind of directing improvements in this area.
>>> 
>>> 3. Help improve the documentation on how to implement a client. We
>>> have tried to make the protocol spec not just a dry document but also
>>> have it share best practices, rationale, and intentions. I think this
>>> could potentially be even better as there is really a range of options
>>> from a very simple quick implementation to a more complex highly
>>> optimized version. It would be good to really document some of the
>>> options and tradeoffs.
>>> 
>>> 4. Come up with a standard way of documenting the features of clients.
>>> In an ideal world it would be possible to get the same information
>>> (author, language, feature set, download link, source code, etc) for
>>> all clients. It would be great to standardize the documentation for
>>> the client as well. For example having one or two basic examples that
>>> are repeated for every client in a standardized way. This would let
>>> someone come to the Kafka site who is not a java developer, and click
>>> on the link for their language and view examples of interacting with
>>> Kafka in the language they know using the client they would eventually
>>> use.
>>> 
>>> 5. Build a Kafka Client Compatibility Kit (KCCK) :-) The idea is this:
>>> anyone who wants to implement a client would implement a simple
>>> command line program with a set of standardized options. The
>>> compatibil

Re: Improving the Kafka client ecosystem

2014-07-18 Thread Philip O'Toole
Thanks Jay -- some good ideas there.

I agree strongly that fewer, more solid, non-Java clients are better than many 
shallow ones. Interesting that you feel we could do some more work in this 
area, as I thought it was well served (even if they have proliferated).

One area I would like see documented better -- and I am considering it myself 
-- is a collection of Kafka "Architectural Design Patterns", all in one one 
place. For example, how to use Kafka to build a staging and test environment 
(tapping the production flow in a non-destructive manner), how to build robust 
pipelines, to read to and from, say, Apache Storm, how to deploy a cluster in 
EC2 (the interaction with Availability Zones), topic vs. partition demuxing, 
etc, etc. I've yet to see a nice consolidation of this information -- it would 
not really be about coding, but system design. Ideally it would be reviewed by 
you committers, but someone else would do the work.

Philip

 
--- 
www.philipotoole.com 



On Friday, July 18, 2014 3:58 PM, Jay Kreps  wrote:
 


Basically my thought with getting a separate mailing list was to have
a place specifically to discuss issues around clients. I don't see a
lot of discussion about them on the main list. I thought perhaps this
was because people don't like to ask questions which are about
adjacent projects/code bases. But basically whatever will lead to a
robust discussion, bug tracking, etc on clients.

-Jay


On Fri, Jul 18, 2014 at 3:49 PM, Jun Rao  wrote:
> Another important part of eco-system could be around the adaptors of
> getting data from other systems into Kafka and vice versa. So, for the
> ingestion part, this can include things like getting data from mysql,
> syslog, apache server log, etc. For the egress part, this can include
> putting Kafka data into HDFS, S3, etc.
>
> Will a separate mailing list be convenient? Could we just use the Kafka
> mailing list?
>
> Thanks,
>
> Jun
>
>
> On Fri, Jul 18, 2014 at 2:34 PM, Jay Kreps  wrote:
>
>> A question was asked in another thread about what was an effective way
>> to contribute to the Kafka project for people who weren't very
>> enthusiastic about writing Java/Scala code.
>>
>> I wanted to kind of advocate for an area I think is really important
>> and not as good as it could be--the client ecosystem. I think our goal
>> is to make Kafka effective as a general purpose, centralized, data
>> subscription system. This vision only really works if all your
>> applications, are able to integrate easily, whatever language they are
>> in.
>>
>> We have a number of pretty good non-java producers. We have been
>> lacking the features on the server-side to make writing non-java
>> consumers easy. We are fixing that right now as part of the consumer
>> work going on right now (which moves a lot of the functionality in the
>> java consumer to the server side).
>>
>> But apart from this I think there may be a lot more we can do to make
>> the client ecosystem better.
>>
>> Here are some concrete ideas. If anyone has additional ideas please
>> reply to this thread and share them. If you are interested in picking
>> any of these up, please do.
>>
>> 1. The most obvious way to improve the ecosystem is to help work on
>> clients. This doesn't necessarily mean writing new clients, since in
>> many cases we already have a client in a given language. I think any
>> way we can incentivize fewer, better clients rather than many
>> half-working clients we should do. However we are working now on the
>> server-side consumer co-ordination so it should now be possible to
>> write much simpler consumers.
>>
>> 2. It would be great if someone put together a mailing list just for
>> client developers to share tips, tricks, problems, and so on. We can
>> make sure all the main contributors on this too. I think this could be
>> a forum for kind of directing improvements in this area.
>>
>> 3. Help improve the documentation on how to implement a client. We
>> have tried to make the protocol spec not just a dry document but also
>> have it share best practices, rationale, and intentions. I think this
>> could potentially be even better as there is really a range of options
>> from a very simple quick implementation to a more complex highly
>> optimized version. It would be good to really document some of the
>> options and tradeoffs.
>>
>> 4. Come up with a standard way of documenting the features of clients.
>> In an ideal world it would be possible to get the same information
>> (author, language, feature set, download link, source code, etc) for
>> all clients. It would be great to standardize the documentation for
>> the client as well. For example having one or two basic examples that
>> are repeated for every client in a standardized way. This would let
>> someone come to the Kafka site who is not a java developer, and click
>> on the link for their language and view examples of interacting with
>> Kafka in the language they know using the 

Re: Improving the Kafka client ecosystem

2014-07-18 Thread Jay Kreps
Basically my thought with getting a separate mailing list was to have
a place specifically to discuss issues around clients. I don't see a
lot of discussion about them on the main list. I thought perhaps this
was because people don't like to ask questions which are about
adjacent projects/code bases. But basically whatever will lead to a
robust discussion, bug tracking, etc on clients.

-Jay

On Fri, Jul 18, 2014 at 3:49 PM, Jun Rao  wrote:
> Another important part of eco-system could be around the adaptors of
> getting data from other systems into Kafka and vice versa. So, for the
> ingestion part, this can include things like getting data from mysql,
> syslog, apache server log, etc. For the egress part, this can include
> putting Kafka data into HDFS, S3, etc.
>
> Will a separate mailing list be convenient? Could we just use the Kafka
> mailing list?
>
> Thanks,
>
> Jun
>
>
> On Fri, Jul 18, 2014 at 2:34 PM, Jay Kreps  wrote:
>
>> A question was asked in another thread about what was an effective way
>> to contribute to the Kafka project for people who weren't very
>> enthusiastic about writing Java/Scala code.
>>
>> I wanted to kind of advocate for an area I think is really important
>> and not as good as it could be--the client ecosystem. I think our goal
>> is to make Kafka effective as a general purpose, centralized, data
>> subscription system. This vision only really works if all your
>> applications, are able to integrate easily, whatever language they are
>> in.
>>
>> We have a number of pretty good non-java producers. We have been
>> lacking the features on the server-side to make writing non-java
>> consumers easy. We are fixing that right now as part of the consumer
>> work going on right now (which moves a lot of the functionality in the
>> java consumer to the server side).
>>
>> But apart from this I think there may be a lot more we can do to make
>> the client ecosystem better.
>>
>> Here are some concrete ideas. If anyone has additional ideas please
>> reply to this thread and share them. If you are interested in picking
>> any of these up, please do.
>>
>> 1. The most obvious way to improve the ecosystem is to help work on
>> clients. This doesn't necessarily mean writing new clients, since in
>> many cases we already have a client in a given language. I think any
>> way we can incentivize fewer, better clients rather than many
>> half-working clients we should do. However we are working now on the
>> server-side consumer co-ordination so it should now be possible to
>> write much simpler consumers.
>>
>> 2. It would be great if someone put together a mailing list just for
>> client developers to share tips, tricks, problems, and so on. We can
>> make sure all the main contributors on this too. I think this could be
>> a forum for kind of directing improvements in this area.
>>
>> 3. Help improve the documentation on how to implement a client. We
>> have tried to make the protocol spec not just a dry document but also
>> have it share best practices, rationale, and intentions. I think this
>> could potentially be even better as there is really a range of options
>> from a very simple quick implementation to a more complex highly
>> optimized version. It would be good to really document some of the
>> options and tradeoffs.
>>
>> 4. Come up with a standard way of documenting the features of clients.
>> In an ideal world it would be possible to get the same information
>> (author, language, feature set, download link, source code, etc) for
>> all clients. It would be great to standardize the documentation for
>> the client as well. For example having one or two basic examples that
>> are repeated for every client in a standardized way. This would let
>> someone come to the Kafka site who is not a java developer, and click
>> on the link for their language and view examples of interacting with
>> Kafka in the language they know using the client they would eventually
>> use.
>>
>> 5. Build a Kafka Client Compatibility Kit (KCCK) :-) The idea is this:
>> anyone who wants to implement a client would implement a simple
>> command line program with a set of standardized options. The
>> compatibility kit would be a standard set of scripts that ran their
>> client using this command line driver and validate its behavior. E.g.
>> for a producer it would test that it correctly can send messages, that
>> the ordering is retained, that the client correctly handles
>> reconnection and metadata refresh, and compression. The output would
>> be a list of features that passed are certified, and perhaps basic
>> performance information. This would be an easy way to help client
>> developers write correct clients, as well as having a standardized
>> comparison for the clients that says that they work correctly.
>>
>> -Jay
>>


Re: Improving the Kafka client ecosystem

2014-07-18 Thread Jun Rao
Another important part of eco-system could be around the adaptors of
getting data from other systems into Kafka and vice versa. So, for the
ingestion part, this can include things like getting data from mysql,
syslog, apache server log, etc. For the egress part, this can include
putting Kafka data into HDFS, S3, etc.

Will a separate mailing list be convenient? Could we just use the Kafka
mailing list?

Thanks,

Jun


On Fri, Jul 18, 2014 at 2:34 PM, Jay Kreps  wrote:

> A question was asked in another thread about what was an effective way
> to contribute to the Kafka project for people who weren't very
> enthusiastic about writing Java/Scala code.
>
> I wanted to kind of advocate for an area I think is really important
> and not as good as it could be--the client ecosystem. I think our goal
> is to make Kafka effective as a general purpose, centralized, data
> subscription system. This vision only really works if all your
> applications, are able to integrate easily, whatever language they are
> in.
>
> We have a number of pretty good non-java producers. We have been
> lacking the features on the server-side to make writing non-java
> consumers easy. We are fixing that right now as part of the consumer
> work going on right now (which moves a lot of the functionality in the
> java consumer to the server side).
>
> But apart from this I think there may be a lot more we can do to make
> the client ecosystem better.
>
> Here are some concrete ideas. If anyone has additional ideas please
> reply to this thread and share them. If you are interested in picking
> any of these up, please do.
>
> 1. The most obvious way to improve the ecosystem is to help work on
> clients. This doesn't necessarily mean writing new clients, since in
> many cases we already have a client in a given language. I think any
> way we can incentivize fewer, better clients rather than many
> half-working clients we should do. However we are working now on the
> server-side consumer co-ordination so it should now be possible to
> write much simpler consumers.
>
> 2. It would be great if someone put together a mailing list just for
> client developers to share tips, tricks, problems, and so on. We can
> make sure all the main contributors on this too. I think this could be
> a forum for kind of directing improvements in this area.
>
> 3. Help improve the documentation on how to implement a client. We
> have tried to make the protocol spec not just a dry document but also
> have it share best practices, rationale, and intentions. I think this
> could potentially be even better as there is really a range of options
> from a very simple quick implementation to a more complex highly
> optimized version. It would be good to really document some of the
> options and tradeoffs.
>
> 4. Come up with a standard way of documenting the features of clients.
> In an ideal world it would be possible to get the same information
> (author, language, feature set, download link, source code, etc) for
> all clients. It would be great to standardize the documentation for
> the client as well. For example having one or two basic examples that
> are repeated for every client in a standardized way. This would let
> someone come to the Kafka site who is not a java developer, and click
> on the link for their language and view examples of interacting with
> Kafka in the language they know using the client they would eventually
> use.
>
> 5. Build a Kafka Client Compatibility Kit (KCCK) :-) The idea is this:
> anyone who wants to implement a client would implement a simple
> command line program with a set of standardized options. The
> compatibility kit would be a standard set of scripts that ran their
> client using this command line driver and validate its behavior. E.g.
> for a producer it would test that it correctly can send messages, that
> the ordering is retained, that the client correctly handles
> reconnection and metadata refresh, and compression. The output would
> be a list of features that passed are certified, and perhaps basic
> performance information. This would be an easy way to help client
> developers write correct clients, as well as having a standardized
> comparison for the clients that says that they work correctly.
>
> -Jay
>


Improving the Kafka client ecosystem

2014-07-18 Thread Jay Kreps
A question was asked in another thread about what was an effective way
to contribute to the Kafka project for people who weren't very
enthusiastic about writing Java/Scala code.

I wanted to kind of advocate for an area I think is really important
and not as good as it could be--the client ecosystem. I think our goal
is to make Kafka effective as a general purpose, centralized, data
subscription system. This vision only really works if all your
applications, are able to integrate easily, whatever language they are
in.

We have a number of pretty good non-java producers. We have been
lacking the features on the server-side to make writing non-java
consumers easy. We are fixing that right now as part of the consumer
work going on right now (which moves a lot of the functionality in the
java consumer to the server side).

But apart from this I think there may be a lot more we can do to make
the client ecosystem better.

Here are some concrete ideas. If anyone has additional ideas please
reply to this thread and share them. If you are interested in picking
any of these up, please do.

1. The most obvious way to improve the ecosystem is to help work on
clients. This doesn't necessarily mean writing new clients, since in
many cases we already have a client in a given language. I think any
way we can incentivize fewer, better clients rather than many
half-working clients we should do. However we are working now on the
server-side consumer co-ordination so it should now be possible to
write much simpler consumers.

2. It would be great if someone put together a mailing list just for
client developers to share tips, tricks, problems, and so on. We can
make sure all the main contributors on this too. I think this could be
a forum for kind of directing improvements in this area.

3. Help improve the documentation on how to implement a client. We
have tried to make the protocol spec not just a dry document but also
have it share best practices, rationale, and intentions. I think this
could potentially be even better as there is really a range of options
from a very simple quick implementation to a more complex highly
optimized version. It would be good to really document some of the
options and tradeoffs.

4. Come up with a standard way of documenting the features of clients.
In an ideal world it would be possible to get the same information
(author, language, feature set, download link, source code, etc) for
all clients. It would be great to standardize the documentation for
the client as well. For example having one or two basic examples that
are repeated for every client in a standardized way. This would let
someone come to the Kafka site who is not a java developer, and click
on the link for their language and view examples of interacting with
Kafka in the language they know using the client they would eventually
use.

5. Build a Kafka Client Compatibility Kit (KCCK) :-) The idea is this:
anyone who wants to implement a client would implement a simple
command line program with a set of standardized options. The
compatibility kit would be a standard set of scripts that ran their
client using this command line driver and validate its behavior. E.g.
for a producer it would test that it correctly can send messages, that
the ordering is retained, that the client correctly handles
reconnection and metadata refresh, and compression. The output would
be a list of features that passed are certified, and perhaps basic
performance information. This would be an easy way to help client
developers write correct clients, as well as having a standardized
comparison for the clients that says that they work correctly.

-Jay