Re: Kafka 0.9 -> consumer.poll() occasionally returns 0 elements

2016-01-26 Thread Krzysztof Ciesielski
Jason,

My os/vm is OSX 10.11.3, JDK 1.8.0.40

— 
Krzysztof
On 26 January 2016 at 19:04:58, Jason Gustafson (ja...@confluent.io) wrote:

Hey Krzysztof,  

So far I haven't had any luck figuring out the cause of the 5 second pause,  
but I've reproduced it with the old consumer on 0.8.2, so that rules out  
anything specific to the new consumer. Can you tell me which os/jvm you're  
seeing it with? Also, can you try changing the "receive.buffer.bytes"  
consumer configuration to "65536"? This was the default for the old  
consumer and I haven't been able to reproduce the problem when it is set.  

-Jason  


Re: facade libs for kafka?

2016-01-26 Thread Krzysztof Ciesielski
Hi Andrew,

I’m the main maintainer of Reactive-Kafka which wraps Kafka as a sink/source of 
a Reactive Stream. Maybe it will suit your needs:
https://github.com/softwaremill/reactive-kafka

Java API is also available.

— 
Bests,
Krzysiek
SoftwareMill
On 26 January 2016 at 22:10:13, Andrew Pennebaker (andrew.penneba...@gmail.com) 
wrote:

The Apache Kafka Java library requires an inordinate amount of code to  
send/receive messages. Has anyone thought of writing a wrapper library to  
make kafka easier to use?  

* Assume more default values  
* Easier access to message iterators / listening-reacting loops  
* Consumer thread pools  
* Easy topic-group offset resets, for clearing old messages before tests run  

--  
Cheers,  
Andrew  


Re: Kafka 0.9 -> consumer.poll() occasionally returns 0 elements

2016-01-24 Thread Krzysztof Ciesielski
Yes, it's exactly 5 seconds on every machine. Sure, I'll open a JIRA
On Jan 22, 2016 7:24 AM, "Jason Gustafson" <ja...@confluent.io> wrote:

> Hi Krzysztof,
>
> This is definitely weird. I see the data in the broker's send queue, but
> there's a delay of 5 seconds before it's sent to the client. Can you create
> a JIRA?
>
> Thanks,
> Jason
>
>
>
> On Thu, Jan 21, 2016 at 11:30 AM, Samya Maiti <samya.maiti2...@gmail.com>
> wrote:
>
> > +1, facing same issue.
> > -Sam
> > > On 22-Jan-2016, at 12:16 am, Krzysztof Ciesielski <
> > krzysztof.ciesiel...@softwaremill.com> wrote:
> > >
> > > Hello, I'm running into an issue with the new consumer in Kafka
> 0.9.0.0.
> > > Here's a runnable gist illustrating the problem:
> > > https://gist.github.com/kciesielski/054bb4359a318aa17561 (requires
> > Kafka on
> > > localhost:9092)
> > >
> > > Scenario description:
> > > First, a producer writes 50 elements into a topic
> > > Then, a consumer starts to read, polling in a loop.
> > > When "max.partition.fetch.bytes" is set to a relatively small value,
> each
> > > "consumer.poll()" returns a batch of messages.
> > > If this value is left as default, the output tends to look like this:
> > >
> > > Poll returned 13793 elements
> > > Poll returned 13793 elements
> > > Poll returned 13793 elements
> > > Poll returned 13793 elements
> > > Poll returned 0 elements
> > > Poll returned 0 elements
> > > Poll returned 0 elements
> > > Poll returned 0 elements
> > > Poll returned 13793 elements
> > > Poll returned 13793 elements
> > >
> > > As we can see, there are weird "gaps" when poll returns 0 elements for
> > some
> > > time. What is the reason for that? Maybe there are some good practices
> > > about setting "max.partition.fetch.bytes" which I don't follow?
> > >
> > > Bests,
> > > Chris
> >
> >
>


Kafka 0.9 -> consumer.poll() occasionally returns 0 elements

2016-01-21 Thread Krzysztof Ciesielski
Hello, I'm running into an issue with the new consumer in Kafka 0.9.0.0.
Here's a runnable gist illustrating the problem:
https://gist.github.com/kciesielski/054bb4359a318aa17561 (requires Kafka on
localhost:9092)

Scenario description:
First, a producer writes 50 elements into a topic
Then, a consumer starts to read, polling in a loop.
When "max.partition.fetch.bytes" is set to a relatively small value, each
"consumer.poll()" returns a batch of messages.
If this value is left as default, the output tends to look like this:

Poll returned 13793 elements
Poll returned 13793 elements
Poll returned 13793 elements
Poll returned 13793 elements
Poll returned 0 elements
Poll returned 0 elements
Poll returned 0 elements
Poll returned 0 elements
Poll returned 13793 elements
Poll returned 13793 elements

As we can see, there are weird "gaps" when poll returns 0 elements for some
time. What is the reason for that? Maybe there are some good practices
about setting "max.partition.fetch.bytes" which I don't follow?

Bests,
Chris


Reactive Kafka now supports Kafka 0.9

2015-12-30 Thread Krzysztof Ciesielski
Hi there,

I'd like to announce that our open source library, Reactive Kafka (which
wraps Akka Streams for Java/Scala around Kafka consumers/producers), now
supports Kafka 0.9. More details:
https://softwaremill.com/reactive-kafka-09/


New Consumer API + Reactive Kafka

2015-12-02 Thread Krzysztof Ciesielski
Hello,

I’m the main maintainer of Reactive Kafka - a wrapper library that provides 
Kafka API as Reactive Streams (https://github.com/softwaremill/reactive-kafka).
I’m a bit concerned about switching to Kafka 0.9 because of the new Consumer 
API which doesn’t seem to fit well into this paradigm, comparing to the old 
one. My main concerns are:

1. Our current code uses the KafkaIterator and reads messages sequentially, 
then sends them further upstream. In the new API, you cannot control how many 
messages are returned with poll(), so we would need to introduce some kind of 
in-memory buffering.
2. You cannot specify which offsets to commit. Our current native committer 
(https://github.com/softwaremill/reactive-kafka/blob/4055e88c09b8e08aefe8dbbd4748605df5779b07/core/src/main/scala/com/softwaremill/react/kafka/commit/native/NativeCommitter.scala)
 uses the OffsetCommitRequest/Response API and 
kafka.api.ConsumerMetadataRequest/Response for resolving brokers. Switching to 
Kafka 0.9 brings some compilation errors that raise questions.

My questions are:

1. Do I understand the capabilities and limitations of new API correctly? :)
2. Can we stay with the old iterator-based client, or is it going to get 
abandoned in future Kafka versions, or discouraged for some reasons?
3. Can we still use the OffsetCommitRequest/Response API to commit messages 
manually? If yes, could someone update this example: 
https://cwiki.apache.org/confluence/display/KAFKA/Committing+and+fetching+consumer+offsets+in+Kafka
 or give me a few hints on how to do this with 0.9?

By the way, we’d like our library to appear on the Ecosystem Wiki, I’m not sure 
how to request that officially :)

— 
Bests,
Chris
SoftwareMill

Re: New Consumer API + Reactive Kafka

2015-12-02 Thread Krzysztof Ciesielski
I see, that’s actually a very important point, thanks Jay.
I think that we are very optimistic about updating Reactive Kafka now after 
getting all these details :)
I have one more question: in the new client we only have to call 
commitSync(offsets). This is a ‘void’ method so I suspect that it commits 
atomically?
In our current native committer, we have quite a lot of additional code for 
retries, reconnecting or finding new channel coordinator. I suspect that the 
new API handles it all internally and if commitSync() fails then it means that 
the only thing we can do is kill the consumer and try to create a new one?

— 
Bests,
Chris
SoftwareMill
On 2 December 2015 at 17:42:24, Jay Kreps (j...@confluent.io) wrote:

It's worth noting that both the old and new consumer are identical in the  
number of records fetched at once and this is bounded by the fetch size and  
the number of partitions you subscribe to. The old consumer held these in  
memory internally and waited for you to ask for them, the new consumer  
immediately gives you what it has. Overall, though, the new consumer gives  
much better control over what is being fetched since it only uses memory  
when you call poll(); the old consumer had a background thread doing this  
which would only stop when it filled up a queue of unprocessed  
chunks...this is a lot harder to predict.  

-Jay  

On Wed, Dec 2, 2015 at 7:13 AM, Gwen Shapira <g...@confluent.io> wrote:  

> On Wed, Dec 2, 2015 at 10:44 PM, Krzysztof Ciesielski <  
> krzysztof.ciesiel...@softwaremill.pl> wrote:  
>  
> > Hello,  
> >  
> > I’m the main maintainer of Reactive Kafka - a wrapper library that  
> > provides Kafka API as Reactive Streams (  
> > https://github.com/softwaremill/reactive-kafka).  
> > I’m a bit concerned about switching to Kafka 0.9 because of the new  
> > Consumer API which doesn’t seem to fit well into this paradigm, comparing  
> > to the old one. My main concerns are:  
> >  
> > 1. Our current code uses the KafkaIterator and reads messages  
> > sequentially, then sends them further upstream. In the new API, you  
> cannot  
> > control how many messages are returned with poll(), so we would need to  
> > introduce some kind of in-memory buffering.  
> > 2. You cannot specify which offsets to commit. Our current native  
> > committer (  
> >  
> https://github.com/softwaremill/reactive-kafka/blob/4055e88c09b8e08aefe8dbbd4748605df5779b07/core/src/main/scala/com/softwaremill/react/kafka/commit/native/NativeCommitter.scala
>   
> )  
> > uses the OffsetCommitRequest/Response API and  
> > kafka.api.ConsumerMetadataRequest/Response for resolving brokers.  
> Switching  
> > to Kafka 0.9 brings some compilation errors that raise questions.  
> >  
> > My questions are:  
> >  
> > 1. Do I understand the capabilities and limitations of new API correctly?  
> > :)  
> >  
>  
> The first limitation is correct - poll() may return any number of records  
> and you need to handle this.  
> The second is not correct - commitSync() can take a map of TopicPartition  
> and Offsets, so you would only commit specific offsets of specific  
> partitions.  
>  
>  
>  
> > 2. Can we stay with the old iterator-based client, or is it going to get  
> > abandoned in future Kafka versions, or discouraged for some reasons?  
> >  
>  
> It is already a bit behind - only the new client includes support for  
> secured clusters (authentication and encryption). It will get deprecated in  
> the future.  
>  
>  
> > 3. Can we still use the OffsetCommitRequest/Response API to commit  
> > messages manually? If yes, could someone update this example:  
> >  
> https://cwiki.apache.org/confluence/display/KAFKA/Committing+and+fetching+consumer+offsets+in+Kafka
>   
> or  
> > give me a few hints on how to do this with 0.9?  
> >  
>  
> AFAIK, the wire protocol and the API is not going anywhere. Hopefully you  
> can use the new objects we provide in the clients jar  
> (org.apache.kafka.common.requests).  
>  
>  
> >  
> > By the way, we’d like our library to appear on the Ecosystem Wiki, I’m  
> not  
> > sure how to request that officially :)  
> >  
>  
> Let us know what to write there and where to link :)  
>  
>  
> >  
> > —  
> > Bests,  
> > Chris  
> > SoftwareMill  
>