Re: Kafka Streams 0.10.0.1 - multiple consumers not receiving messages

2017-04-28 Thread Matthias J. Sax
Henry, you might want to check out the docs, that give an overview of the architecture: http://docs.confluent.io/current/streams/architecture.html#example Also, I am wondering why your application did not crash: I would expect an exception like java.lang.IllegalArgumentException: Assigned

Re: Debugging Kafka Streams Windowing

2017-04-28 Thread Matthias J. Sax
Ok. That makes sense. Question: why do you use .aggregate() instead of .count() ? Also, can you share the code of you AggregatorFunction()? Did you change any default setting of StreamsConfig? I have still no idea what could go wrong. Maybe you can run with log level TRACE? Maybe we can get

Re: Consumer with another group.id conflicts with streams()

2017-04-28 Thread Matthias J. Sax
Andreas, I agree that it might be a bug, but it's unclear what is happening -- it would be helpful to understand the scenario better to file a descriptive JIRA that helps to fix the issue. Btw: Kafka 0.10.2.1 got release yesterday -- maybe you can try the new release and report if it works there

Re: topics stuck in "Leader: -1" after crash while migrating topics

2017-04-28 Thread Ismael Juma
There are indeed some known issues in the Controller that require care to avoid. Onur has recently contributed a PR that simplifies the concurrency model of the Controller: https://github.com/apache/kafka/commit/bb663d04febcadd4f120e0ff5c5919ca8bf7e971 This is a good first step and will be part

Re: Kafka Streams 0.10.0.1 - multiple consumers not receiving messages

2017-04-28 Thread Henry Thacker
Thanks Michael and Eno for your help - I always thought the unit of parallelism was a combination of topic & partition rather than just partition. Out of interest though, had I subscribed for both topics in one subscriber - I would have expected records for both topics interleaved, why when

Re: topics stuck in "Leader: -1" after crash while migrating topics

2017-04-28 Thread Michal Borowiecki
Hi James, This "Cached zkVersion [x] not equal to that in zookeeper" issue bit us once in production and I found these ticket to be relevant: KAFKA-2729 KAFKA-3042 KAFKA-3083

Re: topics stuck in "Leader: -1" after crash while migrating topics

2017-04-28 Thread James Brown
For what it's worth, shutting down the entire cluster and then restarting it did address this issue. I'd love anyone's thoughts on what the "correct" fix would be here. On Fri, Apr 28, 2017 at 10:58 AM, James Brown wrote: > The following is also appearing in the logs a

session window bug not fixed in 0.10.2.1?

2017-04-28 Thread Ara Ebrahimi
Hi, I upgraded to 0.10.2.1 yesterday, enabled caching for session windows and tested again. It doesn’t seem to be fixed? Ara. > On Mar 27, 2017, at 2:10 PM, Damian Guy wrote: > > Hi Ara, > > There is a performance issue in the 0.10.2 release of session windows. It > is

Re: topics stuck in "Leader: -1" after crash while migrating topics

2017-04-28 Thread James Brown
The following is also appearing in the logs a lot, if anyone has any ideas: INFO Partition [easypost.syslog,7] on broker 1: Cached zkVersion [647] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) On Fri, Apr 28, 2017 at 10:43 AM, James Brown

topics stuck in "Leader: -1" after crash while migrating topics

2017-04-28 Thread James Brown
We're running 0.10.1.0 on a five-node cluster. I was in the process of migrating some topics from having 2 replicas to having three replicas when two the five machines in this cluster crashed (brokers 2 and 3). After restarting them, all of the topics that were previously assigned to them are

Re: Kafka Streams 0.10.0.1 - multiple consumers not receiving messages

2017-04-28 Thread Michael Noll
To add to what Eno said: You can of course use the Kafka Streams API to build an application that consumes from multiple Kafka topics. But, going back to your original question, the scalability of Kafka and the Kafka Streams API is based on partitions, not on topics. -Michael On Fri, Apr

Re: Kafka Streams 0.10.0.1 - multiple consumers not receiving messages

2017-04-28 Thread Eno Thereska
Hi Henry, Kafka Streams scales differently and does not support having the same application ID subscribe to different topics for scale-out. The way we support scaling out if you want to use the same application id is through partitions, i.e., Kafka Streams automatically assigns partitions to

Kafka Stream stops polling new messages

2017-04-28 Thread João Peixoto
My stream gets stale after a while and it simply does not receive any new messages, aka does not poll. I'm using Kafka Streams 0.10.2.1 (same happens with 0.10.2.0) and the brokers are running 0.10.1.1. The stream state is RUNNING and there are no exceptions in the logs. Looking at the JMX

Kafka Backup and Restore Solutions

2017-04-28 Thread Ian Duffy
Hi All, Is there any community preferred tooling for doing point in time backups of kafka? (ideally without downtime) We've looked at https://github.com/pinterest/secor but refeeding ~500gb+ of data doesn't seem too neat. Thanks, Ian.

Re: Kafka Streams 0.10.0.1 - multiple consumers not receiving messages

2017-04-28 Thread Henry Thacker
Should also add - there are definitely live incoming messages on both input topics when my streams are running. The auto offset reset config is set to "earliest" and because the input data streams are quite large (several millions records each), I set a relatively small max poll records (200) so

Re: Kafka Streams 0.10.0.1 - multiple consumers not receiving messages

2017-04-28 Thread Henry Thacker
Hi Eno, Thanks for your reply - the code that builds the topology is something like this (I don't have email and the code access on the same machine unfortunately - so might not be 100% accurate / terribly formatted!). The stream application is a simple verifier which stores a tiny bit of state

Re: Kafka Streams 0.10.0.1 - multiple consumers not receiving messages

2017-04-28 Thread Eno Thereska
Hi Henry, Could you share the code that builds your topology so we see how the topics are passed in? Also, this would depend on what the streaming logic is doing with the topics, e.g., if you're joining them then both partitions need to be consumed by the same instance. Eno > On 28 Apr 2017,

Kafka Streams 0.10.0.1 - multiple consumers not receiving messages

2017-04-28 Thread Henry Thacker
Hi, I'm using Kafka 0.10.0.1 and Kafka streams. When I have two different processes, Consumer 1 and 2. They both share the same application ID, but subscribe for different single-partition topics. Only one stream consumer receives messages. The non working stream consumer just sits there

Re: Debugging Kafka Streams Windowing

2017-04-28 Thread Mahendra Kariya
Oh good point! The reason why there is only one row corresponding to each time window is because it only contains the latest value for the time window. So what we did was we just dumped the data present in the sink topic to a db using an upsert query. The primary key of the table was time window.