Re: Consumer Rebalancing Question

2017-01-04 Thread Ewen Cheslack-Postava
The coordinator will immediately move the group into a rebalance if it needs it. The reason LeaveGroupRequest was added was to avoid having to wait for the session timeout before completing a rebalance. So aside from the latency of cleanup/committing offests/rejoining after a heartbeat, rolling

Re: Kafka Streams window retention period question

2017-01-04 Thread Matthias J. Sax
Hi Alexander, first, both mailing list should be fine :) About internal time tracking: Kafka Streams tracks an internal "stream time" that is determined as the minimum "partition time" over all its input partitions. The "partition time" is tracked for each input partition individually and is

Kafka Streams window retention period question

2017-01-04 Thread Alexander Demidko
Hi folks, I'm experimenting with Kafka Streams windowed aggregation and came across window retention period behavior I don't fully understand. I'm using custom timestamp extractor which gets the timestamp from the payload. Values are aggregated using tumbling time windows and summed by the key. I

Consumer Rebalancing Question

2017-01-04 Thread Pradeep Gollakota
Hi Kafka folks! When a consumer is closed, it will issue a LeaveGroupRequest. Does anyone know how long the coordinator waits before reassigning the partitions that were assigned to the leaving consumer to a new consumer? I ask because I'm trying to understand the behavior of consumers if you're

Re: 0.10.2.0-SNAPSHOT - "log end offset should not change while restoring"

2017-01-04 Thread Guozhang Wang
Jon, I looked through the code and found one possible explanation of your observed exception: 1. Assume 2 running threads A and B, and one task t1 jut for simplicity. 2. First rebalance is triggered, task t1 is assigned to A (B has no assigned task). 3. During the first rebalance callback, task

Re: [VOTE] Vote for KIP-101 - Leader Epochs

2017-01-04 Thread Jun Rao
Hi, Ben, Thanks for the proposal. Looks good overall. A few comments below. 1. For LeaderEpochRequest, we need to include topic right? We probably want to follow other requests by nesting partition inside topic? For LeaderEpochResponse, do we need to return leader_epoch? I was thinking that we

Re: 0.10.2.0-SNAPSHOT - rocksdb exception(s)

2017-01-04 Thread Guozhang Wang
Jon, It is hard to determine what could be the root cause of this scenario just from the stack trace without checking the logs. We have seen a similar issue before and it has been fixed in the latest trunk head: https://issues.apache.org/jira/browse/KAFKA-4509 Are you using the latest trunk head

Re: EOF exceptions - 0.10.2.0-SNAPSHOT

2017-01-04 Thread Guozhang Wang
This is not an error message, and it is not specific to Streams. It simply indicates that the server has (probably temporarily) dropped the connection which may be due to a bunch of reasons: server node slow, network unstable, etc, and that is why it is only logged as a debug entry. The consumer

Re: Is this a bug or just unintuitive behavior?

2017-01-04 Thread hans
This sounds exactly as I would expect things to behave. If you consume from the beginning I would think you would get all the messages but not if you consume from the latest offset. You can separately tune the metadata refresh interval if you want to miss fewer messages but that still won't get

Re: Aggregated windowed counts

2017-01-04 Thread Matthias J. Sax
There is no such thing as a final window aggregate and you might see intermediate results -- thus the count do not add up. Please have a look here: http://stackoverflow.com/questions/38935904/how-to-send-final-kafka-streams-aggregation-result-of-a-time-windowed-ktable/38945277#38945277 and

Is this a bug or just unintuitive behavior?

2017-01-04 Thread Jeff Widman
I'm seeing consumers miss messages when they subscribe before the topic is actually created. Scenario: 1) kafka 0.10.1.1 cluster with allow-topic no topics, but supports topic auto-creation as soon as a message is published to the topic 2) consumer subscribes using topic string or a regex

Re: Kafka Connect gets into a rebalance loop

2017-01-04 Thread Ewen Cheslack-Postava
Aside from the logs you already have, the best suggestion I have is to enable trace level logging and try to reproduce -- there are some trace level logs in the KafkaBasedLog class that this uses which might reveal something. But it could be an issue in the consumer as well -- it sounds like it is

Re: Aggregated windowed counts

2017-01-04 Thread Benjamin Black
I'm hoping the DSL will do what I want :) Currently the example is continuously adding instead of bucketing, so if I modify it by adding a window to the count function: .groupBy((key, word) -> word) .count(TimeWindows.of(5000L), "Counts") .toStream((k, v) -> k.key()); Then I do see bucketing

Re: Aggregated windowed counts

2017-01-04 Thread Matthias J. Sax
Do you know about Kafka Streams? It's DSL gives you exactly what you want to do. Check out the documentation and WordCount example: http://docs.confluent.io/current/streams/index.html

Aggregated windowed counts

2017-01-04 Thread Benjamin Black
Hello, I'm looking for guidance on how to approach a counting problem. We want to consume a stream of data that consists of IDs and generate an output of the aggregated count with a window size of X seconds using processing time and a hopping time window. For example, using a window size of 1

Re: Trademark review

2017-01-04 Thread Aaron Niskode-Dossett
Hi Morgan, I believe "Apache Kafka" is preferred to simply "Kafka." pr...@apache.org could probably give you definitive guidance. Thanks! -Aaron On Wed, Jan 4, 2017 at 12:16 PM Stritzinger, Morgan < mstri...@textronsystems.com> wrote: > Hello, > > > > I work for Textron Systems and we would

Trademark review

2017-01-04 Thread Stritzinger, Morgan
Hello, I work for Textron Systems and we would like to mention Kafka in one of our press releases. We are inquiring to gain permission to use the trademarked name, Kafka, in our release. The context is below: "iCommand 2.5 operates in a disconnected, intermittent and limited (DIL) environment

Re: Kafka Connect Consumer reading messages from Kafka recursively

2017-01-04 Thread Srikrishna Alla
Hi Ewen, My assumption that this issue is happening when a huge number of events are getting published was wrong. That is when we discovered it. I looked closely at the log file and I seem to be having this issue for everything. So, my consumer is reading all the events from Kafka topic and

Re: Kafka Connect gets into a rebalance loop

2017-01-04 Thread Willy Hoang
I'm also running into this issue whenever I try to scale up from 1 worker to multiple. I found that I can sometimes hack around this by (1) waiting for the second worker to come up and start spewing out these log messages and then (2) sending a request to the REST API to update one of my

Re: [VOTE] Vote for KIP-101 - Leader Epochs

2017-01-04 Thread Tom Crayford
+1 On Wed, Jan 4, 2017 at 5:28 PM, Gwen Shapira wrote: > +1 - thanks for tackling those old and painful bugs! > > On Wed, Jan 4, 2017 at 9:24 AM, Ben Stopford wrote: > > Hi All > > > > We’re having some problems with this thread being subsumed by the >

Re: [VOTE] Vote for KIP-101 - Leader Epochs

2017-01-04 Thread Gwen Shapira
+1 - thanks for tackling those old and painful bugs! On Wed, Jan 4, 2017 at 9:24 AM, Ben Stopford wrote: > Hi All > > We’re having some problems with this thread being subsumed by the [Discuss] > thread. Hopefully this one will appear distinct. If you see more than one, >

Re: [VOTE] Vote for KIP-101 - Leader Epochs

2017-01-04 Thread Apurva Mehta
Looks good to me! +1 (non-binding) On Wed, Jan 4, 2017 at 9:24 AM, Ben Stopford wrote: > Hi All > > We’re having some problems with this thread being subsumed by the > [Discuss] thread. Hopefully this one will appear distinct. If you see more > than one, please use this one.

[VOTE] Vote for KIP-101 - Leader Epochs

2017-01-04 Thread Ben Stopford
Hi All We’re having some problems with this thread being subsumed by the [Discuss] thread. Hopefully this one will appear distinct. If you see more than one, please use this one. KIP-101 should now be ready for a vote. As a reminder the KIP proposes a change to the replication protocol to

Reg: Getting configuration properties for Kafka broker

2017-01-04 Thread Sumit Maheshwari
Hi, To start the Kafka Broker from the CLI we run the following command: "*bin/kafka-server-start.sh config/server.properties"* How can I get hold of zookeeper configuration or all the configurations mentioned in server.properties in runtime. By runtime meaning code need to access those

Re: how to ingest a database with a Kafka Connect cluster in parallel?

2017-01-04 Thread Yuanzhe Yang
Hi Ewen, OK. Thanks a lot for your feedback! Best regards, Yang 2017-01-03 22:42 GMT+01:00 Ewen Cheslack-Postava : > It's an implementation detail of the JDBC connector. You could potentially > write a connector that parallelizes at that level, but you lose other >

Lost message with Kafka configuration

2017-01-04 Thread Hoang Bao Thien
Hi all, I have a problem with losing messages from Kafka. The situation is as follows: I put a csv file with 286701 rows (size = 110MB) into Kafka producer with command: $ cat test.csv | kafka-console-producer.sh --broker-list localhost:9092 --topic MyTopic > /dev/null and then count the number