Re: Kafka Streams - Expiring Records By Process Time

2018-06-21 Thread John Roesler
Hi Sicheng, I'm also curious about the details. Let's say you are doing a simple count aggregation with 24-hour windows. You got three events with key "A" on 2017-06-21, one year ago, so the windowed key (A,2017-06-21) has a value of 3. Fast-forward a year later. We get one late event, also for

Re: Kafka Streams - Expiring Records By Process Time

2018-06-21 Thread Matthias J. Sax
Can't you increase retention time accordingly to make sure that "old" metrics are not dropped? -Matthias On 6/21/18 2:07 PM, Sicheng Liu wrote: > Because we might get very "old" metrics (the timestamp on the metric is > very old, even though the metric is just delivered, for example, >

Configuring Kerberos behind an ELB

2018-06-21 Thread Tyler Monahan
Hello, I have setup kafka using kerberos successfully however if I try and reach kafka through an elb the kerberos authentication fails. The kafka brokers are each using their unique hostname for kerberos and when going through an elb the consumer/producer only sees the elb's dns record which

Fwd: coordinator load + OffsetFetchRequest error = consumption failure

2018-06-21 Thread Emmett Butler
Hi Kafka users, *tldr questions;* *1. Is it normal or expected for the coordinator load state to last for 6 hours? Is this load time affected by log retention settings, message production rate, or other parameters?* *2. Do non-pykafka clients handle COORDINATOR_LOAD_IN_PROGRESS by consuming only

Re: Some Total and Rate metrics are not consistent

2018-06-21 Thread Sam Lendle
Yep that’s the change I saw. It does look like the issue should be fixed since the implementation of update for Count always adds 1 regardless of the value. Thanks for the followup. On 6/21/18, 1:15 PM, "John Roesler" wrote: Hi Sam, This sounds like a condition I fixed in

Re: Kafka Streams - Expiring Records By Process Time

2018-06-21 Thread Sicheng Liu
Because we might get very "old" metrics (the timestamp on the metric is very old, even though the metric is just delivered, for example, backfill.). If you use event-time for retention, these old metrics could be dropped and won't be aggregated. If we use process-time, at least it will stay in

Re: Kafka Streams - Expiring Records By Process Time

2018-06-21 Thread Matthias J. Sax
I don't understand why event-time retention time cannot be used? Cannot elaborate? -Matthias On 6/21/18 10:59 AM, Sicheng Liu wrote: > Hi All, > > We have a use case that we aggregate some metrics with its event-time > (timestamp on the metric itself) using the simplest tumbling window. The >

Re: Some Total and Rate metrics are not consistent

2018-06-21 Thread John Roesler
Hi Sam, This sounds like a condition I fixed in https://github.com/apache/kafka/commit/ed51b2cdf5bdac210a6904bead1a2ca6e8411406#diff-8b364ed2d0abd8e8ae21f5d322db6564R221 . I realized that the prior code creates a new Meter, which uses a Total metric instead of a Count. But that would total all

Kafka Streams - Expiring Records By Process Time

2018-06-21 Thread Sicheng Liu
Hi All, We have a use case that we aggregate some metrics with its event-time (timestamp on the metric itself) using the simplest tumbling window. The window itself can be set a retention but since we are aggregating with event-time the retention has to be based on event-time too. However, in our

Re: Kafka Streams Thread Exception

2018-06-21 Thread Guozhang Wang
Hello Aravind, We have removed StreamsKafkaClient in since 1.1.0 ( https://issues.apache.org/jira/browse/KAFKA-4857), could you consider upgrading to the newer version and see if this issue goes away? Note that new clients can talk to old versioned brokers since 0.10.1. Guozhang Guozhang On

Kafka Streams Thread Exception

2018-06-21 Thread D S Aravind Vedantam
Hi All, We have been using Kafka Streams in our application however have encountered an issue. We are using Processor API and it works fine for some time but after a while we encounter this exception that happens in all Streams threads and processing doesn't happen any further after this

Re: Frequent "offset out of range" messages, partitions deserted by consumer

2018-06-21 Thread Shantanu Deshmukh
conusmer is always consuming. There's a trickle of messages which always keep flowing. However, during 1am to 5am there are almost no messages. On Wed, Jun 20, 2018 at 11:31 AM Liam Clarke wrote: > How often is the consumer actually consuming? I know there's an issue > where old committed

Re: [VOTE] 2.0.0 RC0

2018-06-21 Thread Rajini Sivaram
Sorry, the documentation does go live with the RC (thanks to Ismael for pointing this out), so here are the links: * Documentation: http://kafka.apache.org/20/documentation.html * Protocol: http://kafka.apache.org/20/protocol.html Regards, Rajini On Wed, Jun 20, 2018 at 9:08 PM, Rajini

Is there a way to either get or understanding source connector generated topics names and schemas ids?

2018-06-21 Thread Andrea Spina
Hi guys, We're working with Kafka Connect and we want to tackle the following problem: we have several source connector deployed through our system and we use kafka-connect REST API suite to manage it. Now, since we have different type of source connector available, we'd get topic and schema

Re: [VOTE] 1.1.1 RC0

2018-06-21 Thread Andras Beni
+1 (non-binding) Built .tar.gz, created a cluster from it and ran a basic end-to-end test: performed a rolling restart while console-producer and console-consumer ran at around 20K messages/sec. No errors or data loss. Ran unit and integration tests successfully 3 out of 5 times. Encountered