Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API

2017-03-28 Thread Matthias J. Sax
With regard to KIP-130: Form KIP-130 thread: > About subtopologies and tasks. We do have the concept of subtopologies > already in KIP-120. It's only missing and ID that allow to link a subtopology > to a task. > > IMHO, adding a simple variable to `Subtopoloy` that provide the id should be

Re: [DISCUSS] KIP 130: Expose states of active tasks to KafkaStreams public API

2017-03-28 Thread Matthias J. Sax
Thanks for updating the KIP! I think it's good as is -- I would not add anything more to TaskMetadata. About subtopologies and tasks. We do have the concept of subtopologies already in KIP-120. It's only missing and ID that allow to link a subtopology to a task. IMHO, adding a simple variable

Re: Achieve message ordering through Async Producer

2017-03-28 Thread Henry Cai
Based on kafka doc, this parameter should maintain the message ordering: max.in.flight.requests.per.connection The maximum number of unacknowledged requests the client will send on a single connection before blocking. Note that if this setting is set to be greater than 1 and there are failed

Achieve message ordering through Async Producer

2017-03-28 Thread Henry Cai
If I use kafka's AsyncProducer, would I still be able to achieve message ordering within the same partition? When the first message failed to send to broker, will the second message (within the same kafka partition) being sent out ahead of first message? Based on this email thread, it seems

offsets commitment from the another client

2017-03-28 Thread Vova Shelgunov
Hi, I have a number of the application's instances which consume from a single Kafka topic using the same consumer group. Upon receiving of the message I need to run its processing in a separate process (in my case it is a Docker container). My issue is that I want to commit offset to Kafka only

Re: [DISCUSS] KIP 130: Expose states of active tasks to KafkaStreams public API

2017-03-28 Thread Florian Hussonnois
Hi all, I've updated the KIP and the PR to reflect your suggestions. https://cwiki.apache.org/confluence/display/KAFKA/KIP+130%3A+Expose+states+of+active+tasks+to+KafkaStreams+public+API https://github.com/apache/kafka/pull/2612 Also, I've exposed property StreamThread#state as a string through

Understanding ReadOnlyWindowStore.fetch

2017-03-28 Thread Jon Yeargers
Im probing about trying to find a way to solve my aggregation -> db issue. Looking at the '.fetch()' function Im wondering about the 'timeFrom' and 'timeTo' params as not a lot is mentioned about 'proper' usage. The test in

Re: more uniform task assignment across kafka stream nodes

2017-03-28 Thread Ara Ebrahimi
Awesome! Thanks. Can’t wait to try it out. Ara. On Mar 28, 2017, at 2:13 PM, Matthias J. Sax > wrote: This message is for the designated recipient only and may contain privileged, proprietary, or

Re: more uniform task assignment across kafka stream nodes

2017-03-28 Thread Matthias J. Sax
Created a JIRA: https://issues.apache.org/jira/browse/KAFKA-4969 -Matthias On 3/27/17 4:33 PM, Ara Ebrahimi wrote: > Well, even with 4-5x better performance thanks to the session window fix, I > expect to get ~10x better performance if I throw 10x more nodes at the > problem. That won’t be

Re: WindowStore and retention

2017-03-28 Thread Matthias J. Sax
Note, it's not based on system/wall-clock time, but based on "stream time", ie, the internal time progress of your app, that depends on the timestamps returned by TimestampExtractor. -Matthias On 3/28/17 10:55 AM, Matthias J. Sax wrote: > Yes. :) > > On 3/28/17 10:40 AM, Jon Yeargers wrote: >>

Re: WindowStore and retention

2017-03-28 Thread Matthias J. Sax
Yes. :) On 3/28/17 10:40 AM, Jon Yeargers wrote: > How long does a given value persist in a WindowStore? Does it obey the > '.until()' param of a windowed aggregation/ reduction? > > Please say yes. > signature.asc Description: OpenPGP digital signature

WindowStore and retention

2017-03-28 Thread Jon Yeargers
How long does a given value persist in a WindowStore? Does it obey the '.until()' param of a windowed aggregation/ reduction? Please say yes.

Managing topic configuration w/ auto.create.topics.enable

2017-03-28 Thread Mathieu Fenniak
Hey Kafka Users, When using a Kafka broker w/ auto.create.topics.enable set to true, how do Kafka users generally manage configuration of those topics? In particular, cleanup.policy={compact/delete} can be a crucial configuration value to get correct. In my application, I have a couple Kafka

Re: 3 machines 12 threads all fail within 2 hours of starting the streams application

2017-03-28 Thread Matthias J. Sax
Thanks for this update! Really appreciate it! This allows us to improve Kafka further! We hope to do a bug-fix release including this findings soon! Also happy, that your applications is running now! Keep us posted if possible! -Matthias On 3/27/17 9:44 PM, Sachin Mittal wrote: > - single

Re: Custom stream processor not triggering #punctuate()

2017-03-28 Thread Elliot Crosby-McCullough
Hi Michael, My confusion was that the events are being created, transferred, and received several seconds apart (longer than the punctuate schedule) with no stalling because I'm triggering them by hand, so regardless of what mechanism is being used for timing it should still be called. That

Re: Custom stream processor not triggering #punctuate()

2017-03-28 Thread Michael Noll
Elliot, in the current API, `punctuate()` is called based on the current stream-time (which defaults to event-time), not based on the current wall-clock time / processing-time. See http://docs.confluent.io/ current/streams/faq.html#why-is-punctuate-not-called. The stream-time is advanced only

Custom stream processor not triggering #punctuate()

2017-03-28 Thread Elliot Crosby-McCullough
Hi there, I've written a simple processor which expects to have #process called on it for each message and configures regular punctuate calls via `context.schedule`. Regardless of what configuration I try for timestamp extraction I cannot get #punctuate to be called, despite #process being

Re: org.apache.kafka.common.errors.TimeoutException

2017-03-28 Thread R Krishna
Are you able to publish any messages at all? If it is one off, then it is possible that the broker is busy and the client busy that it could not publish that batch of messages in that partition 0 within 1732 ms in which case you should increase the message timeouts and retries. Search the timeout