Re: Verify time semantics through topology

2017-05-04 Thread Matthias J. Sax
Hi, I am not sure if I understand correctly: If you use default TimestampExtractor, the whole pipeline will be event-time based. However, as you want to compute the AVG, I would recommend a different pattern anyway: FEED -> groupByKey() -> window() -> aggregate() -> mapValues() = avgKTable In

Re: Shouldn't the initializer of a stream aggregate accept the key?

2017-05-04 Thread Matthias J. Sax
Yes. Key can be of any type and we cannot enforce immutable types at API level, and thus, it could get modified as a "side effect". The problem is, that if the key would be modified, it would corrupt data partitioning and thus would lead to wrong result. It's not possible to modify the key via re

Verify time semantics through topology

2017-05-04 Thread Garrett Barton
I think I have an understanding of how Kafka Streams is handling time behind the scenes and would like someone to verify it for me. The actual reason is I am running into behavior where I only can join two streams for a little, then it stops working. Assuming a topology like this: FEED -> groupB

答复: Trouble fetching messages with error "Skipping fetch for partition"

2017-05-04 Thread Hu Xi
This is a trace-level log which means that consumer already creates a fetch request to the given node from which you are reading data so no more requests cannot be created. Did you get any other warn-level or error-level logs when failing to fetching message? 发

Re: Kafka Streams Failed to rebalance error

2017-05-04 Thread Sameer Kumar
Yes, I have upgraded my cluster and client both to version 10.2.1 and currently monitoring the situation. Will report back in case I find any errors. Thanks for the help though. -Sameer. On Fri, May 5, 2017 at 3:37 AM, Matthias J. Sax wrote: > Did you see Eno's reply? > > Please try out Streams

Trouble fetching messages with error "Skipping fetch for partition"

2017-05-04 Thread Sachin Nikumbh
Hello, I am using kafka 0.10.1.0 and failing to fetch messages with following error message in the log : Skipping fetch for partition MYPARTITION because there is an in-flight request to MYMACHINE:9092 (id: 0 rack: null) (org.apache.kafka.clients.consumer.

Re: Shouldn't the initializer of a stream aggregate accept the key?

2017-05-04 Thread João Peixoto
So the reasoning is that the key may be a mutable object which in term could potentially cause disaster? Just clarifying because I think the initializer should only return a value (as it does right now). On Thu, May 4, 2017 at 3:02 PM Matthias J. Sax wrote: > Currently, you don't get any hold on

Re: What is the impact of setting the consumer config max.poll.interval.ms to high value

2017-05-04 Thread Matthias J. Sax
If your consumer fails (ie. whole process dies) setting the value high is not a problem (because the heartbeat thread dies, too, and the failure will be detected quickly). It's only a problem if you "main processing thread" dies (and everything else is still up and running), or if you main process

Re: Deduplicating KStream-KStream join

2017-05-04 Thread Matthias J. Sax
Hi, we don't believe in triggers ;) https://www.confluent.io/blog/watermarks-tables-event-time-dataflow-model/ -> Thus, it's a BigQuery flaw to not support updates... (IMHO) (We are also considering improving KStream-KStream join though, but that's of course no short term solution for you: https

Re: Debugging Kafka Streams Windowing

2017-05-04 Thread Matthias J. Sax
About > 07:44:08.493 [StreamThread-10] INFO o.a.k.c.c.i.AbstractCoordinator - > Discovered coordinator broker-05:6667 for group group-2. Please upgrade to Streams 0.10.2.1 -- we fixed couple of bug and I would assume this issue is fixed, too. If not, please report back. > Another question that I

Re: Record timestamp

2017-05-04 Thread Matthias J. Sax
The timestamp is always stored -- it's a metadata field that was added to the message format in 0.10.0. By default, Producer uses System.currentTimeMillis() to set the timestamp before it send the record to the broker. Or you can explicitly set the timestamp by yourself. The default TimestampExtr

Setting API version while upgrading

2017-05-04 Thread Milind Vaidya
> > > The documentation says "Upgrading from 0.8.x or 0.9.x to 0.10.0.0" > > I am upgrading from kafka_2.9.2-0.8.1.1 so which one is correct > > A. 0.8.1.1 > > *inter.broker.protocol.version**=**0.8.1.1* > > *log.message.format.version**=**0.8.1.1* > > B. 0.8.1 > > *inter.broker.protocol.version**=

Re: Kafka Streams Failed to rebalance error

2017-05-04 Thread Matthias J. Sax
Did you see Eno's reply? Please try out Streams 0.10.2.1 -- this should be fixed there. If not, please report back. I would also recommend to subscribe to the list. It's self-service http://kafka.apache.org/contact -Matthias On 5/3/17 10:49 PM, Sameer Kumar wrote: > My brokers are on version 1

Re: Shouldn't the initializer of a stream aggregate accept the key?

2017-05-04 Thread Matthias J. Sax
Currently, you don't get any hold on the key, because the key must be protected from modification. Thus, the initial value must be the same for all keys atm. We are aware that this limits some use cases and there is already a proposal to add keys to some API. https://cwiki.apache.org/confluence/d

Is rollback supported while upgrade ?

2017-05-04 Thread Milind Vaidya
Upgrading from kafka_2.9.2-0.8.1.1 to kafka_2.11-0.10.0.0 The new version kafka will look at the same location for log files as older one is what I am assuming. As per documentation following properties will be set in the new broker inter.broker.protocol.version=0.8.1.1 log.message.format.versi

Record timestamp

2017-05-04 Thread Murad Mamedov
Hi, In Producer record, we have timestamp. - Where it is actually stored, if stored? - When producing resorts with KStream.to() which timestamp is used, if used? In order to guarantee proper timestamp, do we have to always implement timestamp extractor? Thanks in advance.

Re: Debugging Kafka Streams Windowing

2017-05-04 Thread Mahendra Kariya
Hi Matthias, Please find the answers below. I would recommend to double check the following: > > - can you confirm that the filter does not remove all data for those > time periods? > Filter does not remove all data. There is a lot of data coming in even after the filter stage. > - I would a

Kafka topic 's timeindex size is zero.

2017-05-04 Thread Tony Liu
Hi Guys, I found one of topic's time index size is zero, we are using kafka version 0.10.1.1. is there anyone have any idea why the zero time index will be happened ? thanks. the following is the data snippet: -rw-r--r-- 1 root root0 May 4 05:36 /kafka/data/thl_raw-1/000114296521.t

Deduplicating KStream-KStream join

2017-05-04 Thread Ofir Sharony
Hi guys, I want to perform a join between two KStreams. An event may appear only on one of the streams (either one of them), so I can't use inner join (which emits only on a match) or left join (which emits only when the left input arrives). This leaves me with outer join. The problem with outer j

What is the impact of setting the consumer config max.poll.interval.ms to high value

2017-05-04 Thread chaitanya puchakayala
What will be the impact of setting the consumer config max.poll.interval.ms to 10 or 15 mins? How will impact nodes joining , nodes leaving, nodes crashing.? Chaitanya -- Chaitanya

Re: Kafka-streams process stopped processing messages

2017-05-04 Thread Shimi Kiviti
Thanks Eno, We still see problems on our side. when we run kafka-streams 0.10.1.1 eventually the problem goes away but with 0.10.2.1 it is not. We see a lot of the rebalancing messages I wrote before on at least 1 kafka-stream nodes we see disconnection messages like the following. These messages

Re: Resetting offsets

2017-05-04 Thread Dana Powers
kafka-python, yes. On May 4, 2017 2:28 AM, "Paul van der Linden" wrote: Thanks everyone. @Dana is that using the kafka-python library? On Thu, May 4, 2017 at 4:52 AM, Dana Powers wrote: > Requires stopping your existing consumers, but otherwise should work: > > from kafka import KafkaConsumer

Fwd: Error: Executing consumer group command failed due to Request GROUP_COORDINATOR failed on brokers List(localhost:9092 (id: -1 rack: null))

2017-05-04 Thread Abhimanyu Nagrath
-- Forwarded message -- From: Abhimanyu Nagrath Date: Thu, May 4, 2017 at 4:21 PM Subject: Error: Executing consumer group command failed due to Request GROUP_COORDINATOR failed on brokers List(localhost:9092 (id: -1 rack: null)) To: d...@kafka.apache.org Hi, I am using single no

Re: Resetting offsets

2017-05-04 Thread Paul van der Linden
Thanks everyone. @Dana is that using the kafka-python library? On Thu, May 4, 2017 at 4:52 AM, Dana Powers wrote: > Requires stopping your existing consumers, but otherwise should work: > > from kafka import KafkaConsumer, TopicPartition, OffsetAndMetadata > > def reset_offsets(group_id, topic,

Re: session window bug not fixed in 0.10.2.1?

2017-05-04 Thread Damian Guy
It is odd as the person that originally reported the problem has verified that it is fixed. On Thu, 4 May 2017 at 08:31 Guozhang Wang wrote: > Ara, > > That is a bit weird, I double checked and agreed with Eno that this commit > is in both trunk and 0.10.2, so I suspect the same issue still pers

Re: session window bug not fixed in 0.10.2.1?

2017-05-04 Thread Guozhang Wang
Ara, That is a bit weird, I double checked and agreed with Eno that this commit is in both trunk and 0.10.2, so I suspect the same issue still persists in trunk, hence there might be another issue that is not fixed in 2645. Could you help verify if that is the case? In which we can re-open https:/