Re: OOM errors

2016-11-29 Thread Guozhang Wang
Where does the "20 minutes" come from? I thought the "aggregate" operator in your stream->aggregate->filter->foreach topology is not a windowed aggregation, so the aggregate results will keep accumulating. Guozhang On Tue, Nov 29, 2016 at 8:40 PM, Jon Yeargers wrote: > "keep increasing" - w

Re: OOM errors

2016-11-29 Thread Jon Yeargers
"keep increasing" - why? It seems (to me) that the aggregates should be 20 minutes long. After that the memory should be released. Not true? On Tue, Nov 29, 2016 at 3:53 PM, Guozhang Wang wrote: > Jon, > > Note that in your "aggregate" function, if it is now windowed aggregate > then the aggreg

Behavior of Mismatched Configs in Single Cluster

2016-11-29 Thread Alan Braithwaite
Hey all, I'm curious what the behavior of mismatched configs across brokers is supposed to be throughout the kafka cluster. For example, if we don't have the setting for auto re-election of leaders enabled but if we were to enable it, what would be the behavior of the setting between the time we

Additional tutorial for Kafka

2016-11-29 Thread Ervin Varga
Hi Guys, I would like to share with you an information about my book "Creating Maintainable APIs" (visit http://www.apress.com/9781484221952). I had devoted Part III of the book to messaging APIs, where I talk about Apache Kafka (chapters 12 and 13), and how it nicely interplays with Apache Avr

Re: KStream window - end value out of bounds

2016-11-29 Thread Guozhang Wang
I agree that we should fix the "end timestamp" in windows after calling WindowedDeserializer, created https://issues.apache.org/jira/browse/KAFKA-4468 for it. As for Jon's observed issue that some records seem aggregated into incorrect windows, we are interested in the observed behavior that was u

Re: kafka 0.10.1 ProcessorTopologyTestDriver issue with .map and .groupByKey.count

2016-11-29 Thread Guozhang Wang
Hamid, Could you paste your code using KStreamDriver that does not have this issue into the JIRA as well? I suspect KStreamDriver should have the same issue and wondering why it did not. Guozhang On Tue, Nov 29, 2016 at 10:38 AM, Matthias J. Sax wrote: > Thanks! > > On 11/29/16 7:18 AM, Hamidr

Re: OOM errors

2016-11-29 Thread Guozhang Wang
Jon, Note that in your "aggregate" function, if it is now windowed aggregate then the aggregation results will keep increasing in your local state stores unless you're pretty sure that the aggregate key space is bounded. This is not only related to disk space but also memory since the current defa

Re: Disadvantages of Upgrading Kafka server without upgrading client libraries?

2016-11-29 Thread Ismael Juma
That's right, there should be no performance penalty if the broker is configured to use the older message format. The downside is that timestamps introduced in message format version 2 won't be supported in that case. Ismael On Tue, Nov 29, 2016 at 11:31 PM, Hans Jespersen wrote: > The performa

Re: Kafka Streaming

2016-11-29 Thread Guozhang Wang
Hello Mohit, I'm copy-pasting Mathieu's previous email on making Streams to work on Windows, note it was for 0.10.0.1 but I think the process should be very similar. To any who follow in my footsteps, here is my trail: 1. Upgrade to at least Kafka Streams 0.10.0.1 (currently only in

Re: Disadvantages of Upgrading Kafka server without upgrading client libraries?

2016-11-29 Thread Hans Jespersen
The performance impact of upgrading and some settings you can use to mitigate this impact when the majority of your clients are still 0.8.x are documented on the Apache Kafka website https://kafka.apache.org/documentation#upgrade_10_performance_impact -hans /** * Hans Jespersen, Principal System

Re: Disadvantages of Upgrading Kafka server without upgrading client libraries?

2016-11-29 Thread Apurva Mehta
I may be wrong, but since there have been message format changes between 0.8.2 and 0.10.1, there will be a performance penalty if the clients are not also upgraded. This is because you lose the zero-copy semantics on the server side as the messages have to be converted to the old format before bein

Re: log.dirs balance?

2016-11-29 Thread Karolis Pocius
It's difficult enough to balance kafka brokers with a single log directory, not to mention attempting to juggle multiple ones. While JBOD is great in terms of capacity, it's a pain in terms of management. After 6 months of constant manual reassignments I ended up going with RAID1+0 which is wha

Re: OOM errors

2016-11-29 Thread Jon Yeargers
App eventually got OOM-killed. Consumed 53G of swap space. Does it require a different GC? Some extra settings for the java cmd line? On Tue, Nov 29, 2016 at 12:05 PM, Jon Yeargers wrote: > I cloned/built 10.2.0-SNAPSHOT > > App hasn't been OOM-killed yet but it's up to 66% mem. > > App takes

Re: OOM errors

2016-11-29 Thread Jon Yeargers
I cloned/built 10.2.0-SNAPSHOT App hasn't been OOM-killed yet but it's up to 66% mem. App takes > 10 min to start now. Needless to say this is problematic. The 'kafka-streams' scratch space has consumed 37G and still climbing. On Tue, Nov 29, 2016 at 10:48 AM, Jon Yeargers wrote: > Does eve

Re: Two possible issues with 2.11 0.10.1.0

2016-11-29 Thread Matthias J. Sax
Your issues seems to be similar to this one: https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/confluent-platform/i5cwYhpUtx4/i79mHiD6CgAJ Can you confirm? Maybe you can try out the patch... -Matthias On 11/28/16 6:42 PM, Jon Yeargers wrote: > Ran into these two internal

log.dirs balance?

2016-11-29 Thread Tim Visher
Hello, My kafka deploy has 5 servers with 3 log disks each. Over the weekend I noticed that on 2 of the 5 servers the partitions appear to be imbalanced amongst the log.dirs. ``` kafka3 /var/lib/kafka/disk1 3 /var/lib/kafka/disk2 3 /var/lib/kafka/disk3 3 kafka5 /var/lib/kafka/disk1 3 /var/lib/kaf

Re: KStream window - end value out of bounds

2016-11-29 Thread Jon Yeargers
Seems straightforward enough: I have a 'foreach' after my windowed aggregation and I see values like these come out: (window) start: 148044420 end: 148044540 (record) epoch='1480433282000' If I have a 20 minute window with a 1 minute 'step' I will see my record come out of the aggrega

Re: KStream window - end value out of bounds

2016-11-29 Thread Eno Thereska
Let us know if we can help with that, what problems are you seeing with records in wrong windows? Eno > On 29 Nov 2016, at 19:02, Jon Yeargers wrote: > > I've been having problems with records appearing in windows that they > clearly don't belong to. Was curious whether this was related but it

Re: KStream window - end value out of bounds

2016-11-29 Thread Jon Yeargers
I've been having problems with records appearing in windows that they clearly don't belong to. Was curious whether this was related but it seems not. Bummer. On Tue, Nov 29, 2016 at 8:52 AM, Eno Thereska wrote: > Hi Jon, > > There is an optimization in org.apache.kafka.streams.kstream.internals.

Re: OOM errors

2016-11-29 Thread Jon Yeargers
Does every broker need to be updated or just my client app(s)? On Tue, Nov 29, 2016 at 10:46 AM, Matthias J. Sax wrote: > What version do you use? > > There is a memory leak in the latest version 0.10.1.0. The bug got > already fixed in trunk and 0.10.1 branch. > > There is already a discussion

Re: OOM errors

2016-11-29 Thread Matthias J. Sax
What version do you use? There is a memory leak in the latest version 0.10.1.0. The bug got already fixed in trunk and 0.10.1 branch. There is already a discussion about a 0.10.1.1 bug fix release. For now, you could build the Kafka Streams from the sources by yourself. -Matthias On 11/29/16 1

Re: kafka 0.10.1 ProcessorTopologyTestDriver issue with .map and .groupByKey.count

2016-11-29 Thread Matthias J. Sax
Thanks! On 11/29/16 7:18 AM, Hamidreza Afzali wrote: > I have created a JIRA issue: > > https://issues.apache.org/jira/browse/KAFKA-4461 > > > Hamid > signature.asc Description: OpenPGP digital signature

Re: Question about state store

2016-11-29 Thread Matthias J. Sax
A KStream-KTable join might also work. > stream.join(table, ...); -Matthias On 11/29/16 1:21 AM, Eno Thereska wrote: > Hi Simon, > > See if this helps: > > - you can create the KTable with the state store "storeAdminItem" as you > mentioned > - for each element you want to check, you want to

OOM errors

2016-11-29 Thread Jon Yeargers
My KStreams app seems to be having some memory issues. 1. I start it `java -Xmx8G -jar .jar` 2. Wait 5-10 minutes - see lots of 'org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 0xc58abee3e13 after 0ms' messages 3. When it _finally_ starts reading values it typically goes

One Kafka Broker Went Rogue

2016-11-29 Thread Thomas DeVoe
Hi, I encountered a strange issue in our kafka cluster, where randomly a single broker entered a state where it seemed to think it was the only broker in the cluster (it shrank all of its ISRs to just existing on itself). Some details about the kafka cluster: - running in an EC2 VPC on AWS - 3 no

Re: Disadvantages of Upgrading Kafka server without upgrading client libraries?

2016-11-29 Thread Thomas Becker
The only obvious downside I'm aware of is not being able to benefit from the bugfixes in the client. We are essentially doing the same thing; we upgraded the broker side to 0.10.0.0 but have yet to upgrade our clients from 0.8.1.x. On Tue, 2016-11-29 at 09:30 -0500, Tim Visher wrote: > Hi Everyone

Re: Disadvantages of Upgrading Kafka server without upgrading client libraries?

2016-11-29 Thread Gwen Shapira
Most people upgrade clients to enjoy new client features, fix bugs or improve performance. If none of these apply, no need to upgrade. Since you are upgrading to 0.10.1.0, read the upgrade docs closely - there are specific server settings regarding the message format that you need to configure a c

Re: KStream window - end value out of bounds

2016-11-29 Thread Eno Thereska
Hi Jon, There is an optimization in org.apache.kafka.streams.kstream.internals.WindowedSerializer/Deserializer where we don't encode and decode the end of the window since the user can always calculate it. So instead we return a default of Long.MAX_VALUE, which is the big number you see. In o

Relation between segment name and message offset, if any?

2016-11-29 Thread Harald Kirsch
I thought the name of a segment contains the offset of the first message in that segment, but I just saw offsets being processed that would map to a segment file that was cleaned and is listed as empty by dir. This is 0.10.* on Windows. Is there something strange going on with data still mappe

Re: Messages intermittently get lost

2016-11-29 Thread Martin Gainty
Hi Zach we dont know whats causing this intermittent problem..so lets Divide and Conquer each part of this problem individually starting at the source of the data feeds Let us eliminate any potential problem with feeds from external sources Once you verify the zookeeper feeds are 100% relia

KStream window - end value out of bounds

2016-11-29 Thread Jon Yeargers
Using the following topology: KStream kStream = kStreamBuilder.stream(stringSerde,transSerde,TOPIC); KTable, SumRecordCollector> ktAgg = kStream.groupByKey().aggregate( SumRecordCollector::new, new Aggregate(), TimeWindows.of(20 * 60 * 1000)

Re: Messages intermittently get lost

2016-11-29 Thread Zac Harvey
Does anybody have any idea why ZK might be to blame if messages sent by a Kafka producer fail to be received by a Kafka consumer? From: Zac Harvey Sent: Monday, November 28, 2016 9:07:41 AM To: users@kafka.apache.org Subject: Re: Messages intermittently get lost

Re: kafka 0.10.1 ProcessorTopologyTestDriver issue with .map and .groupByKey.count

2016-11-29 Thread Hamidreza Afzali
I have created a JIRA issue: https://issues.apache.org/jira/browse/KAFKA-4461 Hamid

question about property advertised.listeners

2016-11-29 Thread Aki Yoshida
I have a question regarding how to use advertised.listeners to expose a different port to the outside of the container where the broker is running. When using advertised.listeners to register a different port as the actual broker port, do we need a port forwarding from this configured port to the

Disadvantages of Upgrading Kafka server without upgrading client libraries?

2016-11-29 Thread Tim Visher
Hi Everyone, I have an install of Kafka 0.8.2.1 which I'm upgrading to 0.10.1.0. I see that Kafka 0.10.1.0 should be backwards compatible with client libraries written for older versions but that newer client libraries are only compatible with their version and up. My question is what disadvantag

Zookeeper stack trace

2016-11-29 Thread Jon Yeargers
2016-11-29 11:56:24,069 [main-SendThread(:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 0xb58abf2daaf0011 after 0ms Exception in thread "StreamThread-1" org.apache.kafka.streams.errors.TaskAssignmentException: unable to decode subscription data: version=2 at org.

Re: Question about state store

2016-11-29 Thread Eno Thereska
Hi Simon, See if this helps: - you can create the KTable with the state store "storeAdminItem" as you mentioned - for each element you want to check, you want to query that state store directly to check if that element is in the state store. You can do the querying via the Interactive Query AP

Re: Kafka 0.10.1.0 consumer group rebalance broken?

2016-11-29 Thread Radek Gruchalski
You’re most likely correct that it’s not that particular change. That commit was introduced only 6 days ago, well after releasing 0.10.1. An mvp would be helpful. Unless someone else on this list knows the issue immediately. – Best regards, Radek Gruchalski ra...@gruchalski.com On November 29, 2

Re: Kafka 0.10.1.0 consumer group rebalance broken?

2016-11-29 Thread Bart Vercammen
Well, well, mr. Gruchalski, always nice to talk to you ;-) Not sure whether it is indeed related to: https://github.com/apache/kafka/commit/4b003d8bcfffded55a00b8ecc9eed8 eb373fb6c7#diff-d2461f78377b588c4f7e2fe8223d5679R633 But I'll have a look and will try to create a test scenario for this that