Re: Kafka Streams topology does not replay correctly

2018-01-16 Thread Matthias J. Sax
>>> The KStream has incoming events, and #transform() will >>> let me mount the store and use it how I please. Within an application >>> instance, any other KStream#transform()s using the same store will see the >>> same data in real time. That sounds basically correct. But you don't know the orde

[HELP] Guidelines/tools to test kafka performance with application layer involved

2018-01-16 Thread Pritam Kadam
Hello, We are using kafka for pub sub and want test performance of entire system. Is there any tool readily available in kafka world which can simulate multiple publishers and subscribers tp measure latency and throughput considering custom application layer? Any guidelines around this would be

Re: Kafka Streams topology does not replay correctly

2018-01-16 Thread Dmitry Minkovsky
I meant “Thanks, yes I will try replacing...” вт, 16 янв. 2018 г. в 22:12, Dmitry Minkovsky : > Thanks, yes try replacing the KStream-KTable joins with > KStream#transform()s and a store. Not sure why you mean I’d need to buffer > multiple records. The KStream has incoming events, and #transform(

Re: Kafka Streams topology does not replay correctly

2018-01-16 Thread Dmitry Minkovsky
Thanks, yes try replacing the KStream-KTable joins with KStream#transform()s and a store. Not sure why you mean I’d need to buffer multiple records. The KStream has incoming events, and #transform() will let me mount the store and use it how I please. Within an application instance, any other KStre

Re: Kafka Streams topology does not replay correctly

2018-01-16 Thread Matthias J. Sax
You have more flexibility of course and thus can get better results. But your code must be able to buffer multiple records from the KTable and KStream input and also store the corresponding timestamps to perform the join correctly. It's not trivial but also also not rocket-science. If we need stro

Re: [DISCUSS] KIP-247: Add public test utils for Kafka Streams

2018-01-16 Thread Matthias J. Sax
Colin, the TopologyTestDriver does not connect to any broker and simulates processing of single-partitioned input topics purely in-memory (the driver is basically a mock for a StreamThread). This is sufficient to test basic business logic. For more complex topologies that are actually divided into

Re: Kafka Streams topology does not replay correctly

2018-01-16 Thread Dmitry Minkovsky
Right now I am thinking of re-writing anything that has these problematic KStream/KTable joins as KStream#transform() wherein the state store is manually used. Does that makes sense as an option for me? -Dmitry On Tue, Jan 16, 2018 at 6:08 PM, Dmitry Minkovsky wrote: > Earlier today I posted th

I would like to increase the bandwidth by binding each broker to each network card in one machine. Is this feasible?

2018-01-16 Thread ??????
I would like to increase the bandwidth by binding each broker to each network card in one machine. Is this feasible? As it said in title,What should I do to achieve this effect? Does anyone know how to achieve it?

Fwd: Re: [DISCUSS] KIP-247: Add public test utils for Kafka Streams

2018-01-16 Thread Matthias J. Sax
Forgot dev-list... Forwarded Message Subject: Re: [DISCUSS] KIP-247: Add public test utils for Kafka Streams Date: Tue, 16 Jan 2018 13:56:38 -0800 From: Matthias J. Sax Organization: Confluent Inc To: users@kafka.apache.org Thanks a lot for the comments. @Guozhang: I updated

Kafka Streams topology does not replay correctly

2018-01-16 Thread Dmitry Minkovsky
Earlier today I posted this question to SO : > I have a topology that looks like this: KTable users = topology.table(USERS, Consumed.with(byteStringSerde, userSerde), Materialized.as(USERS));

Re: __consumer_offsets too big

2018-01-16 Thread Shravan R
BTW, I see log segments as old as last year and offsets.retention.minutes is set to 4 days. Any reason why it may have not been deleted? -rw-r--r-- 1 kafka kafka 104857532 Apr 5 2017 .log -rw-r--r-- 1 kafka kafka 104857564 Apr 6 2017 01219197.log -rw-r--r-- 1 ka

Re: [DISCUSS] KIP-247: Add public test utils for Kafka Streams

2018-01-16 Thread Matthias J. Sax
Thanks a lot for the comments. @Guozhang: I updated the KIP accordingly. With regard to potential client test-utils, I agree, but not sure how to resolve it. I guess, we just need to move the class for this case later on. (One reason to annotate all classes with @Evolving) @Bill: The new artifact

Re: Intermittent NoLeaderForPartition exceptions

2018-01-16 Thread R Krishna
For us, it was always network blips between Kafka and ZK. On Tue, Jan 16, 2018 at 11:00 AM, Atul Mohan wrote: > Hello, > We have 5 Kafka brokers and have a service that continuously send events to > partitions across these 5 brokers. The configuration works fine but every > 90 minutes ~ 120 minu

Intermittent NoLeaderForPartition exceptions

2018-01-16 Thread Atul Mohan
Hello, We have 5 Kafka brokers and have a service that continuously send events to partitions across these 5 brokers. The configuration works fine but every 90 minutes ~ 120 minutes, we lose several events due to the following exception: org.apache.kafka.common.errors.NotLeaderForPartitionExceptio

Re: [DISCUSS] KIP-247: Add public test utils for Kafka Streams

2018-01-16 Thread Jeff Klukas
>From what I can tell, global state stores are managed separately from other state stores and are accessed via different methods. Do the proposed methods on TopologyTestDriver (such as getStateStore) cover global stores? If not, can we add an interface for accessing and testing global stores in th

Re: [DISCUSS] KIP-247: Add public test utils for Kafka Streams

2018-01-16 Thread Bill Bejeck
Thanks for the KIP! One meta question: Will users that are currently using the existing testing code with the "classifier:test" approach: 1) have access to the new testing utilities without updating the gradle.build file 2) can they continue to use the current testing code with the cl

Re: __consumer_offsets too big

2018-01-16 Thread Shravan R
I looked into it. I played with log.cleaner.dedupe.buffer.size between 256MB to 2GB while keeping log.cleaner.threads=1 but that did not help me. I helped me to recover from __consumer_offsets-33 but got into a similar exception on another partition. There no lags on our system and that is not a co

Re: How can I repartition/rebalance topics processed by a Kafka Streams topology?

2018-01-16 Thread Dmitry Minkovsky
> Thus, only left/outer KStream-KStream and KStream-KTable join have some runtime dependencies. For more details about join, check out this blog post: https://www.confluent.io/blog/crossing-streams-joins-apache-kafka/ So I am trying to reprocess and topology and seem to have encountered this. I po

Re: [DISCUSS] KIP-247: Add public test utils for Kafka Streams

2018-01-16 Thread Colin McCabe
Thanks, Matthias, this looks great. It seems like these APIs could either be used against mock objects, or against real brokers running in the same process. Is there a way for the user to select which they want when using the API? Sorry if it's in the KIP and I missed it. cheers, Colin On

Re: [DISCUSS] KIP-247: Add public test utils for Kafka Streams

2018-01-16 Thread Guozhang Wang
Thanks Matthias, I made a pass over the wiki and left some comments; I see you have addressed most of them. Here are a few more: 1. "TopologyTestDriver#process()": how about rename it to "pipeInput" or "sendInput"? 2. For "ConsumerRecordFactory" constructor where "startTimestampMs" is not specifi

Re: __consumer_offsets too big

2018-01-16 Thread naresh Goud
Can you check if jira KAFKA-3894 helps? Thank you, Naresh On Tue, Jan 16, 2018 at 10:28 AM Shravan R wrote: > We are running Kafka-0.9 and I am seeing large __consumer_offsets on some > of the partitions of the order of 100GB or more. I see some of the log and > index files are more than a yea

__consumer_offsets too big

2018-01-16 Thread Shravan R
We are running Kafka-0.9 and I am seeing large __consumer_offsets on some of the partitions of the order of 100GB or more. I see some of the log and index files are more than a year old. I see the following properties that are of interest. offsets.retention.minutes=5769 (4 Days) log.cleaner.dedup

RE: what are common ways to convert info on a web site into a log entry?

2018-01-16 Thread Tauzell, Dave
I would have a cron that runs every day but somehow tracks if it has pulled data for the month. If it has it just does nothing. This way if you have some sort of failure one day (website is down, etc ...) it would pull data the next day. You could possibly use Kaka itself to store the last mo

Re: one machine that have four network.....

2018-01-16 Thread Jakub Scholz
To be honest, I'm not familiar enough with the network configuration etc. But the advice from Svante looks like it might give some idea how to fix it. Regards Jakub On Tue, Jan 16, 2018 at 2:08 PM, 猪少爷 wrote: > Jakhub: I would like to increase the bandwidth by binding each broke to > each netw

Re: Upgrading Kafka from Version 0.10.2 to 1.0.0

2018-01-16 Thread Tim Visher
On Tue, Jan 9, 2018 at 4:50 PM, ZigSphere Tech wrote: > Is it easy to upgrade from Kafka version 0.10.2 to 1.0.0 or do I need to > upgrade to version 0.11.0 first? Anything to expect? > We just did (almost) exactly this upgrade. 2.11-0.10.1.0 to 2.11-1.0.0. The main issue we faced was the broke

??????one machine that have four network.....

2018-01-16 Thread ??????
Jakhub: I would like to increase the bandwidth by binding each broke to each network card in one machine. Is this feasible? -- -- ??: "??";; : 2018??1??16??(??) 10:14 ??: "users"; : one machine that have four network...

consumer.seekToBeginning() will disable "enable.auto.commit"

2018-01-16 Thread 杨光
Hi All, I'm using kafka Manual Partition Assignment api to read kafka topic. I found that if i use the "seekToBeginning" method ,the consumer will not auto commit offset to kafka even if the "enable.auto.commit" is "true". My code like next: Properties props = new Properties(); props.put("bo

Re: one machine that have four network.....

2018-01-16 Thread Svante Karlsson
Even if you bind your socket to an ip of a specific card, when the packet is about to leave your host it hits the routing table and gets routed through the interface with least cost (arbitrary but static since all interfaces have same cost since they are on the same subnet) thus you will not reach

Re: one machine that have four network.....

2018-01-16 Thread Jakub Scholz
Maybe a stupid question ... but if you just want to create a setup with 3 zookeepers and 3 brokers on a single machine you just need to use different port numbers. You do not need separate network interfaces. What are you trying to achieve with the different network interfaces? Regards Jakub On T