Re: Running cluster of stream processing application

2016-12-08 Thread Sachin Mittal
Hi, I followed the document and I have few questions. Say I have a single partition input key topic and say I run 2 streams application from machine1 and machine2. Both the application have same application id are have identical code. Say topic1 has messages like (k1, v11) (k1, v12) (k1, v13) (k2,

Re: log.retention.hours not working?

2016-12-08 Thread Rodrigo Sandoval
This is what Tood said: "Retention is going to be based on a combination of both the retention and segment size settings (as a side note, it's recommended to use log.retention.ms and log.segment.ms, not the hours config. That's there for legacy reasons, but the ms configs are more consistent). As

Re: log.retention.hours not working?

2016-12-08 Thread Rodrigo Sandoval
Your understanding about segment.bytes and retention.ms is correct. But Tood Palino said just after having reached the segment size, that is when the segment is "closed" PLUS all messages within the segment that was closed are older than the retention policy defined ( in this case retention.ms)

Re: Kafka windowed table not aggregating correctly

2016-12-08 Thread Sachin Mittal
Hi, Right now in order to circumvent this problem I am using a timestamp whose values increase by few ms as and when I get new records. So lets say I have records in order A -> lower limit TS + 1 sec B -> lower limit TS + 3 sec C -> lower limit TS + 5 sec .. Z -> upper limit TS - 1 sec Now say I

Re: log.retention.hours not working?

2016-12-08 Thread Sachin Mittal
I think segment.bytes defines the size of single log file before creating a new one. retention.ms defines number of ms to wait on a log file before deleting it. So it is working as defined in docs. On Fri, Dec 9, 2016 at 2:42 AM, Rodrigo Sandoval wrote: > How is

Re: Running cluster of stream processing application

2016-12-08 Thread Mathieu Fenniak
Hi Sachin, Some quick answers, and a link to some documentation to read more: - If you restart the application, it will start from the point it crashed (possibly reprocessing a small window of records). - You can run more than one instance of the application. They'll coordinate by virtue of

Running cluster of stream processing application

2016-12-08 Thread Sachin Mittal
Hi All, We were able to run a stream processing application against a fairly decent load of messages in production environment. To make the system robust say the stream processing application crashes, is there a way to make it auto start from the point when it crashed? Also is there any concept

Running mirror maker between two different version of kafka

2016-12-08 Thread Vijayanand Rengarajan
Team, I am trying to mirror few topics from cluster A( version 0.8.1) to Cluster B (version 0.10.1.0), but due to version incompatibility I am getting below error.if any one of you had similar issues, please share the work around/solution to this issue. I am running the kafka mirroring in

Re: log.retention.hours not working?

2016-12-08 Thread Rodrigo Sandoval Tejerina
How is that about that when the segment size is reached, plus every single message inside the segment is older than the retention time, then the segment will be deleted? I have playing with Kafka and I have the following: bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic topic1

Re: log.retention.hours not working?

2016-12-08 Thread Rodrigo Sandoval Tejerina
How is that about that when the segment size is reached, plus every single message inside the segment is older than the retention time, then the segment will be deleted? I have playing with Kafka and I have the following: bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic topic1

Kafka supported on AIX OS?

2016-12-08 Thread Jayanna, Gautham
Hi, We are trying to determine if we can run Kafka on AIX OS, however I could not find definite information in the wiki page or by searching on internet. I would greatly appreciate if you could let us know if we can run Kafka on AIX or if there are plans to support AIX in a future release.

Re: [VOTE] 0.10.1.1 RC0

2016-12-08 Thread Bernard Leach
The scala 2.12 artifacts aren’t showing up, any chance of publishing them? > On 9 Dec 2016, at 07:57, Vahid S Hashemian wrote: > > +1 > > Build and quickstart worked fine on Ubuntu, Mac, Windows 32 and 64 bit. > > Thanks for running the release. > > Regards, >

The connection between kafka and zookeeper is often closed by zookeeper, lead to NotLeaderForPartitionException: This server is not the leader for that topic-partition.

2016-12-08 Thread Jiecxy
Hi guys, Situation: 3 nodes, each 32G memory, CPU 24 cores, 1T hd. 3 brokers on 3 nodes, and 3 zookeeper on these 3 nodes too, all the properties are default, start the zookeeper cluster and kafka cluster. Create a topic (3 replications, 6 partions), like below: bin/kafka-topics.sh

kafka streams passes org.apache.kafka.streams.kstream.internals.Change to my app!!

2016-12-08 Thread Ara Ebrahimi
Hi, Once in a while and quite randomly this happens, but it does happen every few hundred thousand message: 2016-12-03 11:48:05 ERROR StreamThread:249 - stream-thread [StreamThread-4] Streams application error during processing: java.lang.ClassCastException:

Re: Creating a connector with Kafka Connect Distributed returning 500 error

2016-12-08 Thread Phillip Mann
Hello Konstantine and community, I was able to fix this problem by using the latest version of Confluent Platform. I was running CP 3.0.1 but upgraded to 3.1.1 and my worker and connector behaved as expected. Thanks! Phillip From: Konstantine Karantasis Date:

controlling memory growth when aggregating

2016-12-08 Thread Jon Yeargers
I working with JSON data that has an array member. Im aggregating values into this using minute long windows. I ran the app for ~10 minutes and watched it consume 40% of the memory on a box with 32G. It was still growing when I stopped it. At this point it had created ~800 values each of which

Re: log.retention.hours not working?

2016-12-08 Thread Rodrigo Sandoval
How is that about that when the segment size is reached, plus every single message inside the segment is older than the retention time, then the segment will be deleted? I have playing with Kafka and I have the following: bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic topic1

Re: [VOTE] 0.10.1.1 RC0

2016-12-08 Thread Vahid S Hashemian
+1 Build and quickstart worked fine on Ubuntu, Mac, Windows 32 and 64 bit. Thanks for running the release. Regards, --Vahid From: Guozhang Wang To: "users@kafka.apache.org" , "d...@kafka.apache.org" ,

Re: Kafka windowed table not aggregating correctly

2016-12-08 Thread Guozhang Wang
Hello Sachin, I am with you that ideally the windowing segmentation implementation should be totally abstracted from users but today it is a bit confusing to understand. I have filed JIRA some time ago to improve on this end: https://issues.apache.org/jira/browse/KAFKA-3596 So to your example,

Re: KafkaStreams metadata - enum keys?

2016-12-08 Thread Damian Guy
Yes it could be an issue when you initially startup. If it is the first time you run the app and there are internal topics created by Kafka Streams, there will be rebalances. However it depends on your topology. How are you trying to access the state store? Thanks, Damian On Thu, 8 Dec 2016 at

Re: KafkaStreams metadata - enum keys?

2016-12-08 Thread Jon Yeargers
Im only running one consumer-instance so would rebalancing / wrong host be an issue? On Thu, Dec 8, 2016 at 7:31 AM, Damian Guy wrote: > Hi Jon, > > How are you trying to access the store? > > That exception is thrown in a few circumstances: > 1. KakfaStreams hasn't

Re: Q about doc of consumer

2016-12-08 Thread Vahid S Hashemian
Ryan, The correct consumer command in the latest doc ( http://kafka.apache.org/quickstart#quickstart_consume) is bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning You used the "--zookeeper" parameter which implies using the old consumer, in which

kafka commands taking a long time

2016-12-08 Thread Stephen Cresswell
I followed the quickstart instructions at https://kafka.apache.org/quickstart and everything seems to be working ok, except that commands take a long time to run, e.g. $ time bin/kafka-topics.sh --list --zookeeper localhost:2181 real 0m11.751s user 0m1.540s sys 0m0.273s The zookeeper logging

Configuration for low latency and low cpu utilization? java/librdkafka

2016-12-08 Thread Niklas Ström
Use case scenario: We want to have a fairly low latency, say below 20 ms, and we want to be able to run a few hundred processes (on one machine) both producing and consuming a handful of topics. The throughput is not high, lets say on average 10 messages per second for each process. Most messages

Re: KafkaStreams metadata - enum keys?

2016-12-08 Thread Damian Guy
Hi Jon, How are you trying to access the store? That exception is thrown in a few circumstances: 1. KakfaStreams hasn't initialized or is re-initializing due to a rebalance. This can occur for a number of reasons, i.e., new topics/partitions being added to the broker (including streams internal

Re: kafka cluster network bandwidth is too high

2016-12-08 Thread Stephen Powis
Yea, we have a 7 node cluster with ~200 topics and see sustained 100Mbps going between the nodes. Very bandwidth hungry :p On Thu, Dec 8, 2016 at 1:51 AM, Matthias J. Sax wrote: > You cannot sent images over the mailing list. They get automatically > removed. > > On

Upgrading from 0.10.0.1 to 0.10.1.0

2016-12-08 Thread Hagen Rother
Hi, I am testing an upgrade and I am stuck on the mirror maker. - New consumer doesn't like the old brokers - Old consumer comes up, but does nothing and throws a java.net.SocketTimeoutException after while. What's the correct upgrade strategy when mirroring is used? Thanks! Hagen

Re: Kafka and zookeeper stores and mesos env

2016-12-08 Thread Mike Marzo
understood, and i am looking at that bit but i would still like to know the answer. On Thu, Dec 8, 2016 at 8:22 AM, Asaf Mesika wrote: > Off-question a bit - Using the Kafka Mesos framework should save you from > handling those questions: https://github.com/mesos/kafka >

Re: Kafka and zookeeper stores and mesos env

2016-12-08 Thread Asaf Mesika
Off-question a bit - Using the Kafka Mesos framework should save you from handling those questions: https://github.com/mesos/kafka On Thu, Dec 8, 2016 at 2:33 PM Mike Marzo wrote: If i'm running a 5 node zk cluster and a 3 node kafka cluster in dcker on a

Kafka and zookeeper stores and mesos env

2016-12-08 Thread Mike Marzo
If i'm running a 5 node zk cluster and a 3 node kafka cluster in dcker on a mesos/marathon environment where my zk and broker nodes are all leveraging local disk on the hosts they are running on is there any value to the local data being preserved across restarts? In other words when a

Re: KafkaStreams metadata - enum keys?

2016-12-08 Thread Jon Yeargers
Tried calling that - got this exception (FWIW - there isn't any other instance) State store value comes from groupByKey().aggregate(LogLine::new, new aggregate(), TimeWindows.of(60 * 60 * 1000L), collectorSerde, "minute_agg_stream"); 2016-12-08 11:33:50,924 [qtp1318180415-18] DEBUG

Re: KafkaStreams metadata - enum keys?

2016-12-08 Thread Jon Yeargers
Maybe the 'rangeForKeyValueStore' function from the sample? On Thu, Dec 8, 2016 at 2:55 AM, Jon Yeargers wrote: > I see functions that require knowing a key name but in the interests of > partitioning we're using fairly complex key structures (IE non-obvious to > an

KafkaStreams metadata - enum keys?

2016-12-08 Thread Jon Yeargers
I see functions that require knowing a key name but in the interests of partitioning we're using fairly complex key structures (IE non-obvious to an external function). Is there a method / process for enumerating keys?

Re: Consumer poll - no results

2016-12-08 Thread Harald Kirsch
auto.offset.reset is honoured if the consumer group has not committed offsets yet, or if the offsets expired (I think this is offsets.retention.*). Otherwise the last committed offsets should be read for that group. Harald. On 07.12.2016 18:48, Mohit Anchlia wrote: Is auto.offset.reset

Re: Problem with multiple joins in one topology

2016-12-08 Thread Matthias J. Sax
Hi Brian, Sorry for you headache. We are aware that current join semantics in Streams are not straight forward. We did rework those already in trunk and this change will be included in next release 0.10.2. Please build from trunk and let us know if this resolves your issue. For details, see