Re: Adding topics to KafkaStreams after ingestion has been started?

2016-12-02 Thread Ali Akhtar
Thank you very much. Last q - Is it safe to do this from within a call back processing that topic , once it reaches the last message? (It keeps a count of how many messages processed vs how many remaining) On 3 Dec 2016 11:36 a.m., "Matthias J. Sax" wrote: > You can use

Re: Adding topics to KafkaStreams after ingestion has been started?

2016-12-02 Thread Matthias J. Sax
You can use TopicCommand to delete a topic within Java: > final TopicCommand.TopicCommandOptions commandOptions = new > TopicCommand.TopicCommandOptions(new String[]{ > "--zookeeper", "zookeperHost:2181", > "--delete", > "--topic", "TOPIC-TO-BE-DELETED"}); >

Re: Kafka windowed table not aggregating correctly

2016-12-02 Thread Sachin Mittal
Hi, I think now it makes all the sense. The field I was using for timestamp extractor contains timestamps which spans for greater than a day's duration and it worked for wall clock because for short duration timestamps were in day's range. I wanted to understand one thing: Say I have a timestamp

Re: Adding topics to KafkaStreams after ingestion has been started?

2016-12-02 Thread Ali Akhtar
Is there a way to delete the processed topics via streams or the java driver? Or only thru the bash script? On 3 Dec 2016 5:27 a.m., "Matthias J. Sax" wrote: > If you keep old topics that are completely processed, there would be > increasing overhead, because Streams

Re: Building kafka from source

2016-12-02 Thread Sachin Mittal
Hi, I figured out the first step, where I am able to get the wrapper but simply running C:\Users\Sachin\.gradle\wrapper\dists\gradle-2.9-bin\ebaspjjvvkuki3ldbldx7hexd\gradle-2.9\bin\gradle but when I build the jar using D:\github\kafka>gradlew.bat jar I get this error FAILURE: Build failed

Building kafka from source

2016-12-02 Thread Sachin Mittal
I think may have asked this question but just for a quick response I am posting here first, so my apologies. I am following the guide on git and I don't understand this first step First bootstrap and download the wrapper cd kafka_source_dir gradle I suppose kafka source dir is the root dir of

Re: Adding topics to KafkaStreams after ingestion has been started?

2016-12-02 Thread Matthias J. Sax
If you keep old topics that are completely processed, there would be increasing overhead, because Streams would try to read from those topics as long as they exist. Thus, more fetch request will be sent to those more topics over time, while most fetch request will return without any new data (as

Re: Tracking when a batch of messages has arrived?

2016-12-02 Thread Ali Akhtar
Hey Apurva, I am including the batch_id inside the messages. Could you give me an example of what you mean by custom control messages with a control topic please? On Sat, Dec 3, 2016 at 12:35 AM, Apurva Mehta wrote: > That should work, though it sounds like you may be

Re: Adding topics to KafkaStreams after ingestion has been started?

2016-12-02 Thread Ali Akhtar
Hey Matthias, So I have a scenario where I need to batch a group of messages together. I'm considering creating a new topic for each batch that arrives, i.e batch_. Each batch_ topic will have a finite number of messages, and then it will remain empty. Essentially these will be throwaway

Re: KTables + aggregation - how to make lots of ppl happy

2016-12-02 Thread Guozhang Wang
Thanks Jon for bringing this up. We have seen the community discussing about explicit triggers for Kafka Streams, and some of them have been covered in KIP-63: https://cwiki.apache.org/confluence/display/KAFKA/KIP-63%3A+Unify+store+and+downstream+caching+in+streams Guozhang On Thu, Dec 1,

Re: Kafka windowed table not aggregating correctly

2016-12-02 Thread Guozhang Wang
Sachin, One thing to note is that the retention of the windowed stores works by keeping multiple segments of the stores where each segments stores a time range which can potentially span multiple windows, if a new window needs to be created that is further from the oldest segment's time range +

Re: HOW TO GET KAFKA CURRENT TOPIC GROUP OFFSET

2016-12-02 Thread David Garcia
https://kafka.apache.org/documentation#operations It’s in there somewhere… ;-) On 11/28/16, 7:34 PM, "西风瘦" wrote: HI! WAIT YOUR ANSWER

Re: Initializing StateStores takes *really* long for large datasets

2016-12-02 Thread Guozhang Wang
Before we have the a single-knob memory management feature, I'd like to propose reducing the Streams' default config values for RocksDB caching and memory block size. For example, I remember Henry has done some fine tuning on the RocksDB config for his use case:

Priority queue using Kafka Streams' KStream

2016-12-02 Thread Ivan Ilichev
Hi guys, Got a quick question on how one would go about implementing a priority queue using Kafka Streams DSL API. This is probably closely related to windowed sort but I haven't found an example of how that can be accomplished. Regards, -Ivan

Re: Kafka windowed table not aggregating correctly

2016-12-02 Thread Matthias J. Sax
The extractor is used in org.apache.kafka.streams.processor.internals.RecordQueue#addRawRecords() Let us know, if you could resolve the problem or need more help. -Matthias On 12/2/16 11:46 AM, Sachin Mittal wrote: > https://github.com/SOHU-Co/kafka-node/ this is the node js client i am >

Re: rebalancing - confusing choices

2016-12-02 Thread Jon Yeargers
This deteriorated somewhat rapidly. First it was using only 3 brokers (of the five).. then 2.. finally only one broker/thread was doing all the work. Then the topic/group stopped responding at all. Now if I call 'kafka-consumer-groups --new-consumer --describe' all it says is 'Consumer group

Re: Suggestions

2016-12-02 Thread Apurva Mehta
> > then, the strange thing is that the consumer on > the second topic which stay in poll forever, *without receive any message*. How long is 'forever'? Did you wait more than 5 minutes? On Fri, Dec 2, 2016 at 2:55 AM, Vincenzo D'Amore wrote: > Hi Kafka Gurus :) > > I'm

Re: Kafka windowed table not aggregating correctly

2016-12-02 Thread Sachin Mittal
https://github.com/SOHU-Co/kafka-node/ this is the node js client i am using. The version is 0.5x. Can you please tell me what code in streams calls the timestamp extractor. I can look there to see if there is any issue. Again issue happens only when producing the messages using producer that is

Re: Tracking when a batch of messages has arrived?

2016-12-02 Thread Apurva Mehta
That should work, though it sounds like you may be interested in : https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging If you can include the 'batch_id' inside your messages, and define custom control messages with a control topic, then you

Re: Kafka windowed table not aggregating correctly

2016-12-02 Thread Matthias J. Sax
I am not sure what is happening. That's why it would be good to have a toy example to reproduce the issue. What do you mean by "Kafka node version 0.5"? -Matthias On 12/2/16 11:30 AM, Sachin Mittal wrote: > I can provide with the data but data does not seem to be the issue. > If I submit the

Re: Kafka windowed table not aggregating correctly

2016-12-02 Thread Sachin Mittal
I can provide with the data but data does not seem to be the issue. If I submit the same data and use same timestamp extractor using the java client with kafka version 0.10.0.1 aggregation works fine. I find the issue only when submitting the data with kafka node version 0.5. It looks like the

ConnectStandalone with no starting connector properties

2016-12-02 Thread Micah Whitacre
I'm curious if there was an intentional reason that Kafka Connect standalone requires a connector properties on startup?[1] ConnectDistributed only requires the worker properties. ConnectStandalone however requires the worker properties and at least one connector properties. Since connectors

Re: Kafka windowed table not aggregating correctly

2016-12-02 Thread Matthias J. Sax
Can you provide example input data (including timetamps) and result. What is the expected result (ie, what aggregation do you apply)? -Matthias On 12/2/16 7:43 AM, Sachin Mittal wrote: > Hi, > After much debugging I found an issue with timestamp extractor. > > If I use a custom timestamp

Re: Adding topics to KafkaStreams after ingestion has been started?

2016-12-02 Thread Matthias J. Sax
1) There will be once consumer per thread. The number of thread is defined by the number of instances you start and how many threads you configure for each instance via StreamConfig parameter NUM_STREAM_THREADS_CONFIG. Thus, you control this completely by yourself. Depending on the number to

Re: Writing a consumer offset checker

2016-12-02 Thread Vahid S Hashemian
There is a JIRA open that should address this: https://issues.apache.org/jira/browse/KAFKA-3853 Since it requires a change in the protocol, it's awaiting a KIP vote that's happening next week ( https://cwiki.apache.org/pages/viewpage.action?pageId=66849788). Once the vote is passed the code

rebalancing - confusing choices

2016-12-02 Thread Jon Yeargers
I have a topic with 10 partitions. I have 10 consumers (5 processes x 2 threads) on 5 separate machines. Seem like a decent match? So why does kafka rebalance and commonly assign two (or more) partitions to a single thread? This leaves threads idling and the partitions start lagging. Is there

Re: Kafka windowed table not aggregating correctly

2016-12-02 Thread Sachin Mittal
Hi, After much debugging I found an issue with timestamp extractor. If I use a custom timestamp extractor with following code: public static class MessageTimestampExtractor implements TimestampExtractor { public long extract(ConsumerRecord record) { if

Re: Adding topics to KafkaStreams after ingestion has been started?

2016-12-02 Thread Ali Akhtar
That's pretty useful to know - thanks. 1) If I listened too foo-.*, and there were 5 foo topics created after kafka streaming was running: foo1, foo2, foo3, foo4, foo5, will this create 5 consumers / threads / instances, or will it be just 1 instance that receives the messages for all of those

Re: Adding topics to KafkaStreams after ingestion has been started?

2016-12-02 Thread Damian Guy
Hi Ali, The only way KafkaStreams will process new topics after start is if the original stream was defined with a regular expression, i.e, kafka.stream(Pattern.compile("foo-.*"); If any new topics are added after start that match the pattern, then they will also be consumed. Thanks, Damian On

Adding topics to KafkaStreams after ingestion has been started?

2016-12-02 Thread Ali Akhtar
Heya, Normally, you add your topics and their callbacks to a StreamBuilder, and then call KafkaStreams.start() to start ingesting those topics. Is it possible to add a new topic to the StreamBuilder, and start ingesting that as well, after KafkaStreams.start() has been called? Thanks.

Tracking when a batch of messages has arrived?

2016-12-02 Thread Ali Akhtar
Heya, I need to send a group of messages, which are all related, and then process those messages, only when all of them have arrived. Here is how I'm planning to do this. Is this the right way, and can any improvements be made to this? 1) Send a message to a topic called batch_start, with a

Re: Suggestions

2016-12-02 Thread Tauzell, Dave
Can you use the console consumer to see the messages on the other topics? > On Dec 2, 2016, at 04:56, Vincenzo D'Amore wrote: > > Hi Kafka Gurus :) > > I'm creating process between few applications. > > First application create a producer and then write a message into a main

Re: Writing a consumer offset checker

2016-12-02 Thread Tobias Adamson
Hi Jon We have written an offset check in python for the new consumer groups. I've filed a bug against pykafka with some admin command support here https://github.com/Parsely/pykafka/issues/620#issuecomment-264258141 That you could use right now If not I should be able to release an offset

Writing a consumer offset checker

2016-12-02 Thread Jon Yeargers
I want to write my own offset monitor so I can integrate it with our alerting system. I've tried Java and Java + Scala but have run into the same problem both times. (details here: http://stackoverflow.com/questions/40808678/kafka-api-offsetrequest-unable-to-retrieve-results ) If anyone has a

kafka-log-cleaner-thread exits

2016-12-02 Thread Yuanjia
Hi all, I find KAFKA-1641, it's fixed. But I encounter this bug in kafka-0.10.0.0. Log as follow: [2016-12-02 11:33:52,744] ERROR [kafka-log-cleaner-thread-0], Error due to (kafka.log.LogCleaner) java.lang.IllegalArgumentException: requirement failed: Last clean offset is 282330655505 but

Suggestions

2016-12-02 Thread Vincenzo D'Amore
Hi Kafka Gurus :) I'm creating process between few applications. First application create a producer and then write a message into a main topic (A), within the message there is the name of a second topic (B). Then promptly create a second producer and write few message into the new topic (B). I

Re: Kafka Logo as HighRes or Vectorgraphics

2016-12-02 Thread Michael Noll
Jan, Here's vector files for the logo. One of our teammates went ahead and helped resized it to fit nicely into a 2x4m board with 15cm of margin all around. Note: I was told to kindly remind you (and other readers of this) to follow the Apache branding guidelines for the logo, and please not

RE: Detecting when all the retries are expired for a message

2016-12-02 Thread Mevada, Vatsal
I executed the same producer code for a single record file with following config: properties.put("bootstrap.servers", bootstrapServer); properties.put("key.serializer", StringSerializer.class.getCanonicalName()); properties.put("value.serializer",

Re: Kafka Logo as HighRes or Vectorgraphics

2016-12-02 Thread Jan Filipiak
Hi, I was just pointed to this. https://www.vectorlogo.zone/logos/apache_kafka/ if someone else is looking for the same thing! thanks a lot Best Jan On 01.12.2016 13:05, Jan Filipiak wrote: Hi Everyone, we want to print some big banners of the Kafka logo to decorate our offices. Can anyone

RE: Detecting when all the retries are expired for a message

2016-12-02 Thread Ismael Juma
The callback is called after the retries have been exhausted. Ismael On 2 Dec 2016 3:34 am, "Mevada, Vatsal" wrote: > @Ismael: > > I can handle TimeoutException in the callback. However as per the > documentation of Callback(link: https://kafka.apache.org/0100/ >