Re: Oversized Message 40k

2016-11-22 Thread Ignacio Solis
At LinkedIn we have a number of use cases for large messages. We stick to the 1MB message limit at the high end though. Nacho On Tue, Nov 22, 2016 at 6:11 PM, Gwen Shapira wrote: > This has been our experience as well. I think the largest we've seen > in production is 50MB.

Re: Delete "kafka-logs"

2016-11-22 Thread Eranga Heshan
The reason I set retention time to 10 seconds is because I run kafka for about 4 minutes each time. What I am running is a microbenchmark and I take throughput and latency values for 4mins of runtime. However when I run I get these printed on terminal. [2016-11-23 12:21:54,517] INFO Deleting

Re: KafkaStreams KTable#through not creating changelog topic

2016-11-22 Thread Mikael Högqvist
Hi Eno, 1) Great :) 2) Yes, we are using the Interactive Queries to access the state stores. In addition, we access the changelogs to subscribe to updates. For this reason we need to know the changelog topic name. Thanks, Mikael On Tue, Nov 22, 2016 at 8:43 PM Eno Thereska

Re: Delete "kafka-logs"

2016-11-22 Thread Sachin Mittal
try bin/kafka-server-start.sh ./config/server.properties Also check how many segments have got created in logs dir and size of each. May be set the retention to 1 hour or 15 minutes. 10 sec seems very less. On Wed, Nov 23, 2016 at 11:21 AM, Eranga Heshan wrote: > As you

Re: Delete "kafka-logs"

2016-11-22 Thread Eranga Heshan
As you suggested I did these changes to the server.properties file on all nodes. log.retention.check.interval.ms=1 log.retention.ms=1 log.segment.bytes=1049 How I start each server is by, /home/ubuntu/eranga/software/kafka_2.11-0.10.0.1/bin/kafka-server-start.sh -daemon

Re: Kafka consumers are not equally distributed

2016-11-22 Thread Sharninder
Could it be because of the partition key ? On Wed, Nov 23, 2016 at 12:33 AM, Ghosh, Achintya (Contractor) < achintya_gh...@comcast.com> wrote: > Hi there, > > We are doing the load test in Kafka with 25tps and first 9 hours it went > fine almost 80K/hr messages were processed after that we see a

Re: Delete "kafka-logs"

2016-11-22 Thread Sachin Mittal
Check these in server.properties # Log Retention Policy # # The following configurations control the disposal of log segments. The policy can # be set to delete segments after a period of time, or after a given size has accumulated. # A

Re: Delete "kafka-logs"

2016-11-22 Thread Eranga Heshan
This is the exact parameter I added to server.config file on all the nodes. log.retention.ms=1000 Thanks, Regards, Eranga Heshan *Undergraduate* Computer Science & Engineering University of Moratuwa Mobile: +94 71 138 2686 <%2B94%2071%20552%202087> Email: era...@wso2.com

Re: Delete "kafka-logs"

2016-11-22 Thread Eranga Heshan
I have 4 node cluster and node1 has zookeeper running. I added those parameters on each of my server.properties files in kafka nodes. Eranga Heshan *Undergraduate* Computer Science & Engineering University of Moratuwa Mobile: +94 71 138 2686 <%2B94%2071%20552%202087> Email: era...@wso2.com

Re: Delete "kafka-logs"

2016-11-22 Thread Sachin Mittal
Kafka logs are not kept based on data consumption. They are deleted after it reaches a particular size and retention time. Even if they are consumed multiple times in that duration, they are still kept. Where have you added those params and how are you starting your zookeeper and server. On

Re: Investigating apparent data loss during preferred replica election

2016-11-22 Thread Mark Smith
Jun, I see what's going on -- the leader doesn't update its HW as soon as the follower has requested the messages, it updates when the follower requests the _next_ messages. I.e., it infers that because the follower requested from offset 38 that everything <= 37 is durable. This makes sense and

Re: Delete "kafka-logs"

2016-11-22 Thread Eranga Heshan
Sorry for repeatedly asking the same question, but although I tried all those parameters; my log file still at large about 2.5GB. Is it due to the data inside my topic is flushed to disk? I meant that there might be data inside topic which is not consumed yet or not been deleted even after

Re: Oversized Message 40k

2016-11-22 Thread Gwen Shapira
This has been our experience as well. I think the largest we've seen in production is 50MB. If you have performance numbers you can share for the large messages, I think we'll all appreciate :) On Tue, Nov 22, 2016 at 1:04 PM, Tauzell, Dave wrote: > I ran tests

RE: Oversized Message 40k

2016-11-22 Thread Tauzell, Dave
I ran tests with a mix of messages, some as large as 20MB. These large messages do slow down processing, but it still works. -Dave -Original Message- From: h...@confluent.io [mailto:h...@confluent.io] Sent: Tuesday, November 22, 2016 1:41 PM To: users@kafka.apache.org Subject: Re:

Re: kafka 0.10.1 ProcessorTopologyTestDriver issue with .map and .groupByKey.count

2016-11-22 Thread Matthias J. Sax
In Kafka 0.10.1 a deduplication cache was introduced for aggregates, that reduces the downstream load for a KTable changelog stream. If you want to disable the cache for testing, you can set StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG to zero. Compare:

kafka 0.10.1 ProcessorTopologyTestDriver issue with .map and .groupByKey.count

2016-11-22 Thread Hamidreza Afzali
Hi, When using ProcessorTopologyTestDriver in the latest Kafka 0.10.1, the combination of .map(...) and .groupByKey(...).count(...) does not produce any result. The topology looks like this: builder.stream(Serdes.String, Serdes.Integer, inputTopic) .map((k, v) => new KeyValue(fn(k), v))

Re: KafkaStreams KTable#through not creating changelog topic

2016-11-22 Thread Eno Thereska
HI Mikael, 1) The JavaDoc looks incorrect, thanks for reporting. Matthias is looking into fixing it. I agree that it can be confusing to have topic names that are not what one would expect. 2) If your goal is to query/read from the state stores, you can use Interactive Queries to do that (you

Re: Oversized Message 40k

2016-11-22 Thread hans
The default config handles messages up to 1MB so you should be fine. -hans > On Nov 22, 2016, at 4:00 AM, Felipe Santos wrote: > > I read on documentation that kafka is not optimized for big messages, what > is considered a big message? > > For us the messages will be on

Re: Restrict consumers from connecting to Kafka cluster

2016-11-22 Thread Ofir Manor
Check the Security section of the documentation, especially authorization (which means also authentication) http://kafka.apache.org/documentation.html Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io On Tue, Nov 22, 2016 at 9:13 PM, ravi singh

Re: KafkaStreams KTable#through not creating changelog topic

2016-11-22 Thread Mikael Högqvist
Sorry for being unclear, i'll try again :) 1) The JavaDoc for through is not correct, it states that a changelog topic will be created for the state store. That is, if I would call it with through("topic", "a-store"), I would expect a kafka topic "my-app-id-a-store-changelog" to be created. 2)

Restrict consumers from connecting to Kafka cluster

2016-11-22 Thread ravi singh
Is it possible to restrict Kafka consumers from consuming from a given Kafka cluster? -- *Regards,* *Ravi*

Kafka consumers are not equally distributed

2016-11-22 Thread Ghosh, Achintya (Contractor)
Hi there, We are doing the load test in Kafka with 25tps and first 9 hours it went fine almost 80K/hr messages were processed after that we see a lot of lags and we stopped the incoming load. Currently we see 15K/hr messages are processing. We have 40 consumer instances with concurrency 4 and

Re: Consumer group keep bouncing between generation id 0 and 1

2016-11-22 Thread Guozhang Wang
1. If you are not having heavy callback, then I'd suggest you checking the frequency of consumer.poll() call, because that is when the heartbeat is sent. For example, between [2016-11-17 06:17:42,399] new group formed with the single consumer member [2016-11-17 06:18:12,404] that single member

Re: Kafka windowed table not aggregating correctly

2016-11-22 Thread Guozhang Wang
Hello Sachin, In the implementation of SortedSet, if the object's implemented the Comparable interface, that compareTo function is applied in " aggregate.add(value);", and hence if it returns 0, this element will not be added since it is a Set. Guozhang On Mon, Nov 21, 2016 at 10:06 PM,

Re: KafkaStreams KTable#through not creating changelog topic

2016-11-22 Thread Matthias J. Sax
I cannot completely follow what want to achieve. However, the JavaDoc for through() seems not to be correct to me. Using through() will not create an extra internal changelog topic with the described naming schema, because the topic specified in through() can be used for this (there is no point

Re: Streams - merging multiple topics

2016-11-22 Thread Brian Krahmer
Thanks Damian! Based on your response, I finally got it working. I did end up using left joins and added a final step that goes from table -> stream and then filters out nulls. thanks, brian On 21.11.2016 22:03, Damian Guy wrote: Hi Brian, It sounds like you might want do something like:

Re: Kafka producer dropping records

2016-11-22 Thread Ismael Juma
Another option which is probably easier is to pass a callback to `send` and log errors. Ismael On Tue, Nov 22, 2016 at 10:33 AM, Ismael Juma wrote: > You can collect the Futures and call `get` in batches. That would give you > access to the errors without blocking on each

Re: KafkaStreams KTable#through not creating changelog topic

2016-11-22 Thread Eno Thereska
Hi Mikael, If you perform an aggregate and thus create another KTable, that KTable will have a changelog topic (and a state store that you can query with Interactive Queris - but this is tangential). It is true that source KTables don't need to create another topic, since they already have

Custom Partitioner explanation?

2016-11-22 Thread Marina
Hi,I'm trying to upgrade my 0.8 producer to 0.9(0.10) APIs, and noticed that the way to implement a custom Partitioner has changed In 0.8, I had implemented this interface:kafka.producer.Partitioner with this implementation of the partition() method - where the goal is to equally distribute

Re: Oversized Message 40k

2016-11-22 Thread Dominik Safaric
Big is a relative term. And the question you ask is quite difficult to answer because not other information is available - including the configuration of the Kafka cluster, hardware specification etcetera. I suggest the following: (1) read a couple of benchmarks such as [1], (2) investigate

Oversized Message 40k

2016-11-22 Thread Felipe Santos
I read on documentation that kafka is not optimized for big messages, what is considered a big message? For us the messages will be on average from 20k ~ 40k? Is this a real problem? Thanks -- Felipe Santos

Re: Kafka producer dropping records

2016-11-22 Thread Ismael Juma
You can collect the Futures and call `get` in batches. That would give you access to the errors without blocking on each request. Ismael On Tue, Nov 22, 2016 at 8:56 AM, Phadnis, Varun wrote: > Hello, > > We had tried that... If future.get() is added in the while

Re: Kafka producer dropping records

2016-11-22 Thread Jaikiran Pai
That tells you that the acknowledgements (which in your case, you have set to receive ACKs from all brokers in the ISR) aren't happening and that can essentially mean that the records aren't making it to the topics. How many brokers do you have? What's the replication factor on the topic and

KafkaStreams KTable#through not creating changelog topic

2016-11-22 Thread Mikael Högqvist
Hi, in the documentation for KTable#through, it is stated that a new changelog topic will be created for the table. It also states that calling through is equivalent to calling #to followed by KStreamBuilder#table.

Consumer failover issue when coordinator dies (e.g. broker restart)

2016-11-22 Thread Jan Omar
Hey guys, We're running Kafka 0.9.0.1 with Java 7 on FreeBSD. We are experiencing unrecoverable issues in our consumers, e.g. when restarting brokers. The consumers start reporting that the coordinator died (which in general is correct, because the coordinator was restarted). However, the

org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.

2016-11-22 Thread Jiecxy
Hi all, My kafka version is 2.11_0.10.1.0, and clusters have three nodes. I created a topic (3 replications, 6 partitions). And I tested the class ProducerPerformance by using the command like this: bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic test

RE: Kafka producer dropping records

2016-11-22 Thread Phadnis, Varun
Hello, We had tried that... If future.get() is added in the while loop, it takes too long for the loop to execute. Last time we tried it, it was running for that file for over 2 hours and still not finished. Regards, Varun -Original Message- From: Jaikiran Pai

Re: Kafka producer dropping records

2016-11-22 Thread Jaikiran Pai
The KafkaProducer.send returns a Future. What happens when you add a future.get() on the returned Future, in that while loop, for each sent record? -Jaikiran On Tuesday 22 November 2016 12:45 PM, Phadnis, Varun wrote: Hello, We have the following piece of code where we read lines from a

Re: Delete "kafka-logs"

2016-11-22 Thread Sachin Mittal
Set some value to log.retention.ms which is ms time to retain log and also log.segment.bytes which is size of a single log file. delete.topic.enable is needed if you want to delete the topic itself. Thanks Sachin On Tue, Nov 22, 2016 at 1:52 PM, Eranga Heshan wrote: > I

Re: Delete "kafka-logs"

2016-11-22 Thread Eranga Heshan
I tried log.retention.bytes=1049000 which is about 10MB. But it did not work. Is it because I had not given any value to delete.topic.enable? Eranga Heshan *Undergraduate* Computer Science & Engineering University of Moratuwa Mobile: +94 71 138 2686 <%2B94%2071%20552%202087> Email:

Re: Delete "kafka-logs"

2016-11-22 Thread Sachin Mittal
Check http://kafka.apache.org/documentation.html#brokerconfigs log.retention.bytes log.retention.ms log.segment.bytes delete.topic.enable On Tue, Nov 22, 2016 at 1:36 PM, Eranga Heshan wrote: > Hi all, > > I want to keep the size of kafka-logs reduced as much as

Delete "kafka-logs"

2016-11-22 Thread Eranga Heshan
Hi all, I want to keep the size of kafka-logs reduced as much as possible. To do it I need to set it to delete its unwanted logs frequently. Is there a way to do it? Thanks, Regards, Eranga Heshan *Undergraduate* Computer Science & Engineering University of Moratuwa Mobile: +94 71 138 2686

Delete kafka-logs

2016-11-22 Thread Eranga Heshan
Eranga Heshan *Undergraduate* Computer Science & Engineering University of Moratuwa Mobile: +94 71 138 2686 <%2B94%2071%20552%202087> Email: era...@wso2.com