Re: Kafka performance test: "--request-num-acks -1" kills throughput

2014-01-31 Thread Jun Rao
The request log shows you the breakdown of the request time. Where is most of the time spent? Also, did you change replica.fetch.wait.max.ms in the broker? Thanks, Jun On Fri, Jan 31, 2014 at 10:14 AM, Michael Popov wrote: > Hi Jun, > > The usage output of bin/kafka-producer-perf-test.sh shows

Re: [Keeping equal no of messages in partitions ]

2014-01-31 Thread Jun Rao
You can use a customized partitioner and provide a partitioning key in each message. Thanks, Jun On Fri, Jan 31, 2014 at 7:57 AM, Abhishek Bhattacharjee < abhishek.bhattacharje...@gmail.com> wrote: > I want to store equal no.of messages in partitions. For eg. if I have 100 > messages and 2 par

Re: C++ Producer => Broker => Java Consumer?

2014-01-31 Thread Otis Gospodnetic
Beautiful then! I thought this cause problems with Java consumer not knowing how to deserialize, but sounds like I don't have to worry. Excellent, thanks! Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Fri, Jan 31, 2014

Re: New Producer Public API

2014-01-31 Thread Jay Kreps
Hey Tom, Agreed, there is definitely nothing that prevents our including partitioner implementations, but it does get a little less seamless. -Jay On Fri, Jan 31, 2014 at 2:35 PM, Tom Brown wrote: > Regarding partitioning APIs, I don't think there is not a common subset of > information that

Re: C++ Producer => Broker => Java Consumer?

2014-01-31 Thread Philip O'Toole
Exactly. Our C++ producers simply stream bytes to 0.72 Kafka, following Kafka's byte-level message spec. Our Java-based Consumers just read bytes and use the standard IO libraries to deserialize the data. Philip On Fri, Jan 31, 2014 at 2:38 PM, Tom Brown wrote: > The C++ program writes bytes

Re: C++ Producer => Broker => Java Consumer?

2014-01-31 Thread Tom Brown
The C++ program writes bytes to kafka, and java reads bytes from kafka. Is there something special about the way the messages are being serialized in C++? --Tom On Fri, Jan 31, 2014 at 2:36 PM, Philip O'Toole wrote: > Is this a Kafka C++ lib you wrote yourself, or some open-source library? >

Re: New Producer Public API

2014-01-31 Thread Tom Brown
Regarding partitioning APIs, I don't think there is not a common subset of information that is required for all strategies. Instead of modifying the core API to easily support all of the various partitioning strategies, offer the most common ones as libraries they can build into their own data pipe

Re: There is insufficient memory for the Java Runtime Environment to continue.

2014-01-31 Thread Steve Morin
Do you have anything like Graphite or Ganglia monitoring the box to see exactly what's going on? On Fri, Jan 31, 2014 at 1:45 PM, David Montgomery wrote: > Welll...I did get kafka to run on a digiocean box with 4 gigs or ram. All > great but now i am paying 40 USD a month for dev servers when

Re: There is insufficient memory for the Java Runtime Environment to continue.

2014-01-31 Thread David Montgomery
Welll...I did get kafka to run on a digiocean box with 4 gigs or ram. All great but now i am paying 40 USD a month for dev servers when I was paying 5. I have 5 dev servers around the worlds. Would be great to get back ot 5 USD for boxes that just need to start up rather just doing anything subs

Re: C++ Producer => Broker => Java Consumer?

2014-01-31 Thread Philip O'Toole
Is this a Kafka C++ lib you wrote yourself, or some open-source library? What version of Kafka? Philip On Fri, Jan 31, 2014 at 1:30 PM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Hi, > > If Kafka Producer is using a C++ Kafka lib to produce messages, how can > Kafka Consumers writt

C++ Producer => Broker => Java Consumer?

2014-01-31 Thread Otis Gospodnetic
Hi, If Kafka Producer is using a C++ Kafka lib to produce messages, how can Kafka Consumers written in Java deserialize them? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/

Re: There is insufficient memory for the Java Runtime Environment to continue.

2014-01-31 Thread Benjamin Black
Sorry, was looking at pre-release 0.8 code. No idea now why they are not being set as expected. On Fri, Jan 31, 2014 at 1:20 PM, Benjamin Black wrote: > kafka-run-class.sh in 0.8 does not define KAFKA_HEAP_OPTS. i think you > want KAFKA_OPTS. > > > On Fri, Jan 31, 2014 at 1:14 PM, David Montgom

Re: There is insufficient memory for the Java Runtime Environment to continue.

2014-01-31 Thread Benjamin Black
kafka-run-class.sh in 0.8 does not define KAFKA_HEAP_OPTS. i think you want KAFKA_OPTS. On Fri, Jan 31, 2014 at 1:14 PM, David Montgomery wrote: > What d you mean? > > This is appended to kafka-run-class.sh > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true > -Dcom.sun.management.jmxremote

Re: There is insufficient memory for the Java Runtime Environment to continue.

2014-01-31 Thread Jay Kreps
Can you echo out the command that is finally issued by that script to see the final options being used? Then try to run another java class with those options? -Jay On Fri, Jan 31, 2014 at 1:14 PM, David Montgomery wrote: > What d you mean? > > This is appended to kafka-run-class.sh > > KAFKA_J

Re: There is insufficient memory for the Java Runtime Environment to continue.

2014-01-31 Thread David Montgomery
What d you mean? This is appended to kafka-run-class.sh KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false" prior to running the below in the shel I run => export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G" I start

Re: New Producer Public API

2014-01-31 Thread Joel Koshy
> The trouble with callbacks, IMHO, is determining the thread in which they > will be executed. Since the IO thread is usually the thread that knows when > the operation is complete, it's easiest to execute that callback within the > IO thread. This can lead the IO thread to spend all its time on c

Re: New Producer Public API

2014-01-31 Thread Jay Kreps
Oliver, Yeah that was my original plan--allow the registration of multiple callbacks on the future. But there is some additional implementation complexity because then you need more synchronization variables to ensure the callback gets executed even if the request has completed at the time the cal

RE: Kafka performance test: "--request-num-acks -1" kills throughput

2014-01-31 Thread Michael Popov
Hi Jun, The usage output of bin/kafka-producer-perf-test.sh shows --request-timeout-ms The produce request timeout in ms (default: 3000) In my test runs I used the following command-line arguments: bin/kaf

Re: New Producer Public API

2014-01-31 Thread Oliver Dain
Hmmm.. I should read the docs more carefully before I open my big mouth: I just noticed the KafkaProducer#send overload that takes a callback. That definitely helps address my concern though I think the API would be cleaner if there was only one variant that returned a future and you could register

Re: New Producer Public API

2014-01-31 Thread Tom Brown
The trouble with callbacks, IMHO, is determining the thread in which they will be executed. Since the IO thread is usually the thread that knows when the operation is complete, it's easiest to execute that callback within the IO thread. This can lead the IO thread to spend all its time on callbacks

Re: New Producer Public API

2014-01-31 Thread Oliver Dain
I wanted to suggest an alternative to the serialization issue. As I understand it, the concern is that if the user is responsible for serialization it becomes difficult for them to compute the partition as the plugin that computes the partition would be called with byte[] forcing the user to de-ser

Re: New Producer Public API

2014-01-31 Thread Oliver Dain
Hey all, I¹m excited about having a new Producer API, and I really like the idea of removing the distinction between a synchronous and asynchronous producer. The one comment I have about the current API is that it¹s hard to write truly asynchronous code with the type of future returned by the send

Re: New Producer Public API

2014-01-31 Thread Jun Rao
For RangePartitioner, it seems that we will need the key object. Range-partitioning on the serialized key bytes is probably confusing. Thanks, Jun On Thu, Jan 30, 2014 at 4:14 PM, Jay Kreps wrote: > One downside to the 1A proposal is that without a Partitioner interface we > can't really pack

Re: 0.8.1 ETA?

2014-01-31 Thread Neha Narkhede
The delete topic functionality is in progress (KAFKA-330). We were hoping to release 0.8.1 with that. So it's probably 1-2 weeks away. As for the rest of the issues, we probably need to clean those up since we expect we can do 0.8.2 soon after as well. Thanks, Neha On Fri, Jan 31, 2014 at 8:34 A

Re: Producer garbage collection problem

2014-01-31 Thread Jun Rao
You can take a look at the GC setting described in https://cwiki.apache.org/confluence/display/KAFKA/Operations Thanks, Jun On Fri, Jan 31, 2014 at 2:26 AM, Florian Ollech wrote: > Hi, > > I am currently trying to setup Kafka 0.8 and ran into a problem with the > producer. Every time the serve

0.8.1 ETA?

2014-01-31 Thread Otis Gospodnetic
Hi, I hate asking the When question, but I was under the impression 0.8.1 was around the corner so I looked at JIRA and found 90 opened issues labeled as 0.8.1. https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20fixVersion%20%3D%20%220.8.1%22%20AND%20resolution%20%3D%20U

Re: Kafka performance test: "--request-num-acks -1" kills throughput

2014-01-31 Thread Jun Rao
Michael, Those SocketTimeoutExceptions meas that the producer didn't receive the response from the broker in time. Could you check the request log in the broker and see what the request completion time is and how it compares with the request socket timeout? I did some testing a while back. Latenc

[Keeping equal no of messages in partitions ]

2014-01-31 Thread Abhishek Bhattacharjee
I want to store equal no.of messages in partitions. For eg. if I have 100 messages and 2 partitions for a topic then each partition should have 50 messages. Can someone help me with this I am new to kafka so any help will be appreciated. Thanks, -- *Abhishek Bhattacharjee* *Pune Institute of Comp

Re: Metadata error always correlates with LeaderAndIsr request

2014-01-31 Thread Jun Rao
It's correlated in the following way. Both the replica followers and the clients (producer and consumer) need to know the leader replica of a partition. If the leader of a partition fails, then (1) a new leader needs to be elected, which triggers LeaderAndIsr requests to be sent to the followers an

Re: New Producer Public API

2014-01-31 Thread David Arthur
On 1/24/14 7:41 PM, Jay Kreps wrote: Yeah I'll fix that name. Hmm, yeah, I agree that often you want to be able delay network connectivity until you have started everything up. But at the same time I kind of loath special init() methods because you always forget to call them and get one round o

Producer garbage collection problem

2014-01-31 Thread Florian Ollech
Hi, I am currently trying to setup Kafka 0.8 and ran into a problem with the producer. Every time the server triggers a garbage collection, because permgen space is full, the server descends into a very unusual memory state where a lot of garbage collection is suddenly happening (~5-10% gc cpu tim