Re: Question about Kafka

2017-09-20 Thread Steve Tian
Producer configuration?

On Wed, Sep 20, 2017, 2:50 PM MAHA ALSAYASNEH <
maha.alsayas...@univ-grenoble-alpes.fr> wrote:

> Hello,
>
> Any suggestion regarding this msg:
> " org.apache.kafka.common.errors.TimeoutException: Expiring 61 record(s)
> for  due to 30001 ms has passed since batch creation plus linger time "
>
> Thanks in advance
> Maha
>
>
> From: "MAHA ALSAYASNEH" 
> To: "users" 
> Sent: Tuesday, September 19, 2017 6:18:25 PM
> Subject: Re: Question about Kafka
>
> Well I kept the defualt:
> log.retention.hours=168
>
>
> Here are my broker configurations:
>
> # Server Basics #
>
> # The id of the broker. This must be set to a unique integer for each
> broker.
> broker.id=3
> host.name=
>
> port=9092
> zookeeper.connect=xxx:2181,:2181,:2181
>
> #The maximum size of message that the server can receive
> message.max.bytes=224
>
>
> eplica.fetch.max.bytes=224
> request.timeout.ms=30
> log.flush.interval.ms=1
> log.flush.interval.messages=2
>
> request.timeout.ms=30
>
> #replica.socket.timeout.ms=6
> #linger.ms=3
>
> # Switch to enable topic deletion or not, default value is false
> delete.topic.enable=true
>
> # Socket Server Settings
> #
>
> # The address the socket server listens on. It will get the value returned
> from
> # java.net.InetAddress.getCanonicalHostName() if not configured.
> # FORMAT:
> # listeners = security_protocol://host_name:port
> # EXAMPLE:
> # listeners = PLAINTEXT://your.host.name:9092
> listeners=PLAINTEXT://x.x.x.X:9092
>
>
> # Hostname and port the broker will advertise to producers and consumers.
> If not set,
> # it uses the value for "listeners" if configured. Otherwise, it will use
> the value
> # returned from java.net.InetAddress.getCanonicalHostName().
> #advertised.listeners=PLAINTEXT://your.host.name:9092
>
> # The number of threads handling network requests
> num.network.threads=4
>
> # The number of threads doing disk I/O
> num.io.threads=8
>
> # The send buffer (SO_SNDBUF) used by the socket server
> socket.send.buffer.bytes=102400
>
> # The receive buffer (SO_RCVBUF) used by the socket server
> socket.receive.buffer.bytes=102400
>
> # The maximum size of a request that the socket server will accept
> (protection against OOM)
> socket.request.max.bytes=104857600
>
>
> # Log Basics #
>
> # A comma seperated list of directories under which to store log files
> log.dirs=/tmp/kafka-logs
>
>
> # The default number of log partitions per topic. More partitions allow
> greater
> # parallelism for consumption, but this will also result in more files
> across
> # the brokers.
> num.partitions=8
>
> # The number of threads per data directory to be used for log recovery at
> startup and flushing at shutdown.
> # This value is recommended to be increased for installations with data
> dirs located in RAID array.
> num.recovery.threads.per.data.dir=1
>
> # Log Flush Policy
> #
>
> # Messages are immediately written to the filesystem but by default we
> only fsync() to sync
> # the OS cache lazily. The following configurations control the flush of
> data to disk.
> # There are a few important trade-offs here:
> # 1. Durability: Unflushed data may be lost if you are not using
> replication.
> # 2. Latency: Very large flush intervals may lead to latency spikes when
> the flush does occur as there will be a lot of data to flush.
> # 3. Throughput: The flush is generally the most expensive operation, and
> a small flush interval may lead to exceessive seeks.
> # The settings below allow one to configure the flush policy to flush data
> after a period of time or
> # every N messages (or both). This can be done globally and overridden on
> a per-topic basis.
>
> # The number of messages to accept before forcing a flush of data to disk
> #log.flush.interval.messages=1
>
> # The maximum amount of time a message can sit in a log before we force a
> flush
> #log.flush.interval.ms=1000
>
> # Log Retention Policy
> #
>
> # The following configurations control the disposal of log segments. The
> policy can
> # be set to delete segments after a period of time, or after a given size
> has accumulated.
> # A segment will be deleted whenever *either* of these criteria are met.
> Deletion always happens
> # from the end of the log.
>
> #log.retention.ms=60
>
> # The minimum age of a log file to be eligible for deletion
> log.retention.hours=168
>
> # A size-based retention policy for logs. Segments are pruned from the log
> as long as the remaining
> # segments don't drop below log.retention.bytes.
> #log.retention.bytes=1073741824
>
> # The maximum size of a log segment file. When this size is reached a new
> log segment will be created.
> log.segment.bytes=536870

Question regarding to reconnect.backoff.ms

2015-08-31 Thread Steve Tian
Hi everyone,

Is there any concerns to have a long reconnect.backoff.ms for new java
Kafka producer (0.8.2.0/0.8.2.1)?

Assuming we have bootstrap.servers=host1:port1,host2:port2,host3:port3 and
host1 is *down* in the very beginning. If a newly created Kafka producer
decide to choose host1 as first node to connect for metadata update, then
that producer will keep trying on host1 *only* as default tcp timeout is
surely longer than default value of reconnect.backoff.ms, which is 10 ms.

I am thinking to have reconnect.backoff.ms longer than N * T where N is the
number of nodes in bootstrap.servers and T is the default tcp timeout.  Is
there any concerns to have a long reconnect.backoff.ms like that?  Any
better solutions?

Cheers, Steve


Re: Question regarding to reconnect.backoff.ms

2015-09-01 Thread Steve Tian
Thanks, Rahul!  In my environment I need to have reconnect.backoff.ms
longer than OS default tcp timeout so that NetworkClient can give second
node a try.

I believe this is related to
https://issues.apache.org/jira/browse/KAFKA-2459 .

Cheers, Steve

On Tue, Sep 1, 2015, 5:24 PM Rahul Jain  wrote:

> We did notice something similar. When a broker node (out of 3) went down,
> metadata calls continued to go to the failed node and producer kept
> failing. We were able to make it work by increasing the
> reconnect.backoff.ms
> to 1 second.
>
> Something similar was discussed earlier -
>
> http://qnalist.com/questions/6002514/new-producer-metadata-update-problem-on-2-node-cluster
>
>
>
> On Mon, Aug 31, 2015 at 11:00 PM, Steve Tian 
> wrote:
>
> > Hi everyone,
> >
> > Is there any concerns to have a long reconnect.backoff.ms for new java
> > Kafka producer (0.8.2.0/0.8.2.1)?
> >
> > Assuming we have bootstrap.servers=host1:port1,host2:port2,host3:port3
> and
> > host1 is *down* in the very beginning. If a newly created Kafka producer
> > decide to choose host1 as first node to connect for metadata update, then
> > that producer will keep trying on host1 *only* as default tcp timeout is
> > surely longer than default value of reconnect.backoff.ms, which is 10
> ms.
> >
> > I am thinking to have reconnect.backoff.ms longer than N * T where N is
> > the
> > number of nodes in bootstrap.servers and T is the default tcp timeout.
> Is
> > there any concerns to have a long reconnect.backoff.ms like that?  Any
> > better solutions?
> >
> > Cheers, Steve
> >
>


Re: Question regarding to reconnect.backoff.ms

2015-09-02 Thread Steve Tian
Would kafka dev kindly give us some advice on this?

Cheers, Steve

On Tue, Sep 1, 2015, 11:20 PM Steve Tian  wrote:

> Thanks, Rahul!  In my environment I need to have reconnect.backoff.ms
> longer than OS default tcp timeout so that NetworkClient can give second
> node a try.
>
> I believe this is related to
> https://issues.apache.org/jira/browse/KAFKA-2459 .
>
> Cheers, Steve
>
> On Tue, Sep 1, 2015, 5:24 PM Rahul Jain  wrote:
>
>> We did notice something similar. When a broker node (out of 3) went down,
>> metadata calls continued to go to the failed node and producer kept
>> failing. We were able to make it work by increasing the
>> reconnect.backoff.ms
>> to 1 second.
>>
>> Something similar was discussed earlier -
>>
>> http://qnalist.com/questions/6002514/new-producer-metadata-update-problem-on-2-node-cluster
>>
>>
>>
>> On Mon, Aug 31, 2015 at 11:00 PM, Steve Tian 
>> wrote:
>>
>> > Hi everyone,
>> >
>> > Is there any concerns to have a long reconnect.backoff.ms for new java
>> > Kafka producer (0.8.2.0/0.8.2.1)?
>> >
>> > Assuming we have bootstrap.servers=host1:port1,host2:port2,host3:port3
>> and
>> > host1 is *down* in the very beginning. If a newly created Kafka producer
>> > decide to choose host1 as first node to connect for metadata update,
>> then
>> > that producer will keep trying on host1 *only* as default tcp timeout is
>> > surely longer than default value of reconnect.backoff.ms, which is 10
>> ms.
>> >
>> > I am thinking to have reconnect.backoff.ms longer than N * T where N is
>> > the
>> > number of nodes in bootstrap.servers and T is the default tcp timeout.
>> Is
>> > there any concerns to have a long reconnect.backoff.ms like that?  Any
>> > better solutions?
>> >
>> > Cheers, Steve
>> >
>>
>


Re: Question regarding to reconnect.backoff.ms

2015-09-02 Thread Steve Tian
Got it. Thanks a lot Ewen!

Cheers, Steve

On Thu, Sep 3, 2015, 10:06 AM Ewen Cheslack-Postava 
wrote:

> Steve,
>
> I don't think there is a better solution at the moment. This is an easy
> issue to miss in unit testing because generally connections to localhost
> will be rejected immediately if there isn't anything listening on the port.
> If you're running in an environment where this happens normally, then for
> now you'll need to wait for the long timeout.
>
> https://issues.apache.org/jira/browse/KAFKA-2120 may also alleviate the
> problem by at least reducing the amount of time for the request to fail.
> Depending on how adventurous you are, you could try using a version with
> that patch and maybe adjust the setting lower than its default.
>
> -Ewen
>
> On Wed, Sep 2, 2015 at 10:46 AM, Steve Tian 
> wrote:
>
> > Would kafka dev kindly give us some advice on this?
> >
> > Cheers, Steve
> >
> > On Tue, Sep 1, 2015, 11:20 PM Steve Tian 
> wrote:
> >
> > > Thanks, Rahul!  In my environment I need to have reconnect.backoff.ms
> > > longer than OS default tcp timeout so that NetworkClient can give
> second
> > > node a try.
> > >
> > > I believe this is related to
> > > https://issues.apache.org/jira/browse/KAFKA-2459 .
> > >
> > > Cheers, Steve
> > >
> > > On Tue, Sep 1, 2015, 5:24 PM Rahul Jain  wrote:
> > >
> > >> We did notice something similar. When a broker node (out of 3) went
> > down,
> > >> metadata calls continued to go to the failed node and producer kept
> > >> failing. We were able to make it work by increasing the
> > >> reconnect.backoff.ms
> > >> to 1 second.
> > >>
> > >> Something similar was discussed earlier -
> > >>
> > >>
> >
> http://qnalist.com/questions/6002514/new-producer-metadata-update-problem-on-2-node-cluster
> > >>
> > >>
> > >>
> > >> On Mon, Aug 31, 2015 at 11:00 PM, Steve Tian  >
> > >> wrote:
> > >>
> > >> > Hi everyone,
> > >> >
> > >> > Is there any concerns to have a long reconnect.backoff.ms for new
> > java
> > >> > Kafka producer (0.8.2.0/0.8.2.1)?
> > >> >
> > >> > Assuming we have
> bootstrap.servers=host1:port1,host2:port2,host3:port3
> > >> and
> > >> > host1 is *down* in the very beginning. If a newly created Kafka
> > producer
> > >> > decide to choose host1 as first node to connect for metadata update,
> > >> then
> > >> > that producer will keep trying on host1 *only* as default tcp
> timeout
> > is
> > >> > surely longer than default value of reconnect.backoff.ms, which is
> 10
> > >> ms.
> > >> >
> > >> > I am thinking to have reconnect.backoff.ms longer than N * T where
> N
> > is
> > >> > the
> > >> > number of nodes in bootstrap.servers and T is the default tcp
> timeout.
> > >> Is
> > >> > there any concerns to have a long reconnect.backoff.ms like that?
> > Any
> > >> > better solutions?
> > >> >
> > >> > Cheers, Steve
> > >> >
> > >>
> > >
> >
>
>
>
> --
> Thanks,
> Ewen
>


Re: Producer Config Error - Kafka 0.9.0.0

2016-01-18 Thread Steve Tian
Hi Joe,

bootstrap.servers is the configuration for new *java* producer.

Cheers, Steve

On Mon, Jan 18, 2016, 5:41 PM Joe San  wrote:

> Dear Kafka Users,
>
> Is there a bug with the producer config?
>
> I have asked a question on Stackoverflow:
>
>
> http://stackoverflow.com/questions/34851412/apache-kafka-producer-config-error
>
> As per the documentation, I need to only provide bootstrap.servers
>
> but, when I run my producer client, I get a message that says:
>
> Exception in thread "main" java.lang.IllegalArgumentException:
> requirement failed: Missing required property 'metadata.broker.list'
> at scala.Predef$.require(Predef.scala:219)
> at
> kafka.utils.VerifiableProperties.getString(VerifiableProperties.scala:177)
> at kafka.producer.ProducerConfig.(ProducerConfig.scala:66)
> at kafka.producer.ProducerConfig.(ProducerConfig.scala:56)
> at com.eon.vpp.MetricsProducer$.main(MetricsProducer.scala:45)
> at com.eon.vpp.MetricsProducer.main(MetricsProducer.scala)
>
>
> Regards,
> Joe
>


Re: Kafka 0.9 producer doesn't work

2016-01-20 Thread Steve Tian
Did you start your consumer before sending message?  Broker version?

Cheers, Steve

On Wed, Jan 20, 2016, 3:57 PM BYEONG-GI KIM  wrote:

> Hello.
>
> I set up the Kafka testbed environment on my VirtualBox, which simply has a
> Kafka broker.
>
> I tested the simple consumer & producer scripts, aka
> kafka-console-consumer.sh and bin/kafka-console-producer.sh respectively,
> and both of them worked fine. I could see the output from the consumer side
> whenever typing any words on the producer.
>
> After that, I moved to test a simple java kafka producer/consumer. I copied
> and pasted the example source code for producer from
>
> http://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html
> ,
> and yeah, unfortunately, it seems not working well; no output was printed
> by the above consumer script. There was even no error log on Eclipse.
>
> I really don't know what the problem is... I think that the properties for
> both zookeeper and kafka seems fine, since the example scripts worked well,
> at least.
>
> I attached my tested source code:
> ==
>  import java.util.Properties;
>
> import org.apache.kafka.clients.producer.KafkaProducer;
> import org.apache.kafka.clients.producer.Producer;
> import org.apache.kafka.clients.producer.ProducerRecord;
> import org.apache.kafka.common.KafkaException;
> import org.apache.kafka.common.errors.TimeoutException;
>
> public class ProducerExample {
> public static void main(String[] args) throws Exception, TimeoutException,
> KafkaException {
> Properties props = new Properties();
> props.put("bootstrap.servers", "10.10.0.40:9092");
> props.put("acks", "all");
> props.put("retries", 0);
> props.put("batch.size", 16384);
> // props.put("linger.ms", 1);
> props.put("buffer.memory", 33554432);
> props.put("key.serializer",
> "org.apache.kafka.common.serialization.StringSerializer");
> props.put("value.serializer",
> "org.apache.kafka.common.serialization.StringSerializer");
>
> Producer producer = new KafkaProducer String>(props);
>
> try {
> for (int i = 0; i < 10; i++) {
> producer.send(new ProducerRecord("test", 0,
> Integer.toString(i), Integer.toString(i)));
> }
> } catch (TimeoutException te) {
> System.out.println(te.getStackTrace());
> te.getStackTrace();
> } catch (Exception ke) {
> System.out.println(ke.getStackTrace());
> ke.getStackTrace();
> }
>
> producer.close();
> }
> }
> ==
>
> Any advice would really be helpful. Thanks in advance.
>
> Best regards
>
> Kim
>


Re: Kafka 0.9 producer doesn't work

2016-01-20 Thread Steve Tian
Your code works in my environment.  Are you able to run your producer code
inside your vm?  You can also debug via changing the log level to
DEGUG/TRACE.

Cheers, Steve

On Wed, Jan 20, 2016, 4:30 PM BYEONG-GI KIM  wrote:

> Sure, I started consumer before starting and sending messages from
> producer, and my broker version, if you mean the kafka version, is 0.9.0.
>
> Best regards
>
> Kim
>
> 2016-01-20 17:28 GMT+09:00 Steve Tian :
>
>> Did you start your consumer before sending message?  Broker version?
>>
>> Cheers, Steve
>>
>> On Wed, Jan 20, 2016, 3:57 PM BYEONG-GI KIM  wrote:
>>
>> > Hello.
>> >
>> > I set up the Kafka testbed environment on my VirtualBox, which simply
>> has a
>> > Kafka broker.
>> >
>> > I tested the simple consumer & producer scripts, aka
>> > kafka-console-consumer.sh and bin/kafka-console-producer.sh
>> respectively,
>> > and both of them worked fine. I could see the output from the consumer
>> side
>> > whenever typing any words on the producer.
>> >
>> > After that, I moved to test a simple java kafka producer/consumer. I
>> copied
>> > and pasted the example source code for producer from
>> >
>> >
>> http://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html
>> > ,
>> > and yeah, unfortunately, it seems not working well; no output was
>> printed
>> > by the above consumer script. There was even no error log on Eclipse.
>> >
>> > I really don't know what the problem is... I think that the properties
>> for
>> > both zookeeper and kafka seems fine, since the example scripts worked
>> well,
>> > at least.
>> >
>> > I attached my tested source code:
>> > ==
>> >  import java.util.Properties;
>> >
>> > import org.apache.kafka.clients.producer.KafkaProducer;
>> > import org.apache.kafka.clients.producer.Producer;
>> > import org.apache.kafka.clients.producer.ProducerRecord;
>> > import org.apache.kafka.common.KafkaException;
>> > import org.apache.kafka.common.errors.TimeoutException;
>> >
>> > public class ProducerExample {
>> > public static void main(String[] args) throws Exception,
>> TimeoutException,
>> > KafkaException {
>> > Properties props = new Properties();
>> > props.put("bootstrap.servers", "10.10.0.40:9092");
>> > props.put("acks", "all");
>> > props.put("retries", 0);
>> > props.put("batch.size", 16384);
>> > // props.put("linger.ms", 1);
>> > props.put("buffer.memory", 33554432);
>> > props.put("key.serializer",
>> > "org.apache.kafka.common.serialization.StringSerializer");
>> > props.put("value.serializer",
>> > "org.apache.kafka.common.serialization.StringSerializer");
>> >
>> > Producer producer = new KafkaProducer> > String>(props);
>> >
>> > try {
>> > for (int i = 0; i < 10; i++) {
>> > producer.send(new ProducerRecord("test", 0,
>> > Integer.toString(i), Integer.toString(i)));
>> > }
>> > } catch (TimeoutException te) {
>> > System.out.println(te.getStackTrace());
>> > te.getStackTrace();
>> > } catch (Exception ke) {
>> > System.out.println(ke.getStackTrace());
>> > ke.getStackTrace();
>> > }
>> >
>> > producer.close();
>> > }
>> > }
>> > ==
>> >
>> > Any advice would really be helpful. Thanks in advance.
>> >
>> > Best regards
>> >
>> > Kim
>> >
>>
>
>
>
> --
> (주)비디 클라우드사업부 와이즈본부 클라우드기술팀 선임
>


Re: Kafka 0.9 producer doesn't work

2016-01-20 Thread Steve Tian
Yes, that's the version I was using.

If all you need is Java client, then you can try:

org.apache.kafka
kafka-clients
0.9.0.0


Cheers, Steve

On Thu, Jan 21, 2016, 9:04 AM BYEONG-GI KIM  wrote:

> Dear Steve
>
> Could you tell me what kafka version you are using for the source code's
> package?
>
> I included the kafka library from maven repository (
> http://mvnrepository.com/artifact/org.apache.kafka), and the artifactId
> is kafka_2.11 and version is 0.9.0.0. The link is as below:
>
> http://mvnrepository.com/artifact/org.apache.kafka/kafka_2.11/0.9.0.0
>
> The maven dependency is as below:
>
> 
> org.apache.kafka
> kafka_2.11
> 0.9.0.0
> 
>
> Are you using this version?
>
> Best regards
>
> Kim
>
> 2016-01-20 18:14 GMT+09:00 Steve Tian :
>
>> Your code works in my environment.  Are you able to run your producer
>> code inside your vm?  You can also debug via changing the log level to
>> DEGUG/TRACE.
>>
>> Cheers, Steve
>>
>>
>> On Wed, Jan 20, 2016, 4:30 PM BYEONG-GI KIM  wrote:
>>
>>> Sure, I started consumer before starting and sending messages from
>>> producer, and my broker version, if you mean the kafka version, is 0.9.0.
>>>
>>> Best regards
>>>
>>> Kim
>>>
>>> 2016-01-20 17:28 GMT+09:00 Steve Tian :
>>>
>>>> Did you start your consumer before sending message?  Broker version?
>>>>
>>>> Cheers, Steve
>>>>
>>>> On Wed, Jan 20, 2016, 3:57 PM BYEONG-GI KIM  wrote:
>>>>
>>>> > Hello.
>>>> >
>>>> > I set up the Kafka testbed environment on my VirtualBox, which simply
>>>> has a
>>>> > Kafka broker.
>>>> >
>>>> > I tested the simple consumer & producer scripts, aka
>>>> > kafka-console-consumer.sh and bin/kafka-console-producer.sh
>>>> respectively,
>>>> > and both of them worked fine. I could see the output from the
>>>> consumer side
>>>> > whenever typing any words on the producer.
>>>> >
>>>> > After that, I moved to test a simple java kafka producer/consumer. I
>>>> copied
>>>> > and pasted the example source code for producer from
>>>> >
>>>> >
>>>> http://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html
>>>> > ,
>>>> > and yeah, unfortunately, it seems not working well; no output was
>>>> printed
>>>> > by the above consumer script. There was even no error log on Eclipse.
>>>> >
>>>> > I really don't know what the problem is... I think that the
>>>> properties for
>>>> > both zookeeper and kafka seems fine, since the example scripts worked
>>>> well,
>>>> > at least.
>>>> >
>>>> > I attached my tested source code:
>>>> > ==
>>>> >  import java.util.Properties;
>>>> >
>>>> > import org.apache.kafka.clients.producer.KafkaProducer;
>>>> > import org.apache.kafka.clients.producer.Producer;
>>>> > import org.apache.kafka.clients.producer.ProducerRecord;
>>>> > import org.apache.kafka.common.KafkaException;
>>>> > import org.apache.kafka.common.errors.TimeoutException;
>>>> >
>>>> > public class ProducerExample {
>>>> > public static void main(String[] args) throws Exception,
>>>> TimeoutException,
>>>> > KafkaException {
>>>> > Properties props = new Properties();
>>>> > props.put("bootstrap.servers", "10.10.0.40:9092");
>>>> > props.put("acks", "all");
>>>> > props.put("retries", 0);
>>>> > props.put("batch.size", 16384);
>>>> > // props.put("linger.ms", 1);
>>>> > props.put("buffer.memory", 33554432);
>>>> > props.put("key.serializer",
>>>> > "org.apache.kafka.common.serialization.StringSerializer");
>>>> > props.put("value.serializer",
>>>> > "org.apache.kafka.common.serialization.StringSerializer");
>>>> >
>>>> > Producer producer = new KafkaProducer>>> > String>(props);
>>>> >
>>>> > try {
>>>> > for (int i = 0; i < 10; i++) {
>>>> > producer.send(new ProducerRecord("test", 0,
>>>> > Integer.toString(i), Integer.toString(i)));
>>>> > }
>>>> > } catch (TimeoutException te) {
>>>> > System.out.println(te.getStackTrace());
>>>> > te.getStackTrace();
>>>> > } catch (Exception ke) {
>>>> > System.out.println(ke.getStackTrace());
>>>> > ke.getStackTrace();
>>>> > }
>>>> >
>>>> > producer.close();
>>>> > }
>>>> > }
>>>> > ==
>>>> >
>>>> > Any advice would really be helpful. Thanks in advance.
>>>> >
>>>> > Best regards
>>>> >
>>>> > Kim
>>>> >
>>>>
>>>
>>>
>>>
>>> --
>>> (주)비디 클라우드사업부 와이즈본부 클라우드기술팀 선임
>>>
>>
>
>
> --
> (주)비디 클라우드사업부 와이즈본부 클라우드기술팀 선임
>


Re: Kafka 0.9 producer doesn't work

2016-01-20 Thread Steve Tian
Have you checked the firewall setting on vm/host?

On Thu, Jan 21, 2016, 10:29 AM BYEONG-GI KIM  wrote:

> Hello.
>
> I packaged it to an executable jar file and executed it on the VM, and
> yes, it was successfully worked.
>
> I'm really confuse why it didn't work on my Windows10 environment where is
> on the host environment and worked well on the VM environment... It is
> weird indeed.
>
> Best regards
>
> Kim
>
> 2016-01-20 18:14 GMT+09:00 Steve Tian :
>
>> Your code works in my environment.  Are you able to run your producer
>> code inside your vm?  You can also debug via changing the log level to
>> DEGUG/TRACE.
>>
>> Cheers, Steve
>>
>>
>> On Wed, Jan 20, 2016, 4:30 PM BYEONG-GI KIM  wrote:
>>
>>> Sure, I started consumer before starting and sending messages from
>>> producer, and my broker version, if you mean the kafka version, is 0.9.0.
>>>
>>> Best regards
>>>
>>> Kim
>>>
>>> 2016-01-20 17:28 GMT+09:00 Steve Tian :
>>>
>>>> Did you start your consumer before sending message?  Broker version?
>>>>
>>>> Cheers, Steve
>>>>
>>>> On Wed, Jan 20, 2016, 3:57 PM BYEONG-GI KIM  wrote:
>>>>
>>>> > Hello.
>>>> >
>>>> > I set up the Kafka testbed environment on my VirtualBox, which simply
>>>> has a
>>>> > Kafka broker.
>>>> >
>>>> > I tested the simple consumer & producer scripts, aka
>>>> > kafka-console-consumer.sh and bin/kafka-console-producer.sh
>>>> respectively,
>>>> > and both of them worked fine. I could see the output from the
>>>> consumer side
>>>> > whenever typing any words on the producer.
>>>> >
>>>> > After that, I moved to test a simple java kafka producer/consumer. I
>>>> copied
>>>> > and pasted the example source code for producer from
>>>> >
>>>> >
>>>> http://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html
>>>> > ,
>>>> > and yeah, unfortunately, it seems not working well; no output was
>>>> printed
>>>> > by the above consumer script. There was even no error log on Eclipse.
>>>> >
>>>> > I really don't know what the problem is... I think that the
>>>> properties for
>>>> > both zookeeper and kafka seems fine, since the example scripts worked
>>>> well,
>>>> > at least.
>>>> >
>>>> > I attached my tested source code:
>>>> > ==
>>>> >  import java.util.Properties;
>>>> >
>>>> > import org.apache.kafka.clients.producer.KafkaProducer;
>>>> > import org.apache.kafka.clients.producer.Producer;
>>>> > import org.apache.kafka.clients.producer.ProducerRecord;
>>>> > import org.apache.kafka.common.KafkaException;
>>>> > import org.apache.kafka.common.errors.TimeoutException;
>>>> >
>>>> > public class ProducerExample {
>>>> > public static void main(String[] args) throws Exception,
>>>> TimeoutException,
>>>> > KafkaException {
>>>> > Properties props = new Properties();
>>>> > props.put("bootstrap.servers", "10.10.0.40:9092");
>>>> > props.put("acks", "all");
>>>> > props.put("retries", 0);
>>>> > props.put("batch.size", 16384);
>>>> > // props.put("linger.ms", 1);
>>>> > props.put("buffer.memory", 33554432);
>>>> > props.put("key.serializer",
>>>> > "org.apache.kafka.common.serialization.StringSerializer");
>>>> > props.put("value.serializer",
>>>> > "org.apache.kafka.common.serialization.StringSerializer");
>>>> >
>>>> > Producer producer = new KafkaProducer>>> > String>(props);
>>>> >
>>>> > try {
>>>> > for (int i = 0; i < 10; i++) {
>>>> > producer.send(new ProducerRecord("test", 0,
>>>> > Integer.toString(i), Integer.toString(i)));
>>>> > }
>>>> > } catch (TimeoutException te) {
>>>> > System.out.println(te.getStackTrace());
>>>> > te.getStackTrace();
>>>> > } catch (Exception ke) {
>>>> > System.out.println(ke.getStackTrace());
>>>> > ke.getStackTrace();
>>>> > }
>>>> >
>>>> > producer.close();
>>>> > }
>>>> > }
>>>> > ==
>>>> >
>>>> > Any advice would really be helpful. Thanks in advance.
>>>> >
>>>> > Best regards
>>>> >
>>>> > Kim
>>>> >
>>>>
>>>
>>>
>>>
>>> --
>>> (주)비디 클라우드사업부 와이즈본부 클라우드기술팀 선임
>>>
>>
>
>
> --
> (주)비디 클라우드사업부 와이즈본부 클라우드기술팀 선임
>


Re: Kafka take too long to update the client with metadata when a broker is gone

2016-06-02 Thread Steve Tian
Client version?

On Fri, Jun 3, 2016, 4:44 AM safique ahemad  wrote:

> Hi All,
>
> We are using Kafka broker cluster in our data center.
> Recently, It is realized that when a Kafka broker goes down then client try
> to refresh the metadata but it get stale metadata upto near 30 seconds.
>
> After near 30-35 seconds, updated metadata is obtained by client. This is
> really a large time for the client continuously gets send failure for so
> long.
>
> Kindly, reply if any configuration may help here or something else or
> required.
>
>
> --
>
> Regards,
> Safique Ahemad
>


Re: Kafka take too long to update the client with metadata when a broker is gone

2016-06-02 Thread Steve Tian
So you are coming from https://github.com/Shopify/sarama/issues/661 ,
right?   I'm not sure if anything from broker side can help but looks like
you already found DialTimeout on client side can help?

Cheers, Steve

On Fri, Jun 3, 2016, 8:33 AM safique ahemad  wrote:

> kafka version:0.9.0.0
> go sarama client version: 1.8
>
> On Thu, Jun 2, 2016 at 5:14 PM, Steve Tian 
> wrote:
>
> > Client version?
> >
> > On Fri, Jun 3, 2016, 4:44 AM safique ahemad 
> wrote:
> >
> > > Hi All,
> > >
> > > We are using Kafka broker cluster in our data center.
> > > Recently, It is realized that when a Kafka broker goes down then client
> > try
> > > to refresh the metadata but it get stale metadata upto near 30 seconds.
> > >
> > > After near 30-35 seconds, updated metadata is obtained by client. This
> is
> > > really a large time for the client continuously gets send failure for
> so
> > > long.
> > >
> > > Kindly, reply if any configuration may help here or something else or
> > > required.
> > >
> > >
> > > --
> > >
> > > Regards,
> > > Safique Ahemad
> > >
> >
>
>
>
> --
>
> Regards,
> Safique Ahemad
> GlobalLogic | Leaders in software R&D services
> P :+91 120 4342000-2990 | M:+91 9953533367
> www.globallogic.com
>


Re: Kafka take too long to update the client with metadata when a broker is gone

2016-06-02 Thread Steve Tian
I see.  I'm not sure if this is a known issue.  Do you mind share the
brokers/topics setup and the steps to reproduce this issue?

Cheers, Steve

On Fri, Jun 3, 2016, 9:45 AM safique ahemad  wrote:

> you got it right...
>
> But DialTimeout is not a concern here. Client try fetching metadata from
> Kafka brokers but Kafka give them stale metadata near 30-40 sec.
> It try to fetch 3-4 time in between until it get updated metadata.
> This is completely different problem than
> https://github.com/Shopify/sarama/issues/661
>
>
>
> On Thu, Jun 2, 2016 at 6:05 PM, Steve Tian 
> wrote:
>
> > So you are coming from https://github.com/Shopify/sarama/issues/661 ,
> > right?   I'm not sure if anything from broker side can help but looks
> like
> > you already found DialTimeout on client side can help?
> >
> > Cheers, Steve
> >
> > On Fri, Jun 3, 2016, 8:33 AM safique ahemad 
> wrote:
> >
> > > kafka version:0.9.0.0
> > > go sarama client version: 1.8
> > >
> > > On Thu, Jun 2, 2016 at 5:14 PM, Steve Tian 
> > > wrote:
> > >
> > > > Client version?
> > > >
> > > > On Fri, Jun 3, 2016, 4:44 AM safique ahemad 
> > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > We are using Kafka broker cluster in our data center.
> > > > > Recently, It is realized that when a Kafka broker goes down then
> > client
> > > > try
> > > > > to refresh the metadata but it get stale metadata upto near 30
> > seconds.
> > > > >
> > > > > After near 30-35 seconds, updated metadata is obtained by client.
> > This
> > > is
> > > > > really a large time for the client continuously gets send failure
> for
> > > so
> > > > > long.
> > > > >
> > > > > Kindly, reply if any configuration may help here or something else
> or
> > > > > required.
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Regards,
> > > > > Safique Ahemad
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Regards,
> > > Safique Ahemad
> > > GlobalLogic | Leaders in software R&D services
> > > P :+91 120 4342000-2990 | M:+91 9953533367
> > > www.globallogic.com
> > >
> >
>
>
>
> --
>
> Regards,
> Safique Ahemad
> GlobalLogic | Leaders in software R&D services
> P :+91 120 4342000-2990 | M:+91 9953533367
> www.globallogic.com
>


Re: Max poll interval and timeouts

2020-03-25 Thread Steve Tian
Hi Ryan,

Have you tried Consumer's pause/resume methods?

Steve

On Wed, Mar 25, 2020, 17:13 Kamal Chandraprakash <
kamal.chandraprak...@gmail.com> wrote:

> With group coordination protocol, you only have to increase the `
> max.poll.interval.ms` / `max.poll.records`.
> Ignore the above messages. Consumer heartbeats are processed in a separate
> thread.
>
> On Wed, Mar 25, 2020 at 2:35 PM Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
>
> > Yes, with `assign` you'll lose the group coordination. You can still use
> > the `subscribe` mode, update the above mentioned configs.
> > You're ask is kind of Delay Queue. Kafka Consumer doesn't support that
> > feature. You've to manually `sleep` in between the poll calls.
> >
> > On Tue, Mar 24, 2020 at 11:56 PM Ryan Schachte <
> coderyanschac...@gmail.com>
> > wrote:
> >
> >> Don't I lose consumer group coordination with assign?
> >>
> >> On Mon, Mar 23, 2020 at 11:49 PM Kamal Chandraprakash <
> >> kamal.chandraprak...@gmail.com> wrote:
> >>
> >> > Hi Ryan,
> >> >
> >> > The maxPollInterval waits for at-most the given time duration and
> >> returns
> >> > ASAP even if a single record is available.
> >> > If you want to collect data once 30-45 minutes,  better to use the
> >> Consumer
> >> > with `assign` mode and poll for records
> >> > once in 30 minutes.
> >> >
> >> > If you're using the consumer with `subscribe` mode, then you have to
> >> update
> >> > the following configs:
> >> > 1. session.timeout.ms
> >> > 2. heartbeat.interval.ms and
> >> > 3. group.max.session.timeout.ms in the broker configs.
> >> >
> >> > Increasing the session timeout will lead to delay in detecting the
> >> consumer
> >> > failures, I would suggest to go with `assign` mode.
> >> >
> >> >
> >> > On Tue, Mar 24, 2020 at 4:45 AM Ryan Schachte <
> >> coderyanschac...@gmail.com>
> >> > wrote:
> >> >
> >> > > Hey guys, I'm getting a bit overwhelmed by the different variables
> >> used
> >> > to
> >> > > help enable batching for me.
> >> > >
> >> > > I have some custom batching logic that processes when either N
> records
> >> > have
> >> > > been buffered or my max timeout has been hit. It was working
> decently
> >> > well,
> >> > > but I hit this error:
> >> > >
> >> > > *This means that the time between subsequent calls to poll() was
> >> longer
> >> > > than the configured max.poll.interval.ms <
> http://max.poll.interval.ms
> >> >,
> >> > > which typically implies that the poll loop is spending too much time
> >> > > message processing.*
> >> > >
> >> > > I ultimately want to wait for the buffer to fill up or sit and
> collect
> >> > data
> >> > > continuously for 30-45 mins at a time. Do I need to do anything with
> >> > > heartbeat or session timeout as well?
> >> > >
> >> > > So now my question is.. Can I just bump my maxPollInterval to
> >> something
> >> > > like:
> >> > >
> >> > > maxPollInterval: '270',
> >> > >
> >> >
> >>
> >
>


Re: Kafka Consumer Hung on Certain Partitions - Single Kafka Consumer

2018-07-08 Thread Steve Tian
Are you sure your consumer was still the owner of your partitions?

On Mon, Jul 9, 2018, 12:54 PM dev loper  wrote:

> Hi Kafka Users,
>
> I am desperate to find an answer to this issue. I would like to know
> whether my issue is due to a Single Kafka Consumer ?  Where should I look
> for answers for this issue?  Is it something do with the Kafka Broker ? I
> am using Kafka version 0.11.01 version for both Kafka broker an client .
>
> Thanks
>
> Dev Loper
>
> On Fri, Jul 6, 2018 at 11:01 PM, dev loper  wrote:
>
> > Hi Kafka Streams Users,
> >
> > I have posted the same question on stackoverflow and if anybody could
> > point some directions  it would be of great help.
> >
> > https://stackoverflow.com/questions/51214506/kafka-
> > consumer-hung-on-certain-partitions-single-kafka-consumer
> >
> >
> > On Fri, Jul 6, 2018 at 10:25 PM, dev loper  wrote:
> >
> >> Hi Kafka Streams Users,
> >>
> >>  My test environment, I have three Kafka brokers and my topic is having
> >> 16 partitions, there are 16 IOT Devices which are posting the messages
> to
> >> Kafka. I have a single system with one Kafka Consumer which subscribes
> to
> >> this topic. Each IOT devices are posting  the message to Kafka every
> second
> >> and its distributed uniformly. I am
> >> printing the offset and partition to which the data posted using Kafka
> >> Producer Call back method on these each IOT device . My consumer stops
> >> consuming the messages from certain partitions randomly and at the same
> >> time its processing records from other partitions. I actually verified
> the
> >> IOT device logs and I could see that the data is actually getting
> posted to
> >> the particular partitions where the consumer has stopped consuming and
> I am
> >> able to see the offsets getting incremented for those partitions. There
> are
> >> no exceptions or any error of any kind in the consumer except that I
> don't
> >> see any processing logs for the partitions which stopped processing.
> >>
> >>
> >> Below I have given my pseudo code for my consumer  which almost
> resembles
> >> the code which I am using in my application .
> >>
> >> public class MyKafkaConumer extends Thread {
> >>
> >>private static final AtomicBoolean running= new AtomicBoolean(true);
> >>private static final KafkaConsumer consumer;
> >>public static final MyKafkaConumer INSTANCE = new MyKafkaConumer();
> >>
> >>static {
> >>
> >>   Properties props = new Properties();
> >>   props.put("bootstrap.servers", "kafkaServer101:9092,kafkaServ
> >> er102:9092,kafkaServer103:9092");
> >>   props.put("group.id", "mykafa-group");
> >>   props.put("enable.auto.commit", "true");
> >>   props.put("auto.commit.interval.ms", "1000");
> >>   props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
> >>   StringDeserializer.class.getName());
> >>   props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
> >>   IOTDeserializer.class.getName());
> >>   consumer = new KafkaConsumer(props);
> >>   consumer.subscribe(Arrays.asList("mytopic"));
> >>
> >>}
> >>
> >>  private MyKafkaConumer() {
> >>   super("MyKafkaConumer");
> >> }
> >>
> >>public void run() {
> >>
> >>
> >>try {
> >> while (running.get()) {
> >> ConsumerRecords records = consumer.poll(2000L);
> >> records.forEach(record -> {
> >> System.out.printf("Consumer Record:(%d, %s, %d, %d)\n",
> >> record.key(), record.value(),
> >> record.partition(), record.offset());
> >> });
> >>
> >> }
> >> } finally {
> >>   consumer.close();
> >> }
> >>
> >>}
> >>
> >>public static void main(String[] args) throws InterruptedException {
> >>MyKafkaConumer.INSTANCE.start();
> >>MyKafkaConumer.INSTANCE.join();
> >> }
> >> }
> >>
> >> I only have a single Consumer with a single thread running . What could
> >> be the reason for the Kafka Consumer to stop processing from certain
> >> partitions while the processing is happening for other partitions even
> >> though the producer is sending message to the partitions where it was
> >> stuck ? Any help here is very much appreciated.
> >>
> >> Thanks
> >> Dev Loper
> >>
> >>
> >>
> >>
> >>
> >
>


Re: Kafka Consumer Hung on Certain Partitions - Single Kafka Consumer

2018-07-08 Thread Steve Tian
Also, have you tried to enabled DEBUG level logging for KafkaConsumer? How
long does it take to process records in a single batch?

On Mon, Jul 9, 2018, 2:51 PM Sabarish Sasidharan 
wrote:

> Can you try executing kafka-consumer-groups to see what it says? Are your
> log offsets increasing and the lag increasing when processing for some of
> your partitions are stuck?
>
> May be the problem is in the producer side.
>
> Regards
> Sab
>
>
> On Mon, 9 Jul 2018, 11:43 am Steve Tian,  wrote:
>
> > Are you sure your consumer was still the owner of your partitions?
> >
> > On Mon, Jul 9, 2018, 12:54 PM dev loper  wrote:
> >
> > > Hi Kafka Users,
> > >
> > > I am desperate to find an answer to this issue. I would like to know
> > > whether my issue is due to a Single Kafka Consumer ?  Where should I
> look
> > > for answers for this issue?  Is it something do with the Kafka Broker
> ? I
> > > am using Kafka version 0.11.01 version for both Kafka broker an client
> .
> > >
> > > Thanks
> > >
> > > Dev Loper
> > >
> > > On Fri, Jul 6, 2018 at 11:01 PM, dev loper  wrote:
> > >
> > > > Hi Kafka Streams Users,
> > > >
> > > > I have posted the same question on stackoverflow and if anybody could
> > > > point some directions  it would be of great help.
> > > >
> > > > https://stackoverflow.com/questions/51214506/kafka-
> > > > consumer-hung-on-certain-partitions-single-kafka-consumer
> > > >
> > > >
> > > > On Fri, Jul 6, 2018 at 10:25 PM, dev loper 
> wrote:
> > > >
> > > >> Hi Kafka Streams Users,
> > > >>
> > > >>  My test environment, I have three Kafka brokers and my topic is
> > having
> > > >> 16 partitions, there are 16 IOT Devices which are posting the
> messages
> > > to
> > > >> Kafka. I have a single system with one Kafka Consumer which
> subscribes
> > > to
> > > >> this topic. Each IOT devices are posting  the message to Kafka every
> > > second
> > > >> and its distributed uniformly. I am
> > > >> printing the offset and partition to which the data posted using
> Kafka
> > > >> Producer Call back method on these each IOT device . My consumer
> stops
> > > >> consuming the messages from certain partitions randomly and at the
> > same
> > > >> time its processing records from other partitions. I actually
> verified
> > > the
> > > >> IOT device logs and I could see that the data is actually getting
> > > posted to
> > > >> the particular partitions where the consumer has stopped consuming
> and
> > > I am
> > > >> able to see the offsets getting incremented for those partitions.
> > There
> > > are
> > > >> no exceptions or any error of any kind in the consumer except that I
> > > don't
> > > >> see any processing logs for the partitions which stopped processing.
> > > >>
> > > >>
> > > >> Below I have given my pseudo code for my consumer  which almost
> > > resembles
> > > >> the code which I am using in my application .
> > > >>
> > > >> public class MyKafkaConumer extends Thread {
> > > >>
> > > >>private static final AtomicBoolean running= new
> > AtomicBoolean(true);
> > > >>private static final KafkaConsumer consumer;
> > > >>public static final MyKafkaConumer INSTANCE = new
> MyKafkaConumer();
> > > >>
> > > >>static {
> > > >>
> > > >>   Properties props = new Properties();
> > > >>   props.put("bootstrap.servers",
> > "kafkaServer101:9092,kafkaServ
> > > >> er102:9092,kafkaServer103:9092");
> > > >>   props.put("group.id", "mykafa-group");
> > > >>   props.put("enable.auto.commit", "true");
> > > >>   props.put("auto.commit.interval.ms", "1000");
> > > >>   props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
> > > >>   StringDeserializer.class.getName());
> > > >>   props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
> > > >>   IOTDeserializer.class.getName());
> > > >&

Re: Very long consumer rebalances

2018-07-09 Thread Steve Tian
Please re-read the javadoc of KafkaConsumer, make sure you know how to
wakeup/close consumer properly while shutting down your application.  Try
to understand the motivation of KIP-62 and adjust related timeout.

On Mon, Jul 9, 2018, 8:05 PM harish lohar  wrote:

> Try reducing below timer
> metadata.max.age.ms = 30
>
>
> On Fri, Jul 6, 2018 at 5:55 AM Shantanu Deshmukh 
> wrote:
>
> > Hello everyone,
> >
> > We are running a 3 broker Kafka 0.10.0.1 cluster. We have a java app
> which
> > spawns many consumer threads consuming from different topics. For every
> > topic we have specified different consumer-group. A lot of times I see
> that
> > whenever this application is restarted a CG on one or two topics takes
> more
> > than 5 minutes to receive partition assignment. Till that time consumers
> > for that topic don't consumer anything. If I go to Kafka broker and run
> > consumer-groups.sh and describe that particular CG I see that it is
> > rebalancing. There is time critical data stored in that topic and we
> cannot
> > tolerate such long delays. What can be the reason for such long
> rebalances.
> >
> > Here's our consumer config
> >
> >
> > auto.commit.interval.ms = 3000
> > auto.offset.reset = latest
> > bootstrap.servers = [x.x.x.x:9092, x.x.x.x:9092, x.x.x.x:9092]
> > check.crcs = true
> > client.id =
> > connections.max.idle.ms = 54
> > enable.auto.commit = true
> > exclude.internal.topics = true
> > fetch.max.bytes = 52428800
> > fetch.max.wait.ms = 500
> > fetch.min.bytes = 1
> > group.id = otp-notifications-consumer
> > heartbeat.interval.ms = 3000
> > interceptor.classes = null
> > key.deserializer = class
> > org.apache.kafka.common.serialization.StringDeserializer
> > max.partition.fetch.bytes = 1048576
> > max.poll.interval.ms = 30
> > max.poll.records = 50
> > metadata.max.age.ms = 30
> > metric.reporters = []
> > metrics.num.samples = 2
> > metrics.sample.window.ms = 3
> > partition.assignment.strategy = [class
> > org.apache.kafka.clients.consumer.RangeAssignor]
> > receive.buffer.bytes = 65536
> > reconnect.backoff.ms = 50
> > request.timeout.ms = 305000
> > retry.backoff.ms = 100
> > sasl.kerberos.kinit.cmd = /usr/bin/kinit
> > sasl.kerberos.min.time.before.relogin = 6
> > sasl.kerberos.service.name = null
> > sasl.kerberos.ticket.renew.jitter = 0.05
> > sasl.kerberos.ticket.renew.window.factor = 0.8
> > sasl.mechanism = GSSAPI
> > security.protocol = SSL
> > send.buffer.bytes = 131072
> > session.timeout.ms = 30
> > ssl.cipher.suites = null
> > ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
> > ssl.endpoint.identification.algorithm = null
> > ssl.key.password = null
> > ssl.keymanager.algorithm = SunX509
> > ssl.keystore.location = null
> > ssl.keystore.password = null
> > ssl.keystore.type = JKS
> > ssl.protocol = TLS
> > ssl.provider = null
> > ssl.secure.random.implementation = null
> > ssl.trustmanager.algorithm = PKIX
> > ssl.truststore.location = /x/x/client.truststore.jks
> > ssl.truststore.password = [hidden]
> > ssl.truststore.type = JKS
> > value.deserializer = class
> > org.apache.kafka.common.serialization.StringDeserializer
> >
> > Please help.
> >
> > *Thanks & Regards,*
> > *Shantanu Deshmukh*
> >
>


Re: Very long consumer rebalances

2018-07-12 Thread Steve Tian
It's a very good and important doc so I think you should read it all.  You
should get some idea from sections like *Detecting Consumer Failures* and
*Multi-threaded Processing* for your case.

On Thu, Jul 12, 2018, 3:17 PM Shantanu Deshmukh 
wrote:

> Hi Steve,
>
> Could you please shed more light on this? What section should I revisit? I
> am using high-level consumer. So I am simply calling consumer.close() when
> I am shutting down the process. Is there any other method to be called
> before calling close()?
>
> On Mon, Jul 9, 2018 at 5:58 PM Steve Tian  wrote:
>
> > Please re-read the javadoc of KafkaConsumer, make sure you know how to
> > wakeup/close consumer properly while shutting down your application.  Try
> > to understand the motivation of KIP-62 and adjust related timeout.
> >
> > On Mon, Jul 9, 2018, 8:05 PM harish lohar  wrote:
> >
> > > Try reducing below timer
> > > metadata.max.age.ms = 30
> > >
> > >
> > > On Fri, Jul 6, 2018 at 5:55 AM Shantanu Deshmukh <
> shantanu...@gmail.com>
> > > wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > We are running a 3 broker Kafka 0.10.0.1 cluster. We have a java app
> > > which
> > > > spawns many consumer threads consuming from different topics. For
> every
> > > > topic we have specified different consumer-group. A lot of times I
> see
> > > that
> > > > whenever this application is restarted a CG on one or two topics
> takes
> > > more
> > > > than 5 minutes to receive partition assignment. Till that time
> > consumers
> > > > for that topic don't consumer anything. If I go to Kafka broker and
> run
> > > > consumer-groups.sh and describe that particular CG I see that it is
> > > > rebalancing. There is time critical data stored in that topic and we
> > > cannot
> > > > tolerate such long delays. What can be the reason for such long
> > > rebalances.
> > > >
> > > > Here's our consumer config
> > > >
> > > >
> > > > auto.commit.interval.ms = 3000
> > > > auto.offset.reset = latest
> > > > bootstrap.servers = [x.x.x.x:9092, x.x.x.x:9092, x.x.x.x:9092]
> > > > check.crcs = true
> > > > client.id =
> > > > connections.max.idle.ms = 54
> > > > enable.auto.commit = true
> > > > exclude.internal.topics = true
> > > > fetch.max.bytes = 52428800
> > > > fetch.max.wait.ms = 500
> > > > fetch.min.bytes = 1
> > > > group.id = otp-notifications-consumer
> > > > heartbeat.interval.ms = 3000
> > > > interceptor.classes = null
> > > > key.deserializer = class
> > > > org.apache.kafka.common.serialization.StringDeserializer
> > > > max.partition.fetch.bytes = 1048576
> > > > max.poll.interval.ms = 30
> > > > max.poll.records = 50
> > > > metadata.max.age.ms = 30
> > > > metric.reporters = []
> > > > metrics.num.samples = 2
> > > > metrics.sample.window.ms = 3
> > > > partition.assignment.strategy = [class
> > > > org.apache.kafka.clients.consumer.RangeAssignor]
> > > > receive.buffer.bytes = 65536
> > > > reconnect.backoff.ms = 50
> > > > request.timeout.ms = 305000
> > > > retry.backoff.ms = 100
> > > > sasl.kerberos.kinit.cmd = /usr/bin/kinit
> > > > sasl.kerberos.min.time.before.relogin = 6
> > > > sasl.kerberos.service.name = null
> > > > sasl.kerberos.ticket.renew.jitter = 0.05
> > > > sasl.kerberos.ticket.renew.window.factor = 0.8
> > > > sasl.mechanism = GSSAPI
> > > > security.protocol = SSL
> > > > send.buffer.bytes = 131072
> > > > session.timeout.ms = 30
> > > > ssl.cipher.suites = null
> > > > ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
> > > > ssl.endpoint.identification.algorithm = null
> > > > ssl.key.password = null
> > > > ssl.keymanager.algorithm = SunX509
> > > > ssl.keystore.location = null
> > > > ssl.keystore.password = null
> > > > ssl.keystore.type = JKS
> > > > ssl.protocol = TLS
> > > > ssl.provider = null
> > > > ssl.secure.random.implementation = null
> > > > ssl.trustmanager.algorithm = PKIX
> > > > ssl.truststore.location = /x/x/client.truststore.jks
> > > > ssl.truststore.password = [hidden]
> > > > ssl.truststore.type = JKS
> > > > value.deserializer = class
> > > > org.apache.kafka.common.serialization.StringDeserializer
> > > >
> > > > Please help.
> > > >
> > > > *Thanks & Regards,*
> > > > *Shantanu Deshmukh*
> > > >
> > >
> >
>


Re: KafkaConsumer pause method not working as expected

2018-07-31 Thread Steve Tian
You should use ConsumerRebalanceListener to check if there is a rebalance
or not.  Seeing same assigned partitions after each poll doesn't mean there
is no rebalance.

Steve

On Wed, Aug 1, 2018, 8:57 AM Manoj Khangaonkar 
wrote:

> Hi,
>
> I am implementing flow control by
>
> (1) using the pause(partitions) method on the consumer to stop
> consumer.poll from returning messages.
>
> (2) using the resume(partitions) method on the consumer to let
> consumer.poll return messages
>
> This works well for a while. Several sets of  pause-resume work as
> expected.
>
> But after a while I am seeing that the paused consumer starts returning
> messages WITHOUT my code calling resume.
>
> There are no changes in the partitions assigned to consumer. I know this
> because I logs the assigned partitions after every consumer.poll(...). That
> should rule out a rebalance.
>
> Is there something that I am missing here ?
>
> Is there something more required to pause a consumer from retrieving
> message ?
>
> regards
>
>
> --
> http://khangaonkar.blogspot.com/
>


Re: I have issue in Kafka 2.0

2018-08-13 Thread Steve Tian
Have you checked the javadoc of KafkaConsumer?

On Mon, Aug 13, 2018, 11:10 PM Kailas Biradar  wrote:

> I have issue more time this ConcurrentModificationException because
> KafkaConsumer is not safe for multi-threaded access
>
> --
> kailas
>


Re: How to reduce kafka's rebalance time ?

2018-08-16 Thread Steve Tian
My $0.02:
1. Read the documentation.
2. Help people to understand your problem better:  Try to describe your
problem in a gist and see if you can provide some details like: which
version of Kafka you're using on client/server side, your
consumer/producer/server code/configuration that can reproduce your
problem.
3. Help yourself to understand your problem better:  Is the problem still
reproducible in different version/cluster/configuration/process logic?  For
example, you found there is a 10 minutes delay, what kind of configuration
is 10 minutes related?  Will delay get changed when you change your
configurations?  How long does it take to process records returned from a
single poll?
4. Learn from your logging/metrics:  If you can reproduce the problem, try
to see if anything unusual in the logs/metrics.  Try too enable/add logging
in your consumer, consumer rebalance listener and interceptor.   Try to
close your consumer gracefully/abruptly, try to kill your Java consumer
process, or even try to pause your Java consumer process to simulate gc
pause.   You should find something in the logs and JMX metrics, try to
reason about them.   You will need this for production
troubleshooting/monitoring.

Cheers,
Steve


On Thu, Aug 16, 2018, 7:48 PM Shantanu Deshmukh 
wrote:

> Hi Manna,
>
> I meant no offense. Simply meant to say that haven't found solution to my
> problem from here.
> Apologies, if my sentence was off the line.
>
> On Thu, Aug 16, 2018 at 4:05 PM M. Manna  wrote:
>
> > You have been recommended to upgrade to a newer version of Kafka, or tune
> > timeout params. Adhering to a older version is more of the users’
> decision.
> > Perhaps, we should simply put older versions as “End of Life”.
> >
> > As part of open source initiative, you are always welcome to debug and
> > demonstrate how your use case is different, and raise a KIP.
> >
> > Not sure what you mean by “*no have I received any help from here.” *
> >
> > We are always actively trying to contribute as much as we can, and
> > sometimes the answers may not be according to your expectations or
> > timeline. Hence, the open source initiative.
> >
> > Hope this makes sense.
> >
> > Regards,
> >
> > Regards,
> > On Thu, 16 Aug 2018 at 06:55, Shantanu Deshmukh 
> > wrote:
> >
> > > I am also facing the same issue. Whenever I am restarting my consumers
> it
> > > is taking upto 10 minutes to start consumption. Also some of the
> > consumers
> > > randomly rebalance and it again takes same amount of time to complete
> > > rebalance.
> > > I haven't been able to figure out any solution for this issue, nor
> have I
> > > received any help from here.
> > >
> > > On Thu, Aug 16, 2018 at 9:56 AM 堅強de泡沫  wrote:
> > >
> > > > hello:
> > > > How to reduce kafka's rebalance time ?
> > > > It takes a lot of time to rebalance each time. Why?
> > >
> >
>


Re: How to reduce kafka's rebalance time ?

2018-08-16 Thread Steve Tian
Interesting.  Thanks for letting us know.

On Fri, 17 Aug 2018 at 11:22 堅強de泡沫  wrote:

> I modified some consumer configuration items, mainly about the parameters
> of fetch frequency, heartbeat, and session timeout. The problem of long
> time rebalance has not been found in the test environment for a long time.
> The relevant configuration is as follows:fetch-max-wait: 1s
> heartbeat-interval: 1s
> session.timeout.ms: 1
> metadata.max.age.ms: 6000
> max.poll.records: 100
> max.poll.interval.ms: 500
>
>
>
>
> -- 原始邮件 --
> 发件人: "Shantanu Deshmukh";
> 发送时间: 2018年8月16日(星期四) 下午2:25
> 收件人: "users";
>
> 主题: Re: How to reduce kafka's rebalance time ?
>
>
>
> I am also facing the same issue. Whenever I am restarting my consumers it
> is taking upto 10 minutes to start consumption. Also some of the consumers
> randomly rebalance and it again takes same amount of time to complete
> rebalance.
> I haven't been able to figure out any solution for this issue, nor have I
> received any help from here.
>
> On Thu, Aug 16, 2018 at 9:56 AM 堅強de泡沫  wrote:
>
> > hello:
> > How to reduce kafka's rebalance time ?
> > It takes a lot of time to rebalance each time. Why?


Re: Frequent appearance of "Marking the coordinator dead" message in consumer log

2018-08-22 Thread Steve Tian
How long did it take to process 50 `ConsumerRecord`s?

On Wed, Aug 22, 2018, 5:16 PM Shantanu Deshmukh 
wrote:

> Hello,
>
> We have Kafka 0.10.0.1 running on a 3 broker cluster. We have an
> application which consumes from a topic having 10 partitions. 10 consumers
> are spawned from this process, they belong to one consumer group.
>
> What we have observed is that very frequently we are observing such
> messages in consumer logs
>
> [2018-08-21 11:12:46] :: WARN  :: ConsumerCoordinator:554 - Auto offset
> commit failed for group otp-email-consumer: Commit cannot be completed
> since the group has already rebalanced and assigned the partitions to
> another member. This means that the time between subsequent calls to poll()
> was longer than the configured max.poll.interval.ms, which typically
> implies that the poll loop is spending too much time message processing.
> You can address this either by increasing the session timeout or by
> reducing the maximum size of batches returned in poll() with
> max.poll.records.
> [2018-08-21 11:12:46] :: INFO  :: ConsumerCoordinator:333 - Revoking
> previously assigned partitions [otp-email-1, otp-email-0, otp-email-3,
> otp-email-2] for group otp-email-consumer
> [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:381 - (Re-)joining
> group otp-email-consumer
> [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:600 - *Marking the
> coordinator x.x.x.x:9092 (id: 2147483646 rack: null) dead for group
> otp-email-consumer*
> [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:600 - *Marking the
> coordinator x.x.x.x:9092 (id: 2147483646 rack: null) dead for group
> otp-email-consumer*
> [2018-08-21 11:12:46] :: INFO  ::
> AbstractCoordinator$GroupCoordinatorResponseHandler:555 - Discovered
> coordinator 10.189.179.117:9092 (id: 2147483646 rack: null) for group
> otp-email-consumer.
> [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:381 - (Re-)joining
> group otp-email-consumer
>
> After this, the group enters rebalancing phase and it takes about 5-10
> minutes to start consuming messages again.
> What does this message mean? The actual broker doesn't  go down as per our
> monitoring tools. So how come it is declared dead? Please help, I am stuck
> on this issue since 2 months now.
>
> Here's our consumer configuration
> auto.commit.interval.ms = 3000
> auto.offset.reset = latest
> bootstrap.servers = [x.x.x.x:9092, x.x.x.x:9092, x.x.x.x:9092]
> check.crcs = true
> client.id =
> connections.max.idle.ms = 54
> enable.auto.commit = true
> exclude.internal.topics = true
> fetch.max.bytes = 52428800
> fetch.max.wait.ms = 500
> fetch.min.bytes = 1
> group.id = otp-notifications-consumer
> heartbeat.interval.ms = 3000
> interceptor.classes = null
> key.deserializer = class org.apache.kafka.common.serialization.
> StringDeserializer
> max.partition.fetch.bytes = 1048576
> max.poll.interval.ms = 30
> max.poll.records = 50
> metadata.max.age.ms = 30
> metric.reporters = []
> metrics.num.samples = 2
> metrics.sample.window.ms = 3
> partition.assignment.strategy = [class org.apache.kafka.clients.
> consumer.RangeAssignor]
> receive.buffer.bytes = 65536
> reconnect.backoff.ms = 50
> request.timeout.ms = 305000
> retry.backoff.ms = 100
> sasl.kerberos.kinit.cmd = /usr/bin/kinit
> sasl.kerberos.min.time.before.relogin = 6
> sasl.kerberos.service.name = null
> sasl.kerberos.ticket.renew.jitter = 0.05
> sasl.kerberos.ticket.renew.window.factor = 0.8
> sasl.mechanism = GSSAPI
> security.protocol = SSL
> send.buffer.bytes = 131072
> session.timeout.ms = 30
> ssl.cipher.suites = null
> ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
> ssl.endpoint.identification.algorithm = null
> ssl.key.password = null
> ssl.keymanager.algorithm = SunX509
> ssl.keystore.location = null
> ssl.keystore.password = null
> ssl.keystore.type = JKS
> ssl.protocol = TLS
> ssl.provider = null
> ssl.secure.random.implementation = null
> ssl.trustmanager.algorithm = PKIX
> ssl.truststore.location = /x/x/client.truststore.jks
> ssl.truststore.password = [hidden]
> ssl.truststore.type = JKS
> value.deserializer = class org.apache.kafka.common.serialization.
> StringDeserializer
>


Re: Frequent appearance of "Marking the coordinator dead" message in consumer log

2018-08-22 Thread Steve Tian
Did you observed any GC-pausing?

On Wed, Aug 22, 2018, 6:38 PM Shantanu Deshmukh 
wrote:

> Hi Steve,
>
> Application is just sending mails. Every record is just a email request
> with very basic business logic. Generally it doesn't take more than 200ms
> to process a single mail. Currently it is averaging out at 70-80 ms.
>
> On Wed, Aug 22, 2018 at 3:06 PM Steve Tian 
> wrote:
>
> > How long did it take to process 50 `ConsumerRecord`s?
> >
> > On Wed, Aug 22, 2018, 5:16 PM Shantanu Deshmukh 
> > wrote:
> >
> > > Hello,
> > >
> > > We have Kafka 0.10.0.1 running on a 3 broker cluster. We have an
> > > application which consumes from a topic having 10 partitions. 10
> > consumers
> > > are spawned from this process, they belong to one consumer group.
> > >
> > > What we have observed is that very frequently we are observing such
> > > messages in consumer logs
> > >
> > > [2018-08-21 11:12:46] :: WARN  :: ConsumerCoordinator:554 - Auto offset
> > > commit failed for group otp-email-consumer: Commit cannot be completed
> > > since the group has already rebalanced and assigned the partitions to
> > > another member. This means that the time between subsequent calls to
> > poll()
> > > was longer than the configured max.poll.interval.ms, which typically
> > > implies that the poll loop is spending too much time message
> processing.
> > > You can address this either by increasing the session timeout or by
> > > reducing the maximum size of batches returned in poll() with
> > > max.poll.records.
> > > [2018-08-21 11:12:46] :: INFO  :: ConsumerCoordinator:333 - Revoking
> > > previously assigned partitions [otp-email-1, otp-email-0, otp-email-3,
> > > otp-email-2] for group otp-email-consumer
> > > [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:381 -
> (Re-)joining
> > > group otp-email-consumer
> > > [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:600 - *Marking
> the
> > > coordinator x.x.x.x:9092 (id: 2147483646 rack: null) dead for group
> > > otp-email-consumer*
> > > [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:600 - *Marking
> the
> > > coordinator x.x.x.x:9092 (id: 2147483646 rack: null) dead for group
> > > otp-email-consumer*
> > > [2018-08-21 11:12:46] :: INFO  ::
> > > AbstractCoordinator$GroupCoordinatorResponseHandler:555 - Discovered
> > > coordinator 10.189.179.117:9092 (id: 2147483646 rack: null) for group
> > > otp-email-consumer.
> > > [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:381 -
> (Re-)joining
> > > group otp-email-consumer
> > >
> > > After this, the group enters rebalancing phase and it takes about 5-10
> > > minutes to start consuming messages again.
> > > What does this message mean? The actual broker doesn't  go down as per
> > our
> > > monitoring tools. So how come it is declared dead? Please help, I am
> > stuck
> > > on this issue since 2 months now.
> > >
> > > Here's our consumer configuration
> > > auto.commit.interval.ms = 3000
> > > auto.offset.reset = latest
> > > bootstrap.servers = [x.x.x.x:9092, x.x.x.x:9092, x.x.x.x:9092]
> > > check.crcs = true
> > > client.id =
> > > connections.max.idle.ms = 54
> > > enable.auto.commit = true
> > > exclude.internal.topics = true
> > > fetch.max.bytes = 52428800
> > > fetch.max.wait.ms = 500
> > > fetch.min.bytes = 1
> > > group.id = otp-notifications-consumer
> > > heartbeat.interval.ms = 3000
> > > interceptor.classes = null
> > > key.deserializer = class org.apache.kafka.common.serialization.
> > > StringDeserializer
> > > max.partition.fetch.bytes = 1048576
> > > max.poll.interval.ms = 30
> > > max.poll.records = 50
> > > metadata.max.age.ms = 30
> > > metric.reporters = []
> > > metrics.num.samples = 2
> > > metrics.sample.window.ms = 3
> > > partition.assignment.strategy = [class org.apache.kafka.clients.
> > > consumer.RangeAssignor]
> > > receive.buffer.bytes = 65536
> > > reconnect.backoff.ms = 50
> > > request.timeout.ms = 305000
> > > retry.backoff.ms = 100
> > > sasl.kerberos.kinit.cmd = /usr/bin/kinit
> > > sasl.kerberos.min.time.before.relogin = 6
> > > sasl.kerberos.service.name = null
> > > sasl.kerberos.ticket.renew.jitter = 0.05
> > > sasl.kerberos.tick

Re: Frequent appearance of "Marking the coordinator dead" message in consumer log

2018-08-22 Thread Steve Tian
NVM.  What's your client version?  I'm asking as max.poll.interval.ms
should be introduced since 0.10.1.0, which is not the version you mentioned
in the email thread.

On Wed, Aug 22, 2018, 6:51 PM Shantanu Deshmukh 
wrote:

> How do I check for GC pausing?
>
> On Wed, Aug 22, 2018 at 4:12 PM Steve Tian 
> wrote:
>
> > Did you observed any GC-pausing?
> >
> > On Wed, Aug 22, 2018, 6:38 PM Shantanu Deshmukh 
> > wrote:
> >
> > > Hi Steve,
> > >
> > > Application is just sending mails. Every record is just a email request
> > > with very basic business logic. Generally it doesn't take more than
> 200ms
> > > to process a single mail. Currently it is averaging out at 70-80 ms.
> > >
> > > On Wed, Aug 22, 2018 at 3:06 PM Steve Tian 
> > > wrote:
> > >
> > > > How long did it take to process 50 `ConsumerRecord`s?
> > > >
> > > > On Wed, Aug 22, 2018, 5:16 PM Shantanu Deshmukh <
> shantanu...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > We have Kafka 0.10.0.1 running on a 3 broker cluster. We have an
> > > > > application which consumes from a topic having 10 partitions. 10
> > > > consumers
> > > > > are spawned from this process, they belong to one consumer group.
> > > > >
> > > > > What we have observed is that very frequently we are observing such
> > > > > messages in consumer logs
> > > > >
> > > > > [2018-08-21 11:12:46] :: WARN  :: ConsumerCoordinator:554 - Auto
> > offset
> > > > > commit failed for group otp-email-consumer: Commit cannot be
> > completed
> > > > > since the group has already rebalanced and assigned the partitions
> to
> > > > > another member. This means that the time between subsequent calls
> to
> > > > poll()
> > > > > was longer than the configured max.poll.interval.ms, which
> typically
> > > > > implies that the poll loop is spending too much time message
> > > processing.
> > > > > You can address this either by increasing the session timeout or by
> > > > > reducing the maximum size of batches returned in poll() with
> > > > > max.poll.records.
> > > > > [2018-08-21 11:12:46] :: INFO  :: ConsumerCoordinator:333 -
> Revoking
> > > > > previously assigned partitions [otp-email-1, otp-email-0,
> > otp-email-3,
> > > > > otp-email-2] for group otp-email-consumer
> > > > > [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:381 -
> > > (Re-)joining
> > > > > group otp-email-consumer
> > > > > [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:600 -
> *Marking
> > > the
> > > > > coordinator x.x.x.x:9092 (id: 2147483646 rack: null) dead for group
> > > > > otp-email-consumer*
> > > > > [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:600 -
> *Marking
> > > the
> > > > > coordinator x.x.x.x:9092 (id: 2147483646 rack: null) dead for group
> > > > > otp-email-consumer*
> > > > > [2018-08-21 11:12:46] :: INFO  ::
> > > > > AbstractCoordinator$GroupCoordinatorResponseHandler:555 -
> Discovered
> > > > > coordinator 10.189.179.117:9092 (id: 2147483646 rack: null) for
> > group
> > > > > otp-email-consumer.
> > > > > [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:381 -
> > > (Re-)joining
> > > > > group otp-email-consumer
> > > > >
> > > > > After this, the group enters rebalancing phase and it takes about
> > 5-10
> > > > > minutes to start consuming messages again.
> > > > > What does this message mean? The actual broker doesn't  go down as
> > per
> > > > our
> > > > > monitoring tools. So how come it is declared dead? Please help, I
> am
> > > > stuck
> > > > > on this issue since 2 months now.
> > > > >
> > > > > Here's our consumer configuration
> > > > > auto.commit.interval.ms = 3000
> > > > > auto.offset.reset = latest
> > > > > bootstrap.servers = [x.x.x.x:9092, x.x.x.x:9092, x.x.x.x:9092]
> > > > > check.crcs = true
> > > > > client.id =
> > > > > connections.max.idle.ms = 54
> > 

Re: Frequent appearance of "Marking the coordinator dead" message in consumer log

2018-08-22 Thread Steve Tian
Have you measured the duration between two `poll` invocations and the size
of returned `ConsumrRecords`?

On Wed, Aug 22, 2018, 7:00 PM Shantanu Deshmukh 
wrote:

> Ohh sorry, my bad. Kafka version is 0.10.1.0 indeed and so is the client.
>
> On Wed, Aug 22, 2018 at 4:26 PM Steve Tian 
> wrote:
>
> > NVM.  What's your client version?  I'm asking as max.poll.interval.ms
> > should be introduced since 0.10.1.0, which is not the version you
> mentioned
> > in the email thread.
> >
> > On Wed, Aug 22, 2018, 6:51 PM Shantanu Deshmukh 
> > wrote:
> >
> > > How do I check for GC pausing?
> > >
> > > On Wed, Aug 22, 2018 at 4:12 PM Steve Tian 
> > > wrote:
> > >
> > > > Did you observed any GC-pausing?
> > > >
> > > > On Wed, Aug 22, 2018, 6:38 PM Shantanu Deshmukh <
> shantanu...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hi Steve,
> > > > >
> > > > > Application is just sending mails. Every record is just a email
> > request
> > > > > with very basic business logic. Generally it doesn't take more than
> > > 200ms
> > > > > to process a single mail. Currently it is averaging out at 70-80
> ms.
> > > > >
> > > > > On Wed, Aug 22, 2018 at 3:06 PM Steve Tian <
> steve.cs.t...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > How long did it take to process 50 `ConsumerRecord`s?
> > > > > >
> > > > > > On Wed, Aug 22, 2018, 5:16 PM Shantanu Deshmukh <
> > > shantanu...@gmail.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > We have Kafka 0.10.0.1 running on a 3 broker cluster. We have
> an
> > > > > > > application which consumes from a topic having 10 partitions.
> 10
> > > > > > consumers
> > > > > > > are spawned from this process, they belong to one consumer
> group.
> > > > > > >
> > > > > > > What we have observed is that very frequently we are observing
> > such
> > > > > > > messages in consumer logs
> > > > > > >
> > > > > > > [2018-08-21 11:12:46] :: WARN  :: ConsumerCoordinator:554 -
> Auto
> > > > offset
> > > > > > > commit failed for group otp-email-consumer: Commit cannot be
> > > > completed
> > > > > > > since the group has already rebalanced and assigned the
> > partitions
> > > to
> > > > > > > another member. This means that the time between subsequent
> calls
> > > to
> > > > > > poll()
> > > > > > > was longer than the configured max.poll.interval.ms, which
> > > typically
> > > > > > > implies that the poll loop is spending too much time message
> > > > > processing.
> > > > > > > You can address this either by increasing the session timeout
> or
> > by
> > > > > > > reducing the maximum size of batches returned in poll() with
> > > > > > > max.poll.records.
> > > > > > > [2018-08-21 11:12:46] :: INFO  :: ConsumerCoordinator:333 -
> > > Revoking
> > > > > > > previously assigned partitions [otp-email-1, otp-email-0,
> > > > otp-email-3,
> > > > > > > otp-email-2] for group otp-email-consumer
> > > > > > > [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:381 -
> > > > > (Re-)joining
> > > > > > > group otp-email-consumer
> > > > > > > [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:600 -
> > > *Marking
> > > > > the
> > > > > > > coordinator x.x.x.x:9092 (id: 2147483646 rack: null) dead for
> > group
> > > > > > > otp-email-consumer*
> > > > > > > [2018-08-21 11:12:46] :: INFO  :: AbstractCoordinator:600 -
> > > *Marking
> > > > > the
> > > > > > > coordinator x.x.x.x:9092 (id: 2147483646 rack: null) dead for
> > group
> > > > > > > otp-email-consumer*
> > > > > > > [2018-08-21 11:12:46] :: INFO  ::
> > > > > > > AbstractCoordinator$GroupCoordinatorResponseHandler:555 -
> > > Discov