Data lose prevention configuration no longer works with Kafka 9.x

2016-03-30 Thread Atul Soman
Hi List,
 Our requirement is to have the producers wait (indefinitely) until kafka 
broker comes up (in case of maintenance or network interruptions). We have disk 
caching in front of producer, when the producer waits and it drains when the 
producer eventually reconnects. This functionality was working fine in Kafka 
0.8.2.1. The producer settings I used are.

block.on.buffer.full=true
retries=MAXINT
acks=all
reconnect.backoff.ms=1
max.in.flight.requests.per.connection=1

(I would like to acknowledge Gwen Shapiras slideshare 
http://www.slideshare.net/gwenshap/kafka-reliability-when-it-absolutely-positively-has-to-be-there
  on which this setting was derived for 0.8.2.1)

Issue:

I am currently evaluating an upgrade to the latest version and what I observe 
is that this setting no longer works. I debugged the code a bit and observed 
that the changes added for 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-19+-+Add+a+request+timeout+to+NetworkClient
 could be causing this. To give more details.

Thread 1:
BufferPool. Allocate()
   moreMemory.await(maxTimeToBlock, TimeUnit.MILLISECONDS)  => 
Waiting

Thread 2:
Sender.run()

accumulator.abortExpiredBatches(this.requestTimeout=> Defaults 
to 30 seconds

RecordAccumilator.deallocate(batch)

BufferPool.deallocate()
   
moreMem.signal() => This will wake up the first thread.

So looks like the property value set by max.block.ms  is overridden by the 
value of request.timeout.ms. I am not fully sure if this behavior is intended. 
If that's the case it would be good to indicate in the documentation.

Also, I would be interested to know what is the ideal producer configuration 
setting for preventing data loss for 0.9.x.

Many Thanks,
Atul Soman.


Note: I am resending this mail after subscribing to 
dev@kafka.apache.org since I guess my first mail 
didn't go through.



Data lose prevention configuration no longer works with Kafka 9.x

2016-03-30 Thread Atul Soman
Hi List,
 Our requirement is to have the producers wait (indefinitely) until kafka 
broker comes up (in case of maintenance or network interruptions). We have disk 
caching in front of producer, when the producer waits and it drains when the 
producer eventually reconnects. This functionality was working fine in Kafka 
0.8.2.1. The producer settings I used are.

block.on.buffer.full=true
retries=MAXINT
acks=all
reconnect.backoff.ms=1
max.in.flight.requests.per.connection=1

(I would like to acknowledge Gwen Shapiras slideshare 
http://www.slideshare.net/gwenshap/kafka-reliability-when-it-absolutely-positively-has-to-be-there
  on which this setting was derived for 0.8.2.1)

Issue:

I am currently evaluating an upgrade to the latest version and what I observe 
is that this setting no longer works. I debugged the code a bit and observed 
that the changes added for 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-19+-+Add+a+request+timeout+to+NetworkClient
 could be causing this. To give more details.

Thread 1:
BufferPool. Allocate()
   moreMemory.await(maxTimeToBlock, TimeUnit.MILLISECONDS)  => 
Waiting

Thread 2:
Sender.run()

accumulator.abortExpiredBatches(this.requestTimeout=> Defaults 
to 30 seconds

RecordAccumilator.deallocate(batch)

BufferPool.deallocate()
   
moreMem.signal() => This will wake up the first thread.

So looks like the property value set by max.block.ms  is overridden by the 
value of request.timeout.ms. I am not fully sure if this behavior is intended. 
If that's the case it would be good to indicate in the documentation.

Also, I would be interested to know what is the ideal producer configuration 
setting for preventing data loss for 0.9.x.

Many Thanks,
Atul Soman.