Re: Undecipherable error in zookeeper on initial connection

2015-08-14 Thread Jaikiran Pai
Such errors are very typical in zookeeper logs - it's very noisy. I 
typically ignore those errors and try and debug the Kafka issue either 
via Kafka logs, Kafka thread dumps and/or zookeeper shell.


Anyway, how are you adding the topics (script, code?) and what exactly 
are you noticing? Running into exceptions or timing out?


-Jaikiran
On Thursday 13 August 2015 04:06 AM, Jason Kania wrote:

Hello,
I am wondering if someone can point me in the right direction. I am getting 
this error when kafka connects to zookeeper:
zookeeper2_1  | 2015-08-12 22:18:21,493 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /100.100.100.1:38178zookeeper2_1  | 2015-08-12 
22:18:21,498 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client 
attempting to establish new session at /172.17.0.239:38178zookeeper2_1  
| 2015-08-12 22:18:21,499 [myid:] - INFO  [SyncThread:0:FileTxnLog@199] - 
Creating new log file: log.6azookeeper_1  | 2015-08-12 22:18:21,505 
[myid:] - INFO  [SyncThread:0:ZooKeeperServer@617] - Established session 
0x14f23fe141a with negotiated timeout 6000 for client 
/100.100.100.1:38178zookeeper_1  | 2015-08-12 22:18:21,755 [myid:] - 
INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got 
user-level KeeperException when processing sessionid:0x14f23fe141a 
type:delete cxid:0x1b zxid:0x6d txntype:-1 reqpath:n/a Error 
Path:/admin/preferred_replica_election Error:KeeperErrorCode = NoNode for 
/admin/preferred_replica_election
At this point Kafka remains running, I see nothing in the Kafka logs to 
indicate error, but attempts to add topics indicate that no brokers are 
running.  I have tried to look for a solution but zookeeper seems to be a 
really poor application.
Any suggestions would be appreciated.
Thanks,
Jason





Re: Help with SocketTimeoutException while reading from Kafka cluster

2015-08-14 Thread Jaikiran Pai

On Wednesday 12 August 2015 04:59 AM, venkatesh kavuluri wrote:

83799 [c3-onboard_-2-9571-1439334326956-cfa8b46a-leader-finder-thread]
INFO  kafka.consumer.SimpleConsumer  - Reconnect due to socket error:
java.net.SocketTimeoutException

163931 [c3-onboard_-2-9571-1439334326956-cfa8b46a-leader-finder-thread]
INFO  kafka.consumer.SimpleConsumer  - Reconnect due to socket error:
java.net.SocketTimeoutException


There's a patch in the JIRA here which logs the exact reason why the 
exception was thrown https://issues.apache.org/jira/browse/KAFKA-2221. 
It hasn't been merged since SimpleConsumer was considered deprecated. 
But you might want to apply that and see if that helps to narrow down 
the issue.


-Jaikiran


Re: use page cache as much as possiblee

2015-08-14 Thread Yuheng Du
So if I understand correctly, even if I delay flushing, the consumer will
get the messages as soon as the broker receives them and put them into page
cache (assuming producer doesn't wait for acks from brokers)?

And will the decrease of log.flush interval help reduce latency between
producer and consumer?

Thanks.


On Fri, Aug 14, 2015 at 11:57 AM, Kishore Senji kse...@gmail.com wrote:

 Thank you Gwen for correcting me. This document (
 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Replication) in
 Writes section also has specified the same thing as you have mentioned.
 One thing is not clear to me as to what happens when the Replicas add the
 message to memory but the leader fails before acking to the producer. Later
 the leader replica is chosen to be the leader for the partition, it will
 advance the HW to its LEO (which has the message). The producer can resend
 the same message thinking it failed and there will be a duplicate message.
 Is my understanding correct here?

 On Thu, Aug 13, 2015 at 10:50 PM, Gwen Shapira g...@confluent.io wrote:

  On Thu, Aug 13, 2015 at 4:10 PM, Kishore Senji kse...@gmail.com wrote:
 
   Consumers can only fetch data up to the committed offset and the reason
  is
   reliability and durability on a broker crash (some consumers might get
  the
   new data and some may not as the data is not yet committed and lost).
  Data
   will be committed when it is flushed. So if you delay the flushing,
   consumers won't get those messages until that time.
  
 
  As far as I know, this is not accurate.
 
  A message is considered committed when all ISR replicas received it (this
  much is documented). This doesn't need to include writing to disk, which
  will happen asynchronously.
 
 
  
   Even though you flush periodically based on log.flush.interval.messages
  and
   log.flush.interval.ms, if the segment file is in the pagecache, the
   consumers will still benefit from that pagecache and OS wouldn't read
 it
   again from disk.
  
   On Thu, Aug 13, 2015 at 2:54 PM Yuheng Du yuheng.du.h...@gmail.com
   wrote:
  
Hi,
   
As I understand it, kafka brokers will store the incoming messages
 into
pagecache as much as possible and then flush them into disk, right?
   
But in my experiment where 90 producers is publishing data into 6
   brokers,
I see that the log directory on disk where broker stores the data is
constantly increasing (every seconds.) So why this is happening? Does
   this
has to do with the default log.flush.interval setting?
   
I want the broker to write to disk less often when serving some
 on-line
consumers to reduce latency. I tested in my broker the disk write
 speed
   is
around 110MB/s.
   
Thanks for any replies.
   
  
 



Re: Kafka java consumer

2015-08-14 Thread Abhijith Prabhakar
Thanks Ewen.  Any idea when we can expect 0.8.3?


 On Aug 14, 2015, at 5:36 PM, Ewen Cheslack-Postava e...@confluent.io wrote:
 
 Hi Abhijith,
 
 You should be using KafkaProducer, but KafkaConsumer is not ready yet. The
 APIs are included in 0.8.2.1, but the implementation is not ready. Until
 0.8.3 is released, you cannot rely only on kafka-clients if you want to
 write a consumer. You'll need to depend on the main kafka jar and use
 kafka.consumer.Consumer, as described on that wiki page. It has not been
 deprecated yet since the new consumer implementation is not ready yet.
 
 -Ewen
 
 On Fri, Aug 14, 2015 at 2:17 PM, Abhijith Prabhakar abhi.preda...@gmail.com
 wrote:
 
 Hi All,
 
 I am newbie to Kafka and was looking to use java client implementation
 org.apache.kafka:kafka-clients:0.8.2.1.  I was trying to write a consumer
 group using example given here:
 https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example 
 https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
 
 I see couple of issues here.
 
 1.  Above confluence page uses kafka.consumer.Consumer which seems to be
 deprecated and taken out in 0.8.2.1.
 2. I realized that in documentation it mentions that 0.8.2 only has
 Producer implementation inside Java client. But I also see
 org/apache/kafka/clients/consumer/KafkaConsumer in this 0.8.2.1 version.
 Not sure if this is ready to be used.  Also javadoc on this class is
 different than 0.8.3
 
 http://kafka.apache.org/083/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
 
 http://kafka.apache.org/083/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
 
 
 Can someone please let me know if using KafkaConsumer is a good idea?  If
 yes, then please point me to an example.
 
 Thanks
 Abhi
 
 
 
 
 -- 
 Thanks,
 Ewen



Re: Kafka java consumer

2015-08-14 Thread Ewen Cheslack-Postava
There's not a precise date for the release, ~1.5 or 2 months from now.

On Fri, Aug 14, 2015 at 3:45 PM, Abhijith Prabhakar abhi.preda...@gmail.com
 wrote:

 Thanks Ewen.  Any idea when we can expect 0.8.3?


  On Aug 14, 2015, at 5:36 PM, Ewen Cheslack-Postava e...@confluent.io
 wrote:
 
  Hi Abhijith,
 
  You should be using KafkaProducer, but KafkaConsumer is not ready yet.
 The
  APIs are included in 0.8.2.1, but the implementation is not ready. Until
  0.8.3 is released, you cannot rely only on kafka-clients if you want to
  write a consumer. You'll need to depend on the main kafka jar and use
  kafka.consumer.Consumer, as described on that wiki page. It has not been
  deprecated yet since the new consumer implementation is not ready yet.
 
  -Ewen
 
  On Fri, Aug 14, 2015 at 2:17 PM, Abhijith Prabhakar 
 abhi.preda...@gmail.com
  wrote:
 
  Hi All,
 
  I am newbie to Kafka and was looking to use java client implementation
  org.apache.kafka:kafka-clients:0.8.2.1.  I was trying to write a
 consumer
  group using example given here:
 
 https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example 
 
 https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
 
  I see couple of issues here.
 
  1.  Above confluence page uses kafka.consumer.Consumer which seems to be
  deprecated and taken out in 0.8.2.1.
  2. I realized that in documentation it mentions that 0.8.2 only has
  Producer implementation inside Java client. But I also see
  org/apache/kafka/clients/consumer/KafkaConsumer in this 0.8.2.1 version.
  Not sure if this is ready to be used.  Also javadoc on this class is
  different than 0.8.3
 
 
 http://kafka.apache.org/083/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
  
 
 http://kafka.apache.org/083/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
 
 
  Can someone please let me know if using KafkaConsumer is a good idea?
 If
  yes, then please point me to an example.
 
  Thanks
  Abhi
 
 
 
 
  --
  Thanks,
  Ewen




-- 
Thanks,
Ewen


Kafka java consumer

2015-08-14 Thread Abhijith Prabhakar
Hi All,

I am newbie to Kafka and was looking to use java client implementation 
org.apache.kafka:kafka-clients:0.8.2.1.  I was trying to write a consumer group 
using example given here:  
https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example 
https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example

I see couple of issues here.  

1.  Above confluence page uses kafka.consumer.Consumer which seems to be 
deprecated and taken out in 0.8.2.1.
2. I realized that in documentation it mentions that 0.8.2 only has Producer 
implementation inside Java client. But I also see 
org/apache/kafka/clients/consumer/KafkaConsumer in this 0.8.2.1 version.  Not 
sure if this is ready to be used.  Also javadoc on this class is different than 
0.8.3 
http://kafka.apache.org/083/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
 
http://kafka.apache.org/083/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
 

Can someone please let me know if using KafkaConsumer is a good idea?  If yes, 
then please point me to an example.

Thanks
Abhi

Re: Kafka java consumer

2015-08-14 Thread Ewen Cheslack-Postava
Hi Abhijith,

You should be using KafkaProducer, but KafkaConsumer is not ready yet. The
APIs are included in 0.8.2.1, but the implementation is not ready. Until
0.8.3 is released, you cannot rely only on kafka-clients if you want to
write a consumer. You'll need to depend on the main kafka jar and use
kafka.consumer.Consumer, as described on that wiki page. It has not been
deprecated yet since the new consumer implementation is not ready yet.

-Ewen

On Fri, Aug 14, 2015 at 2:17 PM, Abhijith Prabhakar abhi.preda...@gmail.com
 wrote:

 Hi All,

 I am newbie to Kafka and was looking to use java client implementation
 org.apache.kafka:kafka-clients:0.8.2.1.  I was trying to write a consumer
 group using example given here:
 https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example 
 https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example

 I see couple of issues here.

 1.  Above confluence page uses kafka.consumer.Consumer which seems to be
 deprecated and taken out in 0.8.2.1.
 2. I realized that in documentation it mentions that 0.8.2 only has
 Producer implementation inside Java client. But I also see
 org/apache/kafka/clients/consumer/KafkaConsumer in this 0.8.2.1 version.
 Not sure if this is ready to be used.  Also javadoc on this class is
 different than 0.8.3

 http://kafka.apache.org/083/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
 
 http://kafka.apache.org/083/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
 

 Can someone please let me know if using KafkaConsumer is a good idea?  If
 yes, then please point me to an example.

 Thanks
 Abhi




-- 
Thanks,
Ewen


Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-14 Thread Abhijith
Thanks.  Any idea when we can expect 0.8.3?

Sent from my phone

 On Aug 14, 2015, at 5:39 PM, Guozhang Wang wangg...@gmail.com wrote:
 
 +1 for both KAFKA-2189 and 2308.
 
 On Fri, Aug 14, 2015 at 7:03 AM, Gwen Shapira g...@confluent.io wrote:
 
 Will be nice to include Kafka-2308 and fix two critical snappy issues in
 the maintenance release.
 
 Gwen
 On Aug 14, 2015 6:16 AM, Grant Henke ghe...@cloudera.com wrote:
 
 Just to clarify. Will KAFKA-2189 be the only patch in the release?
 
 On Fri, Aug 14, 2015 at 7:35 AM, Manikumar Reddy ku...@nmsworks.co.in
 wrote:
 
 +1  for 0.8.2.2 release
 
 On Fri, Aug 14, 2015 at 5:49 PM, Ismael Juma ism...@juma.me.uk
 wrote:
 
 I think this is a good idea as the change is minimal on our side and
 it
 has
 been tested in production for some time by the reporter.
 
 Best,
 Ismael
 
 On Fri, Aug 14, 2015 at 1:15 PM, Jun Rao j...@confluent.io wrote:
 
 Hi, Everyone,
 
 Since the release of Kafka 0.8.2.1, a number of people have
 reported
 an
 issue with snappy compression (
 https://issues.apache.org/jira/browse/KAFKA-2189). Basically, if
 they
 use
 snappy in 0.8.2.1, they will experience a 2-3X space increase. The
 issue
 has since been fixed in trunk (just a snappy jar upgrade). Since
 0.8.3
 is
 still a few months away, it may make sense to do an 0.8.2.2 release
 just
 to
 fix this issue. Any objections?
 
 Thanks,
 
 Jun
 
 
 
 --
 Grant Henke
 Software Engineer | Cloudera
 gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
 
 
 
 -- 
 -- Guozhang


Re: Message corruption with new Java client + snappy + broker restart

2015-08-14 Thread Lance Laursen
I am also seeing this issue when using the new producer and snappy
compression, running mirrormaker (trunk, aug 10 or so). I'm using snappy
1.1.1.7

[2015-08-14 14:15:27,876] WARN Got error produce response with correlation
id 5151552 on topic-partition mytopic-56, retrying (2147480801 attempts
left). Error: CORRUPT_MESSAGE
(org.apache.kafka.clients.producer.internals.Sender)

I also get the occasional

java: target/snappy-1.1.1/snappy.cc:384: char*
snappy::internal::CompressFragment(const char*, size_t, char*,
snappy::uint16*, int): Assertion `hash == Hash(ip, shift)' failed.

and associated

java.io.IOException: FAILED_TO_UNCOMPRESS(5) at
org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:84)

in kafka logs when using snappy compression with the new producer and
restarting brokers. This looks to be very closely related to KAFKA-2308
(s/hash/memcp). Not sure how closely related to the message corruption loop
this is though.


I understand lz4 is now a compression option, which has both higher
throughput and better compression in comparison to snappy, but I do not see
it listed anywhere in the documentation.

On Tue, May 12, 2015 at 9:45 PM, Roger Hoover roger.hoo...@gmail.com
wrote:

 Oops.  I originally sent this to the dev list but meant to send it here.

 Hi,
 
  When using Samza 0.9.0 which uses the new Java producer client and snappy
  enabled, I see messages getting corrupted on the client side.  It never
  happens with the old producer and it never happens with lz4, gzip, or no
  compression.  It only happens when a broker gets restarted (or maybe just
  shutdown).
 
  The error is not always the same.  I've noticed at least three types of
  errors on the Kafka brokers.
 
  1) java.io.IOException: failed to read chunk
  at
 
 org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:356)
  http://pastebin.com/NZrrEHxU
  2) java.lang.OutOfMemoryError: Java heap space
 at
 
 org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:346)
  http://pastebin.com/yuxk1BjY
  3) java.io.IOException: PARSING_ERROR(2)
at org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:84)
  http://pastebin.com/yq98Hx49
 
  I've noticed a couple different behaviors from the Samza producer/job
  A) It goes into a long retry loop where this message is logged.  I saw
  this with error #1 above.
 
  2015-04-29 18:17:31 Sender [WARN] task[Partition 7]
  ssp[kafka,svc.call.w_deploy.c7tH4YaiTQyBEwAAhQzRXw,7] offset[253] Got
  error produce response with correlation id 4878 on topic-partition
  svc.call.w_deploy.T2UDe2PWRYWcVAAAhMOAwA-1, retrying (2147483646 attempts
  left). Error: CORRUPT_MESSAGE
 
  B) The job exits with
  org.apache.kafka.common.errors.UnknownServerException (at least when run
 as
  ThreadJob).  I saw this with error #3 above.
 
  org.apache.samza.SamzaException: Unable to send message from
  TaskName-Partition 6 to system kafka.
  org.apache.kafka.common.errors.UnknownServerException: The server
  experienced an unexpected error when processing the request
 
  There seem to be two issues here:
 
  1) When leadership for a topic is transferred to another broker, the Java
  client (I think) has to move the data it was buffering for the original
  leader broker to the buffer for the new leader.  My guess is that the
  corruption is happening at this point.
 
  2) When a producer has corrupt message, it retries 2.1 billions times in
 a
  hot loop even though it's not a retriable error.  It probably shouldn't
  retry on such errors.  For retriable errors, it would be much safer to
 have
  a backoff scheme for retries.
 
  Thanks,
 
  Roger
 



0.8.2 producer and single message requests

2015-08-14 Thread Neelesh
We are fronting all our Kafka requests with a simple web service (we do
some additional massaging and writing to other stores as well). The new
KafkaProducer in 0.8.2 seems very geared towards producer batching. Most of
our payload are single messages.

Producer batching basically sets us up for lost messages if our web service
goes down with unflushed messaged in the producer.

Another issue is when we have a batch of records. It looks like I have to
call producer.send for each record and deal with individual futures
returned.

Are there any patterns for primarily single message requests, without
losing data? I understand the throughput will be low.

Thanks!
-Neelesh


Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-14 Thread Guozhang Wang
+1 for both KAFKA-2189 and 2308.

On Fri, Aug 14, 2015 at 7:03 AM, Gwen Shapira g...@confluent.io wrote:

 Will be nice to include Kafka-2308 and fix two critical snappy issues in
 the maintenance release.

 Gwen
 On Aug 14, 2015 6:16 AM, Grant Henke ghe...@cloudera.com wrote:

  Just to clarify. Will KAFKA-2189 be the only patch in the release?
 
  On Fri, Aug 14, 2015 at 7:35 AM, Manikumar Reddy ku...@nmsworks.co.in
  wrote:
 
   +1  for 0.8.2.2 release
  
   On Fri, Aug 14, 2015 at 5:49 PM, Ismael Juma ism...@juma.me.uk
 wrote:
  
I think this is a good idea as the change is minimal on our side and
 it
   has
been tested in production for some time by the reporter.
   
Best,
Ismael
   
On Fri, Aug 14, 2015 at 1:15 PM, Jun Rao j...@confluent.io wrote:
   
 Hi, Everyone,

 Since the release of Kafka 0.8.2.1, a number of people have
 reported
  an
 issue with snappy compression (
 https://issues.apache.org/jira/browse/KAFKA-2189). Basically, if
  they
use
 snappy in 0.8.2.1, they will experience a 2-3X space increase. The
   issue
 has since been fixed in trunk (just a snappy jar upgrade). Since
  0.8.3
   is
 still a few months away, it may make sense to do an 0.8.2.2 release
   just
to
 fix this issue. Any objections?

 Thanks,

 Jun

   
  
 
 
 
  --
  Grant Henke
  Software Engineer | Cloudera
  gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
 




-- 
-- Guozhang


Re: 0.8.2 producer and single message requests

2015-08-14 Thread Gwen Shapira
Hi Neelesh :)

The new producer has configuration for controlling the batch sizes.
By default, it will batch as much as possible without delay (controlled by
linger.ms) and without using too much memory (controlled by batch.size).

As mentioned in the docs, you can set batch.size to 0 to disable batching
completely if you want.

It is worthwhile to consider using the producer callback to avoid losing
messages when the webservice crashes (for example have the webservice only
consider messages as sent if the callback is triggered for a successful
send).

You can read more information on batching here:
http://ingest.tips/2015/07/19/tips-for-improving-performance-of-kafka-producer/

And some examples on how to produce data to Kafka with the new producer -
both with futures and callbacks here:
https://github.com/gwenshap/kafka-examples/blob/master/SimpleCounter/src/main/java/com/shapira/examples/producer/simplecounter/DemoProducerNewJava.java

Gwen



On Fri, Aug 14, 2015 at 5:07 PM, Neelesh neele...@gmail.com wrote:

 We are fronting all our Kafka requests with a simple web service (we do
 some additional massaging and writing to other stores as well). The new
 KafkaProducer in 0.8.2 seems very geared towards producer batching. Most of
 our payload are single messages.

 Producer batching basically sets us up for lost messages if our web service
 goes down with unflushed messaged in the producer.

 Another issue is when we have a batch of records. It looks like I have to
 call producer.send for each record and deal with individual futures
 returned.

 Are there any patterns for primarily single message requests, without
 losing data? I understand the throughput will be low.

 Thanks!
 -Neelesh



Re: 0.8.2.1 upgrade causes much more IO

2015-08-14 Thread Jun Rao
Hi, Andrew,

Yes, I agree that this is a serious issue. Let me start a discussion thread
on this to see if there is any objection in doing an 0.8.2.2 release just
for this.

Thanks,

Jun

On Thu, Aug 13, 2015 at 1:10 PM, Andrew Otto o...@wikimedia.org wrote:

 Hey all,

 Just wanted to confirm, this was totally our issue.  Thank so much Todd and
 Matt, our cluster is much more stable now.

 Apache Kafka folks:  I know 0.8.3 is slated to come out soon, but this is a
 pretty serious bug.  I would think it would merit a minor release just to
 get it out there, so that others don't run into this problem.  0.8.2.1
 basically does not work at scale with snappy compression.  I will add a
 comment to https://issues.apache.org/jira/browse/KAFKA-2189 noting this
 too.

 Thanks so much!
 -Andrew

 On Tue, Aug 11, 2015 at 3:43 PM, Matthew Bruce mbr...@blackberry.com
 wrote:

  Hi Andrew,
 
 
 
  I work with Todd and did our 0.8.2.1 testing with him.  I believe that
 the
  Kafka 0.8.x brokers recompresses the messages once it receives them in,
  order to assign the offsets to the messages (see the ‘Compression in
 Kafka’
  section of:
  http://nehanarkhede.com/2013/03/28/compression-in-kafka-gzip-or-snappy/
 ).
  I expect that you will see an improvement with Snappy 1.1.1.7  (FWIW, our
  load generator’s version of Snappy didn’t change between our 0.8.1.1 and
  0.8.2.1 testing, and we still saw the IO hit on the broker side, which
  seems to confirm this).
 
 
 
  Thanks,
 
  Matt Bruce
 
 
 
 
 
  *From:* Andrew Otto [mailto:ao...@wikimedia.org]
  *Sent:* Tuesday, August 11, 2015 3:15 PM
  *To:* users@kafka.apache.org
  *Cc:* Dan Andreescu dandree...@wikimedia.org; Joseph Allemandou 
  jalleman...@wikimedia.org
  *Subject:* Re: 0.8.2.1 upgrade causes much more IO
 
 
 
  Hi Todd,
 
 
 
  We are using snappy!  And we are using version 1.1.1.6 as of our upgrade
  to 0.8.2.1 yesterday.  However, as far as I can tell, that is only
 relevant
  for Java producers, right?   Our main producers use librdkafka (the
 Kafka C
  lib) to produce, and in doing so use a built in C version of snappy[1].
 
 
 
  Even so, your issue sounds very similar to mine, and I don’t have a full
  understanding of how brokers deal with compression, so I have updated the
  snappy java version to 1.1.1.7 on one of our brokers.  We’ll have to
 wait a
  while to see if the log sizes are actually smaller for data written to
 this
  broker.
 
 
 
  Thanks!
 
 
 
 
 
 
 
 
 
  [1] https://github.com/edenhill/librdkafka/blob/0.8.5/src/snappy.c
 
  On Aug 11, 2015, at 12:58, Todd Snyder tsny...@blackberry.com wrote:
 
 
 
  Hi Andrew,
 
 
 
  Are you using Snappy Compression by chance?  When we tested the 0.8.2.1
  upgrade initially we saw similar results and tracked it down to a problem
  with Snappy version 1.1.1.6 (
  https://issues.apache.org/jira/browse/KAFKA-2189).  We’re running with
  Snappy 1.1.1.7 now and the performance is back to where it used to be.
 
 
 
 
 
  Sent from my BlackBerry 10 smartphone on the TELUS network.
 
  *From: *Andrew Otto
 
  *Sent: *Tuesday, August 11, 2015 12:26 PM
 
  *To: *users@kafka.apache.org
 
  *Reply To: *users@kafka.apache.org
 
  *Cc: *Dan Andreescu; Joseph Allemandou
 
  *Subject: *0.8.2.1 upgrade causes much more IO
 
 
 
  Hi all!
 
 
 
  Yesterday I did a production upgrade of our 4 broker Kafka cluster from
  0.8.1.1 to 0.8.2.1.
 
 
 
  When we did so, we were running our (varnishkafka) producers with
  request.required.acks = -1.  After switching to 0.8.2.1, producers saw
  produce response RTTs of 60 seconds.  I then switched to
  request.required.acks = 1, and producers settled down.  However, we then
  started seeing flapping ISRs about every 10 minutes.  We run Camus every
 10
  minutes.  If we disable Camus, then ISRs don’t flap.
 
 
 
  All of these issues seem to be a side affect of a larger problem.  The
  total amount of network and disk IO that Kafka brokers are doing after
 the
  upgrade to 0.8.2.1 has tripled.  We were previously seeing about 20 MB/s
  incoming on broker interfaces, 0.8.2.1 knocks this up to around 60 MB/s.
  Disk writes have tripled accordingly.  Disk reads have also increased by
 a
  huge amount, although I suspect this is a consequence of more data flying
  around somehow dirtying the disk cache
 
 
 
  You can see these changes in this dashboard:
  http://grafana.wikimedia.org/#/dashboard/db/kafka-0821-upgrade
 
 
 
  The upgrade started at around 2015-08-10 14:30, and was completed on all
 4
  brokers within a couple of hours.
 
 
 
  Probably the most relevant is network rx_bytes on brokers.
 
 
 
 
 
 
 
  We looked at Kafka .log file sizes and noticed that file sizes are indeed
  much larger than they were before this upgrade:
 
 
 
  # 0.8.1.1
 
  2015-08-10T04 38119109383
 
  2015-08-10T05 46172089174
 
  2015-08-10T06 46172182745
 
  2015-08-10T07 53151490032
 
  2015-08-10T08 53151892928
 
  2015-08-10T09 55836248198
 
  2015-08-10T10 57984054557
 
  

[DISCUSSION] Kafka 0.8.2.2 release?

2015-08-14 Thread Jun Rao
Hi, Everyone,

Since the release of Kafka 0.8.2.1, a number of people have reported an
issue with snappy compression (
https://issues.apache.org/jira/browse/KAFKA-2189). Basically, if they use
snappy in 0.8.2.1, they will experience a 2-3X space increase. The issue
has since been fixed in trunk (just a snappy jar upgrade). Since 0.8.3 is
still a few months away, it may make sense to do an 0.8.2.2 release just to
fix this issue. Any objections?

Thanks,

Jun


Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-14 Thread Ismael Juma
I think this is a good idea as the change is minimal on our side and it has
been tested in production for some time by the reporter.

Best,
Ismael

On Fri, Aug 14, 2015 at 1:15 PM, Jun Rao j...@confluent.io wrote:

 Hi, Everyone,

 Since the release of Kafka 0.8.2.1, a number of people have reported an
 issue with snappy compression (
 https://issues.apache.org/jira/browse/KAFKA-2189). Basically, if they use
 snappy in 0.8.2.1, they will experience a 2-3X space increase. The issue
 has since been fixed in trunk (just a snappy jar upgrade). Since 0.8.3 is
 still a few months away, it may make sense to do an 0.8.2.2 release just to
 fix this issue. Any objections?

 Thanks,

 Jun



Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-14 Thread Manikumar Reddy
+1  for 0.8.2.2 release

On Fri, Aug 14, 2015 at 5:49 PM, Ismael Juma ism...@juma.me.uk wrote:

 I think this is a good idea as the change is minimal on our side and it has
 been tested in production for some time by the reporter.

 Best,
 Ismael

 On Fri, Aug 14, 2015 at 1:15 PM, Jun Rao j...@confluent.io wrote:

  Hi, Everyone,
 
  Since the release of Kafka 0.8.2.1, a number of people have reported an
  issue with snappy compression (
  https://issues.apache.org/jira/browse/KAFKA-2189). Basically, if they
 use
  snappy in 0.8.2.1, they will experience a 2-3X space increase. The issue
  has since been fixed in trunk (just a snappy jar upgrade). Since 0.8.3 is
  still a few months away, it may make sense to do an 0.8.2.2 release just
 to
  fix this issue. Any objections?
 
  Thanks,
 
  Jun
 



Re: [DISCUSSION] Kafka 0.8.2.2 release?

2015-08-14 Thread Grant Henke
Just to clarify. Will KAFKA-2189 be the only patch in the release?

On Fri, Aug 14, 2015 at 7:35 AM, Manikumar Reddy ku...@nmsworks.co.in
wrote:

 +1  for 0.8.2.2 release

 On Fri, Aug 14, 2015 at 5:49 PM, Ismael Juma ism...@juma.me.uk wrote:

  I think this is a good idea as the change is minimal on our side and it
 has
  been tested in production for some time by the reporter.
 
  Best,
  Ismael
 
  On Fri, Aug 14, 2015 at 1:15 PM, Jun Rao j...@confluent.io wrote:
 
   Hi, Everyone,
  
   Since the release of Kafka 0.8.2.1, a number of people have reported an
   issue with snappy compression (
   https://issues.apache.org/jira/browse/KAFKA-2189). Basically, if they
  use
   snappy in 0.8.2.1, they will experience a 2-3X space increase. The
 issue
   has since been fixed in trunk (just a snappy jar upgrade). Since 0.8.3
 is
   still a few months away, it may make sense to do an 0.8.2.2 release
 just
  to
   fix this issue. Any objections?
  
   Thanks,
  
   Jun
  
 




-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke