Re: Closing socket connection to /192.115.190.61. (kafka.network.Processor)

2015-06-18 Thread Joe Stein
What version of Kafka are you using? This was changed to debug level in
0.8.2.

~ Joestein
On Jun 18, 2015 10:39 PM, "bit1...@163.com"  wrote:

> Hi,
> I have started the kafka server as a backgroud process, however, the
> following INFO log appears on the console very 10 seconds.
> Looks it is not an error since its log level is INFO. How could I suppress
> this annoying log? Thanks
>
>
> [2015-06-19 13:34:10,884] INFO Closing socket connection to /
> 192.115.190.61. (kafka.network.Processor)
>
>
>
> bit1...@163.com
>


Closing socket connection to /192.115.190.61. (kafka.network.Processor)

2015-06-18 Thread bit1...@163.com
Hi,
I have started the kafka server as a backgroud process, however, the following 
INFO log appears on the console very 10 seconds.
Looks it is not an error since its log level is INFO. How could I suppress this 
annoying log? Thanks


[2015-06-19 13:34:10,884] INFO Closing socket connection to /192.115.190.61. 
(kafka.network.Processor)



bit1...@163.com


Re: Could not recover kafka broker

2015-06-18 Thread haosdent
The kafka borker is stuck. And I restart the whole cluster. It works now.
Thank you very much.

On Thu, Jun 18, 2015 at 7:37 PM, haosdent  wrote:

> Hello, I use kafka 0.8.2 . I have a three borkers kafka cluster.
>
> I stop one broker and copy recovery-point-offset-checkpoint to
> override replication-offset-checkpoint. After that, I start the broker.
>
> But I find that the broker could not be added to ISR anymore. And the
> `logs/state-change.log` don't update anymore.  I try
> `bin/kafka-preferred-replica-election.sh`, but the broker still could not
> added to ISR. `log/server.log` and `logs/controller.log` don't have any
> WARN or ERROR log. And I check the network traffic of this sever, the
> replication fetch of this borker is not start. I search a lot in the user
> mail list, but I still don't have any ideas. How could I recover it? Thank
> you in advance.
>
> --
> Best Regards,
> Haosdent Huang
>



-- 
Best Regards,
Haosdent Huang


Re: duplicate messages at consumer

2015-06-18 Thread Kris K
Thanks Adam for your response.
I will have a mechanism to handle duplicates on the service consuming the
messages.
Just curious, if there is a way to identify the cause for receiving
duplicates.
I mean any log file that could help with this?

Regards,
Kris

On Wed, Jun 17, 2015 at 8:24 AM, Adam Shannon 
wrote:

> This is actually an expected consequence of using distributed systems. The
> kafka FAQ has a good answer
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdoIgetexactly-oncemessagingfromKafka
> ?
>
> On Tue, Jun 16, 2015 at 11:06 PM, Kris K  wrote:
>
> > Hi,
> >
> > While testing message delivery using kafka, I realized that few duplicate
> > messages got delivered by the consumers in the same consumer group (two
> > consumers got the same message with few milli-seconds difference).
> However,
> > I do not see any redundancy at the producer or broker. One more
> observation
> > is that - this is not happening when I use only one consumer thread.
> >
> > I am running 3 brokers (0.8.2.1) with 3 Zookeeper nodes. There are 3
> > partitions in the topic and replication-factor is 3. For producing, am
> > using New Producer with compression.type=none.
> >
> > On the consumer end, I have 3 High level consumers in the same consumer
> > group running with one consumer thread each, on three different hosts.
> Auto
> > commit is set to true for consumer.
> >
> > Size of each message would range anywhere between 0.7 KB and  2 MB. The
> max
> > volume for this test is 100 messages/hr.
> >
> > I looked at controller log for any possibility of consumer rebalance
> during
> > this time, but did not find any. In the server log of all the brokers the
> > error - java.io.IOException: Connection reset by peer is almost being
> > written continuously.
> >
> > So, is it possible to achieve exactly-once delivery with the current high
> > level consumer without needing an extra layer to remove redundancy?
> >
> > Could you please point me to any settings or logs that would help me tune
> > the configuration ?
> >
> > *PS: I tried searching for similar discussions, but could not find any.
> If
> > its already been answered, please provide the link.
> >
> > Thanks,
> > Kris
> >
>
>
>
> --
> Adam Shannon | Software Engineer | Banno | Jack Henry
> 206 6th Ave Suite 1020 | Des Moines, IA 50309 | Cell: 515.867.8337
>


Broker Fails to restart

2015-06-18 Thread Zakee
Any ideas on why one of the brokers which was down for a day, fails to restart 
with exception as below? The 10-node cluster has been up and running fine for 
quite a few weeks.

[2015-06-18 16:44:25,746] ERROR [app=broker] [main] There was an error in one 
of the threads during logs loading: java.lang.IllegalArgumentException: 
requirement failed: Attempt to append to a full index (size = 128). 
(kafka.log.LogManager)
[2015-06-18 16:44:25,747] FATAL [app=broker] [main] [Kafka Server 13], Fatal 
error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
java.lang.IllegalArgumentException: requirement failed: Attempt to append to a 
full index (size = 128).
at scala.Predef$.require(Predef.scala:233)
at 
kafka.log.OffsetIndex$$anonfun$append$1.apply$mcV$sp(OffsetIndex.scala:198)
at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
at kafka.log.OffsetIndex$$anonfun$append$1.apply(OffsetIndex.scala:197)
at kafka.utils.Utils$.inLock(Utils.scala:535)
at kafka.log.OffsetIndex.append(OffsetIndex.scala:197)
at kafka.log.LogSegment.recover(LogSegment.scala:187)
at kafka.log.Log.recoverLog(Log.scala:205)
at kafka.log.Log.loadSegments(Log.scala:177)
at kafka.log.Log.(Log.scala:67)
at 
kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$7$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:142)
at kafka.utils.Utils$$anon$1.run(Utils.scala:54)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)


Thanks
Zakee




Old School Yearbook Pics
View Class Yearbooks Online Free. Search by School & Year. Look Now!
http://thirdpartyoffers.netzero.net/TGL3231/558359b1bf13159b1361dst03vuc

Re: How to decrease number of replicas

2015-06-18 Thread Guangle Fan
Well I solved my problem. For record, the problem was prior reassignment
was not finished while new reassignment was kicked out, in which case
replicas list won't change as expected. Given our data is at relatively
large scale, reassignment just stayed without progress for one day.
The Solution is rolling restart stuck leaders for each partition, and sent
out replica reassignment once again.

On Thu, Jun 18, 2015 at 5:37 PM, Guangle Fan  wrote:

> If I use the same approach to reassign smaller number of replicas to the
> same partition, I got this error :
>
> (0,5,1,6,2,3) are the current replica, and (6) is the new list I want to
> assign to topic partition 0
>
> Assigned replicas (0,5,1,6,2,3) don't match the list of replicas for
> reassignment (6) for partition [topic,0]
>
> On Thu, Jun 18, 2015 at 5:03 PM, Guangle Fan  wrote:
>
>> Hi, Folks
>>
>> We have an online kafka cluster v0.8.1.1.
>> After running a partition reassignment script which maps each partition
>> to 3 replicas. But growth of data is out of my expectation, and I really
>> need to decrease replicas for each partition to 2 or 1.
>>
>> What's the best way to do this ?
>>
>> Thanks !
>>
>> GFan
>>
>
>


Re: Kafka 0.8.3 - New Consumer - user implemented partition.assignment.strategies?

2015-06-18 Thread Jiangjie Qin
Hi Olof,

I am just wondering what is the benefit of rebalancing with minimal number
of reassignments here?

I am asking this because in new consumer, the rebalance actually is quite
cheap on the consumer side - just updating a topic partition set. That
means the actually rebalance time on consumer side is almost ignorable no
matter how many partitions are reassigned.

Is it because the consumer has some sort of sticking partition
requirements? If that is the case,  that seems need an immutable partition
assignment policy.

Just curious about the motivation behind this.

Thanks,

Jiangjie (Becket) Qin


On 6/11/15, 9:30 AM, "Johansson, Olof" 
wrote:

>Thanks Guozhang,
>
>I agree that it seems to be a common reassignment strategy that should be
>one of the pre-defined strategies. Do you have a Jira ticket for this
>specific case, and/or a Jira ticket to add user defined
>partitions.assignment.strategies that I can watch?
>
>/ Olof
>
>On 10/06/2015 14:35, "Guozhang Wang"  wrote:
>
>>Hi Olof,
>>
>>Yes we have plans to allow user defined partitions.assignment strategy to
>>pass in to the consumer coordinator; I am not sure if this feature will
>>not
>>be available in the first release of the new consumer in 0.8.3 though.
>>Currently users still have to choose one from the server-defined strategy
>>upon registering themselves.
>>
>>On the other hand, I think "rebalance with a minimal number of
>>reassignment" should be quite a common reassignment strategy, and I think
>>it is possible to just add it into the server-defined strategies list.
>>
>>Guozhang
>>
>>
>>On Wed, Jun 10, 2015 at 9:32 AM, Johansson, Olof <
>>olof.johans...@thingworx.com> wrote:
>>
>>> Is it possible to have a consumer rebalancing partition assignment
>>> strategy that will rebalance with a minimal number of reassignments?
>>>
>>> Per the "Kafka 0.9 Consumer Rewrite Design" it should be possible to
>>> define my own partition assignment strategy:
>>> "partition.assignment.strategies - may take a comma separated list of
>>> properties that map the strategy's friendly name to to the class that
>>> implements the strategy. This is used for any strategy implemented by
>>>the
>>> user and released to the Kafka cluster. By default, Kafka will include
>>>a
>>> set of strategies that can be used by the consumer."
>>>
>>> Is there a Jira ticket that tracks adding user defined
>>> partitions.assignment.strategies? In the latest source, range, and
>>> roundrobin are still the only possible values (hard-coded).
>>>
>>> I assume that any user implemented strategy would have to implement the
>>> PartitionAssignor trait. If so, by naively looking at the 0.8.3 source,
>>>a
>>> strategy that should do a minimal number of partition reassignments
>>>would
>>> need the ConsumerMetaData. That's not currently available in the
>>> PartitionAssignor contract - assign(topicsPerConsumer,
>>>partitionsPerTopic).
>>> Have there been any discussion to change the contract to pass
>>> ConsumerMetaData?
>>>
>>
>>
>>
>>-- 
>>-- Guozhang
>



noticing blocking of kafka-network-thread

2015-06-18 Thread Pete Wright
Hi There,
while investigating higher than expected CPU utilization on one of our
Kafka brokers we noticed multiple instances of the kafka-network-thread
running in a BLOCKED state, all of whom are waiting for a single thread
to release a lock.

Here is an example from a stack trace:

(blocked thread)
"kafka-network-thread-9092-10" - Thread t@50
   java.lang.Thread.State: BLOCKED
at sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:300)
- waiting to lock <2bb6fb92> (a java.lang.Object) owned by
"kafka-network-thread-9092-13" t@53
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:565)
at kafka.log.FileMessageSet.writeTo(FileMessageSet.scala:147)
at kafka.api.PartitionDataSend.writeTo(FetchResponse.scala:70)
at kafka.network.MultiSend.writeTo(Transmission.scala:101)
at kafka.api.TopicDataSend.writeTo(FetchResponse.scala:125)
at kafka.network.MultiSend.writeTo(Transmission.scala:101)
at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:231)
at kafka.network.Processor.write(SocketServer.scala:472)
at kafka.network.Processor.run(SocketServer.scala:342)
at java.lang.Thread.run(Thread.java:745)



(thread 53)
"kafka-network-thread-9092-13" - Thread t@53
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.FileDispatcherImpl.size0(Native Method)
at sun.nio.ch.FileDispatcherImpl.size(FileDispatcherImpl.java:84)
at sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:309)
- locked <2bb6fb92> (a java.lang.Object)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:565)
at kafka.log.FileMessageSet.writeTo(FileMessageSet.scala:147)
at kafka.api.PartitionDataSend.writeTo(FetchResponse.scala:70)
at kafka.network.MultiSend.writeTo(Transmission.scala:101)
at kafka.api.TopicDataSend.writeTo(FetchResponse.scala:125)
at kafka.network.MultiSend.writeTo(Transmission.scala:101)
at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:231)
at kafka.network.Processor.write(SocketServer.scala:472)
at kafka.network.Processor.run(SocketServer.scala:342)
at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
- None



What is interesting is that:
1) we have seen the broker with high CPU utilization float around our
cluster.  I.e node 0 will have high load for a time, then node 2, etc.

2) when we do a thread dump on normally operating brokers we find no
BLOCKING threads

3) all nodes we capture thread dumps on seem to all get blocked by one
thread.

Has anyone else seen this, and if not - is there an easy way to
determine which object these threads are seeing contention on?

Thanks in advance!
-pete


-- 
Pete Wright
Lead Systems Architect
Rubicon Project
pwri...@rubiconproject.com
310.309.9298


Re: NoSuchMethodError with Consumer Instantiation

2015-06-18 Thread Ewen Cheslack-Postava
It looks like you have mixed up versions of the kafka jars:

4. kafka_2.11-0.8.3-SNAPSHOT.jar
5. kafka_2.11-0.8.2.1.jar
6. kafka-clients-0.8.2.1.jar

I think org.apache.kafka.common.utils.Utils is very new, probably post
0.8.2.1, so it's probably caused by the kafka_2.11-0.8.3-SNAPSHOT.jar being
used, and then trying to use a class which should be in the kafka-clients
jar, but since that jar is the old version kafka-clients-0.8.2.1.jar it
can't find the class.

-Ewen

On Thu, Jun 18, 2015 at 1:13 PM, Srividhya Anantharamakrishnan <
srivid...@hedviginc.com> wrote:

> Sorry for spamming, but any help would be greatly appreciated!
>
>
> On Thu, Jun 18, 2015 at 10:49 AM, Srividhya Anantharamakrishnan <
> srivid...@hedviginc.com> wrote:
>
> > The following are the jars in my classpath:
> >
> > 1. slf4j-log4j12-1.6.6.jar
> > 2. slf4j-api-1.6.6.jar
> > 3. zookeeper-3.4.6.jar
> > 4. kafka_2.11-0.8.3-SNAPSHOT.jar
> > 5. kafka_2.11-0.8.2.1.jar
> > 6. kafka-clients-0.8.2.1.jar
> > 7. metrics-core-2.2.0.jar
> > 8. scala-library-2.11.5.jar
> > 9. zkclient-0.3.jar
> >
> > Am I missing something?
> >
> > On Wed, Jun 17, 2015 at 9:15 PM, Jaikiran Pai 
> > wrote:
> >
> >> You probably have the wrong version of the Kafka jar(s) within your
> >> classpath. Which version of Kafka are you using and how have you setup
> the
> >> classpath?
> >>
> >> -Jaikiran
> >>
> >> On Thursday 18 June 2015 08:11 AM, Srividhya Anantharamakrishnan wrote:
> >>
> >>> Hi,
> >>>
> >>> I am trying to set up Kafka in our cluster and I am running into the
> >>> following error when Consumer is getting instantiated:
> >>>
> >>> java.lang.NoSuchMethodError:
> >>>
> >>>
> org.apache.kafka.common.utils.Utils.newThread(Ljava/lang/String;Ljava/lang/Runnable;Ljava/lang/Boolean;)Ljava/lang/Thread;
> >>>
> >>>  at
> >>> kafka.utils.KafkaScheduler$$anon$1.newThread(KafkaScheduler.scala:84)
> >>>
> >>>  at
> >>>
> >>>
> java.util.concurrent.ThreadPoolExecutor$Worker.(ThreadPoolExecutor.java:610)
> >>>
> >>>  at
> >>>
> >>>
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:924)
> >>>
> >>>  at
> >>>
> >>>
> java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1590)
> >>>
> >>>  at
> >>>
> >>>
> java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:333)
> >>>
> >>>  at
> >>>
> >>>
> java.util.concurrent.ScheduledThreadPoolExecutor.scheduleAtFixedRate(ScheduledThreadPoolExecutor.java:570)
> >>>
> >>>  at
> kafka.utils.KafkaScheduler.schedule(KafkaScheduler.scala:116)
> >>>
> >>>  at
> >>>
> >>>
> kafka.consumer.ZookeeperConsumerConnector.(ZookeeperConsumerConnector.scala:136)
> >>>
> >>>  at
> >>>
> >>>
> kafka.javaapi.consumer.ZookeeperConsumerConnector.(ZookeeperConsumerConnector.scala:65)
> >>>
> >>>  at
> >>>
> >>>
> kafka.javaapi.consumer.ZookeeperConsumerConnector.(ZookeeperConsumerConnector.scala:68)
> >>>
> >>>  at
> >>>
> >>>
> kafka.consumer.Consumer$.createJavaConsumerConnector(ConsumerConnector.scala:120)
> >>>
> >>>  at
> >>>
> >>>
> kafka.consumer.Consumer.createJavaConsumerConnector(ConsumerConnector.scala)
> >>>
> >>>
> >>> I am guessing that it is missing certain classpath references. If that
> is
> >>> the reason, could someone tell me which jar is it?
> >>>
> >>> If not, what is it that I am missing?
> >>>
> >>>
> >>> *KafkaConsumer:*
> >>>
> >>>
> >>> public KafkaConsumer(String topic)
> >>>
> >>> {
> >>>
> >>> * consumer =
> >>> Consumer.createJavaConsumerConnector(createConsumerConfig());
> >>> //line where the error is thrown*
> >>>
> >>>   this.topic = topic;
> >>>
> >>> }
> >>>
> >>>   private static ConsumerConfig createConsumerConfig()
> >>>
> >>> {
> >>>
> >>>   Properties props = new Properties();
> >>>
> >>>  props.put("zookeeper.connect", "IP:PORT");
> >>>
> >>>  props.put("group.id", "group1");
> >>>
> >>>  props.put("zookeeper.session.timeout.ms", "6000");
> >>>
> >>>  props.put("zookeeper.sync.time.ms", "2000");
> >>>
> >>>  props.put("auto.commit.interval.ms", "6");
> >>>
> >>>
> >>>  return new ConsumerConfig(props);
> >>>
> >>>   }
> >>>
> >>>
> >>> TIA!
> >>>
> >>>
> >>
> >
>



-- 
Thanks,
Ewen


Re: How to decrease number of replicas

2015-06-18 Thread Guangle Fan
If I use the same approach to reassign smaller number of replicas to the
same partition, I got this error :

(0,5,1,6,2,3) are the current replica, and (6) is the new list I want to
assign to topic partition 0

Assigned replicas (0,5,1,6,2,3) don't match the list of replicas for
reassignment (6) for partition [topic,0]

On Thu, Jun 18, 2015 at 5:03 PM, Guangle Fan  wrote:

> Hi, Folks
>
> We have an online kafka cluster v0.8.1.1.
> After running a partition reassignment script which maps each partition to
> 3 replicas. But growth of data is out of my expectation, and I really need
> to decrease replicas for each partition to 2 or 1.
>
> What's the best way to do this ?
>
> Thanks !
>
> GFan
>


Manual Offset Commits with High Level Consumer skipping messages

2015-06-18 Thread noah
We are in a situation where we need at least once delivery. We have a
thread that pulls messages off the consumer, puts them in a queue where
they go through a few async steps, and then after the final step, we want
to commit the offset to the messages we have completed. There may be items
we have not completed still being processed, so
consumerConnector.commitOffsets() isn't an option for us.

We are manually committing offsets to Kafka (0.8.2.1) (auto commit is off.)

We have a simple test case that is supposed to verify that we don't lose
any messages if the Kafka server is shut down:

// there are 25 messages, we send a few now and a few after the server
comes back up
for (TestMessageClass mess : messages.subList(0, mid)) {
producer.send(mess);
}

stopKafka(); // in memory KafkaServer
startKafka();

for (TestMessageClass mess : messages.subList(mid, total)) {
producer.send(mess);
}

int tries = 0;
while(testConsumer.received.size() < total && tries++ < 10) {
Thread.sleep(200);
}
assertEquals(keys(testConsumer.received),
keys(ImmutableSet.copyOf(messages)));

The test consumer is very simple:

ConsumerIterator iterator;
while(iterator.hasNext()) {
process(iterator.next());
}

// end of process:
   commit(messageAndMetadata.offset());

commit is basically the commit code from this page:
https://cwiki.apache.org/confluence/display/KAFKA/Committing+and+fetching+consumer+offsets+in+Kafka,
but runs the commit in a separate thread so it wont interfere with the
consumer.

Here is the strange thing: If we do not commit, the test passes every time.
Kafka comes back up and the high level consumer picks up right where it
left off. But if we do commit, it does not recover, or we lose messages.
With 1 partition, we only get some prefix of the messages produced before
stopKafka(). With 2, one of the partitions never gets any of the messages
sent in the second half, while the other gets a prefix, but not all of the
messages for that partition.

It seems like the most likely thing is that we are committing the wrong
offsets, but I cannot figure out how that is happening. Does the offset in
MessageAndMetadata not correspond to the offset in OffsetAndMetadata?

Or do we have to abandon the high level consumer entirely if we want to
manually commit in this way?


Re: At-least-once guarantees with high-level consumer

2015-06-18 Thread Bhavesh Mistry
HI Carl,

Produce side retry can produce duplicated message being sent to brokers
with different offset with same message. Also, you may get duplicated when
the High Level Consumer offset is not being saved or commit but you have
processed data and your server restart etc...



To guaranteed at-least one processing across partitions (and across
servers), you will need to store message hash or primary key into
distributed LRU cache (with eviction policy )  like Hazelcast
 and do dedupping across partitions.



I hope this help !



Thanks,

Bhavesh


On Wed, Jun 17, 2015 at 1:49 AM, yewton  wrote:

> So Carl Heymann's ConsumerIterator.next hack approach is not reasonable?
>
> 2015-06-17 08:12:50 + 上のメッセージ Stevo Slavić:
>
>  --047d7bfcf30ed09b460518b241db
>>
>> Content-Type: text/plain; charset=UTF-8
>>
>>
>>
>>
>> With auto-commit one can only have at-most-once delivery guarantee - after
>>
>> commit but before message is delivered for processing, or even after it is
>>
>> delivered but before it is processed, things can fail, causing event not
>> to
>>
>> be processed, which is basically same outcome as if it was not delivered.
>>
>>
>>
>> On Mon, Jun 15, 2015 at 9:12 PM, Carl Heymann 
>> wrote:
>>
>>
>>
>> > Hi
>>
>> >
>>
>> > ** Disclaimer: I know there's a new consumer API on the way, this mail
>> is
>>
>> > about the currently available API. I also apologise if the below has
>>
>> > already been discussed previously. I did try to check previous
>> discussions
>>
>> > on ConsumerIterator **
>>
>> >
>>
>> > It seems to me that the high-level consumer would be able to support
>>
>> > at-least-once messaging, even if one uses auto-commit, by changing
>>
>> > kafka.consumer.ConsumerIterator.next() to call
>>
>> > currentTopicInfo.resetConsumeOffset(..) _before_ super.next(). This
>> way, a
>>
>> > consumer thread for a KafkaStream could just loop:
>>
>> >
>>
>> > while (true) {
>>
>> > MyMessage message = iterator.next().message();
>>
>> > process(message);
>>
>> > }
>>
>> >
>>
>> > Each call to "iterator.next()" then updates the offset to commit to the
>> end
>>
>> > of the message that was just processed. When offsets are committed for
>> the
>>
>> > ConsumerConnector (either automatically or manually), the commit will
>> not
>>
>> > include offsets of messages that haven't been fully processed.
>>
>> >
>>
>> > I've tested the following ConsumerIterator.next(), and it seems to work
>> as
>>
>> > I expect:
>>
>> >
>>
>> >   override def next(): MessageAndMetadata[K, V] = {
>>
>> > // New code: reset consumer offset to the end of the previously
>>
>> > consumed message:
>>
>> > if (consumedOffset > -1L && currentTopicInfo != null) {
>>
>> > currentTopicInfo.resetConsumeOffset(consumedOffset)
>>
>> > val topic = currentTopicInfo.topic
>>
>> > trace("Setting %s consumed offset to %d".format(topic,
>>
>> > consumedOffset))
>>
>> > }
>>
>> >
>>
>> > // Old code, excluding reset:
>>
>> > val item = super.next()
>>
>> > if(consumedOffset < 0)
>>
>> >   throw new KafkaException("Offset returned by the message set is
>>
>> > invalid %d".format(consumedOffset))
>>
>> > val topic = currentTopicInfo.topic
>>
>> > consumerTopicStats.getConsumerTopicStats(topic).messageRate.mark()
>>
>> > consumerTopicStats.getConsumerAllTopicStats().messageRate.mark()
>>
>> > item
>>
>> >   }
>>
>> >
>>
>> > I've seen several people asking about managing commit offsets manually
>> with
>>
>> > the high level consumer. I suspect that this approach (the modified
>>
>> > ConsumerIterator) would scale better than having a separate
>>
>> > ConsumerConnecter per stream just so that you can commit offsets with
>>
>> > at-least-once semantics. The downside of this approach is more duplicate
>>
>> > deliveries after recovery from hard failure (but this is "at least
>> once",
>>
>> > right, not "exactly once").
>>
>> >
>>
>> > I don't propose that the code necessarily be changed like this in
>> trunk, I
>>
>> > just want to know if the approach seems reasonable.
>>
>> >
>>
>> > Regards
>>
>> > Carl Heymann
>>
>> >
>>
>>
>>
>> --047d7bfcf30ed09b460518b241db--
>>
>>
>>
>>
>
>
>


How to decrease number of replicas

2015-06-18 Thread Guangle Fan
Hi, Folks

We have an online kafka cluster v0.8.1.1.
After running a partition reassignment script which maps each partition to
3 replicas. But growth of data is out of my expectation, and I really need
to decrease replicas for each partition to 2 or 1.

What's the best way to do this ?

Thanks !

GFan


Re: NoSuchMethodError with Consumer Instantiation

2015-06-18 Thread Srividhya Anantharamakrishnan
Sorry for spamming, but any help would be greatly appreciated!


On Thu, Jun 18, 2015 at 10:49 AM, Srividhya Anantharamakrishnan <
srivid...@hedviginc.com> wrote:

> The following are the jars in my classpath:
>
> 1. slf4j-log4j12-1.6.6.jar
> 2. slf4j-api-1.6.6.jar
> 3. zookeeper-3.4.6.jar
> 4. kafka_2.11-0.8.3-SNAPSHOT.jar
> 5. kafka_2.11-0.8.2.1.jar
> 6. kafka-clients-0.8.2.1.jar
> 7. metrics-core-2.2.0.jar
> 8. scala-library-2.11.5.jar
> 9. zkclient-0.3.jar
>
> Am I missing something?
>
> On Wed, Jun 17, 2015 at 9:15 PM, Jaikiran Pai 
> wrote:
>
>> You probably have the wrong version of the Kafka jar(s) within your
>> classpath. Which version of Kafka are you using and how have you setup the
>> classpath?
>>
>> -Jaikiran
>>
>> On Thursday 18 June 2015 08:11 AM, Srividhya Anantharamakrishnan wrote:
>>
>>> Hi,
>>>
>>> I am trying to set up Kafka in our cluster and I am running into the
>>> following error when Consumer is getting instantiated:
>>>
>>> java.lang.NoSuchMethodError:
>>>
>>> org.apache.kafka.common.utils.Utils.newThread(Ljava/lang/String;Ljava/lang/Runnable;Ljava/lang/Boolean;)Ljava/lang/Thread;
>>>
>>>  at
>>> kafka.utils.KafkaScheduler$$anon$1.newThread(KafkaScheduler.scala:84)
>>>
>>>  at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.(ThreadPoolExecutor.java:610)
>>>
>>>  at
>>>
>>> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:924)
>>>
>>>  at
>>>
>>> java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1590)
>>>
>>>  at
>>>
>>> java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:333)
>>>
>>>  at
>>>
>>> java.util.concurrent.ScheduledThreadPoolExecutor.scheduleAtFixedRate(ScheduledThreadPoolExecutor.java:570)
>>>
>>>  at kafka.utils.KafkaScheduler.schedule(KafkaScheduler.scala:116)
>>>
>>>  at
>>>
>>> kafka.consumer.ZookeeperConsumerConnector.(ZookeeperConsumerConnector.scala:136)
>>>
>>>  at
>>>
>>> kafka.javaapi.consumer.ZookeeperConsumerConnector.(ZookeeperConsumerConnector.scala:65)
>>>
>>>  at
>>>
>>> kafka.javaapi.consumer.ZookeeperConsumerConnector.(ZookeeperConsumerConnector.scala:68)
>>>
>>>  at
>>>
>>> kafka.consumer.Consumer$.createJavaConsumerConnector(ConsumerConnector.scala:120)
>>>
>>>  at
>>>
>>> kafka.consumer.Consumer.createJavaConsumerConnector(ConsumerConnector.scala)
>>>
>>>
>>> I am guessing that it is missing certain classpath references. If that is
>>> the reason, could someone tell me which jar is it?
>>>
>>> If not, what is it that I am missing?
>>>
>>>
>>> *KafkaConsumer:*
>>>
>>>
>>> public KafkaConsumer(String topic)
>>>
>>> {
>>>
>>> * consumer =
>>> Consumer.createJavaConsumerConnector(createConsumerConfig());
>>> //line where the error is thrown*
>>>
>>>   this.topic = topic;
>>>
>>> }
>>>
>>>   private static ConsumerConfig createConsumerConfig()
>>>
>>> {
>>>
>>>   Properties props = new Properties();
>>>
>>>  props.put("zookeeper.connect", "IP:PORT");
>>>
>>>  props.put("group.id", "group1");
>>>
>>>  props.put("zookeeper.session.timeout.ms", "6000");
>>>
>>>  props.put("zookeeper.sync.time.ms", "2000");
>>>
>>>  props.put("auto.commit.interval.ms", "6");
>>>
>>>
>>>  return new ConsumerConfig(props);
>>>
>>>   }
>>>
>>>
>>> TIA!
>>>
>>>
>>
>


Re: How Netflix is using Kafka

2015-06-18 Thread Steve Brandon
Thanks Peter!

Great information, especially challenges and strategies. We're rolling out
Kafka now and it's interesting to see how others have done it.

On Thu, Jun 18, 2015 at 10:17 AM, Peter Hausel 
wrote:

> Hello,
>
> Thought you might find these resources interesting:
>
> http://techblog.netflix.com/2015/06/nts-real-time-streaming-for-test.html
> http://www.slideshare.net/wangxia5/netflix-kafka
>
> Cheers,
> Peter
>


Re: Need recommendation

2015-06-18 Thread Gwen Shapira
I'm assuming you are sending data in a continuous stream and not a
single large batch:

500GB a day = 20GB an hour = 5MB a second.

A minimal 3 node cluster should work. You also need enough storage for
reasonable retention period (15TB per month).




On Thu, Jun 18, 2015 at 10:39 AM, Khanda.Rajat  wrote:
> Hi,
> I have a requirement of transferring around 500 GB of logs from app server to 
> hdfs per day. What will be the ideal kafka cluster size?
>
> Thanks
> Rajat
> CONFIDENTIALITY NOTICE: This message is the property of International Game 
> Technology PLC and/or its subsidiaries and may contain proprietary, 
> confidential or trade secret information. This message is intended solely for 
> the use of the addressee. If you are not the intended recipient and have 
> received this message in error, please delete this message from your system. 
> Any unauthorized reading, distribution, copying, or other use of this message 
> or its attachments is strictly prohibited.


Need recommendation

2015-06-18 Thread Khanda.Rajat
Hi,
I have a requirement of transferring around 500 GB of logs from app server to 
hdfs per day. What will be the ideal kafka cluster size?

Thanks
Rajat
CONFIDENTIALITY NOTICE: This message is the property of International Game 
Technology PLC and/or its subsidiaries and may contain proprietary, 
confidential or trade secret information. This message is intended solely for 
the use of the addressee. If you are not the intended recipient and have 
received this message in error, please delete this message from your system. 
Any unauthorized reading, distribution, copying, or other use of this message 
or its attachments is strictly prohibited.


Re: NoSuchMethodError with Consumer Instantiation

2015-06-18 Thread Srividhya Anantharamakrishnan
The following are the jars in my classpath:

1. slf4j-log4j12-1.6.6.jar
2. slf4j-api-1.6.6.jar
3. zookeeper-3.4.6.jar
4. kafka_2.11-0.8.3-SNAPSHOT.jar
5. kafka_2.11-0.8.2.1.jar
6. kafka-clients-0.8.2.1.jar
7. metrics-core-2.2.0.jar
8. scala-library-2.11.5.jar
9. zkclient-0.3.jar

Am I missing something?

On Wed, Jun 17, 2015 at 9:15 PM, Jaikiran Pai 
wrote:

> You probably have the wrong version of the Kafka jar(s) within your
> classpath. Which version of Kafka are you using and how have you setup the
> classpath?
>
> -Jaikiran
>
> On Thursday 18 June 2015 08:11 AM, Srividhya Anantharamakrishnan wrote:
>
>> Hi,
>>
>> I am trying to set up Kafka in our cluster and I am running into the
>> following error when Consumer is getting instantiated:
>>
>> java.lang.NoSuchMethodError:
>>
>> org.apache.kafka.common.utils.Utils.newThread(Ljava/lang/String;Ljava/lang/Runnable;Ljava/lang/Boolean;)Ljava/lang/Thread;
>>
>>  at
>> kafka.utils.KafkaScheduler$$anon$1.newThread(KafkaScheduler.scala:84)
>>
>>  at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.(ThreadPoolExecutor.java:610)
>>
>>  at
>>
>> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:924)
>>
>>  at
>>
>> java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1590)
>>
>>  at
>>
>> java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:333)
>>
>>  at
>>
>> java.util.concurrent.ScheduledThreadPoolExecutor.scheduleAtFixedRate(ScheduledThreadPoolExecutor.java:570)
>>
>>  at kafka.utils.KafkaScheduler.schedule(KafkaScheduler.scala:116)
>>
>>  at
>>
>> kafka.consumer.ZookeeperConsumerConnector.(ZookeeperConsumerConnector.scala:136)
>>
>>  at
>>
>> kafka.javaapi.consumer.ZookeeperConsumerConnector.(ZookeeperConsumerConnector.scala:65)
>>
>>  at
>>
>> kafka.javaapi.consumer.ZookeeperConsumerConnector.(ZookeeperConsumerConnector.scala:68)
>>
>>  at
>>
>> kafka.consumer.Consumer$.createJavaConsumerConnector(ConsumerConnector.scala:120)
>>
>>  at
>>
>> kafka.consumer.Consumer.createJavaConsumerConnector(ConsumerConnector.scala)
>>
>>
>> I am guessing that it is missing certain classpath references. If that is
>> the reason, could someone tell me which jar is it?
>>
>> If not, what is it that I am missing?
>>
>>
>> *KafkaConsumer:*
>>
>>
>> public KafkaConsumer(String topic)
>>
>> {
>>
>> * consumer = Consumer.createJavaConsumerConnector(createConsumerConfig());
>> //line where the error is thrown*
>>
>>   this.topic = topic;
>>
>> }
>>
>>   private static ConsumerConfig createConsumerConfig()
>>
>> {
>>
>>   Properties props = new Properties();
>>
>>  props.put("zookeeper.connect", "IP:PORT");
>>
>>  props.put("group.id", "group1");
>>
>>  props.put("zookeeper.session.timeout.ms", "6000");
>>
>>  props.put("zookeeper.sync.time.ms", "2000");
>>
>>  props.put("auto.commit.interval.ms", "6");
>>
>>
>>  return new ConsumerConfig(props);
>>
>>   }
>>
>>
>> TIA!
>>
>>
>


How Netflix is using Kafka

2015-06-18 Thread Peter Hausel
Hello,

Thought you might find these resources interesting:

http://techblog.netflix.com/2015/06/nts-real-time-streaming-for-test.html
http://www.slideshare.net/wangxia5/netflix-kafka

Cheers,
Peter


queued.max.requests Configuration

2015-06-18 Thread Grant Henke
The default value of queued.max.requests is 500. However, the sample
production config in the documentation (
http://kafka.apache.org/documentation.html#prodconfig)
sets queued.max.requests to 16.

Can anyone elaborate on the recommended value of 16 and the trade offs of
increasing or decreasing this value from the default?

Thank you,
Grant

-- 
Grant Henke
Solutions Consultant | Cloudera
ghe...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Could not recover kafka broker

2015-06-18 Thread haosdent
Hello, I use kafka 0.8.2 . I have a three borkers kafka cluster.

I stop one broker and copy recovery-point-offset-checkpoint to
override replication-offset-checkpoint. After that, I start the broker.

But I find that the broker could not be added to ISR anymore. And the
`logs/state-change.log` don't update anymore.  I try
`bin/kafka-preferred-replica-election.sh`, but the broker still could not
added to ISR. `log/server.log` and `logs/controller.log` don't have any
WARN or ERROR log. And I check the network traffic of this sever, the
replication fetch of this borker is not start. I search a lot in the user
mail list, but I still don't have any ideas. How could I recover it? Thank
you in advance.

-- 
Best Regards,
Haosdent Huang


Add the hard drive problem

2015-06-18 Thread 杨洪涛
HI,all
   My server has only one hard drive, and now the disk IO appear bottlenecks, I 
want to add two hard disk, but did not find the relevant online upgrade 
document, please help

Re: Issue with log4j Kafka Appender.

2015-06-18 Thread Manikumar Reddy
You can enable producer  debug log and verify. In 0.8.2.0, you can set
 compressionType
, requiredNumAcks,  syncSend producer config properties to log4j.xml. Trunk
build can take additional retries property .


Manikumar

On Thu, Jun 18, 2015 at 1:14 AM, Madhavi Sreerangam <
madhavi.sreeran...@gmail.com> wrote:

> I have configured my log4j with Kafka Appender.(Kafka version 0.8.2.0)
> Following are entries from my log4j file
>
> log4j.appender.KAFKA=kafka.producer.KafkaLog4jAppender
> log4j.appender.KAFKA.BrokerList=iakafka301p.dev.ch3.s.com:9092,
> iakafka302p.dev.ch3.s.com:9092,iakafka303p.dev.ch3.s.com:9092
> log4j.appender.KAFKA.Topic=dev-1.0_audit
> log4j.appender.KAFKA.Serializer=kafka.test.AppenderStringSerializer
> log4j.appender.KAFKA.layout=org.apache.log4j.PatternLayout
> log4j.appender.KAFKA.layout.ConversionPattern=%m-%d
>
> Kafka is configured with 3 servers, 3 partitions and 3 replicas.
> I have created a test method to publish the messages to kafka topic as
> follows
>
> private void testKAFKAlog(int noOfMessages){
> for(int i=0; i < noOfMessages; i++){
> KAFKA_LOG.info("Test Message: " + i);
> }
> }
> I could not see any messages published into the topic. Then I have modified
> the test method to introduce some wait between the requests as follows
>
> private void testKAFKAlog(int noOfMessages){
> for(int i=0; i < noOfMessages; i++){
> try {
> Thread.sleep(10);
> } catch (InterruptedException e) {
> e.printStackTrace();
> }
> KAFKA_LOG.info("Test Message: " + i);
> }
> }
>
> Then all the messages started publishing. I did this exercise couple of
> times with and without sleep between the requests. Messages got published
> only when there is sleep in between the requests.
> Does any one help me here, what is wrong with the configurations I am
> using. (I can't afford 10ms wait for each message, as my application logs
> few Million messages for each run).
> Is there any way that I can override the default ProducerConfig for log4j
> kafka appender.
>


unclean leader election enable default value bug?

2015-06-18 Thread Zaiming Shi
Kafka 0.8.2.1

I have `unclean.leader.election.enable=false` in server.properties

I can see this log in server.log:
[2015-06-18 09:57:18,961] INFO Property unclean.leader.election.enable is
overridden to false (kafka.utils.VerifiableProperties)

Yet the topic was created with `unclean.leader.election.enable -> true`

I see similarity in this issue:
https://issues.apache.org/jira/browse/KAFKA-2114

Regards
-Zaiming