Re: Is there a way to add partition to a particular topic

2013-11-08 Thread Jun Rao
Delete topic doesn't work yet. We plan to fix it in trunk.

Thanks,

Jun


On Fri, Nov 8, 2013 at 6:30 PM, hsy...@gmail.com  wrote:

> It's in the branch, cool, I'll wait for it's release. actually I find I can
> use ./kafka-delete-topic.sh and ./kafk-create-topic.sh with same topic name
> and keep the broker running. It's interesting that delete topic doesn't
> actually remove the data from the brokers. So what I understand is as long
> as I deal with the error caught on the producer and consumer site, I can
> use this method instead of that add-topic script, am i correct? Do you have
> any concern I add topic in this way?
>
>
> On Fri, Nov 8, 2013 at 6:00 PM, Guozhang Wang  wrote:
>
> > Hello,
> >
> > Please check the add-partition tool:
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool
> >
> > Guozhang
> >
> >
> > On Fri, Nov 8, 2013 at 5:32 PM, hsy...@gmail.com 
> wrote:
> >
> > > Hi guys, since kafka is able to add new broker into the cluster at
> > runtime,
> > > I'm wondering is there a way to add new partition for a specific topic
> at
> > > run time?  If not what will you do if you want to add more partition
> to a
> > > topic? Thanks!
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>


Re: Is there a way to add partition to a particular topic

2013-11-08 Thread hsy...@gmail.com
I mean I assume the messages not yet consumed before delete-topic will be
delivered before you create same topic, correct?


On Fri, Nov 8, 2013 at 6:30 PM, hsy...@gmail.com  wrote:

> It's in the branch, cool, I'll wait for it's release. actually I find I
> can use ./kafka-delete-topic.sh and ./kafk-create-topic.sh with same topic
> name and keep the broker running. It's interesting that delete topic
> doesn't actually remove the data from the brokers. So what I understand is
> as long as I deal with the error caught on the producer and consumer site,
> I can use this method instead of that add-topic script, am i correct? Do
> you have any concern I add topic in this way?
>
>
> On Fri, Nov 8, 2013 at 6:00 PM, Guozhang Wang  wrote:
>
>> Hello,
>>
>> Please check the add-partition tool:
>>
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool
>>
>> Guozhang
>>
>>
>> On Fri, Nov 8, 2013 at 5:32 PM, hsy...@gmail.com 
>> wrote:
>>
>> > Hi guys, since kafka is able to add new broker into the cluster at
>> runtime,
>> > I'm wondering is there a way to add new partition for a specific topic
>> at
>> > run time?  If not what will you do if you want to add more partition to
>> a
>> > topic? Thanks!
>> >
>>
>>
>>
>> --
>> -- Guozhang
>>
>
>


Re: Is there a way to add partition to a particular topic

2013-11-08 Thread hsy...@gmail.com
It's in the branch, cool, I'll wait for it's release. actually I find I can
use ./kafka-delete-topic.sh and ./kafk-create-topic.sh with same topic name
and keep the broker running. It's interesting that delete topic doesn't
actually remove the data from the brokers. So what I understand is as long
as I deal with the error caught on the producer and consumer site, I can
use this method instead of that add-topic script, am i correct? Do you have
any concern I add topic in this way?


On Fri, Nov 8, 2013 at 6:00 PM, Guozhang Wang  wrote:

> Hello,
>
> Please check the add-partition tool:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool
>
> Guozhang
>
>
> On Fri, Nov 8, 2013 at 5:32 PM, hsy...@gmail.com  wrote:
>
> > Hi guys, since kafka is able to add new broker into the cluster at
> runtime,
> > I'm wondering is there a way to add new partition for a specific topic at
> > run time?  If not what will you do if you want to add more partition to a
> > topic? Thanks!
> >
>
>
>
> --
> -- Guozhang
>


Re: Is there a way to add partition to a particular topic

2013-11-08 Thread Guozhang Wang
Hello,

Please check the add-partition tool:

https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool

Guozhang


On Fri, Nov 8, 2013 at 5:32 PM, hsy...@gmail.com  wrote:

> Hi guys, since kafka is able to add new broker into the cluster at runtime,
> I'm wondering is there a way to add new partition for a specific topic at
> run time?  If not what will you do if you want to add more partition to a
> topic? Thanks!
>



-- 
-- Guozhang


Is there a way to add partition to a particular topic

2013-11-08 Thread hsy...@gmail.com
Hi guys, since kafka is able to add new broker into the cluster at runtime,
I'm wondering is there a way to add new partition for a specific topic at
run time?  If not what will you do if you want to add more partition to a
topic? Thanks!


Re: Kafka client dies after rebalancing attempt

2013-11-08 Thread Joel Koshy
Actually from your original mail you do seem to have logs (somewhere -
either in a file or stdout).  Do you see zookeeper session expirations
in there prior to the rebalances?

On Fri, Nov 08, 2013 at 04:11:15PM -0500, Ahmed H. wrote:
> Thanks for the input. Yes that directory is open for all users (rwx).
> 
> I don't think that the lack of logging is related to my consumer dying, but
> it doesn't help when trying to debug when I have no logs.
> 
> I am struggling to find a reason behind this. I deployed the same code, and
> same version of Kafka/Zookeeper locally and I am unable to reproduce it.
> Granted, my local setup does have a few different components, but it's a
> start.
> 
> Any other ideas on what to look for?
> 
> Thanks again for your help
> 
> 
> On Fri, Nov 8, 2013 at 4:00 PM, Joel Koshy  wrote:
> 
> > Do you have write permissions in /kafka-log4j? Your logs should be
> > going there (at least per your log4j config) - and you may want to use
> > a different log4j config for your consumer so it doesn't collide with
> > the broker's.
> >
> > I doubt the consumer thread dying issue is related to yours - again,
> > logs would help.
> >
> > Also, you may want to try with the latest HEAD as opposed to the beta.
> >
> > Thanks,
> >
> > Joel
> >
> > On Fri, Nov 08, 2013 at 01:18:07PM -0500, Ahmed H. wrote:
> > > Hello,
> > >
> > > I am using the beta right now.
> > >
> > > I'm not sure if it's GC or something else at this point. To be honest
> > I've
> > > never really fiddled with any GC settings before. The system can run for
> > as
> > > long as a day without failing, or as little as a few hours. The lack of
> > > pattern makes it a little harder to debug. As I mentioned before, the
> > > activity on this system is fairly consistent throughout the day.
> > >
> > > On the link that you sent, I see this, which could very well be the
> > reason:
> > >
> > >- One of the typical causes is that the application code that consumes
> > >messages somehow died and therefore killed the consumer thread. We
> > >recommend using a try/catch clause to log all Throwable in the
> > consumer
> > >logic.
> > >
> > > That is entirely possible. I wanted to check the kafka logs for any clues
> > > but for some reason, kafka is not writing any logs :/. Here is my log4j
> > > settings for kafka:
> > >
> > > log4j.rootLogger=INFO, stdout
> > > > log4j.appender.stdout=org.apache.log4j.ConsoleAppender
> > > > log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
> > > > log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n
> > > > log4j.appender.kafkaAppender=org.apache.log4j.DailyRollingFileAppender
> > > > log4j.appender.kafkaAppender.DatePattern='.'-MM-dd-HH
> > > > log4j.appender.kafkaAppender.File=/kafka-log4j/server.log
> > > > log4j.appender.kafkaAppender.layout=org.apache.log4j.PatternLayout
> > > > log4j.appender.kafkaAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
> > > >
> > > >
> > log4j.appender.stateChangeAppender=org.apache.log4j.DailyRollingFileAppender
> > > > log4j.appender.stateChangeAppender.DatePattern='.'-MM-dd-HH
> > > > log4j.appender.stateChangeAppender.File=/kafka-log4j/state-change.log
> > > >
> > log4j.appender.stateChangeAppender.layout=org.apache.log4j.PatternLayout
> > > > log4j.appender.stateChangeAppender.layout.ConversionPattern=[%d] %p %m
> > > > (%c)%n
> > > >
> > log4j.appender.requestAppender=org.apache.log4j.DailyRollingFileAppender
> > > > log4j.appender.requestAppender.DatePattern='.'-MM-dd-HH
> > > > log4j.appender.requestAppender.File=/kafka-log4j/kafka-request.log
> > > > log4j.appender.requestAppender.layout=org.apache.log4j.PatternLayout
> > > > log4j.appender.requestAppender.layout.ConversionPattern=[%d] %p %m
> > (%c)%n
> > > >
> > log4j.appender.controllerAppender=org.apache.log4j.DailyRollingFileAppender
> > > > log4j.appender.controllerAppender.DatePattern='.'-MM-dd-HH
> > > > log4j.appender.controllerAppender.File=/kafka-log4j/controller.log
> > > > log4j.appender.controllerAppender.layout=org.apache.log4j.PatternLayout
> > > > log4j.appender.controllerAppender.layout.ConversionPattern=[%d] %p %m
> > > > (%c)%n
> > > > log4j.logger.kafka=INFO, kafkaAppender
> > > > log4j.logger.kafka.network.RequestChannel$=TRACE, requestAppender
> > > > log4j.additivity.kafka.network.RequestChannel$=false
> > > > log4j.logger.kafka.request.logger=TRACE, requestAppender
> > > > log4j.additivity.kafka.request.logger=false
> > > > log4j.logger.kafka.controller=TRACE, controllerAppender
> > > > log4j.additivity.kafka.controller=false
> > > > log4j.logger.state.change.logger=TRACE, stateChangeAppender
> > > > log4j.additivity.state.change.logger=false
> > >
> > >
> > >
> > > Thanks
> > >
> > >
> > > On Thu, Nov 7, 2013 at 5:06 PM, Joel Koshy  wrote:
> > >
> > > > Can you see if this applies in your case:
> > > >
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog%3F
> > > >
> > > > Al

RE: kafka-reassign-partitions.sh --status-check-json-file not working

2013-11-08 Thread Joseph Lawson
so I think this was a copy paste error where the quote symbol (") was being 
pasted as something different which messed up the json file with the assignment 
issues.  After trying to replicate this issue I was unsuccessful unless I had 
the bad character.

From: Joseph Lawson 
Sent: Tuesday, November 05, 2013 8:32 AM
To: users@kafka.apache.org
Subject: Re: kafka-reassign-partitions.sh --status-check-json-file not working

Guozhang,

I'll try to come up with some steps to repicate today.  I didn't notice 
anything obvious in the logs.

Joe

Sent from my Droid Charge on Verizon 4G LTE Guozhang Wang wrote:
Hello Joe,

Do you see any exceptions in the controller or state-change logs?

Where are these two topics originally located?

Guozhang


On Mon, Nov 4, 2013 at 9:47 AM, Joseph Lawson  wrote:

> Hi everyone,
>
> I'm using Kafka 0.8.0 from the git repository.
>
> I'm trying the following commands:
>
> bin/kafka-reassign-partitions.sh --topics-to-move-json-file
> topics-to-move.json --zookeeper
> zk-qa.us-e.roomkey.net:2181/kafka_stage/kafka_qa --broker-list "1"
> --execute
>
> where topics-to-move.json is:
>
> {"topics":
>  [{"topic": "topic1"},{"topic": "topic2"}],
>  "version":1
> }
>
> When I run:
>
> bin/kafka-reassign-partitions.sh --status-check-json-file
> topics-to-move.json --zookeeper
> zk-qa.us-e.roomkey.net:2181/kafka_stage/kafka_qa
>
> All I see is:
>
> Status of partition reassignment:
>
> I expect to see some status but I cannot see anything.
>
> When I run the reassign command again I get the following error implying
> that the reassignment is taking place:
>
> Partitions reassignment failed due to Partition reassignment currently in
> progress for Map(). Aborting operation
> kafka.common.AdminCommandFailedException: Partition reassignment currently
> in progress for Map(). Aborting operation
> at
> kafka.admin.ReassignPartitionsCommand.reassignPartitions(ReassignPartitionsCommand.scala:195)
> at
> kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:137)
> at
> kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)
>
>
> When I look at the broker 1, I see nothing indicating that anything is
> happening.  I only have a replication factor of 1 on this topic.
>
> Is this a bug or am I doing this wrong?
>
> Thanks!
>
> Joe Lawson
>



--
-- Guozhang

Re: Getting my Kafka code example working out of the box

2013-11-08 Thread Chris Bedford
I had some trouble with maven dependencies when i tried to get a simple
round trip test going.   I  worked past those and made my test available
here: https://github.com/buildlackey/cep/tree/master/kafka

it should run out of the box.

-cb


On Thu, Nov 7, 2013 at 5:35 PM, S L  wrote:

> Hi,
>
> This might be a really, really simple question but how do I get my test
> Kafka program working out of the box?  I followed the directions from
> http://kafka.apache.org/documentation.html#quickstart.  I started zk, the
> server, the producer and consumer.  I played with the producer, sending
> msgs to the consumer, which I saw show up in its terminal.  This was to the
> "page_visits" topic, as specified in the example program.
>
> I then went to
> https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Exampleand
> copied the files into netbeans and built it.  I also added jar files
> kafka-assembly-0.8.0-deps.jar and kafka_2.8.0-0.8.0-beta1.jar to make the
> build work if that makes a difference.
>
> Anyway, I then ran java -jar myjar.jar.  I expected to see a message show
> up in the consumer, but nothing shows up.  I sent another msg from the
> producer just to make sure the consumer was still connected, which it was
> b/c the consumer received it.
>
> I'm at wits end.  I know this is an extremely simple question and
> everything looks identical to the links above and should be working.
>  However, I'm not getting a message in the consumer.  Much thanks for any
> help you can give me.  I'm about to ram my head into a wall.
>



-- 
Chris Bedford

Founder & Lead Lackey
Build Lackey Labs:  http://buildlackey.com
Go Grails!: http://blog.buildlackey.com


Re: Kafka client dies after rebalancing attempt

2013-11-08 Thread Ahmed H.
Thanks for the input. Yes that directory is open for all users (rwx).

I don't think that the lack of logging is related to my consumer dying, but
it doesn't help when trying to debug when I have no logs.

I am struggling to find a reason behind this. I deployed the same code, and
same version of Kafka/Zookeeper locally and I am unable to reproduce it.
Granted, my local setup does have a few different components, but it's a
start.

Any other ideas on what to look for?

Thanks again for your help


On Fri, Nov 8, 2013 at 4:00 PM, Joel Koshy  wrote:

> Do you have write permissions in /kafka-log4j? Your logs should be
> going there (at least per your log4j config) - and you may want to use
> a different log4j config for your consumer so it doesn't collide with
> the broker's.
>
> I doubt the consumer thread dying issue is related to yours - again,
> logs would help.
>
> Also, you may want to try with the latest HEAD as opposed to the beta.
>
> Thanks,
>
> Joel
>
> On Fri, Nov 08, 2013 at 01:18:07PM -0500, Ahmed H. wrote:
> > Hello,
> >
> > I am using the beta right now.
> >
> > I'm not sure if it's GC or something else at this point. To be honest
> I've
> > never really fiddled with any GC settings before. The system can run for
> as
> > long as a day without failing, or as little as a few hours. The lack of
> > pattern makes it a little harder to debug. As I mentioned before, the
> > activity on this system is fairly consistent throughout the day.
> >
> > On the link that you sent, I see this, which could very well be the
> reason:
> >
> >- One of the typical causes is that the application code that consumes
> >messages somehow died and therefore killed the consumer thread. We
> >recommend using a try/catch clause to log all Throwable in the
> consumer
> >logic.
> >
> > That is entirely possible. I wanted to check the kafka logs for any clues
> > but for some reason, kafka is not writing any logs :/. Here is my log4j
> > settings for kafka:
> >
> > log4j.rootLogger=INFO, stdout
> > > log4j.appender.stdout=org.apache.log4j.ConsoleAppender
> > > log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
> > > log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n
> > > log4j.appender.kafkaAppender=org.apache.log4j.DailyRollingFileAppender
> > > log4j.appender.kafkaAppender.DatePattern='.'-MM-dd-HH
> > > log4j.appender.kafkaAppender.File=/kafka-log4j/server.log
> > > log4j.appender.kafkaAppender.layout=org.apache.log4j.PatternLayout
> > > log4j.appender.kafkaAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
> > >
> > >
> log4j.appender.stateChangeAppender=org.apache.log4j.DailyRollingFileAppender
> > > log4j.appender.stateChangeAppender.DatePattern='.'-MM-dd-HH
> > > log4j.appender.stateChangeAppender.File=/kafka-log4j/state-change.log
> > >
> log4j.appender.stateChangeAppender.layout=org.apache.log4j.PatternLayout
> > > log4j.appender.stateChangeAppender.layout.ConversionPattern=[%d] %p %m
> > > (%c)%n
> > >
> log4j.appender.requestAppender=org.apache.log4j.DailyRollingFileAppender
> > > log4j.appender.requestAppender.DatePattern='.'-MM-dd-HH
> > > log4j.appender.requestAppender.File=/kafka-log4j/kafka-request.log
> > > log4j.appender.requestAppender.layout=org.apache.log4j.PatternLayout
> > > log4j.appender.requestAppender.layout.ConversionPattern=[%d] %p %m
> (%c)%n
> > >
> log4j.appender.controllerAppender=org.apache.log4j.DailyRollingFileAppender
> > > log4j.appender.controllerAppender.DatePattern='.'-MM-dd-HH
> > > log4j.appender.controllerAppender.File=/kafka-log4j/controller.log
> > > log4j.appender.controllerAppender.layout=org.apache.log4j.PatternLayout
> > > log4j.appender.controllerAppender.layout.ConversionPattern=[%d] %p %m
> > > (%c)%n
> > > log4j.logger.kafka=INFO, kafkaAppender
> > > log4j.logger.kafka.network.RequestChannel$=TRACE, requestAppender
> > > log4j.additivity.kafka.network.RequestChannel$=false
> > > log4j.logger.kafka.request.logger=TRACE, requestAppender
> > > log4j.additivity.kafka.request.logger=false
> > > log4j.logger.kafka.controller=TRACE, controllerAppender
> > > log4j.additivity.kafka.controller=false
> > > log4j.logger.state.change.logger=TRACE, stateChangeAppender
> > > log4j.additivity.state.change.logger=false
> >
> >
> >
> > Thanks
> >
> >
> > On Thu, Nov 7, 2013 at 5:06 PM, Joel Koshy  wrote:
> >
> > > Can you see if this applies in your case:
> > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog%3F
> > >
> > > Also, what version of kafka 0.8 are you using? If not the beta, then
> > > what's the git hash?
> > >
> > > Joel
> > >
> > > On Thu, Nov 07, 2013 at 02:51:41PM -0500, Ahmed H. wrote:
> > > > Hello all,
> > > >
> > > > I am not sure if this is a Kafka issue, or an issue with the client
> that
> > > I
> > > > am using.
> > > >
> > > > We have a fairly small setup, where everything sits on one server
> (Kafka
> > > > 0.8, and Zookeeper). The message frequency 

Re: Purgatory

2013-11-08 Thread Joel Koshy
Marc - thanks again for doing this.  Couple of suggestions:

- I would suggest removing the disclaimer and email quotes since this
  can become a stand-alone clean document on what the purgatory is and
  how it works.
- A diagram would be helpful - it could say, show the watcher map and
  the expiration queue, and it will be especially useful if it can
  show the flow of producer/fetch requests through the purgatory. That
  would also help cut down a lot of the text in the doc.
- I think it would be preferrable to have just high-level details in
  this document.  Internal details (such as the purge interval
  settings) can either be removed or moved (to say, a short faq or
  config section at the end).
- In the overview may want to comment on why we added it: i.e., it is
  the primary data structure we use for supporting long poll of
  producer/fetch requests. E.g., if we don't do this consumers would
  have to keep issuing fetch requests if there's no data yet - as
  opposed to just saying "respond when 'n' bytes of data are available
  or when 't' millisecs have elapsed, whichever is earlier."
- WRT your question on PurgatorySize - we added that just to keep a
  tab on how many requests are sitting in purgatory (including both
  watchers map and expiration queue) as a rough gauge of memory usage.
  Also the fetch/producer request gauges should not collide - the
  KafkaMetricsGroup class takes care of this. The CSV reporter might
  run into issues though - I thought we had fixed that but could be
  wrong.

Joel

On Thu, Nov 07, 2013 at 11:01:06PM -0800, Joel Koshy wrote:
> Excellent - thanks for putting that together! Will review it more
> carefully tomorrow and suggest some minor edits if required.
> 
> On Thu, Nov 07, 2013 at 10:45:40PM -0500, Marc Labbe wrote:
> > I've just added a page for purgatory, feel free to comment/modify at will.
> > I hope I didn't misinterpret too much of the code.
> > 
> > https://cwiki.apache.org/confluence/display/KAFKA/Request+Purgatory+(0.8)
> > 
> > I added a few questions of my own.
> > 
> > 
> > On Fri, Nov 1, 2013 at 9:43 PM, Joe Stein  wrote:
> > 
> > > To edit the Wiki you need to send an ICLA
> > > http://www.apache.org/licenses/#clas to Apache and then once that is done
> > > an email to priv...@kafka.apache.org (or to me and I will copy private)
> > > with your Wiki username and that you sent the ICLA to Apache.
> > >
> > > Then, I can add you to edit the Wiki.
> > >
> > > /***
> > >  Joe Stein
> > >  Founder, Principal Consultant
> > >  Big Data Open Source Security LLC
> > >  http://www.stealth.ly
> > >  Twitter: @allthingshadoop 
> > > /
> > >
> > >
> > > On Fri, Nov 1, 2013 at 9:08 PM, Marc Labbe  wrote:
> > >
> > > > Hi Joel,
> > > >
> > > > I used to have edit to the wiki, I made a few additions to it a while 
> > > > ago
> > > > but it's seem I don't have it anymore. It might have been lost in the
> > > > confluence update. I would be glad to add what I have written if I get 
> > > > it
> > > > back. Otherwise, feel free to paste my words in one of the pages, I 
> > > > don't
> > > > intend on asking for copyrights for this :).
> > > >
> > > > marc
> > > >
> > > >
> > > > On Fri, Nov 1, 2013 at 4:32 PM, Joel Koshy  wrote:
> > > >
> > > > > Marc, thanks for writing that up. I think it is worth adding some
> > > > > details on the request-purgatory on a wiki (Jay had started a wiki
> > > > > page for kafka internals [1] a while ago, but we have not had time to
> > > > > add much to it since.) Your write-up could be reviewed and added
> > > > > there. Do you have edit permissions on the wiki?
> > > > >
> > > > > As for the purge interval config - yes the documentation can be
> > > > > improved a bit. It's one of those "internal" configs that generally
> > > > > don't need to be modified by users. The reason we added that was as
> > > > > follows:
> > > > > - We found that for low-volume topics, replica fetch requests were
> > > > > getting expired but sitting around in purgatory
> > > > > - This was because we were expiring them from the delay queue (used to
> > > > > track when requests should expire), but they were still sitting in the
> > > > > watcherFor map - i.e., they would get purged when the next producer
> > > > > request to that topic/partition arrived, but for low volume topics
> > > > > this could be a long time (or never in the worst case) and we would
> > > > > eventually run into an OOME.
> > > > > - So we needed to periodically go through the entire watcherFor map
> > > > > and explicitly remove those requests that had expired.
> > > > > - More details on this are in KAFKA-664.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Joel
> > > > >
> > > > > [1] https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Internals
> > > > >
> > > > > On Fri, Nov 1, 2013 at 12:33 PM, Marc Labbe  wrote:
> > > > > > Guozhang,
> > > > >

Re: Kafka client dies after rebalancing attempt

2013-11-08 Thread Joel Koshy
Do you have write permissions in /kafka-log4j? Your logs should be
going there (at least per your log4j config) - and you may want to use
a different log4j config for your consumer so it doesn't collide with
the broker's.

I doubt the consumer thread dying issue is related to yours - again,
logs would help.

Also, you may want to try with the latest HEAD as opposed to the beta.

Thanks,

Joel

On Fri, Nov 08, 2013 at 01:18:07PM -0500, Ahmed H. wrote:
> Hello,
> 
> I am using the beta right now.
> 
> I'm not sure if it's GC or something else at this point. To be honest I've
> never really fiddled with any GC settings before. The system can run for as
> long as a day without failing, or as little as a few hours. The lack of
> pattern makes it a little harder to debug. As I mentioned before, the
> activity on this system is fairly consistent throughout the day.
> 
> On the link that you sent, I see this, which could very well be the reason:
> 
>- One of the typical causes is that the application code that consumes
>messages somehow died and therefore killed the consumer thread. We
>recommend using a try/catch clause to log all Throwable in the consumer
>logic.
> 
> That is entirely possible. I wanted to check the kafka logs for any clues
> but for some reason, kafka is not writing any logs :/. Here is my log4j
> settings for kafka:
> 
> log4j.rootLogger=INFO, stdout
> > log4j.appender.stdout=org.apache.log4j.ConsoleAppender
> > log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
> > log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n
> > log4j.appender.kafkaAppender=org.apache.log4j.DailyRollingFileAppender
> > log4j.appender.kafkaAppender.DatePattern='.'-MM-dd-HH
> > log4j.appender.kafkaAppender.File=/kafka-log4j/server.log
> > log4j.appender.kafkaAppender.layout=org.apache.log4j.PatternLayout
> > log4j.appender.kafkaAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
> >
> > log4j.appender.stateChangeAppender=org.apache.log4j.DailyRollingFileAppender
> > log4j.appender.stateChangeAppender.DatePattern='.'-MM-dd-HH
> > log4j.appender.stateChangeAppender.File=/kafka-log4j/state-change.log
> > log4j.appender.stateChangeAppender.layout=org.apache.log4j.PatternLayout
> > log4j.appender.stateChangeAppender.layout.ConversionPattern=[%d] %p %m
> > (%c)%n
> > log4j.appender.requestAppender=org.apache.log4j.DailyRollingFileAppender
> > log4j.appender.requestAppender.DatePattern='.'-MM-dd-HH
> > log4j.appender.requestAppender.File=/kafka-log4j/kafka-request.log
> > log4j.appender.requestAppender.layout=org.apache.log4j.PatternLayout
> > log4j.appender.requestAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
> > log4j.appender.controllerAppender=org.apache.log4j.DailyRollingFileAppender
> > log4j.appender.controllerAppender.DatePattern='.'-MM-dd-HH
> > log4j.appender.controllerAppender.File=/kafka-log4j/controller.log
> > log4j.appender.controllerAppender.layout=org.apache.log4j.PatternLayout
> > log4j.appender.controllerAppender.layout.ConversionPattern=[%d] %p %m
> > (%c)%n
> > log4j.logger.kafka=INFO, kafkaAppender
> > log4j.logger.kafka.network.RequestChannel$=TRACE, requestAppender
> > log4j.additivity.kafka.network.RequestChannel$=false
> > log4j.logger.kafka.request.logger=TRACE, requestAppender
> > log4j.additivity.kafka.request.logger=false
> > log4j.logger.kafka.controller=TRACE, controllerAppender
> > log4j.additivity.kafka.controller=false
> > log4j.logger.state.change.logger=TRACE, stateChangeAppender
> > log4j.additivity.state.change.logger=false
> 
> 
> 
> Thanks
> 
> 
> On Thu, Nov 7, 2013 at 5:06 PM, Joel Koshy  wrote:
> 
> > Can you see if this applies in your case:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog%3F
> >
> > Also, what version of kafka 0.8 are you using? If not the beta, then
> > what's the git hash?
> >
> > Joel
> >
> > On Thu, Nov 07, 2013 at 02:51:41PM -0500, Ahmed H. wrote:
> > > Hello all,
> > >
> > > I am not sure if this is a Kafka issue, or an issue with the client that
> > I
> > > am using.
> > >
> > > We have a fairly small setup, where everything sits on one server (Kafka
> > > 0.8, and Zookeeper). The message frequency is not too high (1-2 per
> > second).
> > >
> > > The setup works fine for a certain period of time but at some point, it
> > > just dies, and exceptions are thrown. This is pretty much a daily
> > > occurrence, but there is no pattern. Based on the logs, it appears that
> > the
> > > Kafka client tries to rebalance with Zookeeper and fails, it tries and
> > > tries multiple times but after a few tries it gives up. Here is the stack
> > > trace:
> > >
> > > 04:56:07,234 INFO  [kafka.consumer.SimpleConsumer]
> > > >
> > (ConsumerFetcherThread-kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0-0)
> > > > Reconnect due to socket error: :
> > > > java.nio.channels.ClosedByInterruptException
> > > >  at
> > > >
> > java.nio.channels.

Re: Kafka Hadoop Consumer issues

2013-11-08 Thread Abhi Basu
Copy-jars.sh did not copy the hadoop consumer and kafka jars. I have copied
them to HDFS manually, but still getting the same error.

Looks like I have all the reqd jars now:

[root@idh251-0 test]# hadoop fs -ls /tmp/kafka/lib
Warning: $HADOOP_HOME is deprecated.

Found 7 items
-rw-r--r--   3 root hadoop  29614 2013-11-08 10:11
/tmp/kafka/lib/hadoop-consumer-0.8-SNAPSHOT.jar
-rw-r--r--   3 root hadoop2559966 2013-11-08 10:05
/tmp/kafka/lib/kafka_2.8.0-0.8-SNAPSHOT.jar
-rw-r--r--   3 root hadoop   4502 2013-11-08 10:03
/tmp/kafka/lib/metrics-annotation-3.0.0-c0c8be71.jar
-rw-r--r--   3 root hadoop  80316 2013-11-08 10:03
/tmp/kafka/lib/metrics-core-3.0.0-c0c8be71.jar
-rw-r--r--   3 root hadoop 307858 2013-11-08 10:03
/tmp/kafka/lib/piggybank.jar
-rw-r--r--   3 root hadoop8807888 2013-11-08 10:07
/tmp/kafka/lib/scala-library.jar
-rw-r--r--   3 root hadoop  98961 2013-11-08 10:03
/tmp/kafka/lib/zkclient-20120522.jar


On Fri, Nov 8, 2013 at 9:17 AM, Abhi Basu <9000r...@gmail.com> wrote:

> Still get the same error:
> [root@idh251-0 hadoop-consumer]# ./run-class.sh
> kafka.etl.impl.SimpleKafkaETLJob test/test.properties
>
> :./../../core/target/scala_2.8.0/kafka-*.jar:./../../contrib/hadoop-consumer/lib_managed/scala_2.8.0/compile/*.jar:./../../contrib/hadoop-consumer/target/scala_2.8.0/*.jar:./../../contrib/hadoop-consumer/lib/piggybank.jar:./../../project/boot/scala-2.8.0/lib/scala-library.jar
> Exception in thread "main" java.lang.NoClassDefFoundError:
> kafka/etl/impl/SimpleKafkaETLJob
> Caused by: java.lang.ClassNotFoundException:
> kafka.etl.impl.SimpleKafkaETLJob
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>  at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> Could not find the main class: kafka.etl.impl.SimpleKafkaETLJob.  Program
> will exit.
> [root@idh251-0 hadoop-consumer]# hadoop fs -ls /tmp/kafka/lib
> Warning: $HADOOP_HOME is deprecated.
>
> Here is what is in the HDFS /tmp/kafka/lib folder after running the
> copy-jars.sh:
>
> Found 4 items
> -rw-r--r--   3 root hadoop   4502 2013-11-08 09:04
> /tmp/kafka/lib/metrics-annotation-3.0.0-c0c8be71.jar
> -rw-r--r--   3 root hadoop  80316 2013-11-08 09:04
> /tmp/kafka/lib/metrics-core-3.0.0-c0c8be71.jar
> -rw-r--r--   3 root hadoop 307858 2013-11-08 09:04
> /tmp/kafka/lib/piggybank.jar
> -rw-r--r--   3 root hadoop  98961 2013-11-08 09:04
> /tmp/kafka/lib/zkclient-20120522.jar
>
>
> On Fri, Nov 8, 2013 at 9:06 AM, Abhi Basu <9000r...@gmail.com> wrote:
>
>> Ok, sorry, missed a step, ran the copy_jars.sh and now retrying 
>>
>>
>> On Fri, Nov 8, 2013 at 8:55 AM, Abhi Basu <9000r...@gmail.com> wrote:
>>
>>> Hi Neha:
>>>
>>> I was following the directions outlined here -
>>> https://github.com/apache/kafka/tree/0.8/contrib/hadoop-consumer. It
>>> does not mention anything about registering jars. Can you please provide
>>> more details?
>>>
>>> Thanks,
>>>
>>> Abhi
>>>
>>>
>>> On Fri, Nov 8, 2013 at 8:48 AM, Neha Narkhede 
>>> wrote:
>>>
 ClassNotFound means the Hadoop job is not able to find the related jar.
 Have you made sure the related jars are registered in the distributed
 cache?


 On Fri, Nov 8, 2013 at 8:40 AM, Abhi Basu <9000r...@gmail.com> wrote:

 > Can anyone help me with this issue? I feel like I am very close and am
 > probably making some silly config error.
 >
 > Kafka team, please provide more detailed notes on how to make this
 > component work.
 >
 > Thanks.
 >
 >
 > On Fri, Nov 8, 2013 at 5:23 AM, Abhi Basu <9000r...@gmail.com> wrote:
 >
 > > Simplekafkaetljob class,  as mentioned in the post.
 > >
 > > Thanks
 > >
 > > Abhi
 > >
 > > From Samsung Galaxy S4
 > > On Nov 7, 2013 8:34 PM, "Jun Rao"  wrote:
 > >
 > >> Which class is not found?
 > >>
 > >> Thanks,
 > >>
 > >> Jun
 > >>
 > >>
 > >> On Thu, Nov 7, 2013 at 11:56 AM, Abhi Basu <9000r...@gmail.com>
 wrote:
 > >>
 > >> > Let me describe my environment. Working on two nodes currently:
 > >> > 1.Single-node hadoop cluster (will refer as Node1)
 > >> > 2.Single node Kafka cluster  (will refer as Node2)
 > >> >
 > >> > Node 2 has 1 broker started with a topic (iot.test.stream) and
 one
 > >> command
 > >> > line producer and one command line consumer to test the kafka
 install.
 > >> > Producer can send messages and the Consumer is receiving it.
 > >> >
 > >> > Node 1 (hadoop cluster) has kafka hadoop consumer code built.
 Have
 > >> edited
 > >> > the /kafka-0.8/contrib/hadoop-consumer/test/test.properties file
 with
 > >> the
 > >> > following:

Re: Kafka client dies after rebalancing attempt

2013-11-08 Thread Ahmed H.
Hello,

I am using the beta right now.

I'm not sure if it's GC or something else at this point. To be honest I've
never really fiddled with any GC settings before. The system can run for as
long as a day without failing, or as little as a few hours. The lack of
pattern makes it a little harder to debug. As I mentioned before, the
activity on this system is fairly consistent throughout the day.

On the link that you sent, I see this, which could very well be the reason:

   - One of the typical causes is that the application code that consumes
   messages somehow died and therefore killed the consumer thread. We
   recommend using a try/catch clause to log all Throwable in the consumer
   logic.

That is entirely possible. I wanted to check the kafka logs for any clues
but for some reason, kafka is not writing any logs :/. Here is my log4j
settings for kafka:

log4j.rootLogger=INFO, stdout
> log4j.appender.stdout=org.apache.log4j.ConsoleAppender
> log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
> log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n
> log4j.appender.kafkaAppender=org.apache.log4j.DailyRollingFileAppender
> log4j.appender.kafkaAppender.DatePattern='.'-MM-dd-HH
> log4j.appender.kafkaAppender.File=/kafka-log4j/server.log
> log4j.appender.kafkaAppender.layout=org.apache.log4j.PatternLayout
> log4j.appender.kafkaAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
>
> log4j.appender.stateChangeAppender=org.apache.log4j.DailyRollingFileAppender
> log4j.appender.stateChangeAppender.DatePattern='.'-MM-dd-HH
> log4j.appender.stateChangeAppender.File=/kafka-log4j/state-change.log
> log4j.appender.stateChangeAppender.layout=org.apache.log4j.PatternLayout
> log4j.appender.stateChangeAppender.layout.ConversionPattern=[%d] %p %m
> (%c)%n
> log4j.appender.requestAppender=org.apache.log4j.DailyRollingFileAppender
> log4j.appender.requestAppender.DatePattern='.'-MM-dd-HH
> log4j.appender.requestAppender.File=/kafka-log4j/kafka-request.log
> log4j.appender.requestAppender.layout=org.apache.log4j.PatternLayout
> log4j.appender.requestAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
> log4j.appender.controllerAppender=org.apache.log4j.DailyRollingFileAppender
> log4j.appender.controllerAppender.DatePattern='.'-MM-dd-HH
> log4j.appender.controllerAppender.File=/kafka-log4j/controller.log
> log4j.appender.controllerAppender.layout=org.apache.log4j.PatternLayout
> log4j.appender.controllerAppender.layout.ConversionPattern=[%d] %p %m
> (%c)%n
> log4j.logger.kafka=INFO, kafkaAppender
> log4j.logger.kafka.network.RequestChannel$=TRACE, requestAppender
> log4j.additivity.kafka.network.RequestChannel$=false
> log4j.logger.kafka.request.logger=TRACE, requestAppender
> log4j.additivity.kafka.request.logger=false
> log4j.logger.kafka.controller=TRACE, controllerAppender
> log4j.additivity.kafka.controller=false
> log4j.logger.state.change.logger=TRACE, stateChangeAppender
> log4j.additivity.state.change.logger=false



Thanks


On Thu, Nov 7, 2013 at 5:06 PM, Joel Koshy  wrote:

> Can you see if this applies in your case:
>
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog%3F
>
> Also, what version of kafka 0.8 are you using? If not the beta, then
> what's the git hash?
>
> Joel
>
> On Thu, Nov 07, 2013 at 02:51:41PM -0500, Ahmed H. wrote:
> > Hello all,
> >
> > I am not sure if this is a Kafka issue, or an issue with the client that
> I
> > am using.
> >
> > We have a fairly small setup, where everything sits on one server (Kafka
> > 0.8, and Zookeeper). The message frequency is not too high (1-2 per
> second).
> >
> > The setup works fine for a certain period of time but at some point, it
> > just dies, and exceptions are thrown. This is pretty much a daily
> > occurrence, but there is no pattern. Based on the logs, it appears that
> the
> > Kafka client tries to rebalance with Zookeeper and fails, it tries and
> > tries multiple times but after a few tries it gives up. Here is the stack
> > trace:
> >
> > 04:56:07,234 INFO  [kafka.consumer.SimpleConsumer]
> > >
> (ConsumerFetcherThread-kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0-0)
> > > Reconnect due to socket error: :
> > > java.nio.channels.ClosedByInterruptException
> > >  at
> > >
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> > > [rt.jar:1.7.0_25]
> > > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:402)
> > > [rt.jar:1.7.0_25]
> > >  at
> > > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:220)
> > > [rt.jar:1.7.0_25]
> > > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> > > [rt.jar:1.7.0_25]
> > >  at
> > >
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> > > [rt.jar:1.7.0_25]
> > > at kafka.utils.Utils$.read(Utils.scala:394)
> > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > >  at
> > >
> kafka.network.

Re: Purgatory

2013-11-08 Thread Guozhang Wang
Thanks Marc! I will also go through it and suggest some edits today.

Guozhang


On Fri, Nov 8, 2013 at 7:50 AM, Marc Labbe  wrote:

> Thx for the feedback. It is true I never mention anything about impact on
> users or the fact this is mostly internal business in Kafka. I will try to
> rephrase some of this.
>
> Marc
> On Nov 8, 2013 10:10 AM, "Yu, Libo"  wrote:
>
> > I read it and tried to understand it. It would be great to add a summary
> > at the beginning about what it is and how it may impact a user.
> >
> > Regards,
> >
> > Libo
> >
> >
> > -Original Message-
> > From: Joel Koshy [mailto:jjkosh...@gmail.com]
> > Sent: Friday, November 08, 2013 2:01 AM
> > To: users@kafka.apache.org
> > Subject: Re: Purgatory
> >
> > Excellent - thanks for putting that together! Will review it more
> > carefully tomorrow and suggest some minor edits if required.
> >
> > On Thu, Nov 07, 2013 at 10:45:40PM -0500, Marc Labbe wrote:
> > > I've just added a page for purgatory, feel free to comment/modify at
> > will.
> > > I hope I didn't misinterpret too much of the code.
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/Request+Purgatory+(0
> > > .8)
> > >
> > > I added a few questions of my own.
> > >
> > >
> > > On Fri, Nov 1, 2013 at 9:43 PM, Joe Stein 
> wrote:
> > >
> > > > To edit the Wiki you need to send an ICLA
> > > > http://www.apache.org/licenses/#clas to Apache and then once that is
> > > > done an email to priv...@kafka.apache.org (or to me and I will copy
> > > > private) with your Wiki username and that you sent the ICLA to
> Apache.
> > > >
> > > > Then, I can add you to edit the Wiki.
> > > >
> > > > /***
> > > >  Joe Stein
> > > >  Founder, Principal Consultant
> > > >  Big Data Open Source Security LLC
> > > >  http://www.stealth.ly
> > > >  Twitter: @allthingshadoop 
> > > > /
> > > >
> > > >
> > > > On Fri, Nov 1, 2013 at 9:08 PM, Marc Labbe 
> wrote:
> > > >
> > > > > Hi Joel,
> > > > >
> > > > > I used to have edit to the wiki, I made a few additions to it a
> > > > > while ago but it's seem I don't have it anymore. It might have
> > > > > been lost in the confluence update. I would be glad to add what I
> > > > > have written if I get it back. Otherwise, feel free to paste my
> > > > > words in one of the pages, I don't intend on asking for copyrights
> > for this :).
> > > > >
> > > > > marc
> > > > >
> > > > >
> > > > > On Fri, Nov 1, 2013 at 4:32 PM, Joel Koshy 
> > wrote:
> > > > >
> > > > > > Marc, thanks for writing that up. I think it is worth adding
> > > > > > some details on the request-purgatory on a wiki (Jay had started
> > > > > > a wiki page for kafka internals [1] a while ago, but we have not
> > > > > > had time to add much to it since.) Your write-up could be
> > > > > > reviewed and added there. Do you have edit permissions on the
> wiki?
> > > > > >
> > > > > > As for the purge interval config - yes the documentation can be
> > > > > > improved a bit. It's one of those "internal" configs that
> > > > > > generally don't need to be modified by users. The reason we
> > > > > > added that was as
> > > > > > follows:
> > > > > > - We found that for low-volume topics, replica fetch requests
> > > > > > were getting expired but sitting around in purgatory
> > > > > > - This was because we were expiring them from the delay queue
> > > > > > (used to track when requests should expire), but they were still
> > > > > > sitting in the watcherFor map - i.e., they would get purged when
> > > > > > the next producer request to that topic/partition arrived, but
> > > > > > for low volume topics this could be a long time (or never in the
> > > > > > worst case) and we would eventually run into an OOME.
> > > > > > - So we needed to periodically go through the entire watcherFor
> > > > > > map and explicitly remove those requests that had expired.
> > > > > > - More details on this are in KAFKA-664.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Joel
> > > > > >
> > > > > > [1]
> > > > > > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Internal
> > > > > > s
> > > > > >
> > > > > > On Fri, Nov 1, 2013 at 12:33 PM, Marc Labbe 
> > wrote:
> > > > > > > Guozhang,
> > > > > > >
> > > > > > > I have to agree with Priya the doc isn't very clear. Although
> > > > > > > the configuration is documented, it is simply rewording the
> > > > > > > name of the
> > > > > > config,
> > > > > > > which isn't particularly useful if you want more information
> > > > > > > about
> > > > what
> > > > > > the
> > > > > > > purgatory is. I searched the whole wiki and doc and could not
> > > > > > > find
> > > > > > anything
> > > > > > > very useful as opposed looking a the code. In this case,
> > > > > > > kafka.server.KafkaApis and kafka.server.RequestPurgatory will
> > > > > > > be your friends.
> > > > > > >
> > > > > > > I'll try to add to Joe's answ

Re: Kafka Hadoop Consumer issues

2013-11-08 Thread Abhi Basu
Still get the same error:
[root@idh251-0 hadoop-consumer]# ./run-class.sh
kafka.etl.impl.SimpleKafkaETLJob test/test.properties
:./../../core/target/scala_2.8.0/kafka-*.jar:./../../contrib/hadoop-consumer/lib_managed/scala_2.8.0/compile/*.jar:./../../contrib/hadoop-consumer/target/scala_2.8.0/*.jar:./../../contrib/hadoop-consumer/lib/piggybank.jar:./../../project/boot/scala-2.8.0/lib/scala-library.jar
Exception in thread "main" java.lang.NoClassDefFoundError:
kafka/etl/impl/SimpleKafkaETLJob
Caused by: java.lang.ClassNotFoundException:
kafka.etl.impl.SimpleKafkaETLJob
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: kafka.etl.impl.SimpleKafkaETLJob.  Program
will exit.
[root@idh251-0 hadoop-consumer]# hadoop fs -ls /tmp/kafka/lib
Warning: $HADOOP_HOME is deprecated.

Here is what is in the HDFS /tmp/kafka/lib folder after running the
copy-jars.sh:

Found 4 items
-rw-r--r--   3 root hadoop   4502 2013-11-08 09:04
/tmp/kafka/lib/metrics-annotation-3.0.0-c0c8be71.jar
-rw-r--r--   3 root hadoop  80316 2013-11-08 09:04
/tmp/kafka/lib/metrics-core-3.0.0-c0c8be71.jar
-rw-r--r--   3 root hadoop 307858 2013-11-08 09:04
/tmp/kafka/lib/piggybank.jar
-rw-r--r--   3 root hadoop  98961 2013-11-08 09:04
/tmp/kafka/lib/zkclient-20120522.jar


On Fri, Nov 8, 2013 at 9:06 AM, Abhi Basu <9000r...@gmail.com> wrote:

> Ok, sorry, missed a step, ran the copy_jars.sh and now retrying 
>
>
> On Fri, Nov 8, 2013 at 8:55 AM, Abhi Basu <9000r...@gmail.com> wrote:
>
>> Hi Neha:
>>
>> I was following the directions outlined here -
>> https://github.com/apache/kafka/tree/0.8/contrib/hadoop-consumer. It
>> does not mention anything about registering jars. Can you please provide
>> more details?
>>
>> Thanks,
>>
>> Abhi
>>
>>
>> On Fri, Nov 8, 2013 at 8:48 AM, Neha Narkhede wrote:
>>
>>> ClassNotFound means the Hadoop job is not able to find the related jar.
>>> Have you made sure the related jars are registered in the distributed
>>> cache?
>>>
>>>
>>> On Fri, Nov 8, 2013 at 8:40 AM, Abhi Basu <9000r...@gmail.com> wrote:
>>>
>>> > Can anyone help me with this issue? I feel like I am very close and am
>>> > probably making some silly config error.
>>> >
>>> > Kafka team, please provide more detailed notes on how to make this
>>> > component work.
>>> >
>>> > Thanks.
>>> >
>>> >
>>> > On Fri, Nov 8, 2013 at 5:23 AM, Abhi Basu <9000r...@gmail.com> wrote:
>>> >
>>> > > Simplekafkaetljob class,  as mentioned in the post.
>>> > >
>>> > > Thanks
>>> > >
>>> > > Abhi
>>> > >
>>> > > From Samsung Galaxy S4
>>> > > On Nov 7, 2013 8:34 PM, "Jun Rao"  wrote:
>>> > >
>>> > >> Which class is not found?
>>> > >>
>>> > >> Thanks,
>>> > >>
>>> > >> Jun
>>> > >>
>>> > >>
>>> > >> On Thu, Nov 7, 2013 at 11:56 AM, Abhi Basu <9000r...@gmail.com>
>>> wrote:
>>> > >>
>>> > >> > Let me describe my environment. Working on two nodes currently:
>>> > >> > 1.Single-node hadoop cluster (will refer as Node1)
>>> > >> > 2.Single node Kafka cluster  (will refer as Node2)
>>> > >> >
>>> > >> > Node 2 has 1 broker started with a topic (iot.test.stream) and one
>>> > >> command
>>> > >> > line producer and one command line consumer to test the kafka
>>> install.
>>> > >> > Producer can send messages and the Consumer is receiving it.
>>> > >> >
>>> > >> > Node 1 (hadoop cluster) has kafka hadoop consumer code built. Have
>>> > >> edited
>>> > >> > the /kafka-0.8/contrib/hadoop-consumer/test/test.properties file
>>> with
>>> > >> the
>>> > >> > following:
>>> > >> >
>>> > >> > kafka.etl.topic=iot.test.stream
>>> > >> > hdfs.default.classpath.dir=/tmp/kafka/lib
>>> > >> > hadoop.job.ugi=kafka,hadoop
>>> > >> > kafka.server.uri=tcp://idh251-kafka:9095
>>> > >> > input=/tmp/kafka/data
>>> > >> > output=/tmp/kafka/output
>>> > >> > kafka.request.limit=-1
>>> > >> > ...
>>> > >> >
>>> > >> > I have copied the copy-jars.sh to /tmp/kafka/lib (on HDFS)
>>> > >> >
>>> > >> > Next I run the following on Node 1:
>>> > >> > ./run-class.sh kafka.etl.impl.SimpleKafkaETLJob
>>> test/test.properties
>>> > >> from
>>> > >> > the /kafka-0.8/contrib/hadoop-consumer folder and get a
>>> > >> > classnotfoundexception for kafka.etl.impl.SimpleKafkaETLJob class.
>>> > >> >
>>> > >> > What am I missing? I was thinking that running the sh file would
>>> allow
>>> > >> me
>>> > >> > to retrieve messages with the same topic name to HDFS from Node 2
>>> to
>>> > >> Node
>>> > >> > 1. I just want to do an end to end test to see that messages
>>> coming
>>> > into
>>> > >> > Kafka are being stored in HDFS with the minimal amount of code
>>> change
>>> > >> > required.
>>> > >> >
>>> > >> > Thanks,
>>> > >> >
>>> > >> > Abhi
>>> > >> >
>>>

Re: Kafka Hadoop Consumer issues

2013-11-08 Thread Abhi Basu
Ok, sorry, missed a step, ran the copy_jars.sh and now retrying 


On Fri, Nov 8, 2013 at 8:55 AM, Abhi Basu <9000r...@gmail.com> wrote:

> Hi Neha:
>
> I was following the directions outlined here -
> https://github.com/apache/kafka/tree/0.8/contrib/hadoop-consumer. It does
> not mention anything about registering jars. Can you please provide more
> details?
>
> Thanks,
>
> Abhi
>
>
> On Fri, Nov 8, 2013 at 8:48 AM, Neha Narkhede wrote:
>
>> ClassNotFound means the Hadoop job is not able to find the related jar.
>> Have you made sure the related jars are registered in the distributed
>> cache?
>>
>>
>> On Fri, Nov 8, 2013 at 8:40 AM, Abhi Basu <9000r...@gmail.com> wrote:
>>
>> > Can anyone help me with this issue? I feel like I am very close and am
>> > probably making some silly config error.
>> >
>> > Kafka team, please provide more detailed notes on how to make this
>> > component work.
>> >
>> > Thanks.
>> >
>> >
>> > On Fri, Nov 8, 2013 at 5:23 AM, Abhi Basu <9000r...@gmail.com> wrote:
>> >
>> > > Simplekafkaetljob class,  as mentioned in the post.
>> > >
>> > > Thanks
>> > >
>> > > Abhi
>> > >
>> > > From Samsung Galaxy S4
>> > > On Nov 7, 2013 8:34 PM, "Jun Rao"  wrote:
>> > >
>> > >> Which class is not found?
>> > >>
>> > >> Thanks,
>> > >>
>> > >> Jun
>> > >>
>> > >>
>> > >> On Thu, Nov 7, 2013 at 11:56 AM, Abhi Basu <9000r...@gmail.com>
>> wrote:
>> > >>
>> > >> > Let me describe my environment. Working on two nodes currently:
>> > >> > 1.Single-node hadoop cluster (will refer as Node1)
>> > >> > 2.Single node Kafka cluster  (will refer as Node2)
>> > >> >
>> > >> > Node 2 has 1 broker started with a topic (iot.test.stream) and one
>> > >> command
>> > >> > line producer and one command line consumer to test the kafka
>> install.
>> > >> > Producer can send messages and the Consumer is receiving it.
>> > >> >
>> > >> > Node 1 (hadoop cluster) has kafka hadoop consumer code built. Have
>> > >> edited
>> > >> > the /kafka-0.8/contrib/hadoop-consumer/test/test.properties file
>> with
>> > >> the
>> > >> > following:
>> > >> >
>> > >> > kafka.etl.topic=iot.test.stream
>> > >> > hdfs.default.classpath.dir=/tmp/kafka/lib
>> > >> > hadoop.job.ugi=kafka,hadoop
>> > >> > kafka.server.uri=tcp://idh251-kafka:9095
>> > >> > input=/tmp/kafka/data
>> > >> > output=/tmp/kafka/output
>> > >> > kafka.request.limit=-1
>> > >> > ...
>> > >> >
>> > >> > I have copied the copy-jars.sh to /tmp/kafka/lib (on HDFS)
>> > >> >
>> > >> > Next I run the following on Node 1:
>> > >> > ./run-class.sh kafka.etl.impl.SimpleKafkaETLJob
>> test/test.properties
>> > >> from
>> > >> > the /kafka-0.8/contrib/hadoop-consumer folder and get a
>> > >> > classnotfoundexception for kafka.etl.impl.SimpleKafkaETLJob class.
>> > >> >
>> > >> > What am I missing? I was thinking that running the sh file would
>> allow
>> > >> me
>> > >> > to retrieve messages with the same topic name to HDFS from Node 2
>> to
>> > >> Node
>> > >> > 1. I just want to do an end to end test to see that messages coming
>> > into
>> > >> > Kafka are being stored in HDFS with the minimal amount of code
>> change
>> > >> > required.
>> > >> >
>> > >> > Thanks,
>> > >> >
>> > >> > Abhi
>> > >> >
>> > >> > --
>> > >> > Abhi Basu
>> > >> >
>> > >>
>> > >
>> >
>> >
>> > --
>> > Abhi Basu
>> >
>>
>
>
>
> --
> Abhi Basu
>



-- 
Abhi Basu


Re: Kafka Hadoop Consumer issues

2013-11-08 Thread Abhi Basu
Hi Neha:

I was following the directions outlined here -
https://github.com/apache/kafka/tree/0.8/contrib/hadoop-consumer. It does
not mention anything about registering jars. Can you please provide more
details?

Thanks,

Abhi


On Fri, Nov 8, 2013 at 8:48 AM, Neha Narkhede wrote:

> ClassNotFound means the Hadoop job is not able to find the related jar.
> Have you made sure the related jars are registered in the distributed
> cache?
>
>
> On Fri, Nov 8, 2013 at 8:40 AM, Abhi Basu <9000r...@gmail.com> wrote:
>
> > Can anyone help me with this issue? I feel like I am very close and am
> > probably making some silly config error.
> >
> > Kafka team, please provide more detailed notes on how to make this
> > component work.
> >
> > Thanks.
> >
> >
> > On Fri, Nov 8, 2013 at 5:23 AM, Abhi Basu <9000r...@gmail.com> wrote:
> >
> > > Simplekafkaetljob class,  as mentioned in the post.
> > >
> > > Thanks
> > >
> > > Abhi
> > >
> > > From Samsung Galaxy S4
> > > On Nov 7, 2013 8:34 PM, "Jun Rao"  wrote:
> > >
> > >> Which class is not found?
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >>
> > >> On Thu, Nov 7, 2013 at 11:56 AM, Abhi Basu <9000r...@gmail.com>
> wrote:
> > >>
> > >> > Let me describe my environment. Working on two nodes currently:
> > >> > 1.Single-node hadoop cluster (will refer as Node1)
> > >> > 2.Single node Kafka cluster  (will refer as Node2)
> > >> >
> > >> > Node 2 has 1 broker started with a topic (iot.test.stream) and one
> > >> command
> > >> > line producer and one command line consumer to test the kafka
> install.
> > >> > Producer can send messages and the Consumer is receiving it.
> > >> >
> > >> > Node 1 (hadoop cluster) has kafka hadoop consumer code built. Have
> > >> edited
> > >> > the /kafka-0.8/contrib/hadoop-consumer/test/test.properties file
> with
> > >> the
> > >> > following:
> > >> >
> > >> > kafka.etl.topic=iot.test.stream
> > >> > hdfs.default.classpath.dir=/tmp/kafka/lib
> > >> > hadoop.job.ugi=kafka,hadoop
> > >> > kafka.server.uri=tcp://idh251-kafka:9095
> > >> > input=/tmp/kafka/data
> > >> > output=/tmp/kafka/output
> > >> > kafka.request.limit=-1
> > >> > ...
> > >> >
> > >> > I have copied the copy-jars.sh to /tmp/kafka/lib (on HDFS)
> > >> >
> > >> > Next I run the following on Node 1:
> > >> > ./run-class.sh kafka.etl.impl.SimpleKafkaETLJob test/test.properties
> > >> from
> > >> > the /kafka-0.8/contrib/hadoop-consumer folder and get a
> > >> > classnotfoundexception for kafka.etl.impl.SimpleKafkaETLJob class.
> > >> >
> > >> > What am I missing? I was thinking that running the sh file would
> allow
> > >> me
> > >> > to retrieve messages with the same topic name to HDFS from Node 2 to
> > >> Node
> > >> > 1. I just want to do an end to end test to see that messages coming
> > into
> > >> > Kafka are being stored in HDFS with the minimal amount of code
> change
> > >> > required.
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Abhi
> > >> >
> > >> > --
> > >> > Abhi Basu
> > >> >
> > >>
> > >
> >
> >
> > --
> > Abhi Basu
> >
>



-- 
Abhi Basu


Re: Kafka Hadoop Consumer issues

2013-11-08 Thread Neha Narkhede
ClassNotFound means the Hadoop job is not able to find the related jar.
Have you made sure the related jars are registered in the distributed cache?


On Fri, Nov 8, 2013 at 8:40 AM, Abhi Basu <9000r...@gmail.com> wrote:

> Can anyone help me with this issue? I feel like I am very close and am
> probably making some silly config error.
>
> Kafka team, please provide more detailed notes on how to make this
> component work.
>
> Thanks.
>
>
> On Fri, Nov 8, 2013 at 5:23 AM, Abhi Basu <9000r...@gmail.com> wrote:
>
> > Simplekafkaetljob class,  as mentioned in the post.
> >
> > Thanks
> >
> > Abhi
> >
> > From Samsung Galaxy S4
> > On Nov 7, 2013 8:34 PM, "Jun Rao"  wrote:
> >
> >> Which class is not found?
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >>
> >> On Thu, Nov 7, 2013 at 11:56 AM, Abhi Basu <9000r...@gmail.com> wrote:
> >>
> >> > Let me describe my environment. Working on two nodes currently:
> >> > 1.Single-node hadoop cluster (will refer as Node1)
> >> > 2.Single node Kafka cluster  (will refer as Node2)
> >> >
> >> > Node 2 has 1 broker started with a topic (iot.test.stream) and one
> >> command
> >> > line producer and one command line consumer to test the kafka install.
> >> > Producer can send messages and the Consumer is receiving it.
> >> >
> >> > Node 1 (hadoop cluster) has kafka hadoop consumer code built. Have
> >> edited
> >> > the /kafka-0.8/contrib/hadoop-consumer/test/test.properties file with
> >> the
> >> > following:
> >> >
> >> > kafka.etl.topic=iot.test.stream
> >> > hdfs.default.classpath.dir=/tmp/kafka/lib
> >> > hadoop.job.ugi=kafka,hadoop
> >> > kafka.server.uri=tcp://idh251-kafka:9095
> >> > input=/tmp/kafka/data
> >> > output=/tmp/kafka/output
> >> > kafka.request.limit=-1
> >> > ...
> >> >
> >> > I have copied the copy-jars.sh to /tmp/kafka/lib (on HDFS)
> >> >
> >> > Next I run the following on Node 1:
> >> > ./run-class.sh kafka.etl.impl.SimpleKafkaETLJob test/test.properties
> >> from
> >> > the /kafka-0.8/contrib/hadoop-consumer folder and get a
> >> > classnotfoundexception for kafka.etl.impl.SimpleKafkaETLJob class.
> >> >
> >> > What am I missing? I was thinking that running the sh file would allow
> >> me
> >> > to retrieve messages with the same topic name to HDFS from Node 2 to
> >> Node
> >> > 1. I just want to do an end to end test to see that messages coming
> into
> >> > Kafka are being stored in HDFS with the minimal amount of code change
> >> > required.
> >> >
> >> > Thanks,
> >> >
> >> > Abhi
> >> >
> >> > --
> >> > Abhi Basu
> >> >
> >>
> >
>
>
> --
> Abhi Basu
>


Re: Kafka Hadoop Consumer issues

2013-11-08 Thread Abhi Basu
Can anyone help me with this issue? I feel like I am very close and am
probably making some silly config error.

Kafka team, please provide more detailed notes on how to make this
component work.

Thanks.


On Fri, Nov 8, 2013 at 5:23 AM, Abhi Basu <9000r...@gmail.com> wrote:

> Simplekafkaetljob class,  as mentioned in the post.
>
> Thanks
>
> Abhi
>
> From Samsung Galaxy S4
> On Nov 7, 2013 8:34 PM, "Jun Rao"  wrote:
>
>> Which class is not found?
>>
>> Thanks,
>>
>> Jun
>>
>>
>> On Thu, Nov 7, 2013 at 11:56 AM, Abhi Basu <9000r...@gmail.com> wrote:
>>
>> > Let me describe my environment. Working on two nodes currently:
>> > 1.Single-node hadoop cluster (will refer as Node1)
>> > 2.Single node Kafka cluster  (will refer as Node2)
>> >
>> > Node 2 has 1 broker started with a topic (iot.test.stream) and one
>> command
>> > line producer and one command line consumer to test the kafka install.
>> > Producer can send messages and the Consumer is receiving it.
>> >
>> > Node 1 (hadoop cluster) has kafka hadoop consumer code built. Have
>> edited
>> > the /kafka-0.8/contrib/hadoop-consumer/test/test.properties file with
>> the
>> > following:
>> >
>> > kafka.etl.topic=iot.test.stream
>> > hdfs.default.classpath.dir=/tmp/kafka/lib
>> > hadoop.job.ugi=kafka,hadoop
>> > kafka.server.uri=tcp://idh251-kafka:9095
>> > input=/tmp/kafka/data
>> > output=/tmp/kafka/output
>> > kafka.request.limit=-1
>> > ...
>> >
>> > I have copied the copy-jars.sh to /tmp/kafka/lib (on HDFS)
>> >
>> > Next I run the following on Node 1:
>> > ./run-class.sh kafka.etl.impl.SimpleKafkaETLJob test/test.properties
>> from
>> > the /kafka-0.8/contrib/hadoop-consumer folder and get a
>> > classnotfoundexception for kafka.etl.impl.SimpleKafkaETLJob class.
>> >
>> > What am I missing? I was thinking that running the sh file would allow
>> me
>> > to retrieve messages with the same topic name to HDFS from Node 2 to
>> Node
>> > 1. I just want to do an end to end test to see that messages coming into
>> > Kafka are being stored in HDFS with the minimal amount of code change
>> > required.
>> >
>> > Thanks,
>> >
>> > Abhi
>> >
>> > --
>> > Abhi Basu
>> >
>>
>


-- 
Abhi Basu


Re: Default compression code in Kafka topics

2013-11-08 Thread Neha Narkhede
Currently, the only way to send compressed data to Kafka is by enabling
compression on the producer side. To move compression to server side, we
have https://issues.apache.org/jira/browse/KAFKA-595 filed

Thanks,
Neha


On Fri, Nov 8, 2013 at 8:23 AM, arathi maddula wrote:

> Hi,
>
> We have a cluster of Kafka servers. We want data of all topics on these
> servers to be compressed, Is there some configuration to achieve this?
> I was able to compress code by using compression.codec property  in
> ProducerConfig in Kafka Producer.
> But I wanted to know if there is a way of enabling topic compression
> without modifying anything in ProducerConfig properties.That is, can I add
> compression.codec=snappy in server.properties and have all topics' data
> compressed. Is there anything I can do during topic creation wrt
> compression?
>
>
> Thanks
> Arathi
>


Default compression code in Kafka topics

2013-11-08 Thread arathi maddula
Hi,

We have a cluster of Kafka servers. We want data of all topics on these
servers to be compressed, Is there some configuration to achieve this?
I was able to compress code by using compression.codec property  in
ProducerConfig in Kafka Producer.
But I wanted to know if there is a way of enabling topic compression
without modifying anything in ProducerConfig properties.That is, can I add
compression.codec=snappy in server.properties and have all topics' data
compressed. Is there anything I can do during topic creation wrt
compression?


Thanks
Arathi


Re: Commit Offset per topic

2013-11-08 Thread Roman Garcia
So, what would be that API for offset commit in 0.8 version? Are there any
docs about it?


On Thu, Nov 7, 2013 at 10:56 PM, 小宇  wrote:

> offset commit API could solve your problem, it's for 0.8 version.
>
> ---Sent from Boxer | http://getboxer.com
>
> Thanks Neha! I guess auto-commit it is for now...
>
>
>
>
>
> On Tue, Nov 5, 2013 at 5:08 AM, Neha Narkhede  >wrote:
>
>
>
> > Currently, the only way to achieve that is to use the SimpleConsumer API.
>
> > We are considering the feature you mentioned for the 0.9 release -
>
> >
>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite#ClientRewrite-ConsumerAPI
>
> >
>
> > Thanks,
>
> > Neha
>
> >
>
> >
>
> > On Mon, Nov 4, 2013 at 9:57 AM, Roman Garcia 
>
> > wrote:
>
> >
>
> > > Right Neha, thanks. I was aware of that particular API, but I process
>
> > > topics within separate threads (actors) so I cannot assume every
>
> > > topic/partition is supposed to commit its offset. Instead, I could use
> a
>
> > > per-stream (topic-partition) commit.
>
> > > Is this use case valid? Should there be a ticket to implement this?
>
> > > Regards,
>
> > > Roman
>
> > >
>
> > >
>
> > > On Mon, Nov 4, 2013 at 1:28 PM, Neha Narkhede 
> > > >wrote:
>
> > >
>
> > > > The API you are looking for is ConsumerConnector.commitOffsets().
>
> > > However,
>
> > > > not that it will commit offsets for all partitions that the
> particular
>
> > > > consumer connector owns at that time. And the set of partitions owned
>
> > by
>
> > > a
>
> > > > particular consumer connector can change over time.
>
> > > >
>
> > > > Thanks,
>
> > > > Neha
>
> > > >
>
> > > >
>
> > > > On Mon, Nov 4, 2013 at 7:58 AM, Roman Garcia  >
>
> > > > wrote:
>
> > > >
>
> > > > > Hi, I'm still using Kafka 0.7, and was wondering if there's a way I
>
> > can
>
> > > > > "manually" commit offsets per Topic/Partition.
>
> > > > >
>
> > > > > I was expecting something like KafkaMessageStream#commitOffset()
>
> > > > > I couldn't find an API for this. If there isn't any, is there a
>
> > reason?
>
> > > > >
>
> > > > > Regards,
>
> > > > > Roman
>
> > > > >
>
> > > >
>
> > >
>
> >
>
>


RE: Purgatory

2013-11-08 Thread Marc Labbe
Thx for the feedback. It is true I never mention anything about impact on
users or the fact this is mostly internal business in Kafka. I will try to
rephrase some of this.

Marc
On Nov 8, 2013 10:10 AM, "Yu, Libo"  wrote:

> I read it and tried to understand it. It would be great to add a summary
> at the beginning about what it is and how it may impact a user.
>
> Regards,
>
> Libo
>
>
> -Original Message-
> From: Joel Koshy [mailto:jjkosh...@gmail.com]
> Sent: Friday, November 08, 2013 2:01 AM
> To: users@kafka.apache.org
> Subject: Re: Purgatory
>
> Excellent - thanks for putting that together! Will review it more
> carefully tomorrow and suggest some minor edits if required.
>
> On Thu, Nov 07, 2013 at 10:45:40PM -0500, Marc Labbe wrote:
> > I've just added a page for purgatory, feel free to comment/modify at
> will.
> > I hope I didn't misinterpret too much of the code.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/Request+Purgatory+(0
> > .8)
> >
> > I added a few questions of my own.
> >
> >
> > On Fri, Nov 1, 2013 at 9:43 PM, Joe Stein  wrote:
> >
> > > To edit the Wiki you need to send an ICLA
> > > http://www.apache.org/licenses/#clas to Apache and then once that is
> > > done an email to priv...@kafka.apache.org (or to me and I will copy
> > > private) with your Wiki username and that you sent the ICLA to Apache.
> > >
> > > Then, I can add you to edit the Wiki.
> > >
> > > /***
> > >  Joe Stein
> > >  Founder, Principal Consultant
> > >  Big Data Open Source Security LLC
> > >  http://www.stealth.ly
> > >  Twitter: @allthingshadoop 
> > > /
> > >
> > >
> > > On Fri, Nov 1, 2013 at 9:08 PM, Marc Labbe  wrote:
> > >
> > > > Hi Joel,
> > > >
> > > > I used to have edit to the wiki, I made a few additions to it a
> > > > while ago but it's seem I don't have it anymore. It might have
> > > > been lost in the confluence update. I would be glad to add what I
> > > > have written if I get it back. Otherwise, feel free to paste my
> > > > words in one of the pages, I don't intend on asking for copyrights
> for this :).
> > > >
> > > > marc
> > > >
> > > >
> > > > On Fri, Nov 1, 2013 at 4:32 PM, Joel Koshy 
> wrote:
> > > >
> > > > > Marc, thanks for writing that up. I think it is worth adding
> > > > > some details on the request-purgatory on a wiki (Jay had started
> > > > > a wiki page for kafka internals [1] a while ago, but we have not
> > > > > had time to add much to it since.) Your write-up could be
> > > > > reviewed and added there. Do you have edit permissions on the wiki?
> > > > >
> > > > > As for the purge interval config - yes the documentation can be
> > > > > improved a bit. It's one of those "internal" configs that
> > > > > generally don't need to be modified by users. The reason we
> > > > > added that was as
> > > > > follows:
> > > > > - We found that for low-volume topics, replica fetch requests
> > > > > were getting expired but sitting around in purgatory
> > > > > - This was because we were expiring them from the delay queue
> > > > > (used to track when requests should expire), but they were still
> > > > > sitting in the watcherFor map - i.e., they would get purged when
> > > > > the next producer request to that topic/partition arrived, but
> > > > > for low volume topics this could be a long time (or never in the
> > > > > worst case) and we would eventually run into an OOME.
> > > > > - So we needed to periodically go through the entire watcherFor
> > > > > map and explicitly remove those requests that had expired.
> > > > > - More details on this are in KAFKA-664.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Joel
> > > > >
> > > > > [1]
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Internal
> > > > > s
> > > > >
> > > > > On Fri, Nov 1, 2013 at 12:33 PM, Marc Labbe 
> wrote:
> > > > > > Guozhang,
> > > > > >
> > > > > > I have to agree with Priya the doc isn't very clear. Although
> > > > > > the configuration is documented, it is simply rewording the
> > > > > > name of the
> > > > > config,
> > > > > > which isn't particularly useful if you want more information
> > > > > > about
> > > what
> > > > > the
> > > > > > purgatory is. I searched the whole wiki and doc and could not
> > > > > > find
> > > > > anything
> > > > > > very useful as opposed looking a the code. In this case,
> > > > > > kafka.server.KafkaApis and kafka.server.RequestPurgatory will
> > > > > > be your friends.
> > > > > >
> > > > > > I'll try to add to Joe's answer here, mostly just reporting
> > > > > > what's available in the Scala doc from the project. I am doing
> > > > > > this to
> > > > > understand
> > > > > > the mechanics myself btw.
> > > > > >
> > > > > > As Joe said, messages are not dropped by the purgatory but
> > > > > > simply
> > > > removed
> > > > > > from the purgatory when they are satisfied. Satisfaction
> > > > >

RE: Purgatory

2013-11-08 Thread Yu, Libo
I read it and tried to understand it. It would be great to add a summary
at the beginning about what it is and how it may impact a user.

Regards,

Libo


-Original Message-
From: Joel Koshy [mailto:jjkosh...@gmail.com] 
Sent: Friday, November 08, 2013 2:01 AM
To: users@kafka.apache.org
Subject: Re: Purgatory

Excellent - thanks for putting that together! Will review it more carefully 
tomorrow and suggest some minor edits if required.

On Thu, Nov 07, 2013 at 10:45:40PM -0500, Marc Labbe wrote:
> I've just added a page for purgatory, feel free to comment/modify at will.
> I hope I didn't misinterpret too much of the code.
> 
> https://cwiki.apache.org/confluence/display/KAFKA/Request+Purgatory+(0
> .8)
> 
> I added a few questions of my own.
> 
> 
> On Fri, Nov 1, 2013 at 9:43 PM, Joe Stein  wrote:
> 
> > To edit the Wiki you need to send an ICLA 
> > http://www.apache.org/licenses/#clas to Apache and then once that is 
> > done an email to priv...@kafka.apache.org (or to me and I will copy 
> > private) with your Wiki username and that you sent the ICLA to Apache.
> >
> > Then, I can add you to edit the Wiki.
> >
> > /***
> >  Joe Stein
> >  Founder, Principal Consultant
> >  Big Data Open Source Security LLC
> >  http://www.stealth.ly
> >  Twitter: @allthingshadoop 
> > /
> >
> >
> > On Fri, Nov 1, 2013 at 9:08 PM, Marc Labbe  wrote:
> >
> > > Hi Joel,
> > >
> > > I used to have edit to the wiki, I made a few additions to it a 
> > > while ago but it's seem I don't have it anymore. It might have 
> > > been lost in the confluence update. I would be glad to add what I 
> > > have written if I get it back. Otherwise, feel free to paste my 
> > > words in one of the pages, I don't intend on asking for copyrights for 
> > > this :).
> > >
> > > marc
> > >
> > >
> > > On Fri, Nov 1, 2013 at 4:32 PM, Joel Koshy  wrote:
> > >
> > > > Marc, thanks for writing that up. I think it is worth adding 
> > > > some details on the request-purgatory on a wiki (Jay had started 
> > > > a wiki page for kafka internals [1] a while ago, but we have not 
> > > > had time to add much to it since.) Your write-up could be 
> > > > reviewed and added there. Do you have edit permissions on the wiki?
> > > >
> > > > As for the purge interval config - yes the documentation can be 
> > > > improved a bit. It's one of those "internal" configs that 
> > > > generally don't need to be modified by users. The reason we 
> > > > added that was as
> > > > follows:
> > > > - We found that for low-volume topics, replica fetch requests 
> > > > were getting expired but sitting around in purgatory
> > > > - This was because we were expiring them from the delay queue 
> > > > (used to track when requests should expire), but they were still 
> > > > sitting in the watcherFor map - i.e., they would get purged when 
> > > > the next producer request to that topic/partition arrived, but 
> > > > for low volume topics this could be a long time (or never in the 
> > > > worst case) and we would eventually run into an OOME.
> > > > - So we needed to periodically go through the entire watcherFor 
> > > > map and explicitly remove those requests that had expired.
> > > > - More details on this are in KAFKA-664.
> > > >
> > > > Thanks,
> > > >
> > > > Joel
> > > >
> > > > [1] 
> > > > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Internal
> > > > s
> > > >
> > > > On Fri, Nov 1, 2013 at 12:33 PM, Marc Labbe  wrote:
> > > > > Guozhang,
> > > > >
> > > > > I have to agree with Priya the doc isn't very clear. Although 
> > > > > the configuration is documented, it is simply rewording the 
> > > > > name of the
> > > > config,
> > > > > which isn't particularly useful if you want more information 
> > > > > about
> > what
> > > > the
> > > > > purgatory is. I searched the whole wiki and doc and could not 
> > > > > find
> > > > anything
> > > > > very useful as opposed looking a the code. In this case, 
> > > > > kafka.server.KafkaApis and kafka.server.RequestPurgatory will 
> > > > > be your friends.
> > > > >
> > > > > I'll try to add to Joe's answer here, mostly just reporting 
> > > > > what's available in the Scala doc from the project. I am doing 
> > > > > this to
> > > > understand
> > > > > the mechanics myself btw.
> > > > >
> > > > > As Joe said, messages are not dropped by the purgatory but 
> > > > > simply
> > > removed
> > > > > from the purgatory when they are satisfied. Satisfaction 
> > > > > conditions
> > are
> > > > > different for both fetch and produce requests and this is 
> > > > > implemented
> > > in
> > > > > their respective DelayedRequest implementation (DelayedFetch 
> > > > > and DelayedProduce).
> > > > >
> > > > > Requests purgatories are defined as follow in the code:
> > > > >  - ProducerRequestPurgatory: A holding pen for produce 
> > > > > requests
> > waiting
> > > > to
> > > > > 

RE: add partition tool in 0.8

2013-11-08 Thread Yu, Libo
Thanks for your reply, Joel.

Regards,

Libo


-Original Message-
From: Joel Koshy [mailto:jjkosh...@gmail.com] 
Sent: Thursday, November 07, 2013 5:00 PM
To: users@kafka.apache.org
Subject: Re: add partition tool in 0.8

> 
> kafka-add-partitions.sh is in 0.8 but not in 0.8-beta1. Therefore we 
> cannot use this tool with 0.8-beta1. If I download latest 0.8 and 
> compile it, can I use its kafka-add-partitions.sh to add partitions for the 
> topics that already exist in our 0.8-beta1 kafka? Thanks.

Unfortunately, no - since there were changes in the controller
(broker) to support this. So you will need to upgrade to 0.8 before you can do 
this.

Thanks,

Joel



RE: which repo for kafka_2.10 ?

2013-11-08 Thread Liu, Raymond
Thanks. Good to know that it will come to formal 0.8.0 release soon.

I am really looking for the public maven repo for porting sparks on to it 
instead of running a local version ;) probably I can do with a beta1 one 
firstly.

Best Regards,
Raymond Liu

-Original Message-
From: Joe Stein [mailto:joe.st...@stealth.ly] 
Sent: Friday, November 08, 2013 9:15 PM
To: users@kafka.apache.org
Subject: Re: which repo for kafka_2.10 ?

0.8.0 is in process being released and when that is done Scala 2.10 will be in 
Maven central.

Until then you can do

./sbt "++2.10 publish-local"

from checking out the source of Kafka as Victor just said, yup.

you will be prompted to sign the jars which you can do with a pgp key or remove 
the pgp key plugin before running.

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/


On Fri, Nov 8, 2013 at 8:11 AM, Viktor Kolodrevskiy < 
viktor.kolodrevs...@gmail.com> wrote:

> Are you looking for maven repo?
>
> You can always checkout sources from http://kafka.apache.org/code.html 
> and build it yourself.
>
> 2013/11/8 Liu, Raymond :
> > If I want to use kafka_2.10 0.8.0-beta1, which repo I should go to?
> Seems apache repo don't have it. While there are com.sksamuel.kafka 
> and
> com.twitter.tormenta-kafka_2.10
> > Which one should I go to or neither?
> >
> > Best Regards,
> > Raymond Liu
> >
>
>
>
> --
> Thanks,
> Viktor
>


Re: Kafka Hadoop Consumer issues

2013-11-08 Thread Abhi Basu
Simplekafkaetljob class,  as mentioned in the post.

Thanks

Abhi

>From Samsung Galaxy S4
On Nov 7, 2013 8:34 PM, "Jun Rao"  wrote:

> Which class is not found?
>
> Thanks,
>
> Jun
>
>
> On Thu, Nov 7, 2013 at 11:56 AM, Abhi Basu <9000r...@gmail.com> wrote:
>
> > Let me describe my environment. Working on two nodes currently:
> > 1.Single-node hadoop cluster (will refer as Node1)
> > 2.Single node Kafka cluster  (will refer as Node2)
> >
> > Node 2 has 1 broker started with a topic (iot.test.stream) and one
> command
> > line producer and one command line consumer to test the kafka install.
> > Producer can send messages and the Consumer is receiving it.
> >
> > Node 1 (hadoop cluster) has kafka hadoop consumer code built. Have edited
> > the /kafka-0.8/contrib/hadoop-consumer/test/test.properties file with the
> > following:
> >
> > kafka.etl.topic=iot.test.stream
> > hdfs.default.classpath.dir=/tmp/kafka/lib
> > hadoop.job.ugi=kafka,hadoop
> > kafka.server.uri=tcp://idh251-kafka:9095
> > input=/tmp/kafka/data
> > output=/tmp/kafka/output
> > kafka.request.limit=-1
> > ...
> >
> > I have copied the copy-jars.sh to /tmp/kafka/lib (on HDFS)
> >
> > Next I run the following on Node 1:
> > ./run-class.sh kafka.etl.impl.SimpleKafkaETLJob test/test.properties from
> > the /kafka-0.8/contrib/hadoop-consumer folder and get a
> > classnotfoundexception for kafka.etl.impl.SimpleKafkaETLJob class.
> >
> > What am I missing? I was thinking that running the sh file would allow me
> > to retrieve messages with the same topic name to HDFS from Node 2 to Node
> > 1. I just want to do an end to end test to see that messages coming into
> > Kafka are being stored in HDFS with the minimal amount of code change
> > required.
> >
> > Thanks,
> >
> > Abhi
> >
> > --
> > Abhi Basu
> >
>


Re: which repo for kafka_2.10 ?

2013-11-08 Thread Joe Stein
0.8.0 is in process being released and when that is done Scala 2.10 will be
in Maven central.

Until then you can do

./sbt "++2.10 publish-local"

from checking out the source of Kafka as Victor just said, yup.

you will be prompted to sign the jars which you can do with a pgp key or
remove the pgp key plugin before running.

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/


On Fri, Nov 8, 2013 at 8:11 AM, Viktor Kolodrevskiy <
viktor.kolodrevs...@gmail.com> wrote:

> Are you looking for maven repo?
>
> You can always checkout sources from http://kafka.apache.org/code.html
> and build it yourself.
>
> 2013/11/8 Liu, Raymond :
> > If I want to use kafka_2.10 0.8.0-beta1, which repo I should go to?
> Seems apache repo don't have it. While there are com.sksamuel.kafka and
> com.twitter.tormenta-kafka_2.10
> > Which one should I go to or neither?
> >
> > Best Regards,
> > Raymond Liu
> >
>
>
>
> --
> Thanks,
> Viktor
>


Re: which repo for kafka_2.10 ?

2013-11-08 Thread Viktor Kolodrevskiy
Are you looking for maven repo?

You can always checkout sources from http://kafka.apache.org/code.html
and build it yourself.

2013/11/8 Liu, Raymond :
> If I want to use kafka_2.10 0.8.0-beta1, which repo I should go to? Seems 
> apache repo don't have it. While there are com.sksamuel.kafka and 
> com.twitter.tormenta-kafka_2.10
> Which one should I go to or neither?
>
> Best Regards,
> Raymond Liu
>



-- 
Thanks,
Viktor