Re: Leader: -1 on Kafka

2014-04-22 Thread Jun Rao
If you only have 1 replica, when the broker the replica is down, the
partition will have no leader. A broker can be down due to soft failures.
See
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whypartitionleadersmigratethemselvessometimes
?

Thanks,

Jun


On Mon, Apr 21, 2014 at 1:36 PM, Kashyap Mhaisekar wrote:

> Yes.
> I have 3 kafka brokers and I have created one topic with 1 partition and 1
> replication (default options). All params are default. I was testing it
> with heavy load and had a storm reading from this and suddenly the messages
> stop. kafka's list topics script files then show the leader as -1.
>
> I got the same error even with 2 partitions and default replication.
>
> Regards,
> Kashyap
>
>
> On Mon, Apr 21, 2014 at 1:26 PM, Joel Koshy  wrote:
>
> > Can you describe your set up in more detail and also if you can
> > reproduce this easily? This can happen when none of the replicas for a
> > partition are available, but cannot comment further without details.
> >
> > On Mon, Apr 21, 2014 at 10:54:59AM -0500, Kashyap Mhaisekar wrote:
> > > Hi,
> > > At times, some of kafka topics end up showing the leader as -1. After
> > this,
> > > the messages dont get added to the topic nor consumed. I tried digging
> > into
> > > why the leader turns -1 *(leader: -1)*
> > >
> > > Is there a reason why this happens and how it can be resolved?
> > >
> > > Regards,
> > > kashyap
> >
> >
>


Log Retention in Kafka

2014-04-22 Thread Kashyap Mhaisekar
Hi,
I wanted to set the message expiry for a message on a kafka topic. Is there
anything like this in kafka?
I came across a property - *log.retention.hours* and
*topic.log.retention.hours*
Had some queries around it.And it was mentioned that
topic.log.retention.hours is per topic configuration.
Had some queries around it -
1. Does it mean that I need to specific the .log.retention.hours
in the kafka config?
2. Can this property be overriden anywhere?
3. Is it possible for the producer to set a message expiry so that the
message expires after a configurable period of time?

Regards,
Kashyap


Re: Log Retention in Kafka

2014-04-22 Thread Joel Koshy
Which version of Kafka are you using?

You can read up on the configuration options here:
http://kafka.apache.org/documentation.html#configuration

You can specify time-based retention using log.retention.minutes which
will apply to all topics. You can override that on per-topic basis -
see further down in the above page under "topic-level configuration"

On Tue, Apr 22, 2014 at 02:34:24PM -0500, Kashyap Mhaisekar wrote:
> Hi,
> I wanted to set the message expiry for a message on a kafka topic. Is there
> anything like this in kafka?
> I came across a property - *log.retention.hours* and
> *topic.log.retention.hours*
> Had some queries around it.And it was mentioned that
> topic.log.retention.hours is per topic configuration.
> Had some queries around it -
> 1. Does it mean that I need to specific the .log.retention.hours
> in the kafka config?
> 2. Can this property be overriden anywhere?
> 3. Is it possible for the producer to set a message expiry so that the
> message expires after a configurable period of time?
> 
> Regards,
> Kashyap



estimating log.retention.bytes

2014-04-22 Thread Andrey Yegorov
Hi,

Please help me understand how one should estimate upper limit for
log.retention.bytes in this situation.

Let's say kafka cluster has 3 machines (broker per machine) with 15TB disk
space per machine.
Cluster will have one topic with 30 partitions and replication factor 2.

My thinking is:
with replication, I'll have 60 'partitions' spread across 3 machines hence
20 per machine.
Max space I can allocate per partition is 15TB/20 = 768GB per partition.

Am I on the right track?

--
Andrey Yegorov


RE: commitOffsets by partition 0.8-beta

2014-04-22 Thread Seshadri, Balaji
I'm updating the latest offset consumed to the zookeeper directory.

Say for eg if my last consumed message has offset of 5 i update it in the 
path,but when i check zookeeper path it has 6 after sometimes.

Does any other process updates it ?.


From: Seshadri, Balaji
Sent: Friday, April 18, 2014 11:50 AM
To: 'users@kafka.apache.org'
Subject: RE: commitOffsets by partition 0.8-beta

Thanks Jun.


-Original Message-
From: Jun Rao [mailto:jun...@gmail.com]
Sent: Friday, April 18, 2014 11:37 AM
To: users@kafka.apache.org
Subject: Re: commitOffsets by partition 0.8-beta

We don't have the ability to commit offset at the partition level now. This 
feature probably won't be available until we are done with the consumer 
rewrite, which is 3-4 months away.

If you want to do sth now and don't want to use SimpleConsumer, another hacky 
way is to turn off auto offset commit and write the offset to ZK in the right 
path yourself in the app.

Thanks,

Jun


On Fri, Apr 18, 2014 at 10:02 AM, Seshadri, Balaji  wrote:

> Hi,
>
> We have use case in DISH where we need to stop the consumer when we
> have issues in proceeding further to database or another back end.
>
> We update offset manually for each consumed message. There are 4
> threads(e.g) consuming from same connector and when one thread commits
> the offset there is chance that data for all other threads also get committed.
>
> We don't want to go with this to prod as we are going to take first
> step of replacing traditional broker with Kafka for business critical
> process, is it ok if we add commit Offset(Topic,partition) method that
> commits only the consumed data for that particular thread.
>
> At this point we don't want to change our framework to use Simple
> Consumer as it is lots of work for us.
>
> Please let us know the effect of committing the offset per partition
> being consumed by the thread. We have around 131 partitions per topic
> and around
>  20 topics.
>
> Thanks,
>
> Balaji
>
>
>


RE: commitOffsets by partition 0.8-beta

2014-04-22 Thread Seshadri, Balaji
Please find my code to commit offset;

public void handleAfterConsumption(MessageAndMetadata mAndM) {
String commitPerThread = 
props.getProperty("commitperthread","N");
DESMetadata metadata= new 
DESMetadata(mAndM.topic(), consumerGroup, mAndM.partition(),mAndM.offset());
if(isRunning()){
if(commitPerThread.equals("Y")){
commitOffset(metadata);
}
else{
kafkaConsumer.commitOffsets();
}
}
}


public void commitOffset(DESMetadata metaData) {
log.info("Update offsets only for ->"+ metaData.toString());
ZKGroupTopicDirs topicDirs = new 
ZKGroupTopicDirs(metaData.getGroupId(),metaData.getTopic());
ZkUtils.updatePersistentPath(zkClient, 
topicDirs.consumerOffsetDir()+"/"+metaData.getPartitionNumber(),metaData.getOffSet()+"");
}

Thanks,

Balaji
-Original Message-
From: Seshadri, Balaji [mailto:balaji.sesha...@dish.com] 
Sent: Tuesday, April 22, 2014 8:10 PM
To: 'users@kafka.apache.org'
Subject: RE: commitOffsets by partition 0.8-beta

I'm updating the latest offset consumed to the zookeeper directory.

Say for eg if my last consumed message has offset of 5 i update it in the 
path,but when i check zookeeper path it has 6 after sometimes.

Does any other process updates it ?.


From: Seshadri, Balaji
Sent: Friday, April 18, 2014 11:50 AM
To: 'users@kafka.apache.org'
Subject: RE: commitOffsets by partition 0.8-beta

Thanks Jun.


-Original Message-
From: Jun Rao [mailto:jun...@gmail.com]
Sent: Friday, April 18, 2014 11:37 AM
To: users@kafka.apache.org
Subject: Re: commitOffsets by partition 0.8-beta

We don't have the ability to commit offset at the partition level now. This 
feature probably won't be available until we are done with the consumer 
rewrite, which is 3-4 months away.

If you want to do sth now and don't want to use SimpleConsumer, another hacky 
way is to turn off auto offset commit and write the offset to ZK in the right 
path yourself in the app.

Thanks,

Jun


On Fri, Apr 18, 2014 at 10:02 AM, Seshadri, Balaji  wrote:

> Hi,
>
> We have use case in DISH where we need to stop the consumer when we 
> have issues in proceeding further to database or another back end.
>
> We update offset manually for each consumed message. There are 4
> threads(e.g) consuming from same connector and when one thread commits 
> the offset there is chance that data for all other threads also get committed.
>
> We don't want to go with this to prod as we are going to take first 
> step of replacing traditional broker with Kafka for business critical 
> process, is it ok if we add commit Offset(Topic,partition) method that 
> commits only the consumed data for that particular thread.
>
> At this point we don't want to change our framework to use Simple 
> Consumer as it is lots of work for us.
>
> Please let us know the effect of committing the offset per partition 
> being consumed by the thread. We have around 131 partitions per topic 
> and around
>  20 topics.
>
> Thanks,
>
> Balaji
>
>
>


Re: estimating log.retention.bytes

2014-04-22 Thread Jun Rao
Yes, your estimate is correct.

Thanks,

Jun


On Tue, Apr 22, 2014 at 6:16 PM, Andrey Yegorov wrote:

> Hi,
>
> Please help me understand how one should estimate upper limit for
> log.retention.bytes in this situation.
>
> Let's say kafka cluster has 3 machines (broker per machine) with 15TB disk
> space per machine.
> Cluster will have one topic with 30 partitions and replication factor 2.
>
> My thinking is:
> with replication, I'll have 60 'partitions' spread across 3 machines hence
> 20 per machine.
> Max space I can allocate per partition is 15TB/20 = 768GB per partition.
>
> Am I on the right track?
>
> --
> Andrey Yegorov
>


Re: commitOffsets by partition 0.8-beta

2014-04-22 Thread Jun Rao
Do you have auto commit disabled?

Thanks,

Jun


On Tue, Apr 22, 2014 at 7:10 PM, Seshadri, Balaji
wrote:

> I'm updating the latest offset consumed to the zookeeper directory.
>
> Say for eg if my last consumed message has offset of 5 i update it in the
> path,but when i check zookeeper path it has 6 after sometimes.
>
> Does any other process updates it ?.
>
> 
> From: Seshadri, Balaji
> Sent: Friday, April 18, 2014 11:50 AM
> To: 'users@kafka.apache.org'
> Subject: RE: commitOffsets by partition 0.8-beta
>
> Thanks Jun.
>
>
> -Original Message-
> From: Jun Rao [mailto:jun...@gmail.com]
> Sent: Friday, April 18, 2014 11:37 AM
> To: users@kafka.apache.org
> Subject: Re: commitOffsets by partition 0.8-beta
>
> We don't have the ability to commit offset at the partition level now.
> This feature probably won't be available until we are done with the
> consumer rewrite, which is 3-4 months away.
>
> If you want to do sth now and don't want to use SimpleConsumer, another
> hacky way is to turn off auto offset commit and write the offset to ZK in
> the right path yourself in the app.
>
> Thanks,
>
> Jun
>
>
> On Fri, Apr 18, 2014 at 10:02 AM, Seshadri, Balaji <
> balaji.sesha...@dish.com
> > wrote:
>
> > Hi,
> >
> > We have use case in DISH where we need to stop the consumer when we
> > have issues in proceeding further to database or another back end.
> >
> > We update offset manually for each consumed message. There are 4
> > threads(e.g) consuming from same connector and when one thread commits
> > the offset there is chance that data for all other threads also get
> committed.
> >
> > We don't want to go with this to prod as we are going to take first
> > step of replacing traditional broker with Kafka for business critical
> > process, is it ok if we add commit Offset(Topic,partition) method that
> > commits only the consumed data for that particular thread.
> >
> > At this point we don't want to change our framework to use Simple
> > Consumer as it is lots of work for us.
> >
> > Please let us know the effect of committing the offset per partition
> > being consumed by the thread. We have around 131 partitions per topic
> > and around
> >  20 topics.
> >
> > Thanks,
> >
> > Balaji
> >
> >
> >
>


RE: commitOffsets by partition 0.8-beta

2014-04-22 Thread Seshadri, Balaji
Yes I disabled it.

My doubt is the path should have offset to be consumed or last consumed offset.

-Original Message-
From: Jun Rao [mailto:jun...@gmail.com] 
Sent: Tuesday, April 22, 2014 9:52 PM
To: users@kafka.apache.org
Subject: Re: commitOffsets by partition 0.8-beta

Do you have auto commit disabled?

Thanks,

Jun


On Tue, Apr 22, 2014 at 7:10 PM, Seshadri, Balaji
wrote:

> I'm updating the latest offset consumed to the zookeeper directory.
>
> Say for eg if my last consumed message has offset of 5 i update it in 
> the path,but when i check zookeeper path it has 6 after sometimes.
>
> Does any other process updates it ?.
>
> 
> From: Seshadri, Balaji
> Sent: Friday, April 18, 2014 11:50 AM
> To: 'users@kafka.apache.org'
> Subject: RE: commitOffsets by partition 0.8-beta
>
> Thanks Jun.
>
>
> -Original Message-
> From: Jun Rao [mailto:jun...@gmail.com]
> Sent: Friday, April 18, 2014 11:37 AM
> To: users@kafka.apache.org
> Subject: Re: commitOffsets by partition 0.8-beta
>
> We don't have the ability to commit offset at the partition level now.
> This feature probably won't be available until we are done with the 
> consumer rewrite, which is 3-4 months away.
>
> If you want to do sth now and don't want to use SimpleConsumer, 
> another hacky way is to turn off auto offset commit and write the 
> offset to ZK in the right path yourself in the app.
>
> Thanks,
>
> Jun
>
>
> On Fri, Apr 18, 2014 at 10:02 AM, Seshadri, Balaji < 
> balaji.sesha...@dish.com
> > wrote:
>
> > Hi,
> >
> > We have use case in DISH where we need to stop the consumer when we 
> > have issues in proceeding further to database or another back end.
> >
> > We update offset manually for each consumed message. There are 4
> > threads(e.g) consuming from same connector and when one thread 
> > commits the offset there is chance that data for all other threads 
> > also get
> committed.
> >
> > We don't want to go with this to prod as we are going to take first 
> > step of replacing traditional broker with Kafka for business 
> > critical process, is it ok if we add commit Offset(Topic,partition) 
> > method that commits only the consumed data for that particular thread.
> >
> > At this point we don't want to change our framework to use Simple 
> > Consumer as it is lots of work for us.
> >
> > Please let us know the effect of committing the offset per partition 
> > being consumed by the thread. We have around 131 partitions per 
> > topic and around
> >  20 topics.
> >
> > Thanks,
> >
> > Balaji
> >
> >
> >
>


Re: Log Retention in Kafka

2014-04-22 Thread Kashyap Mhaisekar
Thanks Joel. Am using version 2.8.0.

Thanks,
Kashyap


On Tue, Apr 22, 2014 at 5:53 PM, Joel Koshy  wrote:

> Which version of Kafka are you using?
>
> You can read up on the configuration options here:
> http://kafka.apache.org/documentation.html#configuration
>
> You can specify time-based retention using log.retention.minutes which
> will apply to all topics. You can override that on per-topic basis -
> see further down in the above page under "topic-level configuration"
>
> On Tue, Apr 22, 2014 at 02:34:24PM -0500, Kashyap Mhaisekar wrote:
> > Hi,
> > I wanted to set the message expiry for a message on a kafka topic. Is
> there
> > anything like this in kafka?
> > I came across a property - *log.retention.hours* and
> > *topic.log.retention.hours*
> > Had some queries around it.And it was mentioned that
> > topic.log.retention.hours is per topic configuration.
> > Had some queries around it -
> > 1. Does it mean that I need to specific the
> .log.retention.hours
> > in the kafka config?
> > 2. Can this property be overriden anywhere?
> > 3. Is it possible for the producer to set a message expiry so that the
> > message expires after a configurable period of time?
> >
> > Regards,
> > Kashyap
>
>