Re: questtion about log.retention.bytes

2013-08-16 Thread Jun Rao
log.retention.bytes is for all topics that are not included in
log.retention.bytes.per.topic
(which defines a map of topic -> size).

Currently, we don't have a total size limit across all topics.

Thanks,

Jun


On Fri, Aug 16, 2013 at 2:00 PM, Paul Christian
wrote:

> According to the Kafka 8 documentation under broker configuration. There
> are these parameters and their definitions.
>
> log.retention.bytes -1 The maximum size of the log before deleting it
> log.retention.bytes.per.topic "" The maximum size of the log for some
> specific topic before deleting it
>
> I'm curious what the first value 'log.retention.bytes' is for if the second
> one is for per topic logs, because aren't all logs generated per topic? Is
> this an aggregate value across topics?
>
> Related question, is there a parameter for kafka where you can say only
> hold this much TOTAL data across all topic ( logs/index together )? I.e.
> our  hosts have this much available space and so value
> log.retention.whatever.aggregate == 75% total  disk space.
>


re: questtion about log.retention.bytes

2013-08-19 Thread Paul Christian
Hi Jun,

Thank you for your reply. I'm still a little fuzzy on the concept.

Are you saying I can have topic A, B and C and with

log.retention.bytes.per.topic.A = 15MB
log.retention.bytes.per.topic.B = 20MB
log.retention.bytes = 30MB

And thus topic C will get the value 30MB? Since it's not defined like the
others' 'per topic'?

log.retention.bytes is for all topics that are not included in
log.retention.bytes.per.topic
(which defines a map of topic -> size).

Otherwise, log.retention.bytes.per.topic and log.retention.bytes seem very
similar to me.

Additionally, we've experimented with this value on our test cluster where
we set the log.retention.bytes to 11MB as a test. Below is a snippet from
our server.properties.

# A size-based retention policy for logs. Segments are pruned from the log
as long as the remaining
# segments don't drop below log.retention.bytes.
log.retention.bytes=11534336

Here is a ls -lh from one of the topics

-rw-r--r-- 1 kafka service  10M Aug 19 15:45 07021913.index
-rw-r--r-- 1 kafka service 114M Aug 19 15:45 07021913.log

The index file appears to be reflected in the property
log.index.size.max.bytes, but the log just keeps going.


Re: questtion about log.retention.bytes

2013-08-19 Thread Jun Rao
For the first question, yes, topic C will get the value of 30MB.

For the second question, log.retention.bytes only controls the segment log
file size, not the index. Typically, index file size is much smaller than
the log file. The index file of the last (active) segment is presized to
the max index size (defaults to 10MB). However, the size is trimmed as soon
as the segment rolls.

Thanks,

Jun


On Mon, Aug 19, 2013 at 9:22 AM, Paul Christian
wrote:

> Hi Jun,
>
> Thank you for your reply. I'm still a little fuzzy on the concept.
>
> Are you saying I can have topic A, B and C and with
>
> log.retention.bytes.per.topic.A = 15MB
> log.retention.bytes.per.topic.B = 20MB
> log.retention.bytes = 30MB
>
> And thus topic C will get the value 30MB? Since it's not defined like the
> others' 'per topic'?
>
> log.retention.bytes is for all topics that are not included in
> log.retention.bytes.per.topic
> (which defines a map of topic -> size).
>
> Otherwise, log.retention.bytes.per.topic and log.retention.bytes seem very
> similar to me.
>
> Additionally, we've experimented with this value on our test cluster where
> we set the log.retention.bytes to 11MB as a test. Below is a snippet from
> our server.properties.
>
> # A size-based retention policy for logs. Segments are pruned from the log
> as long as the remaining
> # segments don't drop below log.retention.bytes.
> log.retention.bytes=11534336
>
> Here is a ls -lh from one of the topics
>
> -rw-r--r-- 1 kafka service  10M Aug 19 15:45 07021913.index
> -rw-r--r-- 1 kafka service 114M Aug 19 15:45 07021913.log
>
> The index file appears to be reflected in the property
> log.index.size.max.bytes, but the log just keeps going.
>


Re: questtion about log.retention.bytes

2013-08-19 Thread Neha Narkhede
Paul,

I'm trying to understand the 2nd problem you reported. Are you saying that
you set the log.retention.bytes=11534336 (11MB) but nevertheless your log
grew to 114MB. Which means the config option didn't really work as expected?

Thanks,
Neha


On Mon, Aug 19, 2013 at 8:46 PM, Jun Rao  wrote:

> For the first question, yes, topic C will get the value of 30MB.
>
> For the second question, log.retention.bytes only controls the segment log
> file size, not the index. Typically, index file size is much smaller than
> the log file. The index file of the last (active) segment is presized to
> the max index size (defaults to 10MB). However, the size is trimmed as soon
> as the segment rolls.
>
> Thanks,
>
> Jun
>
>
> On Mon, Aug 19, 2013 at 9:22 AM, Paul Christian
> wrote:
>
> > Hi Jun,
> >
> > Thank you for your reply. I'm still a little fuzzy on the concept.
> >
> > Are you saying I can have topic A, B and C and with
> >
> > log.retention.bytes.per.topic.A = 15MB
> > log.retention.bytes.per.topic.B = 20MB
> > log.retention.bytes = 30MB
> >
> > And thus topic C will get the value 30MB? Since it's not defined like the
> > others' 'per topic'?
> >
> > log.retention.bytes is for all topics that are not included in
> > log.retention.bytes.per.topic
> > (which defines a map of topic -> size).
> >
> > Otherwise, log.retention.bytes.per.topic and log.retention.bytes seem
> very
> > similar to me.
> >
> > Additionally, we've experimented with this value on our test cluster
> where
> > we set the log.retention.bytes to 11MB as a test. Below is a snippet from
> > our server.properties.
> >
> > # A size-based retention policy for logs. Segments are pruned from the
> log
> > as long as the remaining
> > # segments don't drop below log.retention.bytes.
> > log.retention.bytes=11534336
> >
> > Here is a ls -lh from one of the topics
> >
> > -rw-r--r-- 1 kafka service  10M Aug 19 15:45 07021913.index
> > -rw-r--r-- 1 kafka service 114M Aug 19 15:45 07021913.log
> >
> > The index file appears to be reflected in the property
> > log.index.size.max.bytes, but the log just keeps going.
> >
>


re: questtion about log.retention.bytes

2013-08-20 Thread Paul Christian
Jun,

For my first example is that syntax correct? I.e.

log.retention.bytes.per.topic.A = 15MB
log.retention.bytes.per.topic.B = 20MB

I totally guessed there and was wondering if I guessed right? Otherwise is
there a document with the proper formatting to full out this map?

Thank you,

Paul


Re: questtion about log.retention.bytes

2013-08-20 Thread Paul Christian
Neha,

Correct, that is my question. We want to investigate capping our disk usage
so we don't fill up our hds. If you have any recommended configurations or
documents on these setting, please let us know.

Thank you,

Paul



On Tue, Aug 20, 2013 at 6:16 AM, Paul Christian
wrote:

> Jun,
>
> For my first example is that syntax correct? I.e.
>
> log.retention.bytes.per.topic.A = 15MB
> log.retention.bytes.per.topic.B = 20MB
>
> I totally guessed there and was wondering if I guessed right? Otherwise is
> there a document with the proper formatting to full out this map?
>
> Thank you,
>
> Paul
>
>
>
>


Re: questtion about log.retention.bytes

2013-08-21 Thread Jun Rao
All per topic configuration properties below have the format of csv (e.g.,
"topic1:value1,topic2:value2"). Updated our website to make it clear.

Thanks,

Jun


On Tue, Aug 20, 2013 at 6:16 AM, Paul Christian
wrote:

> Jun,
>
> For my first example is that syntax correct? I.e.
>
> log.retention.bytes.per.topic.A = 15MB
> log.retention.bytes.per.topic.B = 20MB
>
> I totally guessed there and was wondering if I guessed right? Otherwise is
> there a document with the proper formatting to full out this map?
>
> Thank you,
>
> Paul
>


RE: questtion about log.retention.bytes

2013-08-27 Thread Yu, Libo
Hi Jun,

In a previous email thread 
http://markmail.org/search/?q=kafka+log.retention.bytes#query:kafka%20log.retention.bytes+page:1+mid:qnt4pbq47goii2ui+state:results,
you said log.retention.bytes is for each partition. Could you clarify on that?

Say if I have a topic with three partitions. I want to limit the disk space to 
1Gb for each partition.
Then log.retention.bytes should be set to 1Gb (not 3Gb). Is that right?

If I want to use log.retention.bytes.per.topic to set the same limit, should it 
be set to 1G or 3G?

Libo


-Original Message-
From: Jun Rao [mailto:jun...@gmail.com] 
Sent: Saturday, August 17, 2013 12:40 AM
To: users@kafka.apache.org
Subject: Re: questtion about log.retention.bytes

log.retention.bytes is for all topics that are not included in 
log.retention.bytes.per.topic (which defines a map of topic -> size).

Currently, we don't have a total size limit across all topics.

Thanks,

Jun


On Fri, Aug 16, 2013 at 2:00 PM, Paul Christian
wrote:

> According to the Kafka 8 documentation under broker configuration. 
> There are these parameters and their definitions.
>
> log.retention.bytes -1 The maximum size of the log before deleting it 
> log.retention.bytes.per.topic "" The maximum size of the log for some 
> specific topic before deleting it
>
> I'm curious what the first value 'log.retention.bytes' is for if the 
> second one is for per topic logs, because aren't all logs generated 
> per topic? Is this an aggregate value across topics?
>
> Related question, is there a parameter for kafka where you can say 
> only hold this much TOTAL data across all topic ( logs/index together )? I.e.
> our  hosts have this much available space and so value 
> log.retention.whatever.aggregate == 75% total  disk space.
>


Re: questtion about log.retention.bytes

2013-08-27 Thread Jun Rao
For the first question, yes.

For the second one, this is documented in
http://kafka.apache.org/documentation.html#brokerconfigs

"Note that all per topic configuration properties below have the format of
csv (e.g., "topic1:value1,topic2:value2")."
Thanks,
Jun



On Tue, Aug 27, 2013 at 11:52 AM, Yu, Libo  wrote:

> Hi Jun,
>
> In a previous email thread
> http://markmail.org/search/?q=kafka+log.retention.bytes#query:kafka%20log.retention.bytes+page:1+mid:qnt4pbq47goii2ui+state:results
> ,
> you said log.retention.bytes is for each partition. Could you clarify on
> that?
>
> Say if I have a topic with three partitions. I want to limit the disk
> space to 1Gb for each partition.
> Then log.retention.bytes should be set to 1Gb (not 3Gb). Is that right?
>
> If I want to use log.retention.bytes.per.topic to set the same limit,
> should it be set to 1G or 3G?
>
> Libo
>
>
> -Original Message-
> From: Jun Rao [mailto:jun...@gmail.com]
> Sent: Saturday, August 17, 2013 12:40 AM
> To: users@kafka.apache.org
> Subject: Re: questtion about log.retention.bytes
>
> log.retention.bytes is for all topics that are not included in
> log.retention.bytes.per.topic (which defines a map of topic -> size).
>
> Currently, we don't have a total size limit across all topics.
>
> Thanks,
>
> Jun
>
>
> On Fri, Aug 16, 2013 at 2:00 PM, Paul Christian
> wrote:
>
> > According to the Kafka 8 documentation under broker configuration.
> > There are these parameters and their definitions.
> >
> > log.retention.bytes -1 The maximum size of the log before deleting it
> > log.retention.bytes.per.topic "" The maximum size of the log for some
> > specific topic before deleting it
> >
> > I'm curious what the first value 'log.retention.bytes' is for if the
> > second one is for per topic logs, because aren't all logs generated
> > per topic? Is this an aggregate value across topics?
> >
> > Related question, is there a parameter for kafka where you can say
> > only hold this much TOTAL data across all topic ( logs/index together )?
> I.e.
> > our  hosts have this much available space and so value
> > log.retention.whatever.aggregate == 75% total  disk space.
> >
>


Re: questtion about log.retention.bytes

2013-08-28 Thread Jay Kreps
I think the problem is that there is no way to understand the meaning of
that config from the docs, so people keep asking over and over again. The
docs make iit sound like it is per topic and just says it is the "maximum
size before it is deleted" which makes no sense...

-jay

On Tuesday, August 27, 2013, Jun Rao wrote:

> For the first question, yes.
>
> For the second one, this is documented in
> http://kafka.apache.org/documentation.html#brokerconfigs
>
> "Note that all per topic configuration properties below have the format of
> csv (e.g., "topic1:value1,topic2:value2")."
> Thanks,
> Jun
>
>
>
> On Tue, Aug 27, 2013 at 11:52 AM, Yu, Libo >
> wrote:
>
> > Hi Jun,
> >
> > In a previous email thread
> >
> http://markmail.org/search/?q=kafka+log.retention.bytes#query:kafka%20log.retention.bytes+page:1+mid:qnt4pbq47goii2ui+state:results
> > ,
> > you said log.retention.bytes is for each partition. Could you clarify on
> > that?
> >
> > Say if I have a topic with three partitions. I want to limit the disk
> > space to 1Gb for each partition.
> > Then log.retention.bytes should be set to 1Gb (not 3Gb). Is that right?
> >
> > If I want to use log.retention.bytes.per.topic to set the same limit,
> > should it be set to 1G or 3G?
> >
> > Libo
> >
> >
> > -Original Message-
> > From: Jun Rao [mailto:jun...@gmail.com ]
> > Sent: Saturday, August 17, 2013 12:40 AM
> > To: users@kafka.apache.org 
> > Subject: Re: questtion about log.retention.bytes
> >
> > log.retention.bytes is for all topics that are not included in
> > log.retention.bytes.per.topic (which defines a map of topic -> size).
> >
> > Currently, we don't have a total size limit across all topics.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Fri, Aug 16, 2013 at 2:00 PM, Paul Christian
> > >wrote:
> >
> > > According to the Kafka 8 documentation under broker configuration.
> > > There are these parameters and their definitions.
> > >
> > > log.retention.bytes -1 The maximum size of the log before deleting it
> > > log.retention.bytes.per.topic "" The maximum size of the log for some
> > > specific topic before deleting it
> > >
> > > I'm curious what the first value 'log.retention.bytes' is for if the
> > > second one is for per topic logs, because aren't all logs generated
> > > per topic? Is this an aggregate value across topics?
> > >
> > > Related question, is there a parameter for kafka where you can say
> > > only hold this much TOTAL data across all topic ( logs/index together
> )?
> > I.e.
> > > our  hosts have this much available space and so value
> > > log.retention.whatever.aggregate == 75% total  disk space.
> > >
> >
>


Re: questtion about log.retention.bytes

2013-08-28 Thread Jun Rao
Changed the wording in the doc.

Thanks,

Jun


On Wed, Aug 28, 2013 at 7:20 AM, Jay Kreps  wrote:

> I think the problem is that there is no way to understand the meaning of
> that config from the docs, so people keep asking over and over again. The
> docs make iit sound like it is per topic and just says it is the "maximum
> size before it is deleted" which makes no sense...
>
> -jay
>
> On Tuesday, August 27, 2013, Jun Rao wrote:
>
> > For the first question, yes.
> >
> > For the second one, this is documented in
> > http://kafka.apache.org/documentation.html#brokerconfigs
> >
> > "Note that all per topic configuration properties below have the format
> of
> > csv (e.g., "topic1:value1,topic2:value2")."
> > Thanks,
> > Jun
> >
> >
> >
> > On Tue, Aug 27, 2013 at 11:52 AM, Yu, Libo  >
> > wrote:
> >
> > > Hi Jun,
> > >
> > > In a previous email thread
> > >
> >
> http://markmail.org/search/?q=kafka+log.retention.bytes#query:kafka%20log.retention.bytes+page:1+mid:qnt4pbq47goii2ui+state:results
> > > ,
> > > you said log.retention.bytes is for each partition. Could you clarify
> on
> > > that?
> > >
> > > Say if I have a topic with three partitions. I want to limit the disk
> > > space to 1Gb for each partition.
> > > Then log.retention.bytes should be set to 1Gb (not 3Gb). Is that right?
> > >
> > > If I want to use log.retention.bytes.per.topic to set the same limit,
> > > should it be set to 1G or 3G?
> > >
> > > Libo
> > >
> > >
> > > -Original Message-
> > > From: Jun Rao [mailto:jun...@gmail.com ]
> > > Sent: Saturday, August 17, 2013 12:40 AM
> > > To: users@kafka.apache.org 
> > > Subject: Re: questtion about log.retention.bytes
> > >
> > > log.retention.bytes is for all topics that are not included in
> > > log.retention.bytes.per.topic (which defines a map of topic -> size).
> > >
> > > Currently, we don't have a total size limit across all topics.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Fri, Aug 16, 2013 at 2:00 PM, Paul Christian
> > > >wrote:
> > >
> > > > According to the Kafka 8 documentation under broker configuration.
> > > > There are these parameters and their definitions.
> > > >
> > > > log.retention.bytes -1 The maximum size of the log before deleting it
> > > > log.retention.bytes.per.topic "" The maximum size of the log for some
> > > > specific topic before deleting it
> > > >
> > > > I'm curious what the first value 'log.retention.bytes' is for if the
> > > > second one is for per topic logs, because aren't all logs generated
> > > > per topic? Is this an aggregate value across topics?
> > > >
> > > > Related question, is there a parameter for kafka where you can say
> > > > only hold this much TOTAL data across all topic ( logs/index together
> > )?
> > > I.e.
> > > > our  hosts have this much available space and so value
> > > > log.retention.whatever.aggregate == 75% total  disk space.
> > > >
> > >
> >
>


RE: questtion about log.retention.bytes

2013-08-28 Thread Yu, Libo
Jay made a good point. The docs says it is for the topic but actually 
it is for one partition. Thanks, Jun.

Regards,

Libo


-Original Message-
From: Jun Rao [mailto:jun...@gmail.com] 
Sent: Wednesday, August 28, 2013 11:02 AM
To: users@kafka.apache.org
Subject: Re: questtion about log.retention.bytes

Changed the wording in the doc.

Thanks,

Jun


On Wed, Aug 28, 2013 at 7:20 AM, Jay Kreps  wrote:

> I think the problem is that there is no way to understand the meaning 
> of that config from the docs, so people keep asking over and over 
> again. The docs make iit sound like it is per topic and just says it 
> is the "maximum size before it is deleted" which makes no sense...
>
> -jay
>
> On Tuesday, August 27, 2013, Jun Rao wrote:
>
> > For the first question, yes.
> >
> > For the second one, this is documented in 
> > http://kafka.apache.org/documentation.html#brokerconfigs
> >
> > "Note that all per topic configuration properties below have the 
> > format
> of
> > csv (e.g., "topic1:value1,topic2:value2")."
> > Thanks,
> > Jun
> >
> >
> >
> > On Tue, Aug 27, 2013 at 11:52 AM, Yu, Libo  >
> > wrote:
> >
> > > Hi Jun,
> > >
> > > In a previous email thread
> > >
> >
> http://markmail.org/search/?q=kafka+log.retention.bytes#query:kafka%20
> log.retention.bytes+page:1+mid:qnt4pbq47goii2ui+state:results
> > > ,
> > > you said log.retention.bytes is for each partition. Could you 
> > > clarify
> on
> > > that?
> > >
> > > Say if I have a topic with three partitions. I want to limit the 
> > > disk space to 1Gb for each partition.
> > > Then log.retention.bytes should be set to 1Gb (not 3Gb). Is that right?
> > >
> > > If I want to use log.retention.bytes.per.topic to set the same 
> > > limit, should it be set to 1G or 3G?
> > >
> > > Libo
> > >
> > >
> > > -Original Message-
> > > From: Jun Rao [mailto:jun...@gmail.com ]
> > > Sent: Saturday, August 17, 2013 12:40 AM
> > > To: users@kafka.apache.org 
> > > Subject: Re: questtion about log.retention.bytes
> > >
> > > log.retention.bytes is for all topics that are not included in 
> > > log.retention.bytes.per.topic (which defines a map of topic -> size).
> > >
> > > Currently, we don't have a total size limit across all topics.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Fri, Aug 16, 2013 at 2:00 PM, Paul Christian 
> > > >wrote:
> > >
> > > > According to the Kafka 8 documentation under broker configuration.
> > > > There are these parameters and their definitions.
> > > >
> > > > log.retention.bytes -1 The maximum size of the log before 
> > > > deleting it log.retention.bytes.per.topic "" The maximum size of 
> > > > the log for some specific topic before deleting it
> > > >
> > > > I'm curious what the first value 'log.retention.bytes' is for if 
> > > > the second one is for per topic logs, because aren't all logs 
> > > > generated per topic? Is this an aggregate value across topics?
> > > >
> > > > Related question, is there a parameter for kafka where you can 
> > > > say only hold this much TOTAL data across all topic ( logs/index 
> > > > together
> > )?
> > > I.e.
> > > > our  hosts have this much available space and so value 
> > > > log.retention.whatever.aggregate == 75% total  disk space.
> > > >
> > >
> >
>