Re: broker logs unusable after KAFKA-6150: Make Repartition Topics Transient

2018-07-15 Thread Guozhang Wang
Hello Henry, I saw your detailed explanations on the JIRA ticket.

Thinking about this a bit more, I think moving the start offset should be
as important as an event as truncating log tail, or rolling new segment etc
(their log4j levels are all INFO). But I also agree that with the streams
periodical delete records request to brokers it could be too much as a
regularly frequent operation.

Note that besides handling the delete records request, there are a couple
other call traces that can leave to this entry:

1. Replica truncating starting offset due to received fetch response from
the leader.
2. Partition truncating starting offset due to time / size based retention
policy.

For both of these other two cases, the deletion event should be logged as
INFO as they are important and happen less frequently. Given that, I'd
suggest the following: if you do not care about the cases 1) / 2) as well,
then you can simply override these two classes in the log4j properties file
to WARN --- I do not think it is a hack, since overriding per class is a
very common and useful way for users to finer control their granularities.
If you do care other cases that may trigger log head truncation, but want
to reduce the logging overhead for the event that is triggered by handling
a delete records request only, we can file a JIRA for that and maybe
consider augmented call trace to let Log class decide whether or not log
the event.


Guozhang


On Fri, Jul 6, 2018 at 5:58 PM, Henry Cai 
wrote:

> On server side, we use INFO for everything.
>
> Log4j setting can be a temporary hack, but we would like to keep INFO
> logging as the default.
>
> I think those two logging lines can be just downgraded into DEBUG loggings,
> moving start offset is not that eventful to be logged as INFO.
>
> On Fri, Jul 6, 2018 at 4:08 PM, Guozhang Wang  wrote:
>
> > Hello Henry,
> >
> > What's your server-side log4j settings? Could you use WARN on these two
> > classes: kafka.server.epoch.LeaderEpochFileCache and kafka.log.Log.
> >
> >
> >
> > Guozhang
> >
> >
> > On Fri, Jul 6, 2018 at 3:08 PM, Henry Cai 
> > wrote:
> >
> > > @guozhang
> > >
> > > After we moved to kafka-1.1.0 for our Kafka streams application, our
> > broker
> > > logs are polluted with loggings such as:
> > >
> > > [2018-07-06 21:59:26,170] INFO Cleared earliest 0 entries from epoch
> > cache
> > > based on passed offset 301483601 leaving 1 in EpochFile for partition
> > > inflight_spend_unified_staging-single_spend_agg_
> > > window_AD_GROUP-repartition-26
> > > (kafka.server.epoch.LeaderEpochFileCache)
> > >
> > > [2018-07-06 21:59:26,170] INFO [Log
> > > partition=inflight_spend_unified_staging-single_spend_
> > agg_window_AD_GROUP-
> > > repartition-1,
> > > dir=/mnt/kafka] Incrementing log start offset to 240548684
> > (kafka.log.Log)
> > >
> > >
> > > Thousands of them keeping rolling in the broker logs which makes server
> > > side log unusable.
> > >
> > >
> > > Looks like this is triggered by DELETE_RECORDS requests from
> > StreamsThread
> > > from 'KAFKA-6150: Make Repartition Topics Transient'
> > >
> > >
> > > Can you suppress these two INFO loggings on the server side if they are
> > > triggered by AdminClient.deleteRecords()
> > >
> > >
> > > We have thousands of partitions per broker, those deletes were
> happening
> > > too frequent.
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang


Re: broker logs unusable after KAFKA-6150: Make Repartition Topics Transient

2018-07-06 Thread Henry Cai
On server side, we use INFO for everything.

Log4j setting can be a temporary hack, but we would like to keep INFO
logging as the default.

I think those two logging lines can be just downgraded into DEBUG loggings,
moving start offset is not that eventful to be logged as INFO.

On Fri, Jul 6, 2018 at 4:08 PM, Guozhang Wang  wrote:

> Hello Henry,
>
> What's your server-side log4j settings? Could you use WARN on these two
> classes: kafka.server.epoch.LeaderEpochFileCache and kafka.log.Log.
>
>
>
> Guozhang
>
>
> On Fri, Jul 6, 2018 at 3:08 PM, Henry Cai 
> wrote:
>
> > @guozhang
> >
> > After we moved to kafka-1.1.0 for our Kafka streams application, our
> broker
> > logs are polluted with loggings such as:
> >
> > [2018-07-06 21:59:26,170] INFO Cleared earliest 0 entries from epoch
> cache
> > based on passed offset 301483601 leaving 1 in EpochFile for partition
> > inflight_spend_unified_staging-single_spend_agg_
> > window_AD_GROUP-repartition-26
> > (kafka.server.epoch.LeaderEpochFileCache)
> >
> > [2018-07-06 21:59:26,170] INFO [Log
> > partition=inflight_spend_unified_staging-single_spend_
> agg_window_AD_GROUP-
> > repartition-1,
> > dir=/mnt/kafka] Incrementing log start offset to 240548684
> (kafka.log.Log)
> >
> >
> > Thousands of them keeping rolling in the broker logs which makes server
> > side log unusable.
> >
> >
> > Looks like this is triggered by DELETE_RECORDS requests from
> StreamsThread
> > from 'KAFKA-6150: Make Repartition Topics Transient'
> >
> >
> > Can you suppress these two INFO loggings on the server side if they are
> > triggered by AdminClient.deleteRecords()
> >
> >
> > We have thousands of partitions per broker, those deletes were happening
> > too frequent.
> >
>
>
>
> --
> -- Guozhang
>


Re: broker logs unusable after KAFKA-6150: Make Repartition Topics Transient

2018-07-06 Thread Guozhang Wang
Hello Henry,

What's your server-side log4j settings? Could you use WARN on these two
classes: kafka.server.epoch.LeaderEpochFileCache and kafka.log.Log.



Guozhang


On Fri, Jul 6, 2018 at 3:08 PM, Henry Cai 
wrote:

> @guozhang
>
> After we moved to kafka-1.1.0 for our Kafka streams application, our broker
> logs are polluted with loggings such as:
>
> [2018-07-06 21:59:26,170] INFO Cleared earliest 0 entries from epoch cache
> based on passed offset 301483601 leaving 1 in EpochFile for partition
> inflight_spend_unified_staging-single_spend_agg_
> window_AD_GROUP-repartition-26
> (kafka.server.epoch.LeaderEpochFileCache)
>
> [2018-07-06 21:59:26,170] INFO [Log
> partition=inflight_spend_unified_staging-single_spend_agg_window_AD_GROUP-
> repartition-1,
> dir=/mnt/kafka] Incrementing log start offset to 240548684 (kafka.log.Log)
>
>
> Thousands of them keeping rolling in the broker logs which makes server
> side log unusable.
>
>
> Looks like this is triggered by DELETE_RECORDS requests from StreamsThread
> from 'KAFKA-6150: Make Repartition Topics Transient'
>
>
> Can you suppress these two INFO loggings on the server side if they are
> triggered by AdminClient.deleteRecords()
>
>
> We have thousands of partitions per broker, those deletes were happening
> too frequent.
>



-- 
-- Guozhang


broker logs unusable after KAFKA-6150: Make Repartition Topics Transient

2018-07-06 Thread Henry Cai
@guozhang

After we moved to kafka-1.1.0 for our Kafka streams application, our broker
logs are polluted with loggings such as:

[2018-07-06 21:59:26,170] INFO Cleared earliest 0 entries from epoch cache
based on passed offset 301483601 leaving 1 in EpochFile for partition
inflight_spend_unified_staging-single_spend_agg_window_AD_GROUP-repartition-26
(kafka.server.epoch.LeaderEpochFileCache)

[2018-07-06 21:59:26,170] INFO [Log
partition=inflight_spend_unified_staging-single_spend_agg_window_AD_GROUP-repartition-1,
dir=/mnt/kafka] Incrementing log start offset to 240548684 (kafka.log.Log)


Thousands of them keeping rolling in the broker logs which makes server
side log unusable.


Looks like this is triggered by DELETE_RECORDS requests from StreamsThread
from 'KAFKA-6150: Make Repartition Topics Transient'


Can you suppress these two INFO loggings on the server side if they are
triggered by AdminClient.deleteRecords()


We have thousands of partitions per broker, those deletes were happening
too frequent.