[ 
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364847#comment-15364847
 ] 

ASF GitHub Bot commented on KAFKA-3857:
---------------------------------------

GitHub user kiranptivo opened a pull request:

    https://github.com/apache/kafka/pull/1593

    KAFKA-3857 Additional log cleaner metrics

    Fixes KAFKA-3857
    
    Changes proposed in this pull request:
    
    The following additional log cleaner metrics have been added.
    1. num-runs: Cumulative number of successful log cleaner runs since last 
broker restart.
    2. last-run-time: Time of last log cleaner run.
    3. num-filthy-logs: Number of filthy logs. A non zero value for an extended 
period of time indicates that the cleaner has not been successful in cleaning 
the logs.
    
    A note on num-filthy-logs: It is incremented whenever a filthy topic 
partition is added to inProgress HashMap. And it is decremented once the 
cleaning is successful, or if the cleaning is aborted. Note that the existing 
LogCleaner code does not provide a metric to check if the clean operation is 
successful or not. There is an inProgress HashMap with topicPartition  => 
LogCleaningInProgress entries in it, but the entries are removed from the 
HashMap even when clean operation throws an exception. So, added an additional 
metric num-filthy-logs, to differentiate between a successful log clean case 
and an exception case.
    
    The code is ready. I have tested and verified JMX metrics. There is one 
case I couldn't test though. It's the case where numFilthyLogs is decremented 
in 'resumeCleaning(...)' in LogCleanerManager.scala Line 188. It seems to be a 
part of the workflow that aborts the cleaning of a particular partition. Any 
ideas on how to test this scenario?

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/TiVo/kafka log_cleaner_jmx_metrics

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/1593.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1593
    
----
commit f00de412f6b1f6568adef479687ae0df789f9c96
Author: Kiran Pillarisetty <pillarise...@pillarisetty-mbpr.tivo.com>
Date:   2016-06-14T17:40:26Z

    Create a couple of additional Log Cleaner JMX metrics
    log-clean-last-run: Log cleaner's last run time
    log-clean-runs: Number of log cleaner runs.

commit 7dc7511ee2b6d3cdf9df0c366fe23bf34d062a54
Author: Kiran Pillarisetty <pillarise...@tivo.com>
Date:   2016-06-14T20:24:00Z

    Created a couple of additional Log Cleaner JMX metrics
    log-clean-last-run: a metric to track last log cleaner run (unix timestamp)
    log-clean-runs: a metric to track number of log cleaner runs
    
    Committer: Kiran Pillarisetty <pillarise...@tivo.com>

commit 7f1214ff1118103dd639df717e988a22bad8033d
Author: Kiran Pillarisetty <pillarise...@tivo.com>
Date:   2016-07-01T22:14:57Z

    Add additional JMX metric to track successful cleaning of a log segment

commit 1ac346bb37008312e41035167dbfd75803595cd6
Author: Kiran Pillarisetty <pillarise...@tivo.com>
Date:   2016-07-01T22:17:25Z

    Add additional JMX metric to track successful cleaning of a log segment

commit 4f08d875e05c35bd7d7c849584b8b029031f884b
Author: Kiran Pillarisetty <pillarise...@tivo.com>
Date:   2016-07-05T22:23:20Z

    Metric name updated to num-filthy-logs. Metric incremented as it is grabbed 
for cleaning, and decremented once the cleaning is done, or if the cleaning is 
aborted

commit cd887c05bf1d56b7566c5b72b3ddf3bcdfb70898
Author: Kiran Pillarisetty <pillarise...@tivo.com>
Date:   2016-07-05T23:31:32Z

    Changed a metric name (number-of-runs to num-runs). Removed an extra \n 
around line 164. It is not present in the trunk

----


> Additional log cleaner metrics
> ------------------------------
>
>                 Key: KAFKA-3857
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3857
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Kiran Pillarisetty
>
> The proposal would be to add a couple of additional log cleaner metrics: 
> 1. Time of last log cleaner run 
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent, 
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not 
> differentiate an idle log cleaner from a dead log cleaner. It would be useful 
> to have the above two metrics added, to indicate whether log cleaner is alive 
> (and successfully cleaning) or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to