[ 
https://issues.apache.org/jira/browse/KAFKA-612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499572#comment-13499572
 ] 

Neha Narkhede edited comment on KAFKA-612 at 11/18/12 1:03 AM:
---------------------------------------------------------------

Overall, the fix looks good, here are a few review comments -

1. AbstractFetcherThread
1.1 I'm trying to understand what can cause a particular topic partition from 
the response object to have no offset in the fetchMap. Right now, we ignore 
this case altogether. I'm guessing if the leader changes after the broker sends 
the fetch response, this could happen, which is ok. Wondering if this is what 
was intended or if we should log and handle the error case ?
1.2 hasPartition and partCount could use a refactor. You can return a value 
from a try catch block, the finally block will still execute and unlock the 
fetch map lock.

2. ReplicaManager
The right way to measure leader election latency is at the controller by 
measuring the time after it sent a leader/isr request to the time it received 
the corresponding ACK fro
m the broker. Given that, we should probably move the following lines at the 
end of handleLeaderAndIsrRequest in KafkaApis.
    info("Completed leader and isr request %s".format(leaderAndISRRequest))
    replicaFetcherManager.shutdownEmptyFetcherThread()
 
3. AbstractFetcherThread 
3.1 fetchMap is a confusing name, this map is keeping track of fetch offsets 
per topic partition. I haven't looked at this code in much detail before, but 
when I read it while 
reviewing this patch, I thought the names of the maps could be improved. For 
example, the map in AbstractFetcherManager is called fetcherThreadMap and the 
map in this class is 
called fetcherMap.
3.2 How about renaming shutdownEmptyFetcherThread to shutdownIdleFetcherThreads 
?

                
      was (Author: nehanarkhede):
    Overall, the fix looks good, here are a few review comments -

1. AbstractFetcherThread
1.1 I'm trying to understand what can cause a particular topic partition from 
the response object to have no offset in the fetchMap. Right now, we ignore 
this case altogether. I'm guessing if the leader changes after the broker sends 
the fetch response, this could happen, which is ok. Wondering if this is what 
was intended or if we should log and handle the error case ?

1.2 hasPartition and partCount could use a refactor. You can return a value 
from a try catch block, the finally block will still execute and unlock the 
fetch map lock.
2. ReplicaManager

The right way to measure leader election latency is at the controller by 
measuring the time after it sent a leader/isr request to the time it received 
the corresponding ACK fro
m the broker. Given that, we should probably move the following lines at the 
end of handleLeaderAndIsrRequest in KafkaApis.

    info("Completed leader and isr request %s".format(leaderAndISRRequest))
    replicaFetcherManager.shutdownEmptyFetcherThread()
           3. AbstractFetcherThread 
3.1 fetchMap is a confusing name, this map is keeping track of fetch offsets 
per topic partition. I haven't looked at this code in much detail before, but 
when I read it while 
reviewing this patch, I thought the names of the maps could be improved. For 
example, the map in AbstractFetcherManager is called fetcherThreadMap and the 
map in this class is 
called fetcherMap.
3.2 How about renaming shutdownEmptyFetcherThread to shutdownIdleFetcherThreads 
?

                  
> move shutting down of fetcher thread out of critical path
> ---------------------------------------------------------
>
>                 Key: KAFKA-612
>                 URL: https://issues.apache.org/jira/browse/KAFKA-612
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>         Attachments: kafka-612.patch
>
>
> Shutting down a fetch thread seems to take more than 200ms since we need to 
> interrupt the thread. Currently, we shutdown fetcher threads while processing 
> a leaderAndIsr request. This can delay some of the partitions to become a 
> leader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to