[ https://issues.apache.org/jira/browse/NUTCH-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859299#comment-13859299 ]
Tien Nguyen Manh commented on NUTCH-1687: ----------------------------------------- 1. It seem redundant in this context. 2. i add id, so that queues map can delete FetchItemQueue by it's id quickly, if not we must navigate from start of queues. > Pick queue in Round Robin > ------------------------- > > Key: NUTCH-1687 > URL: https://issues.apache.org/jira/browse/NUTCH-1687 > Project: Nutch > Issue Type: Improvement > Components: fetcher > Reporter: Tien Nguyen Manh > Priority: Minor > Fix For: 2.3, 1.8 > > Attachments: NUTCH-1687.patch > > > Currently we chose queue to pick url from start of queues list, so queue at > the start of list have more change to be pick first, that can cause problem > of long tail queue, which only few queue available at the end which have many > urls. > public synchronized FetchItem getFetchItem() { > final Iterator<Map.Entry<String, FetchItemQueue>> it = > queues.entrySet().iterator(); ==> always reset to find queue from > start > while (it.hasNext()) { > .... > I think it is better to pick queue in round robin, that can make reduce time > to find the available queue and make all queue was picked in round robin and > if we use TopN during generator there are no long tail queue at the end. -- This message was sent by Atlassian JIRA (v6.1.5#6160)