[ https://issues.apache.org/jira/browse/NUTCH-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17579781#comment-17579781 ]
Hudson commented on NUTCH-2947: ------------------------------- SUCCESS: Integrated in Jenkins build Nutch ยป Nutch-trunk #81 (See [https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/81/]) NUTCH-2947 Fetcher: keep state of empty but stateful fetch queues (snagel: [https://github.com/apache/nutch/commit/c862d24093554a980f706abaaa7f36fee7a03aa1]) * (edit) src/java/org/apache/nutch/fetcher/FetchItemQueues.java * (edit) src/java/org/apache/nutch/fetcher/QueueFeeder.java NUTCH-2947 Fetcher: keep state of empty but stateful fetch queues (snagel: [https://github.com/apache/nutch/commit/8cfa53f7db59d1aecf0718141d4f4f9d000fab2d]) * (edit) src/java/org/apache/nutch/fetcher/FetchItemQueues.java > Fetcher: keep state of empty fetch queues unless queue feeder is finished > ------------------------------------------------------------------------- > > Key: NUTCH-2947 > URL: https://issues.apache.org/jira/browse/NUTCH-2947 > Project: Nutch > Issue Type: Bug > Components: fetcher > Affects Versions: 1.18 > Reporter: Sebastian Nagel > Assignee: Sebastian Nagel > Priority: Major > Fix For: 1.19 > > > If a fetch queue is empty (containing no fetch items) it may be removed from > the list of queues. This also remove the state of a fetch queue, namely the > next fetch time and the exception counter. If the queue feeder is still > active it may happened that the same queue (i.e. associated with the same > host/domain/IP) removed before is created again. In this case, certain > aspects of fetcher politeness cannot be guaranteed anymore: > - the fetch delay (via earliest next fetch time) and > - the mechanism to block fetching from the same host/domain/IP with too many > exceptions (NUTCH-769). > The issue was observed while verifying NUTCH-2946 in the fetcher logs: > {noformat} > ... 10:19:16,912 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 10:20:16,250 * queue foo.bar >> delayed next fetch by 79248 ms after 2 > exceptions in queue > ... 10:21:52,675 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 10:25:40,931 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 10:27:45,066 * queue foo.bar >> delayed next fetch by 79248 ms after 2 > exceptions in queue > ... 10:29:40,407 * queue foo.bar >> delayed next fetch by 100000 ms after 3 > exceptions in queue > ... 10:41:48,870 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 10:47:54,946 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 10:52:46,792 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 10:57:43,470 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 11:01:12,220 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 11:04:24,621 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 11:18:40,398 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 11:21:09,437 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 11:34:36,052 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 11:39:17,898 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 11:40:35,472 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 11:50:34,224 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 11:51:27,547 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 11:53:04,783 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 11:54:04,404 * queue foo.bar >> delayed next fetch by 79248 ms after 2 > exceptions in queue > ... 11:55:38,232 * queue foo.bar >> delayed next fetch by 100000 ms after 3 > exceptions in queue > ... 11:57:37,942 * queue foo.bar >> delayed next fetch by 116096 ms after 4 > exceptions in queue > ... 12:01:08,619 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > ... 12:02:35,985 * queue foo.bar >> delayed next fetch by 50000 ms after 1 > exceptions in queue > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)