[ 
https://issues.apache.org/jira/browse/NUTCH-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17579781#comment-17579781
 ] 

Hudson commented on NUTCH-2947:
-------------------------------

SUCCESS: Integrated in Jenkins build Nutch ยป Nutch-trunk #81 (See 
[https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/81/])
NUTCH-2947 Fetcher: keep state of empty but stateful fetch queues (snagel: 
[https://github.com/apache/nutch/commit/c862d24093554a980f706abaaa7f36fee7a03aa1])
* (edit) src/java/org/apache/nutch/fetcher/FetchItemQueues.java
* (edit) src/java/org/apache/nutch/fetcher/QueueFeeder.java
NUTCH-2947 Fetcher: keep state of empty but stateful fetch queues (snagel: 
[https://github.com/apache/nutch/commit/8cfa53f7db59d1aecf0718141d4f4f9d000fab2d])
* (edit) src/java/org/apache/nutch/fetcher/FetchItemQueues.java


> Fetcher: keep state of empty fetch queues unless queue feeder is finished
> -------------------------------------------------------------------------
>
>                 Key: NUTCH-2947
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2947
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.18
>            Reporter: Sebastian Nagel
>            Assignee: Sebastian Nagel
>            Priority: Major
>             Fix For: 1.19
>
>
> If a fetch queue is empty (containing no fetch items) it may be removed from 
> the list of queues. This also remove the state of a fetch queue, namely the 
> next fetch time and the exception counter. If the queue feeder is still 
> active it may happened that the same queue (i.e. associated with the same 
> host/domain/IP) removed before is created again. In this case, certain 
> aspects of fetcher politeness cannot be guaranteed anymore:
> - the fetch delay (via earliest next fetch time) and
> - the mechanism to block fetching from the same host/domain/IP with too many 
> exceptions (NUTCH-769).
> The issue was observed while verifying NUTCH-2946 in the fetcher logs:
> {noformat}
> ... 10:19:16,912 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 10:20:16,250 * queue foo.bar >> delayed next fetch by 79248 ms after 2 
> exceptions in queue
> ... 10:21:52,675 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 10:25:40,931 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 10:27:45,066 * queue foo.bar >> delayed next fetch by 79248 ms after 2 
> exceptions in queue
> ... 10:29:40,407 * queue foo.bar >> delayed next fetch by 100000 ms after 3 
> exceptions in queue
> ... 10:41:48,870 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 10:47:54,946 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 10:52:46,792 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 10:57:43,470 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 11:01:12,220 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 11:04:24,621 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 11:18:40,398 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 11:21:09,437 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 11:34:36,052 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 11:39:17,898 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 11:40:35,472 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 11:50:34,224 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 11:51:27,547 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 11:53:04,783 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 11:54:04,404 * queue foo.bar >> delayed next fetch by 79248 ms after 2 
> exceptions in queue
> ... 11:55:38,232 * queue foo.bar >> delayed next fetch by 100000 ms after 3 
> exceptions in queue
> ... 11:57:37,942 * queue foo.bar >> delayed next fetch by 116096 ms after 4 
> exceptions in queue
> ... 12:01:08,619 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> ... 12:02:35,985 * queue foo.bar >> delayed next fetch by 50000 ms after 1 
> exceptions in queue
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to