[ 
https://issues.apache.org/jira/browse/NUTCH-44?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12569428#action_12569428
 ] 

Andrzej Bialecki  commented on NUTCH-44:
----------------------------------------

The name of the property is somewhat misleading, because it applies to Web GUI 
and the OpenSearch servlet. Can we come up with a better name (and shorter too 
;) )?

Also, this patch doesn't solve the whole issue, though it addresses the 
specific scenario described by the reporter. In general, even if hitsPerPage is 
small, it is still very expensive to retrieve a page of results far down the 
list, e.g. results 1000-10010. Currently Nutch will attempt to retrieve 10 
results no matter what is the starting point, which represents a potential way 
to launch a DoS attack. Still, we can first fix this issue, and address this 
problem in a new issue.

> too many search results
> -----------------------
>
>                 Key: NUTCH-44
>                 URL: https://issues.apache.org/jira/browse/NUTCH-44
>             Project: Nutch
>          Issue Type: Bug
>          Components: web gui
>         Environment: web environment
>            Reporter: Emilijan Mirceski
>            Assignee: Dennis Kubes
>         Attachments: NUTCH-44.patch
>
>
> There should be a limitation (user defined) on the number of results the 
> search engine can return. 
> For example, if one modifies the seach url as:
> http://<my>/search.jsp?query=<some quiery>&hitsPerPage=20000&hitsPerSite=0
> The search will try to return 20,000 pages which isn't good for the server 
> side performance. 
> Is it possible to have a setting in the config xml files to control this?
> Thanks,
> Emilijan

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to