[ http://issues.apache.org/jira/browse/NUTCH-101?page=all ]
Fuad Efendi updated NUTCH-101:
--
Version: 0.6
0.7.1
> RobotRulesParser
>
>
> Key: NUTCH-101
> URL: http://issues.apache.org/jira/browse/NUTCH-101
>
[
http://issues.apache.org/jira/browse/NUTCH-101?page=comments#action_12331658 ]
Fuad Efendi commented on NUTCH-101:
---
1. There is a bug in method parseRules(byte[] robotContent):
...
StringTokenizer lineParser= new StringTokenizer(content, "\n\r");
...
This explains pretty well what I am talking about
http://blogs.sun.com/roller/page/ahl/20050418#dtracing_java
Earl
--- Earl Cahill <[EMAIL PROTECTED]> wrote:
> Just about to switch a box over to solaris 10, in
> part
> so I can try and help out with nutch profiling via
> dtrace. Wondering if a
>We are finding that the fetcher is crawling extremely slow.
I am going to run some performance tests during this long weekend,
in-home network with Apache HTTPD, and with browsable copy of
www.apache.org
1. Nutch-0.7.1 with protocol-http plugin
2. Nutch-0.7.1 with protocol-httpclient
2. Mo