[ http://issues.apache.org/jira/browse/NUTCH-413?page=comments#action_12456832 ] Dogacan Güney commented on NUTCH-413: -------------------------------------
Are you sure about this? Running the fetcher (latest trunk) with -noParsing option does not create any parse segments, while running fetcher without it does create them. I even put fetcher.parse property in nutch-site.xml(assuming that nutch-site overrides command line options), still it works as expected. > Fetcher ignores -noParsing command line option > ---------------------------------------------- > > Key: NUTCH-413 > URL: http://issues.apache.org/jira/browse/NUTCH-413 > Project: Nutch > Issue Type: Bug > Components: fetcher > Affects Versions: 0.8.1 > Environment: Fedora Core 6, nutch 0.8.1 > Reporter: Jonathan Amir > > I believe that the patch applied in NUTCH-337 broke the fetcher. Now the > fetcher ignores the -noParsing command-line option - the parsing occurs > anyway. > To the best of my understanding of nutch, I managed to trace the problem as > follows in the code: > In fetcher class, in line 473, -noParsing is evaluted properly and placed > into a Configuration created by NutchConfiguartion.create(). So far so good. > In the same file, in line 280, the decision whether to parse or not depends > on local field "parsing". During execution, this fields value is true, > instead of false. This field is set to true by method "configure", in line > 357. The problem is that method "configure" accepts a JobConf as a parameter, > but the actual JobConf object that is passed to it is not the one used > previously in line 473. > The one that is actually passed to configure is a different object. I think > it is created in line 422, but I am not sure about it. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira