Fetcher ignores -noParsing command line option
----------------------------------------------
Key: NUTCH-413
URL: http://issues.apache.org/jira/browse/NUTCH-413
Project: Nutch
Issue Type: Bug
Components: fetcher
Affects Versions: 0.8.1
Environment: Fedora Core 6, nutch 0.8.1
Reporter: Jonathan Amir
I believe that the patch applied in NUTCH-337 broke the fetcher. Now the
fetcher ignores the -noParsing command-line option - the parsing occurs anyway.
To the best of my understanding of nutch, I managed to trace the problem as
follows in the code:
In fetcher class, in line 473, -noParsing is evaluted properly and placed into
a Configuration created by NutchConfiguartion.create(). So far so good.
In the same file, in line 280, the decision whether to parse or not depends on
local field "parsing". During execution, this fields value is true, instead of
false. This field is set to true by method "configure", in line 357. The
problem is that method "configure" accepts a JobConf as a parameter, but the
actual JobConf object that is passed to it is not the one used previously in
line 473.
The one that is actually passed to configure is a different object. I think it
is created in line 422, but I am not sure about it.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers