I think you should go to JIRA, some big guys forgot us...
No any performance problems in comparison with parsing document... after that, we can walk many times J Can NodeWalker stop after finding first title, or it will walk till end of document? From: Alexey Torochkov [mailto:all.net...@gmail.com] Sent: August-28-09 5:50 PM To: nutch-dev@lucene.apache.org Subject: Re: Title inside body I think it's still a good solution to make it configurable <name>parser.html.skip.body.title</name> <value>false</value> true - for default I see only one performance problem with it, if page doesn't have title at all - NodeWalker will continue to walk on all nodes (but, actually it's not a problem) Patch attached Should I create an issue on it in JIRA? Or this patch have no chances to be applied? -- Alexey Torochkov