[ https://issues.apache.org/jira/browse/NUTCH-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13910355#comment-13910355 ]
lufeng edited comment on NUTCH-1726 at 2/24/14 2:41 PM: -------------------------------------------------------- Hi Markus It seems that HeadingsFilter does not find nested nodes in my testing code. but I can not restore your testing result when I use following process to testing our patch {code:java} > svn checkout https://svn.apache.org/repos/asf/nutch/trunk nutch-svn2 > cd nutch-svn2 > patch -p0 < NUTCH-1726-trunk.patch > ant > cd src/plugin/headings/ > ant test {code} everything seems ok. yes, you are right, maybe someone want to ignore long headers. But do we need to set headings.maxlength option to -1 to disable this check, maybe someone want to disable this feature. Feng was (Author: amuseme.lu): Hi Markus It seems that HeadingsFilter does not find nested nodes in my testing code. but I can not restore your testing result when I use following process to testing our patch {code:bash} > svn checkout https://svn.apache.org/repos/asf/nutch/trunk nutch-svn2 > cd nutch-svn2 > patch -p0 < NUTCH-1726-trunk.patch > ant > cd src/plugin/headings/ > ant test {code} everything seems ok. yes, you are right, maybe someone want to ignore long headers. But do we need to set headings.maxlength option to -1 to disable this check, maybe someone want to disable this feature. Feng > HeadingsFilter does not find nested nodes > ----------------------------------------- > > Key: NUTCH-1726 > URL: https://issues.apache.org/jira/browse/NUTCH-1726 > Project: Nutch > Issue Type: Bug > Affects Versions: 1.7 > Reporter: Markus Jelsma > Assignee: Markus Jelsma > Priority: Minor > Fix For: 1.8 > > Attachments: NUTCH-1726-trunk-v2.patch, NUTCH-1726-trunk.patch, > NUTCH-1726-trunk.patch > > > Filter won't find: > {code} > <h1><span>apache nutch</span></h1> > {code} > The getNodeValue() tries to read data from children but should traverse nodes > instead. -- This message was sent by Atlassian JIRA (v6.1.5#6160)