[ https://issues.apache.org/jira/browse/NUTCH-624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinci updated NUTCH-624: ------------------------ Description: I found the parsed text by default parser, Neko in 1.0 nightly is not easy to process - it just add a space to the end of the tag. For easier analysis, neko (or other parser) should change the behaviour to 1.adding tab for inline element 2.add a tab+newline for block level element end instead of space That will help another application to use the parsed text. was: I found the parsed text by default parser Neko is not easy to process - it just add a space to the end of the tag. Can neko (or other parser) change the behaviour to 1.adding tab (for inline element) 2.add a tab+newline for block level element end instead of space, so we can have a better parsed text? Affects Version/s: 1.0.0 Summary: Better parsed text by default parser (was: Better parsed text) [Add more information on raising this issue] > Better parsed text by default parser > ------------------------------------ > > Key: NUTCH-624 > URL: https://issues.apache.org/jira/browse/NUTCH-624 > Project: Nutch > Issue Type: Improvement > Affects Versions: 1.0.0 > Reporter: Vinci > > I found the parsed text by default parser, Neko in 1.0 nightly is not easy to > process - it just add a space to the end of the tag. > For easier analysis, neko (or other parser) should change the behaviour to > 1.adding tab for inline element > 2.add a tab+newline for block level element end > instead of space > That will help another application to use the parsed text. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.