[ https://issues.apache.org/jira/browse/NUTCH-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lewis John McGibbney updated NUTCH-944: --------------------------------------- Fix Version/s: (was: 2.1) 2.2 > Increase the number of elements to look for URLs and add the ability to > specify multiple attributes by elements > --------------------------------------------------------------------------------------------------------------- > > Key: NUTCH-944 > URL: https://issues.apache.org/jira/browse/NUTCH-944 > Project: Nutch > Issue Type: Improvement > Components: parser > Environment: GNU/Linux Fedora 12 > Reporter: Jean-Francois Gingras > Priority: Minor > Fix For: 1.6, 2.2 > > Attachments: DOMContentUtils.java.path-1.0, > DOMContentUtils.java.path-1.3 > > > Here a patch for DOMContentUtils.java that increase the number of elements to > look for URLs. It also add the ability to specify multiple attributes by > elements, for example: > linkParams.put("frame", new LinkParams("frame", "longdesc,src", 0)); > linkParams.put("object", new LinkParams("object", > "classid,codebase,data,usemap", 0)); > linkParams.put("video", new LinkParams("video", "poster,src", 0)); // HTML 5 > I have a patch for release-1.0 and branch-1.3 > I would love to hear your comments about this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira