[jira] Aktualisiert: (NUTCH-20) Extract urls from plain texts
[ http://issues.apache.org/jira/browse/NUTCH-20?page=all ] Stephan Strittmatter updated NUTCH-20: -- Attachment: OutlinkExtractor.java anchor "null" causes NPE. changed to anchor as empty String. > Extract urls from plain texts > ---
[jira] Aktualisiert: (NUTCH-20) Extract urls from plain texts
[ http://issues.apache.org/jira/browse/NUTCH-20?page=all ] Stephan Strittmatter updated NUTCH-20: -- Description: Some parsers have no Outlinks returned. E.g. the Word-Parser. This class is able to extract (absolute) hyperlinks from a plain String (c