[jira] Aktualisiert: (NUTCH-20) Extract urls from plain texts

2005-08-02 Thread Stephan Strittmatter (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-20?page=all ] Stephan Strittmatter updated NUTCH-20: -- Attachment: OutlinkExtractor.java anchor "null" causes NPE. changed to anchor as empty String. > Extract urls from plain texts > ---

[jira] Aktualisiert: (NUTCH-20) Extract urls from plain texts

2005-08-02 Thread Stephan Strittmatter (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-20?page=all ] Stephan Strittmatter updated NUTCH-20: -- Description: Some parsers have no Outlinks returned. E.g. the Word-Parser. This class is able to extract (absolute) hyperlinks from a plain String (c