Author: siren Date: Tue Oct 24 08:27:55 2006 New Revision: 467357 URL: http://svn.apache.org/viewvc?view=rev&rev=467357 Log: fix for NUTCH-379
Modified: lucene/nutch/branches/branch-0.8/CHANGES.txt lucene/nutch/branches/branch-0.8/src/java/org/apache/nutch/parse/ParseUtil.java Modified: lucene/nutch/branches/branch-0.8/CHANGES.txt URL: http://svn.apache.org/viewvc/lucene/nutch/branches/branch-0.8/CHANGES.txt?view=diff&rev=467357&r1=467356&r2=467357 ============================================================================== --- lucene/nutch/branches/branch-0.8/CHANGES.txt (original) +++ lucene/nutch/branches/branch-0.8/CHANGES.txt Tue Oct 24 08:27:55 2006 @@ -4,6 +4,9 @@ 1. NUTCH-391 ParseUtil logs file contents to log file when it cannot find parser (siren) + + 2. NUTCH-379 - ParseUtil does not pass through the content's URL + to the ParserFactory (Chris A. Mattmann via siren) Release 0.8.1 - 2006-09-24 Modified: lucene/nutch/branches/branch-0.8/src/java/org/apache/nutch/parse/ParseUtil.java URL: http://svn.apache.org/viewvc/lucene/nutch/branches/branch-0.8/src/java/org/apache/nutch/parse/ParseUtil.java?view=diff&rev=467357&r1=467356&r2=467357 ============================================================================== --- lucene/nutch/branches/branch-0.8/src/java/org/apache/nutch/parse/ParseUtil.java (original) +++ lucene/nutch/branches/branch-0.8/src/java/org/apache/nutch/parse/ParseUtil.java Tue Oct 24 08:27:55 2006 @@ -65,7 +65,8 @@ Parser[] parsers = null; try { - parsers = this.parserFactory.getParsers(content.getContentType(), ""); + parsers = this.parserFactory.getParsers(content.getContentType(), + content.getUrl() != null ? content.getUrl():""); } catch (ParserNotFound e) { if (LOG.isWarnEnabled()) { LOG.warn("No suitable parser found when trying to parse content " + content.getUrl() +