Nutch ended up crawling some HTML files that had a TXT extension.  Because
of this(I assume), it didn't strip out the HTML.  So now I have weird
formatting on my results page.

Is there a way to fix this on the Nutch side so it doesn't happen again?

Reply via email to