Hi all, Is it possible to get Nutch to split the crawl results into sentences so that each document contains only one sentence rather than a web page? I need this so that when I use Solr to index the crawl db it takes in a sentence at a time - the final result I want is to get a list of sentences that match a query instead of a list of web pages when doing a search.
Thanks, Michael

