: Iterator<ContentStream> getContentStreams(); : : Consider the case where you iterate through a local file system.
right, a fixed size in memory array can be iterated, but an unbounded stream of objects from an external source can't allways be read into an array effectively -- so when it doubt go with the Iterator (or my favorite: Iterable) : In addition to RequestProcessors, maybe there should be a general : DocumentProcessor : : interface SolrDocumentParser : { : Document parse(ContentStream content); : } : : solrconfig could register "text/html" -> HtmlDocumentParser, and : RequestProcessors could share the same parser. what else would the RequestProcessor do if it was delegating all of the parsing to something else? -Hoss