: I do like moving towards a separation of Document for indexing vs : searching for 3.0. : : Disregarding for starters how we get there from here... : : Wouldn't we just want a base class (not an interface), say : ReadOnlyField, that is used in documents retrieved by a reader? This : class would also have Index.*, Store.*, TermVector.*, and : isStored/Indexed/Tokenized/Compressed, etc, as these are recoverable : from an index. Couldn't this be a concrete class, ie, the actual : class instantiated when a Document is loaded from a reader?
Yes, but one of the peeves I've heard lots of people express over the years is that they want want to "decorate" the Documents returned by a search, so that they can make those documents access alternate field stores and metadata not in the index. (LUCENE-778 started out being a dicussion of wanting to pass custom subclasses of Document to writer.addDocument(), but it also mentions wanting to get custom documents back from IndexReader. Imagine you're writing an app that does a search with Lucene, and then returns a List<Document> ... public List<Document> myMethod(options) { Document<List> docs = doSomeSearchStuff(indexreader, query, options) return docs; } you've got alot of downstream code that calls myMethod and uses/propogates this List<Document> ... and then one day you decide that for each document you want to also include some metadata that Lucene doesn't know anything about, your downstream client code is happy to treat this new metadata just like any other field. You could change the API of myMethod and jump through a lot of hoops changing all of your other code; or if "Document" is a simple interface, you could do something like... public class MyDocumentWraper implements Document { public MyDocumentWraper(Document, otherData) {...} public static List<Document> wrappList(List<Document>, otherData) {...} } public List<Document> myMethod(options) { Document<List> docs = doSomeSearchStuff(indexreader, query, options) return MyDocumentWraper.wrapList(docs, getOtherData(options)); } (If i remember right, there are some comments to this effect in LUCENE-778 as well) : And then a subclass, IndexableField, that adds reader & tokenStream : values, get/set boost, setters to change a field's value, etc. IndexableField really shouldn't be a subclass of whatever class is returned after a sarch is done ... the methods used for accessing the "stored" value of a returned document make as little sense in the context of IndexableField as the setBoost/Reader/TokenStream functions of Document currently make when a search is executed. when all is said and done: an IndexableField and a SearchResultField shouldn't have anything in common except *maybe* that they both have a fieldName. I think Yonik once argued that the ideal API for geting a Document out of an IndexReader would be... /** @return map of field name to field values */ public Map<String,String[]> getDocument(int id) -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]