[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286909#comment-13286909 ]
Andrzej Bialecki commented on LUCENE-3312: ------------------------------------------- Comments to patch 04: * index.Document is an interface, I think for better extensibility in the future it could be an abstract class - who knows what we will want to put there in addition to the iterators... * as noted on IRC, this strong decoupling of stored and indexed content poses some interesting questions: ** since you can add multiple fields with the same name, you can now add an arbitrary sequence of Stored and Indexed fields (all with the same name). This means that you can now store parts of a field that are not indexed, and parts of a field that are indexed but not stored. ** previously, if a field was flagged as indexed but didn't have a tokenStream, its String or Reader value would be used to create a token stream. Now if you want a value to be stored and indexed you have to add two fields with the same name - one StoredField and the other an IndexedField for which you create a token stream from the value. My assumption is that StoredField-s will never be used anymore as potential sources of token streams? * maybe this is a good moment to change all getters that return arrays of fields or values to return List-s, since all the code is doing underneath is collecting them into lists and then converting to arrays? * previously we allowed one to remove fields from document by name, are we going to allow this now separately for indexed and stored fields? * minor nit: there's a grammar mistake in Field.setTokenStream(..): "TokenStream fields tokenized". > Break out StorableField from IndexableField > ------------------------------------------- > > Key: LUCENE-3312 > URL: https://issues.apache.org/jira/browse/LUCENE-3312 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index > Reporter: Michael McCandless > Assignee: Nikola Tankovic > Labels: gsoc2012, lucene-gsoc-12 > Fix For: Field Type branch > > Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, > lucene-3312-patch-03.patch, lucene-3312-patch-04.patch > > > In the field type branch we have strongly decoupled > Document/Field/FieldType impl from the indexer, by having only a > narrow API (IndexableField) passed to IndexWriter. This frees apps up > use their own "documents" instead of the "user-space" impls we provide > in oal.document. > Similarly, with LUCENE-3309, we've done the same thing on the > doc/field retrieval side (from IndexReader), with the > StoredFieldsVisitor. > But, maybe we should break out StorableField from IndexableField, > such that when you index a doc you provide two Iterables -- one for the > IndexableFields and one for the StorableFields. Either can be null. > One downside is possible perf hit for fields that are both indexed & > stored (ie, we visit them twice, lookup their name in a hash twice, > etc.). But the upside is a cleaner separation of concerns in API.... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org