[ http://issues.apache.org/jira/browse/JCR-415?page=comments#action_12459384 ] Marcel Reutegger commented on JCR-415: --------------------------------------
I would like to get this change into the next major release (1.3) and propose the following changes: - Create a new module jackrabbit-text-extractors which will initially contain the jackrabbit-extractor patch provided by Jukka - Migrate the jackrabbit-text-filters into the new extractors module - Add jackrabbit-text-filters as dependency to jackrabbit-core - Remove the jackrabbit-text-filters module and do not create releases anymore for this module. Jackrabbit would still support existing releases of jackrabbit-text-filters but the interface TextFilter will be deprecated (see Jukkas' patch) and developers are encouraged to use the new TextExtractor interface. Does this make sense? > Enhance indexing of binary content > ---------------------------------- > > Key: JCR-415 > URL: http://issues.apache.org/jira/browse/JCR-415 > Project: Jackrabbit > Issue Type: Improvement > Components: indexing > Affects Versions: 1.0, 1.0.1, 0.9 > Reporter: Marcel Reutegger > Priority: Minor > Attachments: jackrabbit-extractor-r420472.patch, > jackrabbit-query-r420472.patch, jackrabbit-query-r421461.patch, > org.apache.jackrabbit.core.query-extractor.jpg, > org.apache.jackrabbit.core.query.lucene-extractor.jpg, > org.apache.jackrabbit.extractor.jpg > > > Indexing of binary content should be enhanced in order to allow either > configuration what fields are indexed or provide better support for custom > NodeIndexer implementations. > The current design has a couple of flaws that should be addressed at the same > time: > - Reader instances are requested from the text filters even though the reader > might never be used > - only jcr:data properties of nt:resource nodes are fulltext indexed > - It is up to the text filter implementation to decide the lucene field name > for the text representation, responsibility should be moved to the > NodeIndexer. A text filter should only provide a Reader instance. > With those changes a custom NodeIndexer can then decide if a binary property > has one or more representations in the index. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira