[ https://issues.apache.org/jira/browse/SOLR-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253263#comment-13253263 ]
Chris Male commented on SOLR-1535: ---------------------------------- Can't we just provide an abstraction so people can choose whatever format they want? You might use JSON out-of-box, but Jan could implement an Avro alternative if he wanted too. That also gives us a way to grow the format as our needs change. > Pre-analyzed field type > ----------------------- > > Key: SOLR-1535 > URL: https://issues.apache.org/jira/browse/SOLR-1535 > Project: Solr > Issue Type: New Feature > Affects Versions: 1.5 > Reporter: Andrzej Bialecki > Fix For: 4.0 > > Attachments: SOLR-1535.patch, SOLR-1535.patch, preanalyzed.patch, > preanalyzed.patch > > > PreAnalyzedFieldType provides a functionality to index (and optionally store) > content that was already processed and split into tokens using some external > processing chain. This implementation defines a serialization format for > sending tokens with any currently supported Attributes (eg. type, posIncr, > payload, ...). This data is de-serialized into a regular TokenStream that is > returned in Field.tokenStreamValue() and thus added to the index as index > terms, and optionally a stored part that is returned in Field.stringValue() > and is then added as a stored value of the field. > This field type is useful for integrating Solr with existing text-processing > pipelines, such as third-party NLP systems. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org