[ https://issues.apache.org/jira/browse/SOLR-477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568116#action_12568116 ]
Yonik Seeley commented on SOLR-477: ----------------------------------- Interface is the most important part... can someone cut'n'paste what the input + output looks like for casual reviewers? > AnalysisRequestHandler > ---------------------- > > Key: SOLR-477 > URL: https://issues.apache.org/jira/browse/SOLR-477 > Project: Solr > Issue Type: New Feature > Reporter: Grant Ingersoll > Assignee: Grant Ingersoll > Priority: Minor > Attachments: SOLR-477.patch, SOLR-477.patch > > > Being able to programmatically access tokenization information can be quite > useful not only in Solr, but in other NLP applications where token vectors > are necessary. > The patch to follow creates an AnalysisRequestHandler which processes a > document through the analysis process and returns a response filled with > tokens, their offsets, position inc., type and value. > Patch also adds some character array processing to Xml and adds Token > handling to XMLWriter. > I only implemented Xml output, as I don't know JSON or the other types. If > someone else is so motivated, they can add those. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.