[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199160#comment-16199160 ]
Tim Allison commented on LUCENE-5317: ------------------------------------- an prototype ASF 2.0 application that demonstrates the utility of the concordance is available: https://github.com/mitre/rhapsode > Concordance/Key Word In Context (KWIC) capability > ------------------------------------------------- > > Key: LUCENE-5317 > URL: https://issues.apache.org/jira/browse/LUCENE-5317 > Project: Lucene - Core > Issue Type: New Feature > Components: core/search > Affects Versions: 4.5 > Reporter: Tim Allison > Assignee: Tommaso Teofili > Labels: patch > Attachments: LUCENE-5317.patch, LUCENE-5317.patch, > concordance_v1.patch.gz, lucene5317v1.patch, lucene5317v2.patch > > > This patch enables a Lucene-powered concordance search capability. > Concordances are extremely useful for linguists, lawyers and other analysts > performing analytic search vs. traditional snippeting/document retrieval > tasks. By "analytic search," I mean that the user wants to browse every time > a term appears (or at least the topn) in a subset of documents and see the > words before and after. > Concordance technology is far simpler and less interesting than IR relevance > models/methods, but it can be extremely useful for some use cases. > Traditional concordance sort orders are available (sort on words before the > target, words after, target then words before and target then words after). > Under the hood, this is running SpanQuery's getSpans() and reanalyzing to > obtain character offsets. There is plenty of room for optimizations and > refactoring. > Many thanks to my colleague, Jason Robinson, for input on the design of this > patch. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org