[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Smiley updated LUCENE-5317: --------------------------------- Fix Version/s: (was: 4.7) 4.8 > [PATCH] Concordance capability > ------------------------------ > > Key: LUCENE-5317 > URL: https://issues.apache.org/jira/browse/LUCENE-5317 > Project: Lucene - Core > Issue Type: New Feature > Components: core/search > Affects Versions: 4.5 > Reporter: Tim Allison > Labels: patch > Fix For: 4.8 > > Attachments: concordance_v1.patch.gz > > > This patch enables a Lucene-powered concordance search capability. > Concordances are extremely useful for linguists, lawyers and other analysts > performing analytic search vs. traditional snippeting/document retrieval > tasks. By "analytic search," I mean that the user wants to browse every time > a term appears (or at least the topn) in a subset of documents and see the > words before and after. > Concordance technology is far simpler and less interesting than IR relevance > models/methods, but it can be extremely useful for some use cases. > Traditional concordance sort orders are available (sort on words before the > target, words after, target then words before and target then words after). > Under the hood, this is running SpanQuery's getSpans() and reanalyzing to > obtain character offsets. There is plenty of room for optimizations and > refactoring. > Many thanks to my colleague, Jason Robinson, for input on the design of this > patch. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org