[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14217330#comment-14217330 ]
Tim Allison edited comment on LUCENE-5317 at 11/19/14 3:12 AM: --------------------------------------------------------------- I merged in my local updates and I pushed these to my fork on github [link|https://github.com/tballison/lucene-solr]. I didn't have luck posting this to the review board. When I tried to post it, I entered the base directory and was returned to the starting page without any error message. For the record, I'm sure that this is user error. was (Author: talli...@mitre.org): I merged in my local updates and I pushed these to my fork on github [link|https://github.com/tballison/lucene-solr]. I didn't have luck posting this to the review board. When I tried to post it, I entered the base directory and was returned to the starting page without any error message. > [PATCH] Concordance capability > ------------------------------ > > Key: LUCENE-5317 > URL: https://issues.apache.org/jira/browse/LUCENE-5317 > Project: Lucene - Core > Issue Type: New Feature > Components: core/search > Affects Versions: 4.5 > Reporter: Tim Allison > Labels: patch > Fix For: 4.9 > > Attachments: LUCENE-5317.patch, concordance_v1.patch.gz, > lucene5317v1.patch > > > This patch enables a Lucene-powered concordance search capability. > Concordances are extremely useful for linguists, lawyers and other analysts > performing analytic search vs. traditional snippeting/document retrieval > tasks. By "analytic search," I mean that the user wants to browse every time > a term appears (or at least the topn) in a subset of documents and see the > words before and after. > Concordance technology is far simpler and less interesting than IR relevance > models/methods, but it can be extremely useful for some use cases. > Traditional concordance sort orders are available (sort on words before the > target, words after, target then words before and target then words after). > Under the hood, this is running SpanQuery's getSpans() and reanalyzing to > obtain character offsets. There is plenty of room for optimizations and > refactoring. > Many thanks to my colleague, Jason Robinson, for input on the design of this > patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org