[jira] [Updated] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated LUCENE-5317: Attachment: lucene5317v2.patch I made the mistake of following instructions and tried {{/trunk}} and {{/trunk/}} yesterday. I tried with a git diff file yesterday, and I also just tried with a git --no-prefix diff file today, which seems to work with a traditional svn (patch attached). Today, I tried three variations of trunk. Still confident this is user error. Is there a size limit on diffs or is there something screwy with the attached diff file? [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5317.patch, concordance_v1.patch.gz, lucene5317v1.patch, lucene5317v2.patch This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated LUCENE-5317: --- Attachment: LUCENE-5317.patch When I tried to make a new review request with your latest patch, I get this error: {quote} The specified diff file could not be parsed. Line 2: No valid separator after the filename was found in the diff header {quote} I've successfully applied your patch to my svn checkout (using {{svn patch}}), and I'm posting it here unchanged. [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5317.patch, LUCENE-5317.patch, concordance_v1.patch.gz, lucene5317v1.patch, lucene5317v2.patch This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated LUCENE-5317: Attachment: lucene5317v1.patch I merged in my local updates and I pushed these to my fork on github [link|https://github.com/tballison/lucene-solr]. I didn't have luck posting this to the review board. When I tried to post it, I entered the base directory and was returned to the starting page without any error message. [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5317.patch, concordance_v1.patch.gz, lucene5317v1.patch This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated LUCENE-5317: --- Attachment: LUCENE-5317.patch Sync'd Tim's patch up to current trunk: - LUCENE-5449: _TestUtil-TestUtil - LUCENE-5569 AtomicReader-LeafReader - LUCENE-5984: ChainedFilter removed; replaced with BooleanFilter - LUCENE-6010: OpenBitSet removed; replaced with FixedBitSet, except for in TokenCharOffsetRequests, where I switched it to Java's BitSet. - Added Maven and IntelliJ config - Cleaned up some code formatting issues: -- indents - 2 spaces per level -- removed useless javadoc stuff (empty returns, empty params) -- normalized whitespace -- normalized curly brace placement -- removed unused imports -- converted some counted for loops to for-each I left a few nocommits. I plan on reviewing more. [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: LUCENE-5317.patch, concordance_v1.patch.gz This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated LUCENE-5317: Fix Version/s: (was: 4.8) 4.9 [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.9 Attachments: concordance_v1.patch.gz This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated LUCENE-5317: - Fix Version/s: (was: 4.7) 4.8 [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.8 Attachments: concordance_v1.patch.gz This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5317) [PATCH] Concordance capability
[ https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated LUCENE-5317: Attachment: concordance_v1.patch.gz v1 of patch attached [PATCH] Concordance capability -- Key: LUCENE-5317 URL: https://issues.apache.org/jira/browse/LUCENE-5317 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.5 Reporter: Tim Allison Labels: patch Fix For: 4.6 Attachments: concordance_v1.patch.gz This patch enables a Lucene-powered concordance search capability. Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snippeting/document retrieval tasks. By analytic search, I mean that the user wants to browse every time a term appears (or at least the topn) in a subset of documents and see the words before and after. Concordance technology is far simpler and less interesting than IR relevance models/methods, but it can be extremely useful for some use cases. Traditional concordance sort orders are available (sort on words before the target, words after, target then words before and target then words after). Under the hood, this is running SpanQuery's getSpans() and reanalyzing to obtain character offsets. There is plenty of room for optimizations and refactoring. Many thanks to my colleague, Jason Robinson, for input on the design of this patch. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org