[jira] [Commented] (LUCENE-5317) [PATCH] Concordance capability

2014-11-19 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14218436#comment-14218436
 ] 

Tim Allison commented on LUCENE-5317:
-

Switching to Chrome was the answer, apparently: 
[link|https://reviews.apache.org/r/28247/]

> [PATCH] Concordance capability
> --
>
> Key: LUCENE-5317
> URL: https://issues.apache.org/jira/browse/LUCENE-5317
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 4.5
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.9
>
> Attachments: LUCENE-5317.patch, LUCENE-5317.patch, 
> concordance_v1.patch.gz, lucene5317v1.patch, lucene5317v2.patch
>
>
> This patch enables a Lucene-powered concordance search capability.
> Concordances are extremely useful for linguists, lawyers and other analysts 
> performing analytic search vs. traditional snippeting/document retrieval 
> tasks.  By "analytic search," I mean that the user wants to browse every time 
> a term appears (or at least the topn)  in a subset of documents and see the 
> words before and after.  
> Concordance technology is far simpler and less interesting than IR relevance 
> models/methods, but it can be extremely useful for some use cases.
> Traditional concordance sort orders are available (sort on words before the 
> target, words after, target then words before and target then words after).
> Under the hood, this is running SpanQuery's getSpans() and reanalyzing to 
> obtain character offsets.  There is plenty of room for optimizations and 
> refactoring.
> Many thanks to my colleague, Jason Robinson, for input on the design of this 
> patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5317) [PATCH] Concordance capability

2014-11-19 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14218407#comment-14218407
 ] 

Steve Rowe commented on LUCENE-5317:


One of the nice things about {{svn patch}} is that it automatically does the 
{{svn add}} stuff for you.

I've never tried rbtools, always used the web interface, no problems so far.

> [PATCH] Concordance capability
> --
>
> Key: LUCENE-5317
> URL: https://issues.apache.org/jira/browse/LUCENE-5317
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 4.5
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.9
>
> Attachments: LUCENE-5317.patch, LUCENE-5317.patch, 
> concordance_v1.patch.gz, lucene5317v1.patch, lucene5317v2.patch
>
>
> This patch enables a Lucene-powered concordance search capability.
> Concordances are extremely useful for linguists, lawyers and other analysts 
> performing analytic search vs. traditional snippeting/document retrieval 
> tasks.  By "analytic search," I mean that the user wants to browse every time 
> a term appears (or at least the topn)  in a subset of documents and see the 
> words before and after.  
> Concordance technology is far simpler and less interesting than IR relevance 
> models/methods, but it can be extremely useful for some use cases.
> Traditional concordance sort orders are available (sort on words before the 
> target, words after, target then words before and target then words after).
> Under the hood, this is running SpanQuery's getSpans() and reanalyzing to 
> obtain character offsets.  There is plenty of room for optimizations and 
> refactoring.
> Many thanks to my colleague, Jason Robinson, for input on the design of this 
> patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5317) [PATCH] Concordance capability

2014-11-19 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14218401#comment-14218401
 ] 

Tim Allison commented on LUCENE-5317:
-

Great.  Thank you.  I just tried svn diff from the svn checkout that I had 
patched with the correct git diff...with no luck.  I hadn't even svn-added the 
concordance directory, so the diff file was quite short.

Are you using rbtools or have you had luck with the web interface?

And success with installing rbtools: 
{noformat}
Searching for RBTools
Reading https://pypi.python.org/simple/RBTools/
Download error on https://pypi.python.org/simple/RBTools/: [Errno 10061] No conn
ection could be made because the target machine actively refused it -- Some pack
ages may not be found!
Couldn't find index page for 'RBTools' (maybe misspelled?)
Scanning index of all packages (this may take a while)
Reading https://pypi.python.org/simple/
Download error on https://pypi.python.org/simple/: [Errno 10061] No connection c
ould be made because the target machine actively refused it -- Some packages may
 not be found!
No local packages or download links found for RBTools
error: Could not find suitable distribution for Requirement.parse('RBTools')
{noformat}

> [PATCH] Concordance capability
> --
>
> Key: LUCENE-5317
> URL: https://issues.apache.org/jira/browse/LUCENE-5317
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 4.5
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.9
>
> Attachments: LUCENE-5317.patch, LUCENE-5317.patch, 
> concordance_v1.patch.gz, lucene5317v1.patch, lucene5317v2.patch
>
>
> This patch enables a Lucene-powered concordance search capability.
> Concordances are extremely useful for linguists, lawyers and other analysts 
> performing analytic search vs. traditional snippeting/document retrieval 
> tasks.  By "analytic search," I mean that the user wants to browse every time 
> a term appears (or at least the topn)  in a subset of documents and see the 
> words before and after.  
> Concordance technology is far simpler and less interesting than IR relevance 
> models/methods, but it can be extremely useful for some use cases.
> Traditional concordance sort orders are available (sort on words before the 
> target, words after, target then words before and target then words after).
> Under the hood, this is running SpanQuery's getSpans() and reanalyzing to 
> obtain character offsets.  There is plenty of room for optimizations and 
> refactoring.
> Many thanks to my colleague, Jason Robinson, for input on the design of this 
> patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5317) [PATCH] Concordance capability

2014-11-19 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14217994#comment-14217994
 ] 

Steve Rowe commented on LUCENE-5317:


bq. I didn't have luck posting this to the review board. When I tried to post 
it, I entered the base directory and was returned to the starting page without 
any error message. For the record, I'm sure that this is user error.

I've successfully used {{trunk}} for the base directory in the past - what did 
you use?

> [PATCH] Concordance capability
> --
>
> Key: LUCENE-5317
> URL: https://issues.apache.org/jira/browse/LUCENE-5317
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 4.5
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.9
>
> Attachments: LUCENE-5317.patch, concordance_v1.patch.gz, 
> lucene5317v1.patch
>
>
> This patch enables a Lucene-powered concordance search capability.
> Concordances are extremely useful for linguists, lawyers and other analysts 
> performing analytic search vs. traditional snippeting/document retrieval 
> tasks.  By "analytic search," I mean that the user wants to browse every time 
> a term appears (or at least the topn)  in a subset of documents and see the 
> words before and after.  
> Concordance technology is far simpler and less interesting than IR relevance 
> models/methods, but it can be extremely useful for some use cases.
> Traditional concordance sort orders are available (sort on words before the 
> target, words after, target then words before and target then words after).
> Under the hood, this is running SpanQuery's getSpans() and reanalyzing to 
> obtain character offsets.  There is plenty of room for optimizations and 
> refactoring.
> Many thanks to my colleague, Jason Robinson, for input on the design of this 
> patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5317) [PATCH] Concordance capability

2014-10-31 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14192255#comment-14192255
 ] 

Tim Allison commented on LUCENE-5317:
-

I added my latest source code and standalone jars to work with 4.10.2 to my 
lucene-addons [repo|https://github.com/tballison/lucene-addons] in case anyone 
wants to try the code as is.  There may be surprises.

The next step is to turn back to the lucene5317 branch in my fork and update 
the trunk code.

The biggest functional difference between the original patch in October and the 
current working code is that I added multivalued field handling.

> [PATCH] Concordance capability
> --
>
> Key: LUCENE-5317
> URL: https://issues.apache.org/jira/browse/LUCENE-5317
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 4.5
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.9
>
> Attachments: LUCENE-5317.patch, concordance_v1.patch.gz
>
>
> This patch enables a Lucene-powered concordance search capability.
> Concordances are extremely useful for linguists, lawyers and other analysts 
> performing analytic search vs. traditional snippeting/document retrieval 
> tasks.  By "analytic search," I mean that the user wants to browse every time 
> a term appears (or at least the topn)  in a subset of documents and see the 
> words before and after.  
> Concordance technology is far simpler and less interesting than IR relevance 
> models/methods, but it can be extremely useful for some use cases.
> Traditional concordance sort orders are available (sort on words before the 
> target, words after, target then words before and target then words after).
> Under the hood, this is running SpanQuery's getSpans() and reanalyzing to 
> obtain character offsets.  There is plenty of room for optimizations and 
> refactoring.
> Many thanks to my colleague, Jason Robinson, for input on the design of this 
> patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5317) [PATCH] Concordance capability

2014-10-29 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188309#comment-14188309
 ] 

Tim Allison commented on LUCENE-5317:
-

Thank you, Steve.  I created a lucene5317 branch on my github 
[fork|https://github.com/tballison/lucene-solr].  I applied your patch and will 
start adding my local updates...there have been quite a few since I posted the 
initial patch. 

When I'm happy enough with that, I'll put the patch on rb.

Thank you, again.

> [PATCH] Concordance capability
> --
>
> Key: LUCENE-5317
> URL: https://issues.apache.org/jira/browse/LUCENE-5317
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 4.5
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.9
>
> Attachments: LUCENE-5317.patch, concordance_v1.patch.gz
>
>
> This patch enables a Lucene-powered concordance search capability.
> Concordances are extremely useful for linguists, lawyers and other analysts 
> performing analytic search vs. traditional snippeting/document retrieval 
> tasks.  By "analytic search," I mean that the user wants to browse every time 
> a term appears (or at least the topn)  in a subset of documents and see the 
> words before and after.  
> Concordance technology is far simpler and less interesting than IR relevance 
> models/methods, but it can be extremely useful for some use cases.
> Traditional concordance sort orders are available (sort on words before the 
> target, words after, target then words before and target then words after).
> Under the hood, this is running SpanQuery's getSpans() and reanalyzing to 
> obtain character offsets.  There is plenty of room for optimizations and 
> refactoring.
> Many thanks to my colleague, Jason Robinson, for input on the design of this 
> patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5317) [PATCH] Concordance capability

2014-10-17 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14175274#comment-14175274
 ] 

Steve Rowe commented on LUCENE-5317:


Tim, FYI, I've used the ASF's ReviewBoard instance a few times recently - it's 
very nice for comparing two patches against each other, and it can be useful 
for detailed review too: https://reviews.apache.org/.  After creating an 
account there, the workflow is: manually upload a patch, assign a reviewer 
(could be the "lucene" group, in which case review requests go to the dev list, 
or a RB account-holder, including yourself), then publish.  Thereafter anybody 
can review by clicking on one or more adjacent lines in a patch and attaching a 
comment, repeating till done, then publishing, and the original review request 
creator can update the patch, and anybody can view differences between any two 
patched versions, and also attach reviews to the patched version differences.


> [PATCH] Concordance capability
> --
>
> Key: LUCENE-5317
> URL: https://issues.apache.org/jira/browse/LUCENE-5317
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 4.5
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.9
>
> Attachments: LUCENE-5317.patch, concordance_v1.patch.gz
>
>
> This patch enables a Lucene-powered concordance search capability.
> Concordances are extremely useful for linguists, lawyers and other analysts 
> performing analytic search vs. traditional snippeting/document retrieval 
> tasks.  By "analytic search," I mean that the user wants to browse every time 
> a term appears (or at least the topn)  in a subset of documents and see the 
> words before and after.  
> Concordance technology is far simpler and less interesting than IR relevance 
> models/methods, but it can be extremely useful for some use cases.
> Traditional concordance sort orders are available (sort on words before the 
> target, words after, target then words before and target then words after).
> Under the hood, this is running SpanQuery's getSpans() and reanalyzing to 
> obtain character offsets.  There is plenty of room for optimizations and 
> refactoring.
> Many thanks to my colleague, Jason Robinson, for input on the design of this 
> patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5317) [PATCH] Concordance capability

2014-10-17 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174938#comment-14174938
 ] 

Tim Allison commented on LUCENE-5317:
-

Steve, thank you!  I had abandoned hope and haven't been updating this patch on 
jira.  The current version in my local repo looks a bit different now.

Let me apply your patch and see what the diff is between your cleanup/fixes and 
my current version.



> [PATCH] Concordance capability
> --
>
> Key: LUCENE-5317
> URL: https://issues.apache.org/jira/browse/LUCENE-5317
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 4.5
>Reporter: Tim Allison
>  Labels: patch
> Fix For: 4.9
>
> Attachments: LUCENE-5317.patch, concordance_v1.patch.gz
>
>
> This patch enables a Lucene-powered concordance search capability.
> Concordances are extremely useful for linguists, lawyers and other analysts 
> performing analytic search vs. traditional snippeting/document retrieval 
> tasks.  By "analytic search," I mean that the user wants to browse every time 
> a term appears (or at least the topn)  in a subset of documents and see the 
> words before and after.  
> Concordance technology is far simpler and less interesting than IR relevance 
> models/methods, but it can be extremely useful for some use cases.
> Traditional concordance sort orders are available (sort on words before the 
> target, words after, target then words before and target then words after).
> Under the hood, this is running SpanQuery's getSpans() and reanalyzing to 
> obtain character offsets.  There is plenty of room for optimizations and 
> refactoring.
> Many thanks to my colleague, Jason Robinson, for input on the design of this 
> patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org