Re: [jira] [Updated] (LUCENE-1889) FastVectorHighlighter: support for additional queries

2011-09-09 Thread Koji Sekiguchi
Thanks for the recover Robert!

Koji Sekiguchi from mobile


On 2011/09/09, at 14:49, Robert Muir (JIRA) j...@apache.org wrote:

 
 [ 
 https://issues.apache.org/jira/browse/LUCENE-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
  ]
 
 Robert Muir updated LUCENE-1889:
 
 
Attachment: LUCENE-1889_reader.patch
 
 here is the patch I applied, might not be the best or whatever, and see the 
 TODO/note in the code.
 
 FastVectorHighlighter: support for additional queries
 -
 
Key: LUCENE-1889
URL: https://issues.apache.org/jira/browse/LUCENE-1889
Project: Lucene - Java
 Issue Type: Wish
 Components: modules/highlighter
   Reporter: Robert Muir
   Assignee: Koji Sekiguchi
   Priority: Minor
Fix For: 3.5, 4.0
 
Attachments: LUCENE-1889.patch, LUCENE-1889.patch, LUCENE-1889.patch, 
 LUCENE-1889_reader.patch
 
 
 I am using fastvectorhighlighter for some strange languages and it is 
 working well! 
 One thing i noticed immediately is that many query types are not highlighted 
 (multitermquery, multiphrasequery, etc)
 Here is one thing Michael M posted in the original ticket:
 {quote}
 I think a nice [eventual] model would be if we could simply re-run the
 scorer on the single document (using InstantiatedIndex maybe, or
 simply some sort of wrapper on the term vectors which are already a
 mini-inverted-index for a single doc), but extend the scorer API to
 tell us the exact term occurrences that participated in a match (which
 I don't think is exposed today).
 {quote}
 Due to strange requirements I am using something similar to this (but 
 specialized to our case).
 I am doing strange things like forcing multitermqueries to rewrite into 
 boolean queries so they will be highlighted,
 and flattening multiphrasequeries into boolean or'ed phrasequeries.
 I do not think these things would be 'fast', but i had a few ideas that 
 might help:
 * looking at contrib/highlighter, you can support FilteredQuery in flatten() 
 by calling getQuery() right?
 * maybe as a last resort, try Query.extractTerms() ?
 
 --
 This message is automatically generated by JIRA.
 For more information on JIRA, see: http://www.atlassian.com/software/jira
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-1889) FastVectorHighlighter: support for additional queries

2011-09-09 Thread Mike Sokolov (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Sokolov updated LUCENE-1889:
-

Attachment: LUCENE-1889-solr.patch

Sorry, forgot to include changes to DefaultSolrHighlighter as well (it gets 
confusing maintaining multiple patches in the same build).

I do think the non-reader method should be derprecated as in Robert's comment.

 FastVectorHighlighter: support for additional queries
 -

 Key: LUCENE-1889
 URL: https://issues.apache.org/jira/browse/LUCENE-1889
 Project: Lucene - Java
  Issue Type: Wish
  Components: modules/highlighter
Reporter: Robert Muir
Assignee: Koji Sekiguchi
Priority: Minor
 Fix For: 3.5, 4.0

 Attachments: LUCENE-1889-solr.patch, LUCENE-1889.patch, 
 LUCENE-1889.patch, LUCENE-1889.patch, LUCENE-1889_reader.patch


 I am using fastvectorhighlighter for some strange languages and it is working 
 well! 
 One thing i noticed immediately is that many query types are not highlighted 
 (multitermquery, multiphrasequery, etc)
 Here is one thing Michael M posted in the original ticket:
 {quote}
 I think a nice [eventual] model would be if we could simply re-run the
 scorer on the single document (using InstantiatedIndex maybe, or
 simply some sort of wrapper on the term vectors which are already a
 mini-inverted-index for a single doc), but extend the scorer API to
 tell us the exact term occurrences that participated in a match (which
 I don't think is exposed today).
 {quote}
 Due to strange requirements I am using something similar to this (but 
 specialized to our case).
 I am doing strange things like forcing multitermqueries to rewrite into 
 boolean queries so they will be highlighted,
 and flattening multiphrasequeries into boolean or'ed phrasequeries.
 I do not think these things would be 'fast', but i had a few ideas that might 
 help:
 * looking at contrib/highlighter, you can support FilteredQuery in flatten() 
 by calling getQuery() right?
 * maybe as a last resort, try Query.extractTerms() ?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-1889) FastVectorHighlighter: support for additional queries

2011-09-08 Thread Mike Sokolov (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Sokolov updated LUCENE-1889:
-

Attachment: LUCENE-1889.patch

updated patch resolves issue w/possibly rewriting MTQs multiple times

 FastVectorHighlighter: support for additional queries
 -

 Key: LUCENE-1889
 URL: https://issues.apache.org/jira/browse/LUCENE-1889
 Project: Lucene - Java
  Issue Type: Wish
  Components: modules/highlighter
Reporter: Robert Muir
Priority: Minor
 Attachments: LUCENE-1889.patch, LUCENE-1889.patch, LUCENE-1889.patch


 I am using fastvectorhighlighter for some strange languages and it is working 
 well! 
 One thing i noticed immediately is that many query types are not highlighted 
 (multitermquery, multiphrasequery, etc)
 Here is one thing Michael M posted in the original ticket:
 {quote}
 I think a nice [eventual] model would be if we could simply re-run the
 scorer on the single document (using InstantiatedIndex maybe, or
 simply some sort of wrapper on the term vectors which are already a
 mini-inverted-index for a single doc), but extend the scorer API to
 tell us the exact term occurrences that participated in a match (which
 I don't think is exposed today).
 {quote}
 Due to strange requirements I am using something similar to this (but 
 specialized to our case).
 I am doing strange things like forcing multitermqueries to rewrite into 
 boolean queries so they will be highlighted,
 and flattening multiphrasequeries into boolean or'ed phrasequeries.
 I do not think these things would be 'fast', but i had a few ideas that might 
 help:
 * looking at contrib/highlighter, you can support FilteredQuery in flatten() 
 by calling getQuery() right?
 * maybe as a last resort, try Query.extractTerms() ?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-1889) FastVectorHighlighter: support for additional queries

2011-09-08 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-1889:


Attachment: LUCENE-1889_reader.patch

here is the patch I applied, might not be the best or whatever, and see the 
TODO/note in the code.

 FastVectorHighlighter: support for additional queries
 -

 Key: LUCENE-1889
 URL: https://issues.apache.org/jira/browse/LUCENE-1889
 Project: Lucene - Java
  Issue Type: Wish
  Components: modules/highlighter
Reporter: Robert Muir
Assignee: Koji Sekiguchi
Priority: Minor
 Fix For: 3.5, 4.0

 Attachments: LUCENE-1889.patch, LUCENE-1889.patch, LUCENE-1889.patch, 
 LUCENE-1889_reader.patch


 I am using fastvectorhighlighter for some strange languages and it is working 
 well! 
 One thing i noticed immediately is that many query types are not highlighted 
 (multitermquery, multiphrasequery, etc)
 Here is one thing Michael M posted in the original ticket:
 {quote}
 I think a nice [eventual] model would be if we could simply re-run the
 scorer on the single document (using InstantiatedIndex maybe, or
 simply some sort of wrapper on the term vectors which are already a
 mini-inverted-index for a single doc), but extend the scorer API to
 tell us the exact term occurrences that participated in a match (which
 I don't think is exposed today).
 {quote}
 Due to strange requirements I am using something similar to this (but 
 specialized to our case).
 I am doing strange things like forcing multitermqueries to rewrite into 
 boolean queries so they will be highlighted,
 and flattening multiphrasequeries into boolean or'ed phrasequeries.
 I do not think these things would be 'fast', but i had a few ideas that might 
 help:
 * looking at contrib/highlighter, you can support FilteredQuery in flatten() 
 by calling getQuery() right?
 * maybe as a last resort, try Query.extractTerms() ?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-1889) FastVectorHighlighter: support for additional queries

2011-09-04 Thread Mike Sokolov (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Sokolov updated LUCENE-1889:
-

Attachment: LUCENE-1889.patch

This patch adds support for highlighting MultiTermQuery in 
FastVectorHighlighter via Query.rewrite().  I left one FIXME (should that be 
nocommit?) that should be fairly easy to resolve: we currently rewrite() the 
same MTQ query twice in some circumstances - if it's in a phrase I think.  I'd 
be happy to sort that out if y'all decide to commit this.

 FastVectorHighlighter: support for additional queries
 -

 Key: LUCENE-1889
 URL: https://issues.apache.org/jira/browse/LUCENE-1889
 Project: Lucene - Java
  Issue Type: Wish
  Components: modules/highlighter
Reporter: Robert Muir
Priority: Minor
 Attachments: LUCENE-1889.patch, LUCENE-1889.patch


 I am using fastvectorhighlighter for some strange languages and it is working 
 well! 
 One thing i noticed immediately is that many query types are not highlighted 
 (multitermquery, multiphrasequery, etc)
 Here is one thing Michael M posted in the original ticket:
 {quote}
 I think a nice [eventual] model would be if we could simply re-run the
 scorer on the single document (using InstantiatedIndex maybe, or
 simply some sort of wrapper on the term vectors which are already a
 mini-inverted-index for a single doc), but extend the scorer API to
 tell us the exact term occurrences that participated in a match (which
 I don't think is exposed today).
 {quote}
 Due to strange requirements I am using something similar to this (but 
 specialized to our case).
 I am doing strange things like forcing multitermqueries to rewrite into 
 boolean queries so they will be highlighted,
 and flattening multiphrasequeries into boolean or'ed phrasequeries.
 I do not think these things would be 'fast', but i had a few ideas that might 
 help:
 * looking at contrib/highlighter, you can support FilteredQuery in flatten() 
 by calling getQuery() right?
 * maybe as a last resort, try Query.extractTerms() ?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-1889) FastVectorHighlighter: support for additional queries

2011-06-25 Thread Mike Sokolov (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Sokolov updated LUCENE-1889:
-

Attachment: LUCENE-1889.patch

Patch includes FVH support for Wildcard-, Regexp- and PrefixQuery.  Change to 
Enwiki benchmark (to generate wildcard queries) should maybe not be committed; 
just providing this as a validation of this approach.

 FastVectorHighlighter: support for additional queries
 -

 Key: LUCENE-1889
 URL: https://issues.apache.org/jira/browse/LUCENE-1889
 Project: Lucene - Java
  Issue Type: Wish
  Components: modules/highlighter
Reporter: Robert Muir
Priority: Minor
 Attachments: LUCENE-1889.patch


 I am using fastvectorhighlighter for some strange languages and it is working 
 well! 
 One thing i noticed immediately is that many query types are not highlighted 
 (multitermquery, multiphrasequery, etc)
 Here is one thing Michael M posted in the original ticket:
 {quote}
 I think a nice [eventual] model would be if we could simply re-run the
 scorer on the single document (using InstantiatedIndex maybe, or
 simply some sort of wrapper on the term vectors which are already a
 mini-inverted-index for a single doc), but extend the scorer API to
 tell us the exact term occurrences that participated in a match (which
 I don't think is exposed today).
 {quote}
 Due to strange requirements I am using something similar to this (but 
 specialized to our case).
 I am doing strange things like forcing multitermqueries to rewrite into 
 boolean queries so they will be highlighted,
 and flattening multiphrasequeries into boolean or'ed phrasequeries.
 I do not think these things would be 'fast', but i had a few ideas that might 
 help:
 * looking at contrib/highlighter, you can support FilteredQuery in flatten() 
 by calling getQuery() right?
 * maybe as a last resort, try Query.extractTerms() ?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org