[jira] [Commented] (LUCENE-8753) New PostingFormat - UniformSplit

2019-04-03 Thread Bruno Roustant (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16809251#comment-16809251
 ] 

Bruno Roustant commented on LUCENE-8753:


{quote}I think this is similar to […] BlockTermsReader/Writer
{quote}
Indeed similar; it mainly differs from VariableGapTermsIndexWriter in the way 
it selects the best term to start a block. It is based on the minimal 
distinguishing prefix. The idea is to make the terms index FST more compact. 
That way, given a target max heap memory, we can have potentially more blocks, 
so smaller ones that are scanned faster. This requirement to consume less heap 
was strong with lucene 7.1, now maybe less with the recent off-heap FST.

 
{quote}Are you also doing something different to encode/decode postings?
{quote}
No, the postings are written with the regular PostingsWriterBase.

 
{quote}Can you post results on the full wikimediumall?
{quote}
 Good point. Will do tomorrow.

> New PostingFormat - UniformSplit
> 
>
> Key: LUCENE-8753
> URL: https://issues.apache.org/jira/browse/LUCENE-8753
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 8.0
>    Reporter: Bruno Roustant
>Priority: Major
> Attachments: Uniform Split Technique.pdf, luceneutil.benchmark.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a proposal to add a new PostingsFormat called "UniformSplit" with 4 
> objectives:
>  - Clear design and simple code.
>  - Easily extensible, for both the logic and the index format.
>  - Light memory usage with a very compact FST.
>  - Focus on efficient TermQuery, PhraseQuery and PrefixQuery performance.
> (the pdf attached explains visually the technique in more details)
>  The principle is to split the list of terms into blocks and use a FST to 
> access the block, but not as a prefix trie, rather with a seek-floor pattern. 
> For the selection of the blocks, there is a target average block size (number 
> of terms), with an allowed delta variation (10%) to compare the terms and 
> select the one with the minimal distinguishing prefix.
>  There are also several optimizations inside the block to make it more 
> compact and speed up the loading/scanning.
> The performance obtained is interesting with the luceneutil benchmark, 
> comparing UniformSplit with BlockTree. Find it in the first comment and also 
> attached for better formatting.
> Although the precise percentages vary between runs, three main points:
>  - TermQuery and PhraseQuery are improved.
>  - PrefixQuery and WildcardQuery are ok.
>  - Fuzzy queries are clearly less performant, because BlockTree is so 
> optimized for them.
> Compared to BlockTree, FST size is reduced by 15%, and segment writing time 
> is reduced by 20%. So this PostingsFormat scales to lots of docs, as 
> BlockTree.
> This initial version passes all Lucene tests. Use “ant test 
> -Dtests.codec=UniformSplitTesting” to test with this PostingsFormat.
> Subjectively, we think we have fulfilled our goal of code simplicity. And we 
> have already exercised this PostingsFormat extensibility to create a 
> different flavor for our own use-case.
> Contributors: Juan Camilo Rodriguez Duran, Bruno Roustant, David Smiley



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6613) TextField.analyzeMultiTerm should not throw exception when analyzer returns no term

2014-10-09 Thread Bruno Roustant (JIRA)
Bruno Roustant created SOLR-6613:


 Summary: TextField.analyzeMultiTerm should not throw exception 
when analyzer returns no term
 Key: SOLR-6613
 URL: https://issues.apache.org/jira/browse/SOLR-6613
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 4.3.1, 4.10.2, Trunk
Reporter: Bruno Roustant


In TextField.analyzeMultiTerm()
at line
try {
  if (!source.incrementToken())
throw new SolrException();

The method should not throw an exception if there is no token because having no 
token is legitimate because all tokens may be filtered out (e.g. with a 
blocking Filter such as StopFilter).

In this case it should simply return null (as it already returns null in some 
cases, see first line of method).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6613) TextField.analyzeMultiTerm should not throw exception when analyzer returns no term

2014-10-09 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated SOLR-6613:
-
Attachment: TestTextField.java

> TextField.analyzeMultiTerm should not throw exception when analyzer returns 
> no term
> ---
>
> Key: SOLR-6613
> URL: https://issues.apache.org/jira/browse/SOLR-6613
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 4.3.1, 4.10.2, Trunk
>    Reporter: Bruno Roustant
> Attachments: TestTextField.java
>
>
> In TextField.analyzeMultiTerm()
> at line
> try {
>   if (!source.incrementToken())
> throw new SolrException();
> The method should not throw an exception if there is no token because having 
> no token is legitimate because all tokens may be filtered out (e.g. with a 
> blocking Filter such as StopFilter).
> In this case it should simply return null (as it already returns null in some 
> cases, see first line of method).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6613) TextField.analyzeMultiTerm should not throw exception when analyzer returns no term

2014-10-09 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated SOLR-6613:
-
Description: 
In TextField.analyzeMultiTerm()
at line
try {
  if (!source.incrementToken())
throw new SolrException();

The method should not throw an exception if there is no token because having no 
token is legitimate because all tokens may be filtered out (e.g. with a 
blocking Filter such as StopFilter).

In this case it should simply return null (as it already returns null in some 
cases, see first line of method). However, SolrQueryParserBase needs also to be 
fixed to correctly handle null returned by TextField.analyzeMultiTerm().

See attached TestTextField for the corresponding new test class.

  was:
In TextField.analyzeMultiTerm()
at line
try {
  if (!source.incrementToken())
throw new SolrException();

The method should not throw an exception if there is no token because having no 
token is legitimate because all tokens may be filtered out (e.g. with a 
blocking Filter such as StopFilter).

In this case it should simply return null (as it already returns null in some 
cases, see first line of method).


> TextField.analyzeMultiTerm should not throw exception when analyzer returns 
> no term
> ---
>
> Key: SOLR-6613
> URL: https://issues.apache.org/jira/browse/SOLR-6613
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 4.3.1, 4.10.2, Trunk
>    Reporter: Bruno Roustant
> Attachments: TestTextField.java
>
>
> In TextField.analyzeMultiTerm()
> at line
> try {
>   if (!source.incrementToken())
> throw new SolrException();
> The method should not throw an exception if there is no token because having 
> no token is legitimate because all tokens may be filtered out (e.g. with a 
> blocking Filter such as StopFilter).
> In this case it should simply return null (as it already returns null in some 
> cases, see first line of method). However, SolrQueryParserBase needs also to 
> be fixed to correctly handle null returned by TextField.analyzeMultiTerm().
> See attached TestTextField for the corresponding new test class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-04-24 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450065#comment-16450065
 ] 

Bruno Roustant commented on SOLR-11865:
---

Sorry for the delay.

Yes, if you can take it from here, that would be awesome!
 * Getters for defaults: you're right, there is no need. Please remove them.
 * keepElevationPriority as a constant in QEC: good point.
 * keepElevationPriority meaning:
Actually the comment is not right, maybe the sorting has changed since the time 
I wrote this comment. I don't think it is linked anymore to forceElevation 
since the ElevationComparatorSource can be added as a SortField even if 
forceElevation=false when one sort by score.
The point is
- with keepElevationPriority=true, the behavior is unchanged, the elevated 
documents (on top) are sorted by the order of the elevation rules and elevated 
ids in the config file.
- with keepElevationPriority=false, the behavior changes, the elevated 
documents (still on top) are in any order, and they may be re-ordered by other 
sort fields (this will allow the use of the efficient but unsorted 
TrieSubsetMatcher in the other patch).

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> 0002-Refactor-QueryElevationComponent-after-review.patch, 
> 0003-Remove-exception-handlers-and-refactor-getBoostDocs.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-04-24 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450065#comment-16450065
 ] 

Bruno Roustant edited comment on SOLR-11865 at 4/24/18 3:37 PM:


Sorry for the delay.

Yes, if you can take it from here, that would be awesome!
 * Getters for defaults: you're right, there is no need. Please remove them.
 * keepElevationPriority as a constant in QEC: good point.
 * keepElevationPriority meaning:
 Actually the comment is not right, maybe the sorting has changed since the 
time I wrote this comment. I don't think it is linked anymore to forceElevation 
since the ElevationComparatorSource can be added as a SortField even if 
forceElevation=false when one sorts by score.
 The point is

 - with keepElevationPriority=true, the behavior is unchanged, the elevated 
documents (on top) are sorted by the order of the elevation rules and elevated 
ids in the config file.
 - with keepElevationPriority=false, the behavior changes, the elevated 
documents (still on top) are in any order (this will allow the use of the 
efficient but unsorted TrieSubsetMatcher in the other patch), and they may be 
re-ordered by other sort fields 


was (Author: bruno.roustant):
Sorry for the delay.

Yes, if you can take it from here, that would be awesome!
 * Getters for defaults: you're right, there is no need. Please remove them.
 * keepElevationPriority as a constant in QEC: good point.
 * keepElevationPriority meaning:
Actually the comment is not right, maybe the sorting has changed since the time 
I wrote this comment. I don't think it is linked anymore to forceElevation 
since the ElevationComparatorSource can be added as a SortField even if 
forceElevation=false when one sort by score.
The point is
- with keepElevationPriority=true, the behavior is unchanged, the elevated 
documents (on top) are sorted by the order of the elevation rules and elevated 
ids in the config file.
- with keepElevationPriority=false, the behavior changes, the elevated 
documents (still on top) are in any order, and they may be re-ordered by other 
sort fields (this will allow the use of the efficient but unsorted 
TrieSubsetMatcher in the other patch).

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> 0002-Refactor-QueryElevationComponent-after-review.patch, 
> 0003-Remove-exception-handlers-and-refactor-getBoostDocs.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-04-27 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456470#comment-16456470
 ] 

Bruno Roustant commented on SOLR-11865:
---

Actually the TrieSubsetMatcher introduced by the next patch does not support 
keepElevationPriority. If keepElevationPriority=true, this matcher is replaced 
by another, which keeps the order but which is less efficient. And this is done 
at component initialization time, in the inform() method (in 
loadElevationProvider()).

So I think it cannot be a query param because it is fixed in the data structure 
at initialization time.

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> 0002-Refactor-QueryElevationComponent-after-review.patch, 
> 0003-Remove-exception-handlers-and-refactor-getBoostDocs.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-8292) Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods

2018-05-03 Thread Bruno Roustant (JIRA)
Bruno Roustant created LUCENE-8292:
--

 Summary: Fix FilterLeafReader.FilterTermsEnum to delegate all 
seekExact methods
 Key: LUCENE-8292
 URL: https://issues.apache.org/jira/browse/LUCENE-8292
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 7.2.1
Reporter: Bruno Roustant
 Fix For: trunk


FilterLeafReader#FilterTermsEnum wraps another TermsEnum and delegates many 
methods.

It misses some seekExact() methods, thus it is not possible to the delegate to 
override these methods to have specific behavior (unlike the TermsEnum API 
which allows that).

The fix is straightforward: simply override these seekExact() methods and 
delegate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8292) Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods

2018-05-03 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated LUCENE-8292:
---
Attachment: LUCENE-8292.patch
0001-Fix-FilterLeafReader.FilterTermsEnum-to-delegate-see.patch

> Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods
> --
>
> Key: LUCENE-8292
> URL: https://issues.apache.org/jira/browse/LUCENE-8292
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 7.2.1
>    Reporter: Bruno Roustant
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> 0001-Fix-FilterLeafReader.FilterTermsEnum-to-delegate-see.patch, 
> LUCENE-8292.patch
>
>
> FilterLeafReader#FilterTermsEnum wraps another TermsEnum and delegates many 
> methods.
> It misses some seekExact() methods, thus it is not possible to the delegate 
> to override these methods to have specific behavior (unlike the TermsEnum API 
> which allows that).
> The fix is straightforward: simply override these seekExact() methods and 
> delegate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8292) Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods

2018-05-03 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462368#comment-16462368
 ] 

Bruno Roustant commented on LUCENE-8292:


1- "Not possible to override": I was not clear. It is still possible for a 
delegate TermsEnum to override the seekExact() method. But it will never be 
called since the FilterTermsEnum above always calls seekCeil().

2- "Two more methods to override": You're right. Although normally the same 
code should be reusable, it should not be tedious. I see the trappy point.

> Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods
> --
>
> Key: LUCENE-8292
> URL: https://issues.apache.org/jira/browse/LUCENE-8292
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>    Affects Versions: 7.2.1
>Reporter: Bruno Roustant
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> 0001-Fix-FilterLeafReader.FilterTermsEnum-to-delegate-see.patch, 
> LUCENE-8292.patch
>
>
> FilterLeafReader#FilterTermsEnum wraps another TermsEnum and delegates many 
> methods.
> It misses some seekExact() methods, thus it is not possible to the delegate 
> to override these methods to have specific behavior (unlike the TermsEnum API 
> which allows that).
> The fix is straightforward: simply override these seekExact() methods and 
> delegate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8292) Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods

2018-05-03 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462394#comment-16462394
 ] 

Bruno Roustant commented on LUCENE-8292:


When looking at TermsEnum API, what I understand is that seekExact() defaults 
to calling seekCeil(), but if needed (not for correctness but for performance 
consideration) we can override it to have a specialized seek that searches only 
the exact term and does not have to position to the next term if not found.

This may have an impact for some TermsEnum extensions (a really noticeable 
impact in my case, that's why I noticed this issue). To me the current behavior 
of FilterTermsEnum is not correct with regard to TermsEnum API. (And I noticed 
that AssertingLeafReader overrides seekExact()).

Adding this two methods in FilterTermsEnum fixes correctness, even if I agree 
it makes more room for bugs.

> Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods
> --
>
> Key: LUCENE-8292
> URL: https://issues.apache.org/jira/browse/LUCENE-8292
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 7.2.1
>Reporter: Bruno Roustant
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> 0001-Fix-FilterLeafReader.FilterTermsEnum-to-delegate-see.patch, 
> LUCENE-8292.patch
>
>
> FilterLeafReader#FilterTermsEnum wraps another TermsEnum and delegates many 
> methods.
> It misses some seekExact() methods, thus it is not possible to the delegate 
> to override these methods to have specific behavior (unlike the TermsEnum API 
> which allows that).
> The fix is straightforward: simply override these seekExact() methods and 
> delegate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8292) Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods

2018-05-03 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462394#comment-16462394
 ] 

Bruno Roustant edited comment on LUCENE-8292 at 5/3/18 1:03 PM:


When looking at TermsEnum API, what I understand is that seekExact() defaults 
to calling seekCeil(), but if needed (not for correctness but for performance 
consideration) we can override it to have a specialized seek that searches only 
the exact term and does not have to position to the next term if not found.

This may have an impact for some TermsEnum extensions (a really noticeable 
impact in my case, that's why I noticed this issue). To me the current behavior 
of FilterTermsEnum is not correct with regard to TermsEnum API. (And I noticed 
that AssertingLeafReader overrides seekExact()).

Adding these two methods in FilterTermsEnum fixes correctness, even if I agree 
it makes more room for bugs.


was (Author: bruno.roustant):
When looking at TermsEnum API, what I understand is that seekExact() defaults 
to calling seekCeil(), but if needed (not for correctness but for performance 
consideration) we can override it to have a specialized seek that searches only 
the exact term and does not have to position to the next term if not found.

This may have an impact for some TermsEnum extensions (a really noticeable 
impact in my case, that's why I noticed this issue). To me the current behavior 
of FilterTermsEnum is not correct with regard to TermsEnum API. (And I noticed 
that AssertingLeafReader overrides seekExact()).

Adding this two methods in FilterTermsEnum fixes correctness, even if I agree 
it makes more room for bugs.

> Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods
> --
>
> Key: LUCENE-8292
> URL: https://issues.apache.org/jira/browse/LUCENE-8292
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 7.2.1
>Reporter: Bruno Roustant
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> 0001-Fix-FilterLeafReader.FilterTermsEnum-to-delegate-see.patch, 
> LUCENE-8292.patch
>
>
> FilterLeafReader#FilterTermsEnum wraps another TermsEnum and delegates many 
> methods.
> It misses some seekExact() methods, thus it is not possible to the delegate 
> to override these methods to have specific behavior (unlike the TermsEnum API 
> which allows that).
> The fix is straightforward: simply override these seekExact() methods and 
> delegate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8292) Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods

2018-05-03 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462409#comment-16462409
 ] 

Bruno Roustant edited comment on LUCENE-8292 at 5/3/18 1:08 PM:


Another option would be to modify the TermsEnum.seekExact() method and make it 
final, or have the javadoc be explicit that it should not be overridden. 
(though I don't like this option)


was (Author: bruno.roustant):
Another option would be to modify the TermsEnum.seekExact() method and make it 
final, or have the javadoc be explicit that it should not be overridden.

> Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods
> --
>
> Key: LUCENE-8292
> URL: https://issues.apache.org/jira/browse/LUCENE-8292
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 7.2.1
>Reporter: Bruno Roustant
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> 0001-Fix-FilterLeafReader.FilterTermsEnum-to-delegate-see.patch, 
> LUCENE-8292.patch
>
>
> FilterLeafReader#FilterTermsEnum wraps another TermsEnum and delegates many 
> methods.
> It misses some seekExact() methods, thus it is not possible to the delegate 
> to override these methods to have specific behavior (unlike the TermsEnum API 
> which allows that).
> The fix is straightforward: simply override these seekExact() methods and 
> delegate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8292) Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods

2018-05-03 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462409#comment-16462409
 ] 

Bruno Roustant commented on LUCENE-8292:


Another option would be to modify the TermsEnum.seekExact() method and make it 
final, or have the javadoc be explicit that it should not be overridden.

> Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods
> --
>
> Key: LUCENE-8292
> URL: https://issues.apache.org/jira/browse/LUCENE-8292
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 7.2.1
>    Reporter: Bruno Roustant
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> 0001-Fix-FilterLeafReader.FilterTermsEnum-to-delegate-see.patch, 
> LUCENE-8292.patch
>
>
> FilterLeafReader#FilterTermsEnum wraps another TermsEnum and delegates many 
> methods.
> It misses some seekExact() methods, thus it is not possible to the delegate 
> to override these methods to have specific behavior (unlike the TermsEnum API 
> which allows that).
> The fix is straightforward: simply override these seekExact() methods and 
> delegate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8292) Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods

2018-05-07 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465767#comment-16465767
 ] 

Bruno Roustant commented on LUCENE-8292:


I just realized that the current no-default-override behavior is actually 
enforced by a test TestFilterLeafReader.testOverrideMethods.

I still think all methods should be overridden, but I understand that this may 
not be the expected behavior currently.

> Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods
> --
>
> Key: LUCENE-8292
> URL: https://issues.apache.org/jira/browse/LUCENE-8292
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 7.2.1
>    Reporter: Bruno Roustant
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> 0001-Fix-FilterLeafReader.FilterTermsEnum-to-delegate-see.patch, 
> LUCENE-8292.patch
>
>
> FilterLeafReader#FilterTermsEnum wraps another TermsEnum and delegates many 
> methods.
> It misses some seekExact() methods, thus it is not possible to the delegate 
> to override these methods to have specific behavior (unlike the TermsEnum API 
> which allows that).
> The fix is straightforward: simply override these seekExact() methods and 
> delegate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8292) Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods

2018-05-07 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465887#comment-16465887
 ] 

Bruno Roustant commented on LUCENE-8292:


[~dsmiley], if I create a subclass of FilterTermsEnum to override seekExact, 
how can I make other classes in Lucene create this subclass instead of 
FilterTermsEnum? Would I have to also override other classes or other factories?

> Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods
> --
>
> Key: LUCENE-8292
> URL: https://issues.apache.org/jira/browse/LUCENE-8292
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 7.2.1
>    Reporter: Bruno Roustant
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> 0001-Fix-FilterLeafReader.FilterTermsEnum-to-delegate-see.patch, 
> LUCENE-8292.patch
>
>
> FilterLeafReader#FilterTermsEnum wraps another TermsEnum and delegates many 
> methods.
> It misses some seekExact() methods, thus it is not possible to the delegate 
> to override these methods to have specific behavior (unlike the TermsEnum API 
> which allows that).
> The fix is straightforward: simply override these seekExact() methods and 
> delegate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8159) Add a copy constructor in AutomatonQuery to copy directly the compiled automaton

2018-02-27 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378373#comment-16378373
 ] 

Bruno Roustant commented on LUCENE-8159:


{quote}I's rather like to expose an expert constructor that takes a compiled 
automaton and expect users to compile the automaton themselves if they plan to 
reuse it in multiple queries?
{quote}
I can speak as such a "user" as I'm having this use case. We often build 
queries with the same prefix/wildcard query for multiple different fields (and 
sometimes many fields). As a user I really appreciate to simply copy a 
PrefixQuery or WildcardQuery, rather than building the automaton myself. The 
inner automaton inside PrefixQuery is hidden, and the logic is internal to the 
PrefixQuery. I don't want to know myself how it is built.

I agree with exposing the compiled automaton. Although I find the copy 
constructor easier to use.
{quote}Should PrefixQuery & WildcardQuery & TermRangeQuery have the same 
constructors too?
{quote}
I indeed prepared the same copy constructors for these classes. I didn't have 
time to resubmit the patch yet, but that's the idea, yes.

> Add a copy constructor in AutomatonQuery to copy directly the compiled 
> automaton
> 
>
> Key: LUCENE-8159
> URL: https://issues.apache.org/jira/browse/LUCENE-8159
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: trunk
>Reporter: Bruno Roustant
>Assignee: David Smiley
>Priority: Major
> Attachments: 
> 0001-Add-a-copy-constructor-in-AutomatonQuery-to-copy-dir.patch, 
> LUCENE-8159.patch
>
>
> When the query is composed of multiple AutomatonQuery with the same automaton 
> and which target different fields, it is much more efficient to reuse the 
> already compiled automaton by copying it directly and just changing the 
> target field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8159) Add a copy constructor in AutomatonQuery to copy directly the compiled automaton

2018-02-27 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378373#comment-16378373
 ] 

Bruno Roustant edited comment on LUCENE-8159 at 2/27/18 10:42 AM:
--

{quote}I's rather like to expose an expert constructor that takes a compiled 
automaton and expect users to compile the automaton themselves if they plan to 
reuse it in multiple queries?
{quote}
I can speak as such a "user" as I'm having this use case. We often build 
queries with the same prefix/wildcard query for multiple different fields (and 
sometimes many fields). As a user I really appreciate to simply copy a 
PrefixQuery or WildcardQuery, rather than building the automaton myself. The 
inner automaton inside PrefixQuery is hidden, and the logic is internal to the 
PrefixQuery. I don't want to know myself how it is built.

I agree with exposing the compiled automaton.
{quote}Should PrefixQuery & WildcardQuery & TermRangeQuery have the same 
constructors too?
{quote}
I indeed prepared the same copy constructors for these classes. I didn't have 
time to resubmit the patch yet, but that's the idea, yes.


was (Author: bruno.roustant):
{quote}I's rather like to expose an expert constructor that takes a compiled 
automaton and expect users to compile the automaton themselves if they plan to 
reuse it in multiple queries?
{quote}
I can speak as such a "user" as I'm having this use case. We often build 
queries with the same prefix/wildcard query for multiple different fields (and 
sometimes many fields). As a user I really appreciate to simply copy a 
PrefixQuery or WildcardQuery, rather than building the automaton myself. The 
inner automaton inside PrefixQuery is hidden, and the logic is internal to the 
PrefixQuery. I don't want to know myself how it is built.

I agree with exposing the compiled automaton. Although I find the copy 
constructor easier to use.
{quote}Should PrefixQuery & WildcardQuery & TermRangeQuery have the same 
constructors too?
{quote}
I indeed prepared the same copy constructors for these classes. I didn't have 
time to resubmit the patch yet, but that's the idea, yes.

> Add a copy constructor in AutomatonQuery to copy directly the compiled 
> automaton
> 
>
> Key: LUCENE-8159
> URL: https://issues.apache.org/jira/browse/LUCENE-8159
> Project: Lucene - Core
>      Issue Type: Improvement
>  Components: core/search
>Affects Versions: trunk
>Reporter: Bruno Roustant
>Assignee: David Smiley
>Priority: Major
> Attachments: 
> 0001-Add-a-copy-constructor-in-AutomatonQuery-to-copy-dir.patch, 
> LUCENE-8159.patch
>
>
> When the query is composed of multiple AutomatonQuery with the same automaton 
> and which target different fields, it is much more efficient to reuse the 
> already compiled automaton by copying it directly and just changing the 
> target field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8159) Add a copy constructor in AutomatonQuery to copy directly the compiled automaton

2018-02-27 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378373#comment-16378373
 ] 

Bruno Roustant edited comment on LUCENE-8159 at 2/27/18 10:44 AM:
--

{quote}I's rather like to expose an expert constructor that takes a compiled 
automaton and expect users to compile the automaton themselves if they plan to 
reuse it in multiple queries?
{quote}
I can speak as such a "user" as I'm having this use case. We often build 
queries with the same prefix/wildcard query for multiple different fields (and 
sometimes many fields - in this case the optimization does help). As a user I 
really appreciate to simply copy a PrefixQuery or WildcardQuery, rather than 
building the automaton myself. The inner automaton inside PrefixQuery is 
hidden, and the logic is internal to the PrefixQuery. I don't want to know 
myself how it is built.

I agree with exposing the compiled automaton.
{quote}Should PrefixQuery & WildcardQuery & TermRangeQuery have the same 
constructors too?
{quote}
I indeed prepared the same copy constructors for these classes. I didn't have 
time to resubmit the patch yet, but that's the idea, yes.


was (Author: bruno.roustant):
{quote}I's rather like to expose an expert constructor that takes a compiled 
automaton and expect users to compile the automaton themselves if they plan to 
reuse it in multiple queries?
{quote}
I can speak as such a "user" as I'm having this use case. We often build 
queries with the same prefix/wildcard query for multiple different fields (and 
sometimes many fields). As a user I really appreciate to simply copy a 
PrefixQuery or WildcardQuery, rather than building the automaton myself. The 
inner automaton inside PrefixQuery is hidden, and the logic is internal to the 
PrefixQuery. I don't want to know myself how it is built.

I agree with exposing the compiled automaton.
{quote}Should PrefixQuery & WildcardQuery & TermRangeQuery have the same 
constructors too?
{quote}
I indeed prepared the same copy constructors for these classes. I didn't have 
time to resubmit the patch yet, but that's the idea, yes.

> Add a copy constructor in AutomatonQuery to copy directly the compiled 
> automaton
> 
>
> Key: LUCENE-8159
> URL: https://issues.apache.org/jira/browse/LUCENE-8159
> Project: Lucene - Core
>      Issue Type: Improvement
>  Components: core/search
>Affects Versions: trunk
>Reporter: Bruno Roustant
>Assignee: David Smiley
>Priority: Major
> Attachments: 
> 0001-Add-a-copy-constructor-in-AutomatonQuery-to-copy-dir.patch, 
> LUCENE-8159.patch
>
>
> When the query is composed of multiple AutomatonQuery with the same automaton 
> and which target different fields, it is much more efficient to reuse the 
> already compiled automaton by copying it directly and just changing the 
> target field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8159) Add a copy constructor in AutomatonQuery to copy directly the compiled automaton

2018-02-27 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378373#comment-16378373
 ] 

Bruno Roustant edited comment on LUCENE-8159 at 2/27/18 11:29 AM:
--

{quote}I's rather like to expose an expert constructor that takes a compiled 
automaton and expect users to compile the automaton themselves if they plan to 
reuse it in multiple queries?
{quote}
I can speak as such a "user" as I'm having this use case. We often build 
queries with the same prefix/wildcard query for multiple different fields (and 
sometimes many fields - in this case the optimization does help). As a user I 
really appreciate to simply copy a PrefixQuery or WildcardQuery, rather than 
building the automaton myself. The inner automaton inside PrefixQuery is 
hidden, and the logic is internal to the PrefixQuery. I don't want to know 
myself how it is built.

I agree with exposing the compiled automaton. But I still think PrefixQuery and 
WildcardQuery would benefit from a new constructor. And this constructor cannot 
really take any automaton as parameter, it could potentially break the 
prefix/wildcard contract. So, to me, PrefixQuery and WildcardQuery should have 
their copy constructor.
{quote}Should PrefixQuery & WildcardQuery & TermRangeQuery have the same 
constructors too?
{quote}
I indeed prepared the same copy constructors for these classes. I didn't have 
time to resubmit the patch yet, but that's the idea, yes.


was (Author: bruno.roustant):
{quote}I's rather like to expose an expert constructor that takes a compiled 
automaton and expect users to compile the automaton themselves if they plan to 
reuse it in multiple queries?
{quote}
I can speak as such a "user" as I'm having this use case. We often build 
queries with the same prefix/wildcard query for multiple different fields (and 
sometimes many fields - in this case the optimization does help). As a user I 
really appreciate to simply copy a PrefixQuery or WildcardQuery, rather than 
building the automaton myself. The inner automaton inside PrefixQuery is 
hidden, and the logic is internal to the PrefixQuery. I don't want to know 
myself how it is built.

I agree with exposing the compiled automaton.
{quote}Should PrefixQuery & WildcardQuery & TermRangeQuery have the same 
constructors too?
{quote}
I indeed prepared the same copy constructors for these classes. I didn't have 
time to resubmit the patch yet, but that's the idea, yes.

> Add a copy constructor in AutomatonQuery to copy directly the compiled 
> automaton
> 
>
> Key: LUCENE-8159
> URL: https://issues.apache.org/jira/browse/LUCENE-8159
> Project: Lucene - Core
>      Issue Type: Improvement
>  Components: core/search
>Affects Versions: trunk
>Reporter: Bruno Roustant
>Assignee: David Smiley
>Priority: Major
> Attachments: 
> 0001-Add-a-copy-constructor-in-AutomatonQuery-to-copy-dir.patch, 
> LUCENE-8159.patch
>
>
> When the query is composed of multiple AutomatonQuery with the same automaton 
> and which target different fields, it is much more efficient to reuse the 
> already compiled automaton by copying it directly and just changing the 
> target field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8159) Add a copy constructor in AutomatonQuery to copy directly the compiled automaton

2018-02-28 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16380407#comment-16380407
 ] 

Bruno Roustant commented on LUCENE-8159:


[~rcmuir] could you be a little more explicit?

Without context I don't understand why a copy constructor is bad in Java in 
general.

> Add a copy constructor in AutomatonQuery to copy directly the compiled 
> automaton
> 
>
> Key: LUCENE-8159
> URL: https://issues.apache.org/jira/browse/LUCENE-8159
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: trunk
>Reporter: Bruno Roustant
>Assignee: David Smiley
>Priority: Major
> Attachments: 
> 0001-Add-a-copy-constructor-in-AutomatonQuery-to-copy-dir.patch, 
> LUCENE-8159.patch
>
>
> When the query is composed of multiple AutomatonQuery with the same automaton 
> and which target different fields, it is much more efficient to reuse the 
> already compiled automaton by copying it directly and just changing the 
> target field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8159) Add a copy constructor in AutomatonQuery to copy directly the compiled automaton

2018-02-28 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16380407#comment-16380407
 ] 

Bruno Roustant edited comment on LUCENE-8159 at 2/28/18 2:58 PM:
-

[~rcmuir] could you be a little more explicit?

Without context I don't understand why a copy constructor is bad in Java in 
general.

Do you mean you prefer a copy method?

PrefixQuery copy(String field)


was (Author: bruno.roustant):
[~rcmuir] could you be a little more explicit?

Without context I don't understand why a copy constructor is bad in Java in 
general.

> Add a copy constructor in AutomatonQuery to copy directly the compiled 
> automaton
> 
>
> Key: LUCENE-8159
> URL: https://issues.apache.org/jira/browse/LUCENE-8159
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: trunk
>Reporter: Bruno Roustant
>Assignee: David Smiley
>Priority: Major
> Attachments: 
> 0001-Add-a-copy-constructor-in-AutomatonQuery-to-copy-dir.patch, 
> LUCENE-8159.patch
>
>
> When the query is composed of multiple AutomatonQuery with the same automaton 
> and which target different fields, it is much more efficient to reuse the 
> already compiled automaton by copying it directly and just changing the 
> target field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8159) Add a copy constructor in AutomatonQuery to copy directly the compiled automaton

2018-03-05 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385824#comment-16385824
 ] 

Bruno Roustant commented on LUCENE-8159:


Ok. I'll let you guys decide whether to discard this patch.

[~jpountz] I'm curious about searching a lot of fields.
{quote}searching over lots of fields is a bad practice
{quote}
Could you tell me the reason for the bad practice? Is it due to bad performance 
impact? Are there other reasons by design?

Generally customer organizations love to have lots of fields. While I agree 
that sometimes they should revisit their data partitioning, there are cases 
where searching many fields help (e.g. CRM, field level security, ML ranking 
model based on field matches)

> Add a copy constructor in AutomatonQuery to copy directly the compiled 
> automaton
> 
>
> Key: LUCENE-8159
> URL: https://issues.apache.org/jira/browse/LUCENE-8159
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: trunk
>Reporter: Bruno Roustant
>Assignee: David Smiley
>Priority: Major
> Attachments: 
> 0001-Add-a-copy-constructor-in-AutomatonQuery-to-copy-dir.patch, 
> LUCENE-8159.patch
>
>
> When the query is composed of multiple AutomatonQuery with the same automaton 
> and which target different fields, it is much more efficient to reuse the 
> already compiled automaton by copying it directly and just changing the 
> target field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8292) Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods

2018-05-15 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475579#comment-16475579
 ] 

Bruno Roustant commented on LUCENE-8292:


Actually there is also another related issue with this 
FilterLeafReader#FilterTermsEnum delegate pattern.

It does not delegate termState() nor seekExact(ByteRef, TermState) methods. 
Which means the termState is never used, so the term queries repeat twice the 
same seek (seekCeil) instead of using the termState to improve performance 
(normally the termState is kept by TermContext#build()).

Practical example: When one configures a timeout for queries, internally a 
ExitableDirectoryReader is created. And its ExitableTermsEnum, which extends 
FilterTermsEnum, makes all term queries repeat twice the same seekCeil().

> Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods
> --
>
> Key: LUCENE-8292
> URL: https://issues.apache.org/jira/browse/LUCENE-8292
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 7.2.1
>    Reporter: Bruno Roustant
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> 0001-Fix-FilterLeafReader.FilterTermsEnum-to-delegate-see.patch, 
> LUCENE-8292.patch
>
>
> FilterLeafReader#FilterTermsEnum wraps another TermsEnum and delegates many 
> methods.
> It misses some seekExact() methods, thus it is not possible to the delegate 
> to override these methods to have specific behavior (unlike the TermsEnum API 
> which allows that).
> The fix is straightforward: simply override these seekExact() methods and 
> delegate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8292) Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods

2018-05-15 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475579#comment-16475579
 ] 

Bruno Roustant edited comment on LUCENE-8292 at 5/15/18 9:57 AM:
-

Actually there is also another related issue with this 
FilterLeafReader#FilterTermsEnum delegate pattern.

It does not delegate termState() nor seekExact(ByteRef, TermState) methods. 
Which means the termState is never used, so the term queries repeat twice the 
same seek (seekCeil) instead of using the termState to improve performance 
(normally the termState is kept by TermContext#build()).

Practical example: When one configures a timeout for queries, internally an 
ExitableDirectoryReader is created. And its ExitableTermsEnum, which extends 
FilterTermsEnum, makes all term queries repeat twice the same seekCeil().


was (Author: bruno.roustant):
Actually there is also another related issue with this 
FilterLeafReader#FilterTermsEnum delegate pattern.

It does not delegate termState() nor seekExact(ByteRef, TermState) methods. 
Which means the termState is never used, so the term queries repeat twice the 
same seek (seekCeil) instead of using the termState to improve performance 
(normally the termState is kept by TermContext#build()).

Practical example: When one configures a timeout for queries, internally a 
ExitableDirectoryReader is created. And its ExitableTermsEnum, which extends 
FilterTermsEnum, makes all term queries repeat twice the same seekCeil().

> Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods
> --
>
> Key: LUCENE-8292
> URL: https://issues.apache.org/jira/browse/LUCENE-8292
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 7.2.1
>    Reporter: Bruno Roustant
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> 0001-Fix-FilterLeafReader.FilterTermsEnum-to-delegate-see.patch, 
> LUCENE-8292.patch
>
>
> FilterLeafReader#FilterTermsEnum wraps another TermsEnum and delegates many 
> methods.
> It misses some seekExact() methods, thus it is not possible to the delegate 
> to override these methods to have specific behavior (unlike the TermsEnum API 
> which allows that).
> The fix is straightforward: simply override these seekExact() methods and 
> delegate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-05-15 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476032#comment-16476032
 ] 

Bruno Roustant commented on SOLR-11865:
---

Great! I agree with all your points [~dsmiley].

Indeed the String IDs in Elevation would be clearer as BytesRefs. And I vote to 
apply the key String => indexed form as early as possible, if the code remains 
small.

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> 0002-Refactor-QueryElevationComponent-after-review.patch, 
> 0003-Remove-exception-handlers-and-refactor-getBoostDocs.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420272#comment-16420272
 ] 

Bruno Roustant commented on SOLR-11865:
---

1- InitializationExceptionHandler & LoadingExceptionHandler:

At Salesforce (i.e. in a multi-tenant context) we allow each organization admin 
to update the list of elevation rules dynamically. When some rules are updated, 
the core corresponding to the organization is updated to reload the elevation 
rules XML. It is important to note that the organization admin - the person who 
defines the elevation rules - is not a Solr admin expert. He needs to get clear 
feedback on any error that may prevent the rules to be loaded. The XML rules 
are more considered as dynamic config rather than static config.

In its original version, the QueryElevationComponent simply throws an exception.

In this new version, it differentiates the error cause and lets an extending 
class (e.g. specific Salesforce extension) override the loading exception and 
take appropriate actions (logging, warning, etc) instead of simply throwing the 
Solr exception.

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420300#comment-16420300
 ] 

Bruno Roustant commented on SOLR-11865:
---

2- ElevationProvider should be immutable and simplified:

Good point. createElevationProvider() accepts the elevationBuilderMap. 
getElevationForQuery() does not throw IOException.

ElevationProvider.size is used by tests to verify the number of parsed rules. I 
added @VisibleForTesting annotation.

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420302#comment-16420302
 ] 

Bruno Roustant commented on SOLR-11865:
---

4- No "Can be overridden by extending this class".

Sure. Removed.

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420301#comment-16420301
 ] 

Bruno Roustant commented on SOLR-11865:
---

3- The indentation around line ~671 (contents of the for loop) is messed up.

I didn't change that part. I'll try to fix the indentation.

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420308#comment-16420308
 ] 

Bruno Roustant commented on SOLR-11865:
---

5- Change comparator docVal (~line 1318) to use getOrDefault.

I didn't change that. Fixed.

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420311#comment-16420311
 ] 

Bruno Roustant commented on SOLR-11865:
---

6- Use {{localBoosts.addAll(boosted.keySet());}} at line ~661 instead of manual 
looping.

Again, I didn't change that (and I didn't want to touch existing code without 
reason).

I fixed by directly removing localBoosts which was an exact copy of 
boosts.keySet() (boots parameter is a map).

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420313#comment-16420313
 ] 

Bruno Roustant commented on SOLR-11865:
---

7- In parseExcludedMarkerFieldName and parseEditorialMarkerFieldName.

Removed the if () block.

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420317#comment-16420317
 ] 

Bruno Roustant commented on SOLR-11865:
---

8- Use a UnaryOperator instead of IndexedValueProvider.

Good point. It is still clear with less code.

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420319#comment-16420319
 ] 

Bruno Roustant commented on SOLR-11865:
---

9- Make the constructor of ElevatingQuery protected.

Done.

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420322#comment-16420322
 ] 

Bruno Roustant commented on SOLR-11865:
---

10- seen.contains(id) == false.

I didn't know this Lucene practice. It explains why I see this strange 
construct.

"I recommend against modifying existing lines" - that's what I tried (see 
points 3,5,6 above) and I thought this "!seen.contains(id)" was tiny and 
harmless.

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420322#comment-16420322
 ] 

Bruno Roustant edited comment on SOLR-11865 at 3/30/18 9:30 AM:


10- seen.contains(id) == false.

I didn't know this Lucene practice. It explains why I see this strange 
construct.

"I recommend against modifying existing lines" - that's what I tried (see 
points 3,5,6 above) and I thought this "!seen.contains(id)" was tiny and 
harmless. And that's a warning highlighted by IntelliJ by the way :)


was (Author: bruno.roustant):
10- seen.contains(id) == false.

I didn't know this Lucene practice. It explains why I see this strange 
construct.

"I recommend against modifying existing lines" - that's what I tried (see 
points 3,5,6 above) and I thought this "!seen.contains(id)" was tiny and 
harmless.

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420329#comment-16420329
 ] 

Bruno Roustant commented on SOLR-11865:
---

11- subsetMatch flag in ElevatingQuery.

Yes, the idea is to support some queries with subset match, and other without. 
This will be supported by the next ElevationProvider in the next patch.

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated SOLR-11865:
--
Attachment: SOLR-11865.patch

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>    Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated SOLR-11865:
--
Attachment: (was: SOLR-11865.patch)

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>    Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated SOLR-11865:
--
Attachment: 0002-Refactor-QueryElevationComponent-after-review.patch

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>    Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> 0002-Refactor-QueryElevationComponent-after-review.patch, SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated SOLR-11865:
--
Attachment: (was: SOLR-11865.patch)

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>    Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated SOLR-11865:
--
Attachment: (was: 
0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch)

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>    Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated SOLR-11865:
--
Attachment: (was: 
0002-Refactor-QueryElevationComponent-after-review.patch)

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>    Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated SOLR-11865:
--
Attachment: SOLR-11865.patch
0002-Refactor-QueryElevationComponent-after-review.patch
0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>    Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> 0002-Refactor-QueryElevationComponent-after-review.patch, SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-03-30 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420620#comment-16420620
 ] 

Bruno Roustant commented on SOLR-11865:
---

[~dsmiley] I uploaded a new patch. Is it better now?

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> 0002-Refactor-QueryElevationComponent-after-review.patch, SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-04-05 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated SOLR-11865:
--
Attachment: (was: SOLR-11865.patch)

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>    Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> 0002-Refactor-QueryElevationComponent-after-review.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-04-05 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated SOLR-11865:
--
Attachment: SOLR-11865.patch
0003-Remove-exception-handlers-and-refactor-getBoostDocs.patch

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>    Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> 0002-Refactor-QueryElevationComponent-after-review.patch, 
> 0003-Remove-exception-handlers-and-refactor-getBoostDocs.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-04-05 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426589#comment-16426589
 ] 

Bruno Roustant commented on SOLR-11865:
---

New delta patch with the modification mentioned.

Eventually I'll squash the commits to produce a single patch that should be 
supported by "Yetus" (currently I simply use git format-patch and it produces 
three separate patch files for three commits).

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> 0002-Refactor-QueryElevationComponent-after-review.patch, 
> 0003-Remove-exception-handlers-and-refactor-getBoostDocs.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-04-05 Thread Bruno Roustant (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426589#comment-16426589
 ] 

Bruno Roustant edited comment on SOLR-11865 at 4/5/18 7:44 AM:
---

New delta patch with the modifications mentioned.

Eventually I'll squash the commits to produce a single patch that should be 
supported by "Yetus" (currently I simply use git format-patch and it produces 
three separate patch files for three commits).


was (Author: bruno.roustant):
New delta patch with the modification mentioned.

Eventually I'll squash the commits to produce a single patch that should be 
supported by "Yetus" (currently I simply use git format-patch and it produces 
three separate patch files for three commits).

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> 0002-Refactor-QueryElevationComponent-after-review.patch, 
> 0003-Remove-exception-handlers-and-refactor-getBoostDocs.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-01-17 Thread Bruno Roustant (JIRA)
Bruno Roustant created SOLR-11865:
-

 Summary: Refactor QueryElevationComponent to prepare query subset 
matching
 Key: SOLR-11865
 URL: https://issues.apache.org/jira/browse/SOLR-11865
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SearchComponents - other
Affects Versions: master (8.0)
Reporter: Bruno Roustant
 Fix For: master (8.0)


The goal is to prepare a second improvement to support query terms subset 
matching or query elevation rules.

Before that, we need to refactor the QueryElevationComponent. We make it 
extendible. We introduce the ElevationProvider interface which will be 
implemented later in a second patch to support subset matching. The current 
full-query match policy becomes a default simple MapElevationProvider.

- Add overridable methods to handle exceptions during the component 
initialization.
- Add overridable methods to provide the default values for config properties.
- No functional change beyond refactoring.
- Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

2018-01-17 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated SOLR-11865:
--
Attachment: SOLR-11865.patch
0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch

> Refactor QueryElevationComponent to prepare query subset matching
> -
>
> Key: SOLR-11865
> URL: https://issues.apache.org/jira/browse/SOLR-11865
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>    Reporter: Bruno Roustant
>Priority: Minor
>  Labels: QueryComponent
> Fix For: master (8.0)
>
> Attachments: 
> 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 
> SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset 
> matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it 
> extendible. We introduce the ElevationProvider interface which will be 
> implemented later in a second patch to support subset matching. The current 
> full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component 
> initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11866) Support efficient subset matching in query elevation rules

2018-01-17 Thread Bruno Roustant (JIRA)
Bruno Roustant created SOLR-11866:
-

 Summary: Support efficient subset matching in query elevation rules
 Key: SOLR-11866
 URL: https://issues.apache.org/jira/browse/SOLR-11866
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SearchComponents - other
Affects Versions: master (8.0)
Reporter: Bruno Roustant


Leverages the SOLR-11865 refactoring by introducing a 
SubsetMatchElevationProvider in QueryElevationComponent. This provider calls a 
new util class TrieSubsetMatcher to efficiently match all query elevation rules 
which subset is contained by the current query list of terms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11866) Support efficient subset matching in query elevation rules

2018-01-18 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated SOLR-11866:
--
Attachment: SOLR-11866.patch
0001-New-SubsetMatchElevationProvider-in-QueryElevationCo.patch

> Support efficient subset matching in query elevation rules
> --
>
> Key: SOLR-11866
> URL: https://issues.apache.org/jira/browse/SOLR-11866
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SearchComponents - other
>Affects Versions: master (8.0)
>    Reporter: Bruno Roustant
>Priority: Major
> Attachments: 
> 0001-New-SubsetMatchElevationProvider-in-QueryElevationCo.patch, 
> SOLR-11866.patch
>
>
> Leverages the SOLR-11865 refactoring by introducing a 
> SubsetMatchElevationProvider in QueryElevationComponent. This provider calls 
> a new util class TrieSubsetMatcher to efficiently match all query elevation 
> rules which subset is contained by the current query list of terms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-8159) Add a copy constructor in AutomatonQuery to copy directly the compiled automaton

2018-02-05 Thread Bruno Roustant (JIRA)
Bruno Roustant created LUCENE-8159:
--

 Summary: Add a copy constructor in AutomatonQuery to copy directly 
the compiled automaton
 Key: LUCENE-8159
 URL: https://issues.apache.org/jira/browse/LUCENE-8159
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: trunk
Reporter: Bruno Roustant


When the query is composed of multiple AutomatonQuery with the same automaton 
and which target different fields, it is much more efficient to reuse the 
already compiled automaton by copying it directly and just changing the target 
field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8159) Add a copy constructor in AutomatonQuery to copy directly the compiled automaton

2018-02-05 Thread Bruno Roustant (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated LUCENE-8159:
---
Attachment: LUCENE-8159.patch
0001-Add-a-copy-constructor-in-AutomatonQuery-to-copy-dir.patch

> Add a copy constructor in AutomatonQuery to copy directly the compiled 
> automaton
> 
>
> Key: LUCENE-8159
> URL: https://issues.apache.org/jira/browse/LUCENE-8159
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: trunk
>    Reporter: Bruno Roustant
>Priority: Major
> Attachments: 
> 0001-Add-a-copy-constructor-in-AutomatonQuery-to-copy-dir.patch, 
> LUCENE-8159.patch
>
>
> When the query is composed of multiple AutomatonQuery with the same automaton 
> and which target different fields, it is much more efficient to reuse the 
> already compiled automaton by copying it directly and just changing the 
> target field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



<    1   2