date:20200416



 [ 
https://issues.apache.org/jira/browse/LUCENE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera closed LUCENE-9300.


> Index corruption with doc values updates and addIndexes
> ---
>
> Key: LUCENE-9300
> URL: https://issues.apache.org/jira/browse/LUCENE-9300
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Jim Ferenczi
>Priority: Major
> Fix For: master (9.0), 8.6, 8.5.1
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Today a doc values update creates a new field infos file that contains the 
> original field infos updated for the new generation as well as the new fields 
> created by the doc values update.
> However existing fields are cloned through the global fields (shared in the 
> index writer) instead of the local ones (present in the segment). In practice 
> this is not an issue since field numbers are shared between segments created 
> by the same index writer. But this assumption doesn't hold for segments 
> created by different writers and added through 
> IndexWriter#addIndexes(Directory). In this case, the field number of the same 
> field can differ between segments so any doc values update can corrupt the 
> index by assigning the wrong field number to an existing field in the next 
> generation. 
> When this happens, queries and merges can access wrong fields without 
> throwing any error, leading to a silent corruption in the index.
>  
> Since segments are not guaranteed to have the same field number consistently 
> we should ensure that doc values update preserves the segment's field number 
> when rewriting field infos.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7701) Refactor grouping collectors

2020-04-16 Thread Alan Woodward (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084681#comment-17084681
 ] 

Alan Woodward commented on LUCENE-7701:
---

Sure, go ahead!

> Refactor grouping collectors
> 
>
> Key: LUCENE-7701
> URL: https://issues.apache.org/jira/browse/LUCENE-7701
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Priority: Major
> Fix For: 7.0
>
> Attachments: LUCENE-7701.patch, LUCENE-7701.patch
>
>
> Grouping currently works via abstract collectors, which need to be overridden 
> for each way of defining a group - currently we have two, 'term' (based on 
> SortedDocValues) and 'function' (based on ValueSources).  These collectors 
> all have a lot of repeated code, and means that if you want to implement your 
> own group definitions, you need to override four or five different classes.
> This would be easier to deal with if instead the 'group selection' code was 
> abstracted out into a single interface, and the various collectors were 
> changed to concrete implementations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9324) Give IDs to SegmentCommitInfo

2020-04-16 Thread Simon Willnauer (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084682#comment-17084682
 ] 

Simon Willnauer commented on LUCENE-9324:
-

I am trying to give a bit more context to this issue. Today we have 
_SegmentInfo_ which represents a segment once it's written to disk for instance 
at flush or merge time. We have a randomly generated ID in _SegmentInfo_ that 
can be used to verify if two segments are the same. Since we use incremental 
numbers for segment naming it's likely that two IndexWriters produce a segment 
with very similar contents and the same name. Yet, the _SegmentInfo_ id would 
be different. In addition to this ID we also have checksums on files which can 
be used to verify identity in addition to the ID but should not be treated 
identity by itself since they are very weak checksums. 
Now segments also get _updated_ for instance when a documents is marked as 
deleted or the segment receives a doc values update. The only thing that 
changes is the delete or update generation which also allow two IndexWriters 
that opened two copies of a segment (with the same segment ID) to produce a new 
delGen or dvGen that looks identical from the outside but are actually 
different. This is a problem that we see quite frequently in Elasticsearch and 
we'd like to prevent or have a better tool in our hands to distinguish 
_SegmentCommitInfo_ instances from another. If we'd have an ID on 
SegmentCommitInfo that changes each time one of these generations changes we 
could much easier tell if only the updated files (which are often very small) 
need to be replaced in order to recover an index. 

The plan is to implement this in a very similar fashion as we did on the 
_SegmentInfo_ but also invalidate the once any of the generations change in 
order to force a new _SegmentCommitInfo_ ID for the new generation. Yet, the 
IDs would not be the same if two IndexWriters start from the same segment 
making an identical change to the segment ie. it's not a replacement for a 
strong hash function.

> Give IDs to SegmentCommitInfo
> -
>
> Key: LUCENE-9324
> URL: https://issues.apache.org/jira/browse/LUCENE-9324
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Adrien Grand
>Priority: Minor
>
> We already have IDs in SegmentInfo, which are useful to uniquely identify 
> segments. Having IDs on SegmentCommitInfo would be useful too in order to 
> compare commits for equality and make snapshots incremental on generational 
> files too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9326) Refactor SortField to better handle extensions

2020-04-16 Thread Alan Woodward (Jira)

Alan Woodward created LUCENE-9326:
-

 Summary: Refactor SortField to better handle extensions
 Key: LUCENE-9326
 URL: https://issues.apache.org/jira/browse/LUCENE-9326
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward


Working on LUCENE-9325 has made me realize that SortField needs some serious 
reworking:
* we have a bunch of hard-coded types, but also a number of custom extensions, 
which make implementing new sort orders complicated in non-obvious ways
* we refer to these hard-coded types in a number of places, in particular in 
index sorts, which means that you can't use a 'custom' sort here.  For example, 
I can see it would be very useful to be able to index sort by distance from a 
particular point, but that's not currently possible.
* the API separates out the comparator and whether or not it should be 
reversed, which adds an extra layer of complication to its use, particularly in 
cases where we have multiple sortfields.

The whole thing could do with an overhaul.  I think this can be broken up into 
a few stages by adding a new superclass abstraction which `SortField` will 
extend, and gradually moving functionality into this superclass.  I plan on 
starting with index sorting, which will require a sort field to a) be able to 
merge sort documents coming from a list of readers, and b) serialize itself to 
and deserialize itself from SegmentInfo



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14409) Existing violations allow bypassing policy rules when adding new replicas

Andrzej Bialecki created SOLR-14409:
---

 Summary: Existing violations allow bypassing policy rules when 
adding new replicas
 Key: SOLR-14409
 URL: https://issues.apache.org/jira/browse/SOLR-14409
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: AutoScaling
Affects Versions: 8.5, master (9.0), 8.6
Reporter: Andrzej Bialecki
Assignee: Andrzej Bialecki


Steps to reproduce:
 * start with an empty cluster policy.
 * create a collection with as many replicas as there are nodes.
 * add one more replica to any node. Now this node has two replicas, all other 
nodes have one. 
 * define the following cluster policy:

{code:java}
{ 'set-cluster-policy': [ {'replica': '<2', 'shard': '#ANY', 'node': '#ANY', 
'strict': true} ] } {code}
This automatically creates a violation because of the existing layout.
 * try adding one more replica. This should fail because no node satisfies the 
rules (there must be at most 1 replica per node). However, the command succeeds 
and adds replica to the node that already has 2 replicas, which clearly 
violates the policy and makes matters even worse.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14409) Existing violations allow bypassing policy rules when adding new replicas



 [ 
https://issues.apache.org/jira/browse/SOLR-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki updated SOLR-14409:

Attachment: SOLR-14409.patch

> Existing violations allow bypassing policy rules when adding new replicas
> -
>
> Key: SOLR-14409
> URL: https://issues.apache.org/jira/browse/SOLR-14409
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: master (9.0), 8.5, 8.6
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
> Attachments: SOLR-14409.patch
>
>
> Steps to reproduce:
>  * start with an empty cluster policy.
>  * create a collection with as many replicas as there are nodes.
>  * add one more replica to any node. Now this node has two replicas, all 
> other nodes have one. 
>  * define the following cluster policy:
> {code:java}
> { 'set-cluster-policy': [ {'replica': '<2', 'shard': '#ANY', 'node': '#ANY', 
> 'strict': true} ] } {code}
> This automatically creates a violation because of the existing layout.
>  * try adding one more replica. This should fail because no node satisfies 
> the rules (there must be at most 1 replica per node). However, the command 
> succeeds and adds replica to the node that already has 2 replicas, which 
> clearly violates the policy and makes matters even worse.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9288) poll_mirrors.py release script doesn't handle HTTPS



 [ 
https://issues.apache.org/jira/browse/LUCENE-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera updated LUCENE-9288:
-
Status: Open  (was: Open)

> poll_mirrors.py release script doesn't handle HTTPS
> ---
>
> Key: LUCENE-9288
> URL: https://issues.apache.org/jira/browse/LUCENE-9288
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/tools
>Affects Versions: 8.5, master (9.0)
>Reporter: Alan Woodward
>Priority: Major
>
> During the 8.5.0 release, the poll_mirrors.py script incorrectly reported 
> that the release artifacts were not on various mirrors or on maven central, 
> because it is configured to hit these endpoints using the `http` schema, 
> where most of them now only accept `https`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9288) poll_mirrors.py release script doesn't handle HTTPS



 [ 
https://issues.apache.org/jira/browse/LUCENE-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera updated LUCENE-9288:
-
Attachment: (was: poll-mirrors.patch)

> poll_mirrors.py release script doesn't handle HTTPS
> ---
>
> Key: LUCENE-9288
> URL: https://issues.apache.org/jira/browse/LUCENE-9288
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/tools
>Affects Versions: master (9.0), 8.5
>Reporter: Alan Woodward
>Priority: Major
> Attachments: poll-mirrors.patch
>
>
> During the 8.5.0 release, the poll_mirrors.py script incorrectly reported 
> that the release artifacts were not on various mirrors or on maven central, 
> because it is configured to hit these endpoints using the `http` schema, 
> where most of them now only accept `https`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9288) poll_mirrors.py release script doesn't handle HTTPS



 [ 
https://issues.apache.org/jira/browse/LUCENE-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera updated LUCENE-9288:
-
Attachment: poll-mirrors.patch
Status: Open  (was: Open)

Attached is the hack I have used for 8.5.1 release. I am sure there is a more 
elegant way but this makes the script work again.

> poll_mirrors.py release script doesn't handle HTTPS
> ---
>
> Key: LUCENE-9288
> URL: https://issues.apache.org/jira/browse/LUCENE-9288
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/tools
>Affects Versions: 8.5, master (9.0)
>Reporter: Alan Woodward
>Priority: Major
> Attachments: poll-mirrors.patch
>
>
> During the 8.5.0 release, the poll_mirrors.py script incorrectly reported 
> that the release artifacts were not on various mirrors or on maven central, 
> because it is configured to hit these endpoints using the `http` schema, 
> where most of them now only accept `https`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14409) Existing violations allow bypassing policy rules when adding new replicas



[ 
https://issues.apache.org/jira/browse/SOLR-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084714#comment-17084714
 ] 

Andrzej Bialecki commented on SOLR-14409:
-

This patch illustrates the problem.

This may be a bug in {{AddReplicaSuggester}} or in {{Suggester.isLessSerious / 
containsNewErrors}} .

{{AddReplicaSuggester}} tries all nodes but they all produce new violations - 
except for the one that already has a violation. So for all other nodes the 
condition in {{AddReplicaSuggester:55 if (!containsNewErrors(errs))}} is always 
false because they produce new errors. As a side-effect of this the variable 
{{leastSeriousViolation}} is never assigned.

Finally, for the node that already has a violation the change does not produce 
a new violation, it only increases the severity of the existing one - but the 
code doesn't check this because {{leastSeriousViolation}} is null, so it treats 
the current error as the least serious.

This may be the conceptual problem here - if there were no other errors then 
shouldn't the current errors be the most serious? but in this case there's 
already a pre-existing violation on this node so perhaps the 
{{leastSeriousViolation}} should always be initialized with the existing 
violations? (I tried it and many unit tests started failing...)

> Existing violations allow bypassing policy rules when adding new replicas
> -
>
> Key: SOLR-14409
> URL: https://issues.apache.org/jira/browse/SOLR-14409
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: master (9.0), 8.5, 8.6
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
> Attachments: SOLR-14409.patch
>
>
> Steps to reproduce:
>  * start with an empty cluster policy.
>  * create a collection with as many replicas as there are nodes.
>  * add one more replica to any node. Now this node has two replicas, all 
> other nodes have one. 
>  * define the following cluster policy:
> {code:java}
> { 'set-cluster-policy': [ {'replica': '<2', 'shard': '#ANY', 'node': '#ANY', 
> 'strict': true} ] } {code}
> This automatically creates a violation because of the existing layout.
>  * try adding one more replica. This should fail because no node satisfies 
> the rules (there must be at most 1 replica per node). However, the command 
> succeeds and adds replica to the node that already has 2 replicas, which 
> clearly violates the policy and makes matters even worse.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9288) poll_mirrors.py release script doesn't handle HTTPS

2020-04-16 Thread ASF subversion and git services (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera updated LUCENE-9288:
-
Attachment: poll-mirrors.patch
Status: Open  (was: Open)

> poll_mirrors.py release script doesn't handle HTTPS
> ---
>
> Key: LUCENE-9288
> URL: https://issues.apache.org/jira/browse/LUCENE-9288
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/tools
>Affects Versions: 8.5, master (9.0)
>Reporter: Alan Woodward
>Priority: Major
> Attachments: poll-mirrors.patch
>
>
> During the 8.5.0 release, the poll_mirrors.py script incorrectly reported 
> that the release artifacts were not on various mirrors or on maven central, 
> because it is configured to hit these endpoints using the `http` schema, 
> where most of them now only accept `https`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9288) poll_mirrors.py release script doesn't handle HTTPS

2020-04-16 Thread Alan Woodward (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084731#comment-17084731
 ] 

Alan Woodward commented on LUCENE-9288:
---

+1, certainly a lot nicer than the monstrosity I came up with

> poll_mirrors.py release script doesn't handle HTTPS
> ---
>
> Key: LUCENE-9288
> URL: https://issues.apache.org/jira/browse/LUCENE-9288
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/tools
>Affects Versions: master (9.0), 8.5
>Reporter: Alan Woodward
>Priority: Major
> Attachments: poll-mirrors.patch
>
>
> During the 8.5.0 release, the poll_mirrors.py script incorrectly reported 
> that the release artifacts were not on various mirrors or on maven central, 
> because it is configured to hit these endpoints using the `http` schema, 
> where most of them now only accept `https`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14291) OldAnalyticsRequestConverter should support fields names with dots



[ 
https://issues.apache.org/jira/browse/SOLR-14291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084774#comment-17084774
 ] 

ASF subversion and git services commented on SOLR-14291:


Commit b24b02840254f7e929a07658ec0f9066a2c5c366 in lucene-solr's branch 
refs/heads/master from Mikhail Khludnev
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b24b028 ]

SOLR-14291: fix regexps to handle dotted fields in Old Analytics params.


> OldAnalyticsRequestConverter should support fields names with dots
> --
>
> Key: SOLR-14291
> URL: https://issues.apache.org/jira/browse/SOLR-14291
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search, SearchComponents - other
>Reporter: Anatolii Siuniaev
>Assignee: Mikhail Khludnev
>Priority: Trivial
> Attachments: SOLR-14291.patch, SOLR-14291.patch, SOLR-14291.patch
>
>
> If you send a query with range facets using old olap-style syntax (see pdf 
> [here|https://issues.apache.org/jira/browse/SOLR-5302]), 
> OldAnalyticsRequestConverter just silently (no exception thrown) omits 
> parameters like
> {code:java}
> olap..rangefacet..start
> {code}
> in case if __ has dots inside (for instance field name is 
> _Project.Value_). And thus no range facets are returned in response.  
> Probably the same happens in case of field faceting. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14291) OldAnalyticsRequestConverter should support fields names with dots

2020-04-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084777#comment-17084777
 ] 

ASF subversion and git services commented on SOLR-14291:


Commit d448f950516a3610d4271af7282fc55b6f176297 in lucene-solr's branch 
refs/heads/branch_8x from Mikhail Khludnev
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d448f95 ]

SOLR-14291: fix regexps to handle dotted fields in Old Analytics params.


> OldAnalyticsRequestConverter should support fields names with dots
> --
>
> Key: SOLR-14291
> URL: https://issues.apache.org/jira/browse/SOLR-14291
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search, SearchComponents - other
>Reporter: Anatolii Siuniaev
>Assignee: Mikhail Khludnev
>Priority: Trivial
> Attachments: SOLR-14291.patch, SOLR-14291.patch, SOLR-14291.patch
>
>
> If you send a query with range facets using old olap-style syntax (see pdf 
> [here|https://issues.apache.org/jira/browse/SOLR-5302]), 
> OldAnalyticsRequestConverter just silently (no exception thrown) omits 
> parameters like
> {code:java}
> olap..rangefacet..start
> {code}
> in case if __ has dots inside (for instance field name is 
> _Project.Value_). And thus no range facets are returned in response.  
> Probably the same happens in case of field faceting. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] s1monw opened a new pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

s1monw opened a new pull request #1434: LUCENE-9324: Add an ID to 
SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434
 
 
   We already have IDs in SegmentInfo, which are useful to uniquely identify 
segments. Having IDs on SegmentCommitInfo is be useful too in order to compare 
commits for equality and make snapshots incremental on generational files.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] s1monw commented on issue #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

s1monw commented on issue #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#issuecomment-614643954
 
 
   Note: there is still a NOCOMMIT in this pr regarding BWC. It's just an idea 
we can build on or even move to using `null` as an indicator that there is no 
id.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

mikemccand commented on a change in pull request #1434: LUCENE-9324: Add an ID 
to SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409553076
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
 ##
 @@ -3081,7 +3081,7 @@ private SegmentCommitInfo 
copySegmentAsIs(SegmentCommitInfo info, String segName
   info.info.getUseCompoundFile(), 
info.info.getCodec(), 
   info.info.getDiagnostics(), 
info.info.getId(), info.info.getAttributes(), info.info.getIndexSort());
 SegmentCommitInfo newInfoPerCommit = new SegmentCommitInfo(newInfo, 
info.getDelCount(), info.getSoftDelCount(), info.getDelGen(),
-   
info.getFieldInfosGen(), info.getDocValuesGen());
+   
info.getFieldInfosGen(), info.getDocValuesGen(), info.getId());
 
 Review comment:
   This happens during `IndexWriter.addIndexes(Directory[])` right?  I wonder 
whether we should give a new id instead of reusing the old one?  E.g. the 
segment (likely) now has a new name, and is in a different `Directory`, and is 
copied/forked from a prior segment, so maybe it should get a new `id`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

mikemccand commented on a change in pull request #1434: LUCENE-9324: Add an ID 
to SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409556321
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java
 ##
 @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory 
directory, ChecksumIndexIn
   if (softDelCount + delCount > info.maxDoc()) {
 throw new CorruptIndexException("invalid deletion count: " + 
softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input);
   }
-  SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, 
softDelCount, delGen, fieldInfosGen, dvGen);
+  final byte[] sciId;
+  if (format > VERSION_74) {
+sciId = new byte[StringHelper.ID_LENGTH];
+input.readBytes(sciId, 0, sciId.length);
+  } else {
+sciId = infos.id;
+// NOCOMMIT can we do this? it would at least give us consistent BWC 
but we can't identify the same SCI in different commits
 
 Review comment:
   Good question ... maybe we could use `info.getId()`?   Then it'd be unique 
across `SCI`, but, shared across `SI` which is also weird.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

mikemccand commented on a change in pull request #1434: LUCENE-9324: Add an ID 
to SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409554301
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/SegmentCommitInfo.java
 ##
 @@ -388,4 +399,17 @@ public SegmentCommitInfo clone() {
   final int getDelCount(boolean includeSoftDeletes) {
 return includeSoftDeletes ? getDelCount() + getSoftDelCount() : 
getDelCount();
   }
+
+  private void generationAdvanced() {
+sizeInBytes = -1;
+id = null;
+  }
+
+  public byte[] getId() {
+if (id == null) {
+  // we advanced a generation - need to generate a new ID
+  id = StringHelper.randomId();
 
 Review comment:
   Do we need some thread safety here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to 
SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409575955
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/SegmentCommitInfo.java
 ##
 @@ -388,4 +399,17 @@ public SegmentCommitInfo clone() {
   final int getDelCount(boolean includeSoftDeletes) {
 return includeSoftDeletes ? getDelCount() + getSoftDelCount() : 
getDelCount();
   }
+
+  private void generationAdvanced() {
+sizeInBytes = -1;
+id = null;
+  }
+
+  public byte[] getId() {
+if (id == null) {
+  // we advanced a generation - need to generate a new ID
+  id = StringHelper.randomId();
 
 Review comment:
   yeah good question, as far as I can tell we never read or write any of the 
member vars on this class unless we hold a lock that protects it. But it's a 
mess so I guess we should. Yet, if we do that we should make every method on 
this class synced no?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to 
SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409577804
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java
 ##
 @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory 
directory, ChecksumIndexIn
   if (softDelCount + delCount > info.maxDoc()) {
 throw new CorruptIndexException("invalid deletion count: " + 
softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input);
   }
-  SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, 
softDelCount, delGen, fieldInfosGen, dvGen);
+  final byte[] sciId;
+  if (format > VERSION_74) {
+sciId = new byte[StringHelper.ID_LENGTH];
+input.readBytes(sciId, 0, sciId.length);
+  } else {
+sciId = infos.id;
+// NOCOMMIT can we do this? it would at least give us consistent BWC 
but we can't identify the same SCI in different commits
 
 Review comment:
   I don't think we should use info.getId() since that would mean that the same 
SegmentInfo instances are treated the same even if two IW made changes to it 
and it's generations. The way we have it now it's only considered the same if 
the overall commit is the same which is good i guess?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to 
SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409578428
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
 ##
 @@ -3081,7 +3081,7 @@ private SegmentCommitInfo 
copySegmentAsIs(SegmentCommitInfo info, String segName
   info.info.getUseCompoundFile(), 
info.info.getCodec(), 
   info.info.getDiagnostics(), 
info.info.getId(), info.info.getAttributes(), info.info.getIndexSort());
 SegmentCommitInfo newInfoPerCommit = new SegmentCommitInfo(newInfo, 
info.getDelCount(), info.getSoftDelCount(), info.getDelGen(),
-   
info.getFieldInfosGen(), info.getDocValuesGen());
+   
info.getFieldInfosGen(), info.getDocValuesGen(), info.getId());
 
 Review comment:
   we do share the `info.info.getId()` here as well so I think we should be 
consistent?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] ErickErickson closed pull request #1428: LUCENE-7788: fail precommit on unparameterised log.trace messages

ErickErickson closed pull request #1428: LUCENE-7788: fail precommit on 
unparameterised log.trace messages
URL: https://github.com/apache/lucene-solr/pull/1428
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] ErickErickson commented on issue #1428: LUCENE-7788: fail precommit on unparameterised log.trace messages

ErickErickson commented on issue #1428: LUCENE-7788: fail precommit on 
unparameterised log.trace messages
URL: https://github.com/apache/lucene-solr/pull/1428#issuecomment-614685762
 
 
   Re-doing


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-11632) Creating an collection with an empty node set logs a WARN

2020-04-16 Thread Ilan Ginzburg (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-11632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084919#comment-17084919
 ] 

Ilan Ginzburg commented on SOLR-11632:
--

Shouldn't that log mention "nodes" instead of "cores" as a fix?

Given only live nodes are considered for replica placement, even a non EMPTY 
set of nodes can lead to this.

> Creating an collection with an empty node set logs a WARN
> -
>
> Key: SOLR-11632
> URL: https://issues.apache.org/jira/browse/SOLR-11632
> Project: Solr
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Priority: Minor
>
> When I create a collection with an empty node set I get a message like this 
> in the logs
> {code}
> 14127 WARN  
> (OverseerThreadFactory-12-thread-3-processing-n:127.0.0.1:61605_solr) 
> [n:127.0.0.1:61605_solr] o.a.s.c.CreateCollectionCmd It is unusual to 
> create a collection (backuprestore_restored) without cores.
> {code}
> Should we just remove this? A user who uses EMPTY will get this message. A 
> user who doesn't pass a set of candidate nodes then the collection creation 
> will fail anyways



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-9830) Once IndexWriter is closed due to some RunTimeException like FileSystemException, It never return to normal unless restart the Solr JVM

2020-04-16 Thread Guilherme Zanetta Simoni (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084940#comment-17084940
 ] 

Guilherme Zanetta Simoni commented on SOLR-9830:


It happens on 8.x too.

> Once IndexWriter is closed due to some RunTimeException like 
> FileSystemException, It never return to normal unless restart the Solr JVM
> ---
>
> Key: SOLR-9830
> URL: https://issues.apache.org/jira/browse/SOLR-9830
> Project: Solr
>  Issue Type: Bug
>  Components: update
>Affects Versions: 6.2
> Environment: Red Hat 4.4.7-3,SolrCloud 
>Reporter: Daisy.Yuan
>Priority: Major
>
> 1. Collection coll_test, has 9 shards, each has two replicas in different 
> solr instances.
> 2. When update documens to the collection use Solrj, inject the exhausted 
> handle fault to one solr instance like solr1.
> 3. Update to col_test_shard3_replica1(It's leader) is failed due to 
> FileSystemException, and IndexWriter is closed.
> 4. And clear the fault, the col_test_shard3_replica1 (is leader) is always 
> cannot be updated documens and the numDocs is always less than the standby 
> replica.
> 5. After Solr instance restart, It can update documens and the numDocs is  
> consistent  between the two replicas.
> I think in this case in Solr Cloud mode, it should recovery itself and not 
> restart to recovery the solrcore update function.
>  2016-12-01 14:13:00,932 | INFO  | http-nio-21101-exec-20 | 
> [DWPT][http-nio-21101-exec-20]: now abort | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,932 | INFO  | http-nio-21101-exec-20 | 
> [DWPT][http-nio-21101-exec-20]: done abort | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,932 | INFO  | http-nio-21101-exec-20 | 
> [IW][http-nio-21101-exec-20]: hit exception updating document | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,933 | INFO  | http-nio-21101-exec-20 | 
> [IW][http-nio-21101-exec-20]: hit tragic FileSystemException inside 
> updateDocument | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,933 | INFO  | http-nio-21101-exec-20 | 
> [IW][http-nio-21101-exec-20]: rollback | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,933 | INFO  | http-nio-21101-exec-20 | 
> [IW][http-nio-21101-exec-20]: all running merges have aborted | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,934 | INFO  | http-nio-21101-exec-20 | 
> [IW][http-nio-21101-exec-20]: rollback: done finish merges | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,934 | INFO  | http-nio-21101-exec-20 | 
> [DW][http-nio-21101-exec-20]: abort | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,939 | INFO  | commitScheduler-46-thread-1 | 
> [DWPT][commitScheduler-46-thread-1]: flush postings as segment _4h9 
> numDocs=3798 | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO  | commitScheduler-46-thread-1 | 
> [DWPT][commitScheduler-46-thread-1]: now abort | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO  | commitScheduler-46-thread-1 | 
> [DWPT][commitScheduler-46-thread-1]: done abort | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO  | http-nio-21101-exec-20 | 
> [DW][http-nio-21101-exec-20]: done abort success=true | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO  | commitScheduler-46-thread-1 | 
> [DW][commitScheduler-46-thread-1]: commitScheduler-46-thread-1 
> finishFullFlush success=false | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO  | http-nio-21101-exec-20 | 
> [IW][http-nio-21101-exec-20]: rollback: 
> infos=_4g7(6.2.0):C59169/23684:delGen=4 _4gq(6.2.0):C67474/11636:delGen=1 
> _4gg(6.2.0):C64067/15664:delGen=2 _4gr(6.2.0):C13131 _4gs(6.2.0):C966 
> _4gt(6.2.0):C4543 _4gu(6.2.0):C6960 _4gv(6.2.0):C2544 | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO  | commitScheduler-46-thread-1 | 
> [IW][commitScheduler-46-thread-1]: hit exception during NRT reader | 
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,967 | INFO  | http-nio-21101-exec-20 |

[jira] [Commented] (LUCENE-9317) Resolve package name conflicts for StandardAnalyzer to allow Java module system support

2020-04-16 Thread Tomoko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085063#comment-17085063
 ] 

Tomoko Uchida commented on LUCENE-9317:
---

Hi [~oobles] you can use "@" mention when you need feedback from specific 
person/people to gain the attention (and sorry, I seem not to be the one here).

[~uschindler] would you give some feedback or thoughts.

{quote}
Apologies that the commit is difficult to review. I staged the changes and 
moves in one commit when I should have done it as moves then changes. Let me 
know if you'd rather I redo it.
{quote} 
I'm not sure what is the preferred way, but you could make some small, 
incomplete patch/PR to describe your idea for review. Or I think some detailed 
design discussion (without patch) would also be okay before touching the 
codebase, since the problem you picked up would not be about java 
implementation, but package/module structure. ?

> Resolve package name conflicts for StandardAnalyzer to allow Java module 
> system support
> ---
>
> Key: LUCENE-9317
> URL: https://issues.apache.org/jira/browse/LUCENE-9317
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: master (9.0)
>Reporter: David Ryan
>Priority: Major
>  Labels: build, features
>
>  
> To allow Lucene to be modularised there are a few preparatory tasks to be 
> completed prior to this being possible.  The Java module system requires that 
> jars do not use the same package name in different jars.  The lucene-core and 
> lucene-analyzers-common both share the package 
> org.apache.lucene.analysis.standard.
> Possible resolutions to this issue are discussed by Uwe on the mailing list 
> here:
>  
> [http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3CCAM21Rt8FHOq_JeUSELhsQJH0uN0eKBgduBQX4fQKxbs49TLqzA%40mail.gmail.com%3E]
> {quote}About StandardAnalyzer: Unfortunately I aggressively complained a 
> while back when Mike McCandless wanted to move standard analyzer out of the 
> analysis package into core (“for convenience”). This was a bad step, and IMHO 
> we should revert that or completely rename the packages and everything. The 
> problem here is: As the analysis services are only part of lucene-analyzers, 
> we had to leave the factory classes there, but move the implementation 
> classes in core. The package has to be the same. The only way around that is 
> to move the analysis factory framework also to core (I would not be against 
> that). This would include all factory base classes and the service loading 
> stuff. Then we can move standard analyzer and some of the filters/tokenizers 
> including their factories to core an that problem would be solved.
> {quote}
> There are two options here, either move factory framework into core or revert 
> StandardAnalyzer back to lucene-analyzers.  In the email, the solution lands 
> on reverting back as per the task list:
> {quote}Add some preparatory issues to cleanup class hierarchy: Move Analysis 
> SPI to core / remove StandardAnalyzer and related classes out of core back to 
> anaysis
> {quote}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers

madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer 
level custom requesthandlers
URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409694609
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/core/CoreContainer.java
 ##
 @@ -648,8 +655,8 @@ public void load() {
   pkiAuthenticationPlugin.initializeMetrics(solrMetricsContext, 
"/authentication/pki");
   TracerConfigurator.loadTracer(loader, 
cfg.getTracerConfiguratorPluginInfo(), getZkController().getZkStateReader());
   packageLoader = new PackageLoader(this);
-  containerHandlers.getApiBag().register(new 
AnnotatedApi(packageLoader.getPackageAPI().editAPI), Collections.EMPTY_MAP);
-  containerHandlers.getApiBag().register(new 
AnnotatedApi(packageLoader.getPackageAPI().readAPI), Collections.EMPTY_MAP);
+  
containerHandlers.getApiBag().registerObject(packageLoader.getPackageAPI().editAPI);
 
 Review comment:
   There's a minor inconsistency here between edit/read and write/read, maybe 
we can standardize on one?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers

madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer 
level custom requesthandlers
URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409691926
 
 

 ##
 File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/impl/BaseHttpSolrClient.java
 ##
 @@ -62,6 +63,9 @@ public static RemoteExecutionException create(String host, 
NamedList errResponse
   if (errObj != null) {
 Number code = (Number) getObjectByPath(errObj, true, 
Collections.singletonList("code"));
 String msg = (String) getObjectByPath(errObj, true, 
Collections.singletonList("msg"));
+if(msg == null) msg = "";
 
 Review comment:
   There's already a null check later, what kinds of messages do we get here 
that are useful and don't leak too much internals?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers

madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer 
level custom requesthandlers
URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409685318
 
 

 ##
 File path: solr/core/src/test-files/runtimecode/sig.txt
 ##
 @@ -69,6 +69,14 @@ openssl dgst -sha1 -sign ../cryptokeys/priv_key512.pem 
expressible.jar.bin | ope
 
 
ZOT11arAiPmPZYOHzqodiNnxO9pRyRozWZEBX8XGjU1/HJptFnZK+DI7eXnUtbNaMcbXE2Ze8hh4M/eGyhY8BQ==
 
+openssl dgst -sha1 -sign priv_key512.pem containerplugin.v.1.jar.bin | openssl 
enc -base64 | sed 's/+/%2B/g' | tr -d \\n | sed
 
 Review comment:
   Does this need to be `../cryptokeys/priv_key512.pem`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers

madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer 
level custom requesthandlers
URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409690893
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/handler/admin/TestApiFramework.java
 ##
 @@ -199,6 +200,25 @@ public void testPayload() {
 
   }
 
+  public void testApiWrapper() {
+Class klas = ApiWithConstructor.class;
 
 Review comment:
   unclear what this test is testing


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers

madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer 
level custom requesthandlers
URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409689441
 
 

 ##
 File path: solr/core/src/test/org/apache/solr/handler/TestContainerPlugin.java
 ##
 @@ -0,0 +1,250 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.handler;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.Callable;
+
+import com.google.common.collect.ImmutableMap;
+import org.apache.solr.api.Command;
+import org.apache.solr.api.EndPoint;
+import org.apache.solr.client.solrj.SolrClient;
+import org.apache.solr.client.solrj.SolrServerException;
+import org.apache.solr.client.solrj.impl.BaseHttpSolrClient;
+import org.apache.solr.client.solrj.request.V2Request;
+import org.apache.solr.client.solrj.request.beans.Package;
+import org.apache.solr.client.solrj.request.beans.PluginMeta;
+import org.apache.solr.client.solrj.response.V2Response;
+import org.apache.solr.cloud.MiniSolrCloudCluster;
+import org.apache.solr.cloud.SolrCloudTestCase;
+import org.apache.solr.common.NavigableObject;
+import org.apache.solr.common.util.Utils;
+import org.apache.solr.filestore.PackageStoreAPI;
+import org.apache.solr.filestore.TestDistribPackageStore;
+import org.apache.solr.pkg.TestPackages;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.security.PermissionNameProvider;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+import static java.util.Collections.singletonMap;
+import static org.apache.solr.client.solrj.SolrRequest.METHOD.GET;
+import static org.apache.solr.client.solrj.SolrRequest.METHOD.POST;
+import static org.apache.solr.filestore.TestDistribPackageStore.readFile;
+import static org.apache.solr.filestore.TestDistribPackageStore.uploadKey;
+import static org.hamcrest.CoreMatchers.containsString;
+
+public class TestContainerPlugin extends SolrCloudTestCase {
+
+  @Before
+  public void setup() {
+System.setProperty("enable.packages", "true");
+  }
+
+  @After
+  public void teardown() {
+System.clearProperty("enable.packages");
+  }
+
+  @Test
+  public void testApi() throws Exception {
+MiniSolrCloudCluster cluster =
 
 Review comment:
   cluster setup and teardown can go in Before/After methods


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers

madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer 
level custom requesthandlers
URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409693320
 
 

 ##
 File path: solr/core/src/test/org/apache/solr/handler/TestContainerPlugin.java
 ##
 @@ -0,0 +1,250 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.handler;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.Callable;
+
+import com.google.common.collect.ImmutableMap;
+import org.apache.solr.api.Command;
+import org.apache.solr.api.EndPoint;
+import org.apache.solr.client.solrj.SolrClient;
+import org.apache.solr.client.solrj.SolrServerException;
+import org.apache.solr.client.solrj.impl.BaseHttpSolrClient;
+import org.apache.solr.client.solrj.request.V2Request;
+import org.apache.solr.client.solrj.request.beans.Package;
+import org.apache.solr.client.solrj.request.beans.PluginMeta;
+import org.apache.solr.client.solrj.response.V2Response;
+import org.apache.solr.cloud.MiniSolrCloudCluster;
+import org.apache.solr.cloud.SolrCloudTestCase;
+import org.apache.solr.common.NavigableObject;
+import org.apache.solr.common.util.Utils;
+import org.apache.solr.filestore.PackageStoreAPI;
+import org.apache.solr.filestore.TestDistribPackageStore;
+import org.apache.solr.pkg.TestPackages;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.security.PermissionNameProvider;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+import static java.util.Collections.singletonMap;
+import static org.apache.solr.client.solrj.SolrRequest.METHOD.GET;
+import static org.apache.solr.client.solrj.SolrRequest.METHOD.POST;
+import static org.apache.solr.filestore.TestDistribPackageStore.readFile;
+import static org.apache.solr.filestore.TestDistribPackageStore.uploadKey;
+import static org.hamcrest.CoreMatchers.containsString;
+
+public class TestContainerPlugin extends SolrCloudTestCase {
+
+  @Before
+  public void setup() {
+System.setProperty("enable.packages", "true");
+  }
+
+  @After
+  public void teardown() {
+System.clearProperty("enable.packages");
+  }
+
+  @Test
+  public void testApi() throws Exception {
+MiniSolrCloudCluster cluster =
+configureCluster(4)
+.withJettyConfig(jetty -> jetty.enableV2(true))
+.configure();
+String errPath = "/error/details[0]/errorMessages[0]";
+try {
+  PluginMeta plugin = new PluginMeta();
+  plugin.name = "testplugin";
+  plugin.klass = C2.class.getName();
+  V2Request req = new V2Request.Builder("/cluster/plugin")
+  .forceV2(true)
+  .withMethod(POST)
+  .withPayload(singletonMap("add", plugin))
+  .build();
+  expectError(req, cluster.getSolrClient(), errPath, "Must have a no-arg 
constructor or CoreContainer constructor and it must not be a non static inner 
class");
+
+  plugin.klass = C1.class.getName();
+  expectError(req, cluster.getSolrClient(), errPath, "Invalid class, no 
@EndPoint annotation");
+
+  plugin.klass = C3.class.getName();
+  req.process(cluster.getSolrClient());
+
+  V2Response rsp = new V2Request.Builder("/cluster/plugin")
+  .forceV2(true)
+  .withMethod(GET)
+  .build()
+  .process(cluster.getSolrClient());
+  assertEquals(C3.class.getName(), rsp._getStr("/plugin/testplugin/class", 
null));
+
+  TestDistribPackageStore.assertResponseValues(10,
+  () -> new V2Request.Builder("/plugin/my/plugin")
+  .forceV2(true)
+  .withMethod(GET)
+  .build().process(cluster.getSolrClient()),
+  ImmutableMap.of("/testkey", "testval"));
+
+  new V2Request.Builder("/cluster/plugin")
+  .withMethod(POST)
+  .forceV2(true)
+  .withPayload("{remove : testplugin}")
+  .build()
+  .process(cluster.getSolrClient());
+
+  rsp = new V2Request.Builder("/cluster/plugin")
+  .forceV2(true)
+  .withMethod(GET)
+  .build()
+  .process(cluster.getSolrClient());
+  assertEquals(null,

[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers

madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer 
level custom requesthandlers
URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409680029
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/handler/admin/ContainerPluginsApi.java
 ##
 @@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.handler.admin;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.function.Function;
+import java.util.function.Supplier;
+
+import org.apache.solr.api.AnnotatedApi;
+import org.apache.solr.api.Command;
+import org.apache.solr.api.CustomContainerPlugins;
+import org.apache.solr.api.EndPoint;
+import org.apache.solr.api.PayloadObj;
+import org.apache.solr.client.solrj.SolrRequest.METHOD;
+import org.apache.solr.client.solrj.request.beans.PluginMeta;
+import org.apache.solr.common.cloud.SolrZkClient;
+import org.apache.solr.common.cloud.ZkStateReader;
+import org.apache.solr.common.util.Utils;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.security.PermissionNameProvider;
+import org.apache.zookeeper.KeeperException;
+import org.apache.zookeeper.data.Stat;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.apache.lucene.util.IOUtils.closeWhileHandlingException;
+
+
+public class ContainerPluginsApi {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  public static final String PLUGIN = "plugin";
+  private final Supplier zkClientSupplier;
+  private final CoreContainer coreContainer;
+  public final Read readAPI = new Read();
+  public final Edit editAPI = new Edit();
+
+  public ContainerPluginsApi(CoreContainer coreContainer) {
+this.zkClientSupplier = coreContainer.zkClientSupplier;
+this.coreContainer = coreContainer;
+  }
+
+  @EndPoint(method = METHOD.GET,
+  path = "/cluster/plugin",
+  permission = PermissionNameProvider.Name.COLL_READ_PERM)
+  public class Read {
+
+@Command
+public void list(SolrQueryRequest req, SolrQueryResponse rsp) throws 
IOException {
+  rsp.add(PLUGIN, plugins(zkClientSupplier));
+}
+  }
+
+  @EndPoint(method = METHOD.POST,
+  path = "/cluster/plugin",
+  permission = PermissionNameProvider.Name.COLL_EDIT_PERM)
+  public class Edit {
+
+@Command(name = "add")
+public void add(SolrQueryRequest req, SolrQueryResponse rsp, 
PayloadObj payload) throws IOException {
+  PluginMeta info = payload.get();
+  validateConfig(payload, info);
+  if(payload.hasError()) return;
+  persistPlugins(map -> {
+if (map.containsKey(info.name)) {
+  payload.addError(info.name + " already exists");
+  return null;
+}
+map.put(info.name, info);
+return map;
+  });
+}
+
+@Command(name = "remove")
+public void remove(SolrQueryRequest req, SolrQueryResponse rsp, 
PayloadObj payload) throws IOException {
 
 Review comment:
   Should this be METHOD.DELETE instead of a post?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers

madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer 
level custom requesthandlers
URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409688568
 
 

 ##
 File path: solr/core/src/test/org/apache/solr/handler/TestContainerPlugin.java
 ##
 @@ -0,0 +1,250 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.handler;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.Callable;
+
+import com.google.common.collect.ImmutableMap;
+import org.apache.solr.api.Command;
+import org.apache.solr.api.EndPoint;
+import org.apache.solr.client.solrj.SolrClient;
+import org.apache.solr.client.solrj.SolrServerException;
+import org.apache.solr.client.solrj.impl.BaseHttpSolrClient;
+import org.apache.solr.client.solrj.request.V2Request;
+import org.apache.solr.client.solrj.request.beans.Package;
+import org.apache.solr.client.solrj.request.beans.PluginMeta;
+import org.apache.solr.client.solrj.response.V2Response;
+import org.apache.solr.cloud.MiniSolrCloudCluster;
+import org.apache.solr.cloud.SolrCloudTestCase;
+import org.apache.solr.common.NavigableObject;
+import org.apache.solr.common.util.Utils;
+import org.apache.solr.filestore.PackageStoreAPI;
+import org.apache.solr.filestore.TestDistribPackageStore;
+import org.apache.solr.pkg.TestPackages;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.security.PermissionNameProvider;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+import static java.util.Collections.singletonMap;
+import static org.apache.solr.client.solrj.SolrRequest.METHOD.GET;
+import static org.apache.solr.client.solrj.SolrRequest.METHOD.POST;
+import static org.apache.solr.filestore.TestDistribPackageStore.readFile;
+import static org.apache.solr.filestore.TestDistribPackageStore.uploadKey;
+import static org.hamcrest.CoreMatchers.containsString;
+
+public class TestContainerPlugin extends SolrCloudTestCase {
+
+  @Before
+  public void setup() {
+System.setProperty("enable.packages", "true");
+  }
+
+  @After
+  public void teardown() {
+System.clearProperty("enable.packages");
+  }
+
+  @Test
+  public void testApi() throws Exception {
+MiniSolrCloudCluster cluster =
+configureCluster(4)
+.withJettyConfig(jetty -> jetty.enableV2(true))
+.configure();
+String errPath = "/error/details[0]/errorMessages[0]";
+try {
+  PluginMeta plugin = new PluginMeta();
+  plugin.name = "testplugin";
+  plugin.klass = C2.class.getName();
+  V2Request req = new V2Request.Builder("/cluster/plugin")
+  .forceV2(true)
+  .withMethod(POST)
+  .withPayload(singletonMap("add", plugin))
+  .build();
+  expectError(req, cluster.getSolrClient(), errPath, "Must have a no-arg 
constructor or CoreContainer constructor and it must not be a non static inner 
class");
+
+  plugin.klass = C1.class.getName();
+  expectError(req, cluster.getSolrClient(), errPath, "Invalid class, no 
@EndPoint annotation");
+
+  plugin.klass = C3.class.getName();
+  req.process(cluster.getSolrClient());
+
+  V2Response rsp = new V2Request.Builder("/cluster/plugin")
+  .forceV2(true)
+  .withMethod(GET)
+  .build()
+  .process(cluster.getSolrClient());
+  assertEquals(C3.class.getName(), rsp._getStr("/plugin/testplugin/class", 
null));
+
+  TestDistribPackageStore.assertResponseValues(10,
+  () -> new V2Request.Builder("/plugin/my/plugin")
+  .forceV2(true)
+  .withMethod(GET)
+  .build().process(cluster.getSolrClient()),
+  ImmutableMap.of("/testkey", "testval"));
+
+  new V2Request.Builder("/cluster/plugin")
+  .withMethod(POST)
+  .forceV2(true)
+  .withPayload("{remove : testplugin}")
+  .build()
+  .process(cluster.getSolrClient());
+
+  rsp = new V2Request.Builder("/cluster/plugin")
+  .forceV2(true)
+  .withMethod(GET)
+  .build()
+  .process(cluster.getSolrClient());
+  assertEquals(null,

[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers

madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer 
level custom requesthandlers
URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409676264
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/api/ApiBag.java
 ##
 @@ -134,6 +142,14 @@ static void registerIntrospect(List l, 
PathTrie registry, Map

[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers

madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer 
level custom requesthandlers
URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409684159
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/pkg/PackageListeners.java
 ##
 @@ -63,13 +63,13 @@ public synchronized void removeListener(Listener listener) 
{
   }
 
   synchronized void packagesUpdated(List pkgs) {
-MDCLoggingContext.setCore(core);
 
 Review comment:
   Why do we need this check? setCore already handles nulls.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers

2020-04-16 Thread ASF subversion and git services (Jira)

madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer 
level custom requesthandlers
URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409687566
 
 

 ##
 File path: solr/core/src/test-files/runtimecode/MyPlugin.java
 ##
 @@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.handler;
+
+import org.apache.solr.api.Command;
+import org.apache.solr.api.EndPoint;
+import org.apache.solr.client.solrj.SolrRequest.METHOD;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.security.PermissionNameProvider;
+
+@EndPoint(path = "/plugin/my/path",
+method = METHOD.GET,
+permission = PermissionNameProvider.Name.CONFIG_READ_PERM)
+public class MyPlugin {
 
 Review comment:
   I assume this is used to generate containerplugin.v.1.jar.bin and v.2? 
Should we have two source files for those? Can we do this some other way, 
besides checking in binaries to the repo?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14387) SolrClient.getById() does not escape parameter separators within ids



[ 
https://issues.apache.org/jira/browse/SOLR-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085095#comment-17085095
 ] 

ASF subversion and git services commented on SOLR-14387:


Commit 74ecc13816fb6aae6e512e2e9d815459e235a120 in lucene-solr's branch 
refs/heads/master from Markus Schuch
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=74ecc13 ]

SOLR-14387 add testcase for ids with separators to GetByIdTest and fix 
SolrClient to escape ids properly


> SolrClient.getById() does not escape parameter separators within ids
> 
>
> Key: SOLR-14387
> URL: https://issues.apache.org/jira/browse/SOLR-14387
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 8.5
>Reporter: Markus Schuch
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Having a solr document with a comma in its id (e.g. "A,B"), 
> {{SolrClient.getById()}} is not able to retrieve this document, because it 
> queries {{/get?ids=A,B}} instead of {{/get?ids=A\,B}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-14387) SolrClient.getById() does not escape parameter separators within ids

2020-04-16 Thread Mike Drob (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob resolved SOLR-14387.
--
Fix Version/s: master (9.0)
 Assignee: Mike Drob
   Resolution: Fixed

This is a good catch and a good fix! Thanks for the patch, I've pushed it to 
master. I'm a little bit concerned that folks might be improperly relying on 
the old behavior to get multiple ids thinking that it is a convenient shortcut, 
so I'm hesitant to put this in branch_8x. Let me know if you disagree!

> SolrClient.getById() does not escape parameter separators within ids
> 
>
> Key: SOLR-14387
> URL: https://issues.apache.org/jira/browse/SOLR-14387
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 8.5
>Reporter: Markus Schuch
>Assignee: Mike Drob
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Having a solr document with a comma in its id (e.g. "A,B"), 
> {{SolrClient.getById()}} is not able to retrieve this document, because it 
> queries {{/get?ids=A,B}} instead of {{/get?ids=A\,B}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob closed pull request #1404: SOLR-14387: SolrClient.getById() does not escape parameter separators within ids

madrob closed pull request #1404: SOLR-14387: SolrClient.getById() does not 
escape parameter separators within ids
URL: https://github.com/apache/lucene-solr/pull/1404
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on issue #1404: SOLR-14387: SolrClient.getById() does not escape parameter separators within ids

madrob commented on issue #1404: SOLR-14387: SolrClient.getById() does not 
escape parameter separators within ids
URL: https://github.com/apache/lucene-solr/pull/1404#issuecomment-614775130
 
 
   Fixed in 74ecc138


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14409) Existing violations allow bypassing policy rules when adding new replicas



 [ 
https://issues.apache.org/jira/browse/SOLR-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki updated SOLR-14409:

Priority: Critical  (was: Major)

> Existing violations allow bypassing policy rules when adding new replicas
> -
>
> Key: SOLR-14409
> URL: https://issues.apache.org/jira/browse/SOLR-14409
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: master (9.0), 8.5, 8.6
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Critical
> Attachments: SOLR-14409.patch
>
>
> Steps to reproduce:
>  * start with an empty cluster policy.
>  * create a collection with as many replicas as there are nodes.
>  * add one more replica to any node. Now this node has two replicas, all 
> other nodes have one. 
>  * define the following cluster policy:
> {code:java}
> { 'set-cluster-policy': [ {'replica': '<2', 'shard': '#ANY', 'node': '#ANY', 
> 'strict': true} ] } {code}
> This automatically creates a violation because of the existing layout.
>  * try adding one more replica. This should fail because no node satisfies 
> the rules (there must be at most 1 replica per node). However, the command 
> succeeds and adds replica to the node that already has 2 replicas, which 
> clearly violates the policy and makes matters even worse.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14409) Existing violations allow bypassing policy rules when adding new replicas



[ 
https://issues.apache.org/jira/browse/SOLR-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085111#comment-17085111
 ] 

Andrzej Bialecki commented on SOLR-14409:
-

Escalating to Critical as it breaks the autoscaling placement for common 
scenarios.

> Existing violations allow bypassing policy rules when adding new replicas
> -
>
> Key: SOLR-14409
> URL: https://issues.apache.org/jira/browse/SOLR-14409
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: master (9.0), 8.5, 8.6
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Critical
> Attachments: SOLR-14409.patch
>
>
> Steps to reproduce:
>  * start with an empty cluster policy.
>  * create a collection with as many replicas as there are nodes.
>  * add one more replica to any node. Now this node has two replicas, all 
> other nodes have one. 
>  * define the following cluster policy:
> {code:java}
> { 'set-cluster-policy': [ {'replica': '<2', 'shard': '#ANY', 'node': '#ANY', 
> 'strict': true} ] } {code}
> This automatically creates a violation because of the existing layout.
>  * try adding one more replica. This should fail because no node satisfies 
> the rules (there must be at most 1 replica per node). However, the command 
> succeeds and adds replica to the node that already has 2 replicas, which 
> clearly violates the policy and makes matters even worse.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-9324) Give IDs to SegmentCommitInfo

2020-04-16 Thread Simon Willnauer (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer reassigned LUCENE-9324:
---

Assignee: Simon Willnauer

> Give IDs to SegmentCommitInfo
> -
>
> Key: LUCENE-9324
> URL: https://issues.apache.org/jira/browse/LUCENE-9324
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Adrien Grand
>Assignee: Simon Willnauer
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We already have IDs in SegmentInfo, which are useful to uniquely identify 
> segments. Having IDs on SegmentCommitInfo would be useful too in order to 
> compare commits for equality and make snapshots incremental on generational 
> files too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9324) Give IDs to SegmentCommitInfo

2020-04-16 Thread Simon Willnauer (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-9324:

Fix Version/s: 8.6
   master (9.0)

> Give IDs to SegmentCommitInfo
> -
>
> Key: LUCENE-9324
> URL: https://issues.apache.org/jira/browse/LUCENE-9324
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Adrien Grand
>Assignee: Simon Willnauer
>Priority: Minor
> Fix For: master (9.0), 8.6
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We already have IDs in SegmentInfo, which are useful to uniquely identify 
> segments. Having IDs on SegmentCommitInfo would be useful too in order to 
> compare commits for equality and make snapshots incremental on generational 
> files too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14275) Policy calculations are very slow for large clusters and large operations

2020-04-16 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085191#comment-17085191
 ] 

David Smiley commented on SOLR-14275:
-

[~noble.paul] I recall from a slack message a few weeks ago that you made 
_tremendous_ progress.  I'm guessing that's the most recent PR which 
corresponds to the same timeframe.  What's the next step?  Known gotchas / 
TODOs / trade-offs?  Does it need a code review?

> Policy calculations are very slow for large clusters and large operations
> -
>
> Key: SOLR-14275
> URL: https://issues.apache.org/jira/browse/SOLR-14275
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
>  Labels: scaling
> Attachments: SOLR-14275.patch, scenario.txt
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Replica placement calculations performed during collection creation take 
> extremely long time (several minutes) when using a large cluster and creating 
> a large collection (eg. 1000 nodes, 500 shards, 4 replicas).
> Profiling shows that most of the time is spent in 
> {{Row.computeCacheIfAbsent}}, which probably doesn't reuse this cache as much 
> as it should.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14410) Switch from SysV init script to systemd service definition

2020-04-16 Thread Marius Ghita (Jira)

Marius Ghita created SOLR-14410:
---

 Summary: Switch from SysV init script to systemd service definition
 Key: SOLR-14410
 URL: https://issues.apache.org/jira/browse/SOLR-14410
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Marius Ghita
 Attachments: solr.service

The proposed change will incorporate the attached service definition file in 
the solr installation script.

 

More information in the [dev mailing list 
thread|[http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14410) Switch from SysV init script to systemd service definition

2020-04-16 Thread Marius Ghita (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marius Ghita updated SOLR-14410:

Description: 
The proposed change will incorporate the attached service definition file in 
the solr installation script.

 

More information in the 
[http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e]

  was:
The proposed change will incorporate the attached service definition file in 
the solr installation script.

 

More information in the [dev mailing list 
thread|[http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e]]


> Switch from SysV init script to systemd service definition
> --
>
> Key: SOLR-14410
> URL: https://issues.apache.org/jira/browse/SOLR-14410
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marius Ghita
>Priority: Major
> Attachments: solr.service
>
>
> The proposed change will incorporate the attached service definition file in 
> the solr installation script.
>  
> More information in the 
> [http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14410) Switch from SysV init script to systemd service definition

2020-04-16 Thread Marius Ghita (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marius Ghita updated SOLR-14410:

Description: 
The proposed change will incorporate the attached service definition file in 
the solr installation script.

 

More information on the mailinglist 
[http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e]

  was:
The proposed change will incorporate the attached service definition file in 
the solr installation script.

 

More information in the 
[http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e]


> Switch from SysV init script to systemd service definition
> --
>
> Key: SOLR-14410
> URL: https://issues.apache.org/jira/browse/SOLR-14410
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marius Ghita
>Priority: Major
> Attachments: solr.service
>
>
> The proposed change will incorporate the attached service definition file in 
> the solr installation script.
>  
> More information on the mailinglist 
> [http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to 
SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409804272
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java
 ##
 @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory 
directory, ChecksumIndexIn
   if (softDelCount + delCount > info.maxDoc()) {
 throw new CorruptIndexException("invalid deletion count: " + 
softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input);
   }
-  SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, 
softDelCount, delGen, fieldInfosGen, dvGen);
+  final byte[] sciId;
+  if (format > VERSION_74) {
+sciId = new byte[StringHelper.ID_LENGTH];
+input.readBytes(sciId, 0, sciId.length);
+  } else {
+sciId = infos.id;
+// NOCOMMIT can we do this? it would at least give us consistent BWC 
but we can't identify the same SCI in different commits
 
 Review comment:
   When we introduced SegmentInfos#getId, we returned `null` as an ID for older 
segments. This is probably a safer option here as well, as callers can fall 
back to whatever behavior makes sense for them such as using a strong hash of 
the commit files as an ID, or re-downloading all files of the commit all the 
time and giving up incrementality?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to 
SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409795498
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/SegmentCommitInfo.java
 ##
 @@ -79,8 +85,7 @@
 
   /**
* Sole constructor.
-   * 
-   * @param info
+   *  @param info
 
 Review comment:
   nit:
   
   ```suggestion
  * @param info
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness

2020-04-16 Thread Chris M. Hostetter (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085216#comment-17085216
 ] 

Chris M. Hostetter commented on SOLR-13132:
---

Michael: I'm still working my way through the latest changes, but i'm mainly 
surprised by this new {{fullDomainAccs}} concept (my prior suggestions were 
trying to reduce the number of impacts on FacetFieldProcessor API and it's 
impls that don't care about sweeping, this seems to have just "changed" the 
nature of impacts) ...
{quote}The main other change that followed as a consequence of this work is 
that I think it's necessary (if we may now replace {{collectAcc}} with null 
when full domain collection can be accomplished via sweep count collection 
only) to explicitly separate the read-access to full-domain SlotAccs so that 
output can be handled appropriately. Formerly, collectAcc served for both 
collection _and_ read-access for such SlotAccs, but the 1:1 correspondence that 
made that work is no longer a valid assumption; and can't use the other 
existing references that get read for output ({{accs}}, {{otherAccs}}, 
{{deferredAggs}}, etc.) b/c they're handled quite differently (on a 1-slot, 
1-bucket-at-a-time a la carte way). I could be missing something here, but in 
any event that's what explains the introduction of the new 
{{FacetFieldProcessor.fullDomainAccs}} field.
{quote}
Interesting ... yeah, i guess i hadn't really considered how to get values out 
of the {{SweepingAccs}} once we removed them from the {{collectAcc}}.

It seems weird that the processors now have to keep track of a list of 
{{List fullDomainAccs}} to call {{setValues}} on later when the 
{{countAcc}} is already tracking those ~same~ (logical) {{SlotAcc}} instances 
in the form of {{SweepingAcc}} instances - so perhaps it would be 
cleaner/simpler if:
 * {{SweepingAcc}} defined a {{setValues}} method (same method sig as 
{{SlotAcc}}) that by default loops over all the "other" {{SweepingAccs}} it 
wraps (similar to how {{MultiAcc.setValues}} works.
 * {{CountAcc}} overrides {{SlotAcc.setValues()}} to look something like...
{code:java}
@Override public void setValues(SimpleOrderedMap bucket, int slotNum) 
throws IOException {
  super.setValues(bucket, slotNum);
  baseSweepingAcc.setValues(bucket, slotNum);
}
{code}

...?

I'm not suggesting/requesting that you make this change right now ... i haven't 
thought it through enough to have any confidence if it's better/worse then what 
you've got at the moment ... just trying to talk it through and see WDYT since 
you're the most familiar with this code: do you think that would simplify the 
overall changes/impl and help keep the "sweeping" logic in only the classes 
that care/know about sweeping?

> Improve JSON "terms" facet performance when sorted by relatedness 
> --
>
> Key: SOLR-13132
> URL: https://issues.apache.org/jira/browse/SOLR-13132
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module
>Affects Versions: 7.4, master (9.0)
>Reporter: Michael Gibney
>Priority: Major
> Attachments: SOLR-13132-with-cache-01.patch, 
> SOLR-13132-with-cache.patch, SOLR-13132.patch, SOLR-13132_testSweep.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate 
> {{relatedness}} for every term. 
> The current implementation uses a standard uninverted approach (either 
> {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain 
> base docSet, and then uses that initial pass as a pre-filter for a 
> second-pass, inverted approach of fetching docSets for each relevant term 
> (i.e., {{count > minCount}}?) and calculating intersection size of those sets 
> with the domain base docSet.
> Over high-cardinality fields, the overhead of per-term docSet creation and 
> set intersection operations increases request latency to the point where 
> relatedness sort may not be usable in practice (for my use case, even after 
> applying the patch for SOLR-13108, for a field with ~220k unique terms per 
> core, QTime for high-cardinality domain docSets were, e.g.: cardinality 
> 1816684=9000ms, cardinality 5032902=18000ms).
> The attached patch brings the above example QTimes down to a manageable 
> ~300ms and ~250ms respectively. The approach calculates uninverted facet 
> counts over domain base, foreground, and background docSets in parallel in a 
> single pass. This allows us to take advantage of the efficiencies built into 
> the standard uninverted {{FacetFieldProcessorByArray[DV|UIF]}}), and avoids 
> the per-term docSet creation and set intersection overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---

[GitHub] [lucene-solr] s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

2020-04-16 Thread ASF subversion and git services (Jira)

s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to 
SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409812796
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java
 ##
 @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory 
directory, ChecksumIndexIn
   if (softDelCount + delCount > info.maxDoc()) {
 throw new CorruptIndexException("invalid deletion count: " + 
softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input);
   }
-  SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, 
softDelCount, delGen, fieldInfosGen, dvGen);
+  final byte[] sciId;
+  if (format > VERSION_74) {
+sciId = new byte[StringHelper.ID_LENGTH];
+input.readBytes(sciId, 0, sciId.length);
+  } else {
+sciId = infos.id;
+// NOCOMMIT can we do this? it would at least give us consistent BWC 
but we can't identify the same SCI in different commits
 
 Review comment:
   I think this case is a bit more complicated due to the changing nature of 
this ID. For each change (DV, llive docs, fields) we need to move to a new ID. 
Should we then just accept null and create a new ID once it changes or should 
we stick with `null` on these segments until they are written first time? 
Introducing `null` requires quite some changes in how we handle this which we 
can do, for sure. I still wonder if we can get away with stealing the _parent_ 
ID and have a smooth upgrade path.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9300) Index corruption with doc values updates and addIndexes

2020-04-16 Thread Jim Ferenczi (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Ferenczi updated LUCENE-9300:
-
Fix Version/s: 7.7.3

> Index corruption with doc values updates and addIndexes
> ---
>
> Key: LUCENE-9300
> URL: https://issues.apache.org/jira/browse/LUCENE-9300
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Jim Ferenczi
>Priority: Major
> Fix For: master (9.0), 7.7.3, 8.6, 8.5.1
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Today a doc values update creates a new field infos file that contains the 
> original field infos updated for the new generation as well as the new fields 
> created by the doc values update.
> However existing fields are cloned through the global fields (shared in the 
> index writer) instead of the local ones (present in the segment). In practice 
> this is not an issue since field numbers are shared between segments created 
> by the same index writer. But this assumption doesn't hold for segments 
> created by different writers and added through 
> IndexWriter#addIndexes(Directory). In this case, the field number of the same 
> field can differ between segments so any doc values update can corrupt the 
> index by assigning the wrong field number to an existing field in the next 
> generation. 
> When this happens, queries and merges can access wrong fields without 
> throwing any error, leading to a silent corruption in the index.
>  
> Since segments are not guaranteed to have the same field number consistently 
> we should ensure that doc values update preserves the segment's field number 
> when rewriting field infos.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9300) Index corruption with doc values updates and addIndexes



[ 
https://issues.apache.org/jira/browse/LUCENE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085233#comment-17085233
 ] 

ASF subversion and git services commented on LUCENE-9300:
-

Commit dab53a8089f78d860f6e046d3676b9a04131addf in lucene-solr's branch 
refs/heads/branch_7_7 from Jim Ferenczi
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=dab53a8 ]

LUCENE-9300: Fix field infos update on doc values update (#1394)

Today a doc values update creates a new field infos file that contains the 
original field infos updated for the new generation as well as the new fields 
created by the doc values update.

However existing fields are cloned through the global fields (shared in the 
index writer) instead of the local ones (present in the segment).
In practice this is not an issue since field numbers are shared between 
segments created by the same index writer.
But this assumption doesn't hold for segments created by different writers and 
added through IndexWriter#addIndexes(Directory).
In this case, the field number of the same field can differ between segments so 
any doc values update can corrupt the index
by assigning the wrong field number to an existing field in the next generation.

When this happens, queries and merges can access wrong fields without throwing 
any error, leading to a silent corruption in the index.

This change ensures that we preserve local field numbers when creating
a new field infos generation.


> Index corruption with doc values updates and addIndexes
> ---
>
> Key: LUCENE-9300
> URL: https://issues.apache.org/jira/browse/LUCENE-9300
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Jim Ferenczi
>Priority: Major
> Fix For: master (9.0), 8.6, 8.5.1
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Today a doc values update creates a new field infos file that contains the 
> original field infos updated for the new generation as well as the new fields 
> created by the doc values update.
> However existing fields are cloned through the global fields (shared in the 
> index writer) instead of the local ones (present in the segment). In practice 
> this is not an issue since field numbers are shared between segments created 
> by the same index writer. But this assumption doesn't hold for segments 
> created by different writers and added through 
> IndexWriter#addIndexes(Directory). In this case, the field number of the same 
> field can differ between segments so any doc values update can corrupt the 
> index by assigning the wrong field number to an existing field in the next 
> generation. 
> When this happens, queries and merges can access wrong fields without 
> throwing any error, leading to a silent corruption in the index.
>  
> Since segments are not guaranteed to have the same field number consistently 
> we should ensure that doc values update preserves the segment's field number 
> when rewriting field infos.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to 
SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409839154
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java
 ##
 @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory 
directory, ChecksumIndexIn
   if (softDelCount + delCount > info.maxDoc()) {
 throw new CorruptIndexException("invalid deletion count: " + 
softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input);
   }
-  SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, 
softDelCount, delGen, fieldInfosGen, dvGen);
+  final byte[] sciId;
+  if (format > VERSION_74) {
+sciId = new byte[StringHelper.ID_LENGTH];
+input.readBytes(sciId, 0, sciId.length);
+  } else {
+sciId = infos.id;
+// NOCOMMIT can we do this? it would at least give us consistent BWC 
but we can't identify the same SCI in different commits
 
 Review comment:
   I pushed a new commit 
https://github.com/apache/lucene-solr/pull/1434/commits/f0a72f82bb17bd2582799aa25514ef764e012570
 to address this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mhitza opened a new pull request #1435: SOLR-14410: Switch from SysV init script to systemd service file

mhitza opened a new pull request #1435: SOLR-14410: Switch from SysV init 
script to systemd service file
URL: https://github.com/apache/lucene-solr/pull/1435
 
 
   # Description
   
   Remove the init.d/solr SysV init script and use a systemd service file 
instead.
   
   # Solution
   
   I've tried not to diverge as much as possible from the way the installation 
script used to work before.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [ ] I have run `ant precommit` and the appropriate test suite.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to 
SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409844046
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java
 ##
 @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory 
directory, ChecksumIndexIn
   if (softDelCount + delCount > info.maxDoc()) {
 throw new CorruptIndexException("invalid deletion count: " + 
softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input);
   }
-  SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, 
softDelCount, delGen, fieldInfosGen, dvGen);
+  final byte[] sciId;
+  if (format > VERSION_74) {
+sciId = new byte[StringHelper.ID_LENGTH];
+input.readBytes(sciId, 0, sciId.length);
+  } else {
+sciId = infos.id;
+// NOCOMMIT can we do this? it would at least give us consistent BWC 
but we can't identify the same SCI in different commits
 
 Review comment:
   I don't understand the idea of stealing the parent ID, wouldn't it cause 
Lucene to consider commits equal when they are not, which would be a much worse 
problem than considering commits different when they are equal?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to 
SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409851521
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java
 ##
 @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory 
directory, ChecksumIndexIn
   if (softDelCount + delCount > info.maxDoc()) {
 throw new CorruptIndexException("invalid deletion count: " + 
softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input);
   }
-  SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, 
softDelCount, delGen, fieldInfosGen, dvGen);
+  final byte[] sciId;
+  if (format > VERSION_74) {
+sciId = new byte[StringHelper.ID_LENGTH];
+input.readBytes(sciId, 0, sciId.length);
+  } else {
+sciId = infos.id;
+// NOCOMMIT can we do this? it would at least give us consistent BWC 
but we can't identify the same SCI in different commits
 
 Review comment:
   > I don't understand the idea of stealing the parent ID, wouldn't it cause 
Lucene to consider commits equal when they are not, which would be a much worse 
problem than considering commits different when they are equal?
   
   the idea was to use the `SegmentInfos` ID which is different for every 
commit as a default. It would not cause Lucene to consider commits equal when 
they are not. I just moved to using null instead, it was an idea that has 
downsides too we can just go with null. The real question is when do we assign 
an ID then? Once we write the SCI again even if it didn't change? I think we 
should but that would then bring back the same problem with the fallback.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo

jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to 
SegmentCommitInfo
URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409866462
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java
 ##
 @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory 
directory, ChecksumIndexIn
   if (softDelCount + delCount > info.maxDoc()) {
 throw new CorruptIndexException("invalid deletion count: " + 
softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input);
   }
-  SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, 
softDelCount, delGen, fieldInfosGen, dvGen);
+  final byte[] sciId;
+  if (format > VERSION_74) {
+sciId = new byte[StringHelper.ID_LENGTH];
+input.readBytes(sciId, 0, sciId.length);
+  } else {
+sciId = infos.id;
+// NOCOMMIT can we do this? it would at least give us consistent BWC 
but we can't identify the same SCI in different commits
 
 Review comment:
   Ah sorry I got confused because I thought that "parent" was referring to 
SegmentInfo (no s) instead of SegmentInfos, but I agree that SegmentInfos is 
not great either.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] janhoy commented on a change in pull request #1435: SOLR-14410: Switch from SysV init script to systemd service file

janhoy commented on a change in pull request #1435: SOLR-14410: Switch from 
SysV init script to systemd service file
URL: https://github.com/apache/lucene-solr/pull/1435#discussion_r409880688
 
 

 ##
 File path: solr/solr-ref-guide/src/taking-solr-to-production.adoc
 ##
 @@ -365,7 +373,7 @@ There is another issue once the heap reaches 32GB. Below 
32GB, Java is able to u
 Because of the potential garbage collection issues and the particular issues 
that happen at 32GB, if a single instance would require a 64GB heap, 
performance is likely to improve greatly if the machine is set up with two 
nodes that each have a 31GB heap.
 
 
-If your use case requires multiple instances, at a minimum you will need 
unique Solr home directories for each node you want to run; ideally, each home 
should be on a different physical disk so that multiple Solr nodes don’t have 
to compete with each other when accessing files on disk. Having different Solr 
home directories implies that you’ll need a different include file for each 
node. Moreover, if using the `/etc/init.d/solr` script to control Solr as a 
service, then you’ll need a separate script for each node. The easiest approach 
is to use the service installation script to add multiple services on the same 
host, such as:
+If your use case requires multiple instances, at a minimum you will need 
unique Solr home directories for each node you want to run; ideally, each home 
should be on a different physical disk so that multiple Solr nodes don’t have 
to compete with each other when accessing files on disk. Having different Solr 
home directories implies that you’ll need a different include file for each 
node. Moreover, if using the `/etc/systemd/system/solr.service` script to 
control Solr, then you’ll need a separate service for each node. The easiest 
approach is to use the service installation script to add multiple services on 
the same host, such as:
 
 Review comment:
   Moreover, if using systemctl to control Solr... would be a better wording?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova commented on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents

mayya-sharipova commented on issue #1351: LUCENE-9280: Collectors to skip
noncompetitive documents
URL: https://github.com/apache/lucene-solr/pull/1351#issuecomment-614931785

I have run another round of benchmarks, this time comparing the performance
of this PR VS master as we don't need any special sort field.
[Here](https://github.com/mayya-sharipova/luceneutil/commit/c3166e4fc44e7fcddcd1672112c96364d9f464e5)
are the changes made to luceneutil.

**wikimedium10m**
```
TaskQPS baseline StdDevQPS patch StdDev
Pct diff
HighTermDayOfYearSort 50.93 (5.6%) 49.31 (10.9%)
-3.2% ( -18% - 14%)
TermDTSort 83.37 (5.9%) 129.95 (41.2%)
55.9% ( 8% - 109%)
WARNING: cat=HighTermDayOfYearSort: hit counts differ: 541957 vs 541957+
WARNING: cat=TermDTSort: hit counts differ: 506054 vs 1861+
```
Here we have two sorts:
- Int sort on a day of year. Slight decrease of performance: -3.2%. There
was an attempt to do the optimization, but the optimization was eventually not
run as every time
[estimatedNumberOfMatches](https://github.com/apache/lucene-solr/pull/1351/files#diff-aff67e212aa0edd675ec31c068cb642bR268)
was not selective enough. The reason for that the data here a day of the year
in the range [1, 366], and all segments contain various values through a
segment.

- Long sort on date field (msecSinceEpoch). Speedups: 55.9%.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova edited a comment on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents

mayya-sharipova edited a comment on issue #1351: LUCENE-9280: Collectors to 
skip noncompetitive documents
URL: https://github.com/apache/lucene-solr/pull/1351#issuecomment-614931785
 
 
   I have run another round of benchmarks, this time comparing the performance 
of this PR VS master as we don't need any special sort field.  
[Here](https://github.com/mayya-sharipova/luceneutil/commit/c3166e4fc44e7fcddcd1672112c96364d9f464e5)
 are the changes made to luceneutil.
   
   
   **wikimedium10m**: 10 millon docs
   ```
TaskQPS baseline   StdDevQPS patch StdDev
Pct diff
  HighTermDayOfYearSort   50.93  (5.6%)   49.31 (10.9%)   
-3.2% ( -18% -   14%)
 TermDTSort   83.37  (5.9%)  129.95 (41.2%)   
55.9% (   8% -  109%)
   WARNING: cat=HighTermDayOfYearSort: hit counts differ: 541957 vs 541957+
   WARNING: cat=TermDTSort: hit counts differ: 506054 vs 1861+
   ```
   
   **wikimediumall**: about 33 million docs
   ```
TaskQPS baseline   StdDevQPS patch StdDev
Pct diff
  HighTermDayOfYearSort   23.37  (4.4%)   21.76  (8.8%)   
-6.9% ( -19% -6%)
 TermDTSort   31.86  (3.5%)  108.33 (49.6%)  
240.0% ( 180% -  303%)
   WARNING: cat=HighTermDayOfYearSort: hit counts differ: 1275574 vs 1275574+
   WARNING: cat=TermDTSort: hit counts differ: 1474717 vs 1070+
   ```
   
   Here we have two sorts:
   -  Int sort on a day of year. Slight decrease of performance: **-6.9% – 
-3.2%,**. There was an attempt to do the optimization, but the optimization was 
eventually not run as every time 
[estimatedNumberOfMatches](https://github.com/apache/lucene-solr/pull/1351/files#diff-aff67e212aa0edd675ec31c068cb642bR268)
 was not selective enough. The reason for that the data here a day of the year 
in the range [1, 366], and all segments contain various values through a 
segment, so this data is not really a target for optimization.
   
   - Long sort on date field (msecSinceEpoch).  Speedups: **55.9% – 240.0%**.   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova commented on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents