[jira] [Commented] (LUCENE-9316) Incorporate all :precommit tasks into :check
[ https://issues.apache.org/jira/browse/LUCENE-9316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084600#comment-17084600 ] Dawid Weiss commented on LUCENE-9316: - The way I always thought it should work would be for {{precommit}} to run a subset of all the checks. This subset should be composed of tasks that are fairly fast so that people can (and should) run them before they commit stuff. My personal opinion at the moment is that precommit tries to run too much (it shouldn't take longer than a two minutes on an average hardware). A {{gradlew check}} runs full validation: everything, including tests. Adding {{ -x test}} is a way of excluding tests from that set (which I know is tempting since tests run for so long). Another difference is that "precommit" is a top-level task we added that collects other things from subprojects while "check" is per-project. So, in theory, if you know you only worked on {{:solr:core}}, you could run {{-p solr/core check}} and this would run a full set of validation tasks for that project only. > Incorporate all :precommit tasks into :check > > > Key: LUCENE-9316 > URL: https://issues.apache.org/jira/browse/LUCENE-9316 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Minor > Fix For: master (9.0) > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7788) fail precommit on unparameterised log messages and examine for wasted work/objects
[ https://issues.apache.org/jira/browse/LUCENE-7788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084621#comment-17084621 ] Dawid Weiss commented on LUCENE-7788: - Looks all right to me, I guess. > fail precommit on unparameterised log messages and examine for wasted > work/objects > -- > > Key: LUCENE-7788 > URL: https://issues.apache.org/jira/browse/LUCENE-7788 > Project: Lucene - Core > Issue Type: Task >Reporter: Christine Poerschke >Assignee: Erick Erickson >Priority: Minor > Attachments: LUCENE-7788.patch, LUCENE-7788.patch, gradle_only.patch, > gradle_only.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > SOLR-10415 would be removing existing unparameterised log.trace messages use > and once that is in place then this ticket's one-line change would be for > 'ant precommit' to reject any future unparameterised log.trace message use. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Closed] (LUCENE-9300) Index corruption with doc values updates and addIndexes
[ https://issues.apache.org/jira/browse/LUCENE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ignacio Vera closed LUCENE-9300. > Index corruption with doc values updates and addIndexes > --- > > Key: LUCENE-9300 > URL: https://issues.apache.org/jira/browse/LUCENE-9300 > Project: Lucene - Core > Issue Type: Bug >Reporter: Jim Ferenczi >Priority: Major > Fix For: master (9.0), 8.6, 8.5.1 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > Today a doc values update creates a new field infos file that contains the > original field infos updated for the new generation as well as the new fields > created by the doc values update. > However existing fields are cloned through the global fields (shared in the > index writer) instead of the local ones (present in the segment). In practice > this is not an issue since field numbers are shared between segments created > by the same index writer. But this assumption doesn't hold for segments > created by different writers and added through > IndexWriter#addIndexes(Directory). In this case, the field number of the same > field can differ between segments so any doc values update can corrupt the > index by assigning the wrong field number to an existing field in the next > generation. > When this happens, queries and merges can access wrong fields without > throwing any error, leading to a silent corruption in the index. > > Since segments are not guaranteed to have the same field number consistently > we should ensure that doc values update preserves the segment's field number > when rewriting field infos. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7701) Refactor grouping collectors
[ https://issues.apache.org/jira/browse/LUCENE-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084681#comment-17084681 ] Alan Woodward commented on LUCENE-7701: --- Sure, go ahead! > Refactor grouping collectors > > > Key: LUCENE-7701 > URL: https://issues.apache.org/jira/browse/LUCENE-7701 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Priority: Major > Fix For: 7.0 > > Attachments: LUCENE-7701.patch, LUCENE-7701.patch > > > Grouping currently works via abstract collectors, which need to be overridden > for each way of defining a group - currently we have two, 'term' (based on > SortedDocValues) and 'function' (based on ValueSources). These collectors > all have a lot of repeated code, and means that if you want to implement your > own group definitions, you need to override four or five different classes. > This would be easier to deal with if instead the 'group selection' code was > abstracted out into a single interface, and the various collectors were > changed to concrete implementations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9324) Give IDs to SegmentCommitInfo
[ https://issues.apache.org/jira/browse/LUCENE-9324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084682#comment-17084682 ] Simon Willnauer commented on LUCENE-9324: - I am trying to give a bit more context to this issue. Today we have _SegmentInfo_ which represents a segment once it's written to disk for instance at flush or merge time. We have a randomly generated ID in _SegmentInfo_ that can be used to verify if two segments are the same. Since we use incremental numbers for segment naming it's likely that two IndexWriters produce a segment with very similar contents and the same name. Yet, the _SegmentInfo_ id would be different. In addition to this ID we also have checksums on files which can be used to verify identity in addition to the ID but should not be treated identity by itself since they are very weak checksums. Now segments also get _updated_ for instance when a documents is marked as deleted or the segment receives a doc values update. The only thing that changes is the delete or update generation which also allow two IndexWriters that opened two copies of a segment (with the same segment ID) to produce a new delGen or dvGen that looks identical from the outside but are actually different. This is a problem that we see quite frequently in Elasticsearch and we'd like to prevent or have a better tool in our hands to distinguish _SegmentCommitInfo_ instances from another. If we'd have an ID on SegmentCommitInfo that changes each time one of these generations changes we could much easier tell if only the updated files (which are often very small) need to be replaced in order to recover an index. The plan is to implement this in a very similar fashion as we did on the _SegmentInfo_ but also invalidate the once any of the generations change in order to force a new _SegmentCommitInfo_ ID for the new generation. Yet, the IDs would not be the same if two IndexWriters start from the same segment making an identical change to the segment ie. it's not a replacement for a strong hash function. > Give IDs to SegmentCommitInfo > - > > Key: LUCENE-9324 > URL: https://issues.apache.org/jira/browse/LUCENE-9324 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Adrien Grand >Priority: Minor > > We already have IDs in SegmentInfo, which are useful to uniquely identify > segments. Having IDs on SegmentCommitInfo would be useful too in order to > compare commits for equality and make snapshots incremental on generational > files too. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9326) Refactor SortField to better handle extensions
Alan Woodward created LUCENE-9326: - Summary: Refactor SortField to better handle extensions Key: LUCENE-9326 URL: https://issues.apache.org/jira/browse/LUCENE-9326 Project: Lucene - Core Issue Type: Improvement Reporter: Alan Woodward Assignee: Alan Woodward Working on LUCENE-9325 has made me realize that SortField needs some serious reworking: * we have a bunch of hard-coded types, but also a number of custom extensions, which make implementing new sort orders complicated in non-obvious ways * we refer to these hard-coded types in a number of places, in particular in index sorts, which means that you can't use a 'custom' sort here. For example, I can see it would be very useful to be able to index sort by distance from a particular point, but that's not currently possible. * the API separates out the comparator and whether or not it should be reversed, which adds an extra layer of complication to its use, particularly in cases where we have multiple sortfields. The whole thing could do with an overhaul. I think this can be broken up into a few stages by adding a new superclass abstraction which `SortField` will extend, and gradually moving functionality into this superclass. I plan on starting with index sorting, which will require a sort field to a) be able to merge sort documents coming from a list of readers, and b) serialize itself to and deserialize itself from SegmentInfo -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14409) Existing violations allow bypassing policy rules when adding new replicas
Andrzej Bialecki created SOLR-14409: --- Summary: Existing violations allow bypassing policy rules when adding new replicas Key: SOLR-14409 URL: https://issues.apache.org/jira/browse/SOLR-14409 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: AutoScaling Affects Versions: 8.5, master (9.0), 8.6 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Steps to reproduce: * start with an empty cluster policy. * create a collection with as many replicas as there are nodes. * add one more replica to any node. Now this node has two replicas, all other nodes have one. * define the following cluster policy: {code:java} { 'set-cluster-policy': [ {'replica': '<2', 'shard': '#ANY', 'node': '#ANY', 'strict': true} ] } {code} This automatically creates a violation because of the existing layout. * try adding one more replica. This should fail because no node satisfies the rules (there must be at most 1 replica per node). However, the command succeeds and adds replica to the node that already has 2 replicas, which clearly violates the policy and makes matters even worse. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14409) Existing violations allow bypassing policy rules when adding new replicas
[ https://issues.apache.org/jira/browse/SOLR-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated SOLR-14409: Attachment: SOLR-14409.patch > Existing violations allow bypassing policy rules when adding new replicas > - > > Key: SOLR-14409 > URL: https://issues.apache.org/jira/browse/SOLR-14409 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: master (9.0), 8.5, 8.6 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Attachments: SOLR-14409.patch > > > Steps to reproduce: > * start with an empty cluster policy. > * create a collection with as many replicas as there are nodes. > * add one more replica to any node. Now this node has two replicas, all > other nodes have one. > * define the following cluster policy: > {code:java} > { 'set-cluster-policy': [ {'replica': '<2', 'shard': '#ANY', 'node': '#ANY', > 'strict': true} ] } {code} > This automatically creates a violation because of the existing layout. > * try adding one more replica. This should fail because no node satisfies > the rules (there must be at most 1 replica per node). However, the command > succeeds and adds replica to the node that already has 2 replicas, which > clearly violates the policy and makes matters even worse. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9288) poll_mirrors.py release script doesn't handle HTTPS
[ https://issues.apache.org/jira/browse/LUCENE-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ignacio Vera updated LUCENE-9288: - Status: Open (was: Open) > poll_mirrors.py release script doesn't handle HTTPS > --- > > Key: LUCENE-9288 > URL: https://issues.apache.org/jira/browse/LUCENE-9288 > Project: Lucene - Core > Issue Type: Bug > Components: general/tools >Affects Versions: 8.5, master (9.0) >Reporter: Alan Woodward >Priority: Major > > During the 8.5.0 release, the poll_mirrors.py script incorrectly reported > that the release artifacts were not on various mirrors or on maven central, > because it is configured to hit these endpoints using the `http` schema, > where most of them now only accept `https`. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9288) poll_mirrors.py release script doesn't handle HTTPS
[ https://issues.apache.org/jira/browse/LUCENE-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ignacio Vera updated LUCENE-9288: - Attachment: (was: poll-mirrors.patch) > poll_mirrors.py release script doesn't handle HTTPS > --- > > Key: LUCENE-9288 > URL: https://issues.apache.org/jira/browse/LUCENE-9288 > Project: Lucene - Core > Issue Type: Bug > Components: general/tools >Affects Versions: master (9.0), 8.5 >Reporter: Alan Woodward >Priority: Major > Attachments: poll-mirrors.patch > > > During the 8.5.0 release, the poll_mirrors.py script incorrectly reported > that the release artifacts were not on various mirrors or on maven central, > because it is configured to hit these endpoints using the `http` schema, > where most of them now only accept `https`. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9288) poll_mirrors.py release script doesn't handle HTTPS
[ https://issues.apache.org/jira/browse/LUCENE-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ignacio Vera updated LUCENE-9288: - Attachment: poll-mirrors.patch Status: Open (was: Open) Attached is the hack I have used for 8.5.1 release. I am sure there is a more elegant way but this makes the script work again. > poll_mirrors.py release script doesn't handle HTTPS > --- > > Key: LUCENE-9288 > URL: https://issues.apache.org/jira/browse/LUCENE-9288 > Project: Lucene - Core > Issue Type: Bug > Components: general/tools >Affects Versions: 8.5, master (9.0) >Reporter: Alan Woodward >Priority: Major > Attachments: poll-mirrors.patch > > > During the 8.5.0 release, the poll_mirrors.py script incorrectly reported > that the release artifacts were not on various mirrors or on maven central, > because it is configured to hit these endpoints using the `http` schema, > where most of them now only accept `https`. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14409) Existing violations allow bypassing policy rules when adding new replicas
[ https://issues.apache.org/jira/browse/SOLR-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084714#comment-17084714 ] Andrzej Bialecki commented on SOLR-14409: - This patch illustrates the problem. This may be a bug in {{AddReplicaSuggester}} or in {{Suggester.isLessSerious / containsNewErrors}} . {{AddReplicaSuggester}} tries all nodes but they all produce new violations - except for the one that already has a violation. So for all other nodes the condition in {{AddReplicaSuggester:55 if (!containsNewErrors(errs))}} is always false because they produce new errors. As a side-effect of this the variable {{leastSeriousViolation}} is never assigned. Finally, for the node that already has a violation the change does not produce a new violation, it only increases the severity of the existing one - but the code doesn't check this because {{leastSeriousViolation}} is null, so it treats the current error as the least serious. This may be the conceptual problem here - if there were no other errors then shouldn't the current errors be the most serious? but in this case there's already a pre-existing violation on this node so perhaps the {{leastSeriousViolation}} should always be initialized with the existing violations? (I tried it and many unit tests started failing...) > Existing violations allow bypassing policy rules when adding new replicas > - > > Key: SOLR-14409 > URL: https://issues.apache.org/jira/browse/SOLR-14409 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: master (9.0), 8.5, 8.6 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Attachments: SOLR-14409.patch > > > Steps to reproduce: > * start with an empty cluster policy. > * create a collection with as many replicas as there are nodes. > * add one more replica to any node. Now this node has two replicas, all > other nodes have one. > * define the following cluster policy: > {code:java} > { 'set-cluster-policy': [ {'replica': '<2', 'shard': '#ANY', 'node': '#ANY', > 'strict': true} ] } {code} > This automatically creates a violation because of the existing layout. > * try adding one more replica. This should fail because no node satisfies > the rules (there must be at most 1 replica per node). However, the command > succeeds and adds replica to the node that already has 2 replicas, which > clearly violates the policy and makes matters even worse. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9288) poll_mirrors.py release script doesn't handle HTTPS
[ https://issues.apache.org/jira/browse/LUCENE-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ignacio Vera updated LUCENE-9288: - Attachment: poll-mirrors.patch Status: Open (was: Open) > poll_mirrors.py release script doesn't handle HTTPS > --- > > Key: LUCENE-9288 > URL: https://issues.apache.org/jira/browse/LUCENE-9288 > Project: Lucene - Core > Issue Type: Bug > Components: general/tools >Affects Versions: 8.5, master (9.0) >Reporter: Alan Woodward >Priority: Major > Attachments: poll-mirrors.patch > > > During the 8.5.0 release, the poll_mirrors.py script incorrectly reported > that the release artifacts were not on various mirrors or on maven central, > because it is configured to hit these endpoints using the `http` schema, > where most of them now only accept `https`. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9288) poll_mirrors.py release script doesn't handle HTTPS
[ https://issues.apache.org/jira/browse/LUCENE-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084731#comment-17084731 ] Alan Woodward commented on LUCENE-9288: --- +1, certainly a lot nicer than the monstrosity I came up with > poll_mirrors.py release script doesn't handle HTTPS > --- > > Key: LUCENE-9288 > URL: https://issues.apache.org/jira/browse/LUCENE-9288 > Project: Lucene - Core > Issue Type: Bug > Components: general/tools >Affects Versions: master (9.0), 8.5 >Reporter: Alan Woodward >Priority: Major > Attachments: poll-mirrors.patch > > > During the 8.5.0 release, the poll_mirrors.py script incorrectly reported > that the release artifacts were not on various mirrors or on maven central, > because it is configured to hit these endpoints using the `http` schema, > where most of them now only accept `https`. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14291) OldAnalyticsRequestConverter should support fields names with dots
[ https://issues.apache.org/jira/browse/SOLR-14291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084774#comment-17084774 ] ASF subversion and git services commented on SOLR-14291: Commit b24b02840254f7e929a07658ec0f9066a2c5c366 in lucene-solr's branch refs/heads/master from Mikhail Khludnev [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b24b028 ] SOLR-14291: fix regexps to handle dotted fields in Old Analytics params. > OldAnalyticsRequestConverter should support fields names with dots > -- > > Key: SOLR-14291 > URL: https://issues.apache.org/jira/browse/SOLR-14291 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: search, SearchComponents - other >Reporter: Anatolii Siuniaev >Assignee: Mikhail Khludnev >Priority: Trivial > Attachments: SOLR-14291.patch, SOLR-14291.patch, SOLR-14291.patch > > > If you send a query with range facets using old olap-style syntax (see pdf > [here|https://issues.apache.org/jira/browse/SOLR-5302]), > OldAnalyticsRequestConverter just silently (no exception thrown) omits > parameters like > {code:java} > olap..rangefacet..start > {code} > in case if __ has dots inside (for instance field name is > _Project.Value_). And thus no range facets are returned in response. > Probably the same happens in case of field faceting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14291) OldAnalyticsRequestConverter should support fields names with dots
[ https://issues.apache.org/jira/browse/SOLR-14291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084777#comment-17084777 ] ASF subversion and git services commented on SOLR-14291: Commit d448f950516a3610d4271af7282fc55b6f176297 in lucene-solr's branch refs/heads/branch_8x from Mikhail Khludnev [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d448f95 ] SOLR-14291: fix regexps to handle dotted fields in Old Analytics params. > OldAnalyticsRequestConverter should support fields names with dots > -- > > Key: SOLR-14291 > URL: https://issues.apache.org/jira/browse/SOLR-14291 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: search, SearchComponents - other >Reporter: Anatolii Siuniaev >Assignee: Mikhail Khludnev >Priority: Trivial > Attachments: SOLR-14291.patch, SOLR-14291.patch, SOLR-14291.patch > > > If you send a query with range facets using old olap-style syntax (see pdf > [here|https://issues.apache.org/jira/browse/SOLR-5302]), > OldAnalyticsRequestConverter just silently (no exception thrown) omits > parameters like > {code:java} > olap..rangefacet..start > {code} > in case if __ has dots inside (for instance field name is > _Project.Value_). And thus no range facets are returned in response. > Probably the same happens in case of field faceting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw opened a new pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
s1monw opened a new pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434 We already have IDs in SegmentInfo, which are useful to uniquely identify segments. Having IDs on SegmentCommitInfo is be useful too in order to compare commits for equality and make snapshots incremental on generational files. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on issue #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
s1monw commented on issue #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#issuecomment-614643954 Note: there is still a NOCOMMIT in this pr regarding BWC. It's just an idea we can build on or even move to using `null` as an indicator that there is no id. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
mikemccand commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409553076 ## File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java ## @@ -3081,7 +3081,7 @@ private SegmentCommitInfo copySegmentAsIs(SegmentCommitInfo info, String segName info.info.getUseCompoundFile(), info.info.getCodec(), info.info.getDiagnostics(), info.info.getId(), info.info.getAttributes(), info.info.getIndexSort()); SegmentCommitInfo newInfoPerCommit = new SegmentCommitInfo(newInfo, info.getDelCount(), info.getSoftDelCount(), info.getDelGen(), - info.getFieldInfosGen(), info.getDocValuesGen()); + info.getFieldInfosGen(), info.getDocValuesGen(), info.getId()); Review comment: This happens during `IndexWriter.addIndexes(Directory[])` right? I wonder whether we should give a new id instead of reusing the old one? E.g. the segment (likely) now has a new name, and is in a different `Directory`, and is copied/forked from a prior segment, so maybe it should get a new `id`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
mikemccand commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409556321 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java ## @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory directory, ChecksumIndexIn if (softDelCount + delCount > info.maxDoc()) { throw new CorruptIndexException("invalid deletion count: " + softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input); } - SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, softDelCount, delGen, fieldInfosGen, dvGen); + final byte[] sciId; + if (format > VERSION_74) { +sciId = new byte[StringHelper.ID_LENGTH]; +input.readBytes(sciId, 0, sciId.length); + } else { +sciId = infos.id; +// NOCOMMIT can we do this? it would at least give us consistent BWC but we can't identify the same SCI in different commits Review comment: Good question ... maybe we could use `info.getId()`? Then it'd be unique across `SCI`, but, shared across `SI` which is also weird. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
mikemccand commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409554301 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentCommitInfo.java ## @@ -388,4 +399,17 @@ public SegmentCommitInfo clone() { final int getDelCount(boolean includeSoftDeletes) { return includeSoftDeletes ? getDelCount() + getSoftDelCount() : getDelCount(); } + + private void generationAdvanced() { +sizeInBytes = -1; +id = null; + } + + public byte[] getId() { +if (id == null) { + // we advanced a generation - need to generate a new ID + id = StringHelper.randomId(); Review comment: Do we need some thread safety here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409575955 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentCommitInfo.java ## @@ -388,4 +399,17 @@ public SegmentCommitInfo clone() { final int getDelCount(boolean includeSoftDeletes) { return includeSoftDeletes ? getDelCount() + getSoftDelCount() : getDelCount(); } + + private void generationAdvanced() { +sizeInBytes = -1; +id = null; + } + + public byte[] getId() { +if (id == null) { + // we advanced a generation - need to generate a new ID + id = StringHelper.randomId(); Review comment: yeah good question, as far as I can tell we never read or write any of the member vars on this class unless we hold a lock that protects it. But it's a mess so I guess we should. Yet, if we do that we should make every method on this class synced no? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409577804 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java ## @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory directory, ChecksumIndexIn if (softDelCount + delCount > info.maxDoc()) { throw new CorruptIndexException("invalid deletion count: " + softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input); } - SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, softDelCount, delGen, fieldInfosGen, dvGen); + final byte[] sciId; + if (format > VERSION_74) { +sciId = new byte[StringHelper.ID_LENGTH]; +input.readBytes(sciId, 0, sciId.length); + } else { +sciId = infos.id; +// NOCOMMIT can we do this? it would at least give us consistent BWC but we can't identify the same SCI in different commits Review comment: I don't think we should use info.getId() since that would mean that the same SegmentInfo instances are treated the same even if two IW made changes to it and it's generations. The way we have it now it's only considered the same if the overall commit is the same which is good i guess? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409578428 ## File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java ## @@ -3081,7 +3081,7 @@ private SegmentCommitInfo copySegmentAsIs(SegmentCommitInfo info, String segName info.info.getUseCompoundFile(), info.info.getCodec(), info.info.getDiagnostics(), info.info.getId(), info.info.getAttributes(), info.info.getIndexSort()); SegmentCommitInfo newInfoPerCommit = new SegmentCommitInfo(newInfo, info.getDelCount(), info.getSoftDelCount(), info.getDelGen(), - info.getFieldInfosGen(), info.getDocValuesGen()); + info.getFieldInfosGen(), info.getDocValuesGen(), info.getId()); Review comment: we do share the `info.info.getId()` here as well so I think we should be consistent? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson closed pull request #1428: LUCENE-7788: fail precommit on unparameterised log.trace messages
ErickErickson closed pull request #1428: LUCENE-7788: fail precommit on unparameterised log.trace messages URL: https://github.com/apache/lucene-solr/pull/1428 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson commented on issue #1428: LUCENE-7788: fail precommit on unparameterised log.trace messages
ErickErickson commented on issue #1428: LUCENE-7788: fail precommit on unparameterised log.trace messages URL: https://github.com/apache/lucene-solr/pull/1428#issuecomment-614685762 Re-doing This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11632) Creating an collection with an empty node set logs a WARN
[ https://issues.apache.org/jira/browse/SOLR-11632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084919#comment-17084919 ] Ilan Ginzburg commented on SOLR-11632: -- Shouldn't that log mention "nodes" instead of "cores" as a fix? Given only live nodes are considered for replica placement, even a non EMPTY set of nodes can lead to this. > Creating an collection with an empty node set logs a WARN > - > > Key: SOLR-11632 > URL: https://issues.apache.org/jira/browse/SOLR-11632 > Project: Solr > Issue Type: Improvement >Reporter: Varun Thacker >Priority: Minor > > When I create a collection with an empty node set I get a message like this > in the logs > {code} > 14127 WARN > (OverseerThreadFactory-12-thread-3-processing-n:127.0.0.1:61605_solr) > [n:127.0.0.1:61605_solr] o.a.s.c.CreateCollectionCmd It is unusual to > create a collection (backuprestore_restored) without cores. > {code} > Should we just remove this? A user who uses EMPTY will get this message. A > user who doesn't pass a set of candidate nodes then the collection creation > will fail anyways -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-9830) Once IndexWriter is closed due to some RunTimeException like FileSystemException, It never return to normal unless restart the Solr JVM
[ https://issues.apache.org/jira/browse/SOLR-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084940#comment-17084940 ] Guilherme Zanetta Simoni commented on SOLR-9830: It happens on 8.x too. > Once IndexWriter is closed due to some RunTimeException like > FileSystemException, It never return to normal unless restart the Solr JVM > --- > > Key: SOLR-9830 > URL: https://issues.apache.org/jira/browse/SOLR-9830 > Project: Solr > Issue Type: Bug > Components: update >Affects Versions: 6.2 > Environment: Red Hat 4.4.7-3,SolrCloud >Reporter: Daisy.Yuan >Priority: Major > > 1. Collection coll_test, has 9 shards, each has two replicas in different > solr instances. > 2. When update documens to the collection use Solrj, inject the exhausted > handle fault to one solr instance like solr1. > 3. Update to col_test_shard3_replica1(It's leader) is failed due to > FileSystemException, and IndexWriter is closed. > 4. And clear the fault, the col_test_shard3_replica1 (is leader) is always > cannot be updated documens and the numDocs is always less than the standby > replica. > 5. After Solr instance restart, It can update documens and the numDocs is > consistent between the two replicas. > I think in this case in Solr Cloud mode, it should recovery itself and not > restart to recovery the solrcore update function. > 2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | > [DWPT][http-nio-21101-exec-20]: now abort | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | > [DWPT][http-nio-21101-exec-20]: done abort | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | > [IW][http-nio-21101-exec-20]: hit exception updating document | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | > [IW][http-nio-21101-exec-20]: hit tragic FileSystemException inside > updateDocument | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | > [IW][http-nio-21101-exec-20]: rollback | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | > [IW][http-nio-21101-exec-20]: all running merges have aborted | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,934 | INFO | http-nio-21101-exec-20 | > [IW][http-nio-21101-exec-20]: rollback: done finish merges | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,934 | INFO | http-nio-21101-exec-20 | > [DW][http-nio-21101-exec-20]: abort | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,939 | INFO | commitScheduler-46-thread-1 | > [DWPT][commitScheduler-46-thread-1]: flush postings as segment _4h9 > numDocs=3798 | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | > [DWPT][commitScheduler-46-thread-1]: now abort | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | > [DWPT][commitScheduler-46-thread-1]: done abort | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,940 | INFO | http-nio-21101-exec-20 | > [DW][http-nio-21101-exec-20]: done abort success=true | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | > [DW][commitScheduler-46-thread-1]: commitScheduler-46-thread-1 > finishFullFlush success=false | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,940 | INFO | http-nio-21101-exec-20 | > [IW][http-nio-21101-exec-20]: rollback: > infos=_4g7(6.2.0):C59169/23684:delGen=4 _4gq(6.2.0):C67474/11636:delGen=1 > _4gg(6.2.0):C64067/15664:delGen=2 _4gr(6.2.0):C13131 _4gs(6.2.0):C966 > _4gt(6.2.0):C4543 _4gu(6.2.0):C6960 _4gv(6.2.0):C2544 | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | > [IW][commitScheduler-46-thread-1]: hit exception during NRT reader | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,967 | INFO | http-nio-21101-exec-20 |
[jira] [Commented] (LUCENE-9317) Resolve package name conflicts for StandardAnalyzer to allow Java module system support
[ https://issues.apache.org/jira/browse/LUCENE-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085063#comment-17085063 ] Tomoko Uchida commented on LUCENE-9317: --- Hi [~oobles] you can use "@" mention when you need feedback from specific person/people to gain the attention (and sorry, I seem not to be the one here). [~uschindler] would you give some feedback or thoughts. {quote} Apologies that the commit is difficult to review. I staged the changes and moves in one commit when I should have done it as moves then changes. Let me know if you'd rather I redo it. {quote} I'm not sure what is the preferred way, but you could make some small, incomplete patch/PR to describe your idea for review. Or I think some detailed design discussion (without patch) would also be okay before touching the codebase, since the problem you picked up would not be about java implementation, but package/module structure. ? > Resolve package name conflicts for StandardAnalyzer to allow Java module > system support > --- > > Key: LUCENE-9317 > URL: https://issues.apache.org/jira/browse/LUCENE-9317 > Project: Lucene - Core > Issue Type: Improvement > Components: core/other >Affects Versions: master (9.0) >Reporter: David Ryan >Priority: Major > Labels: build, features > > > To allow Lucene to be modularised there are a few preparatory tasks to be > completed prior to this being possible. The Java module system requires that > jars do not use the same package name in different jars. The lucene-core and > lucene-analyzers-common both share the package > org.apache.lucene.analysis.standard. > Possible resolutions to this issue are discussed by Uwe on the mailing list > here: > > [http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3CCAM21Rt8FHOq_JeUSELhsQJH0uN0eKBgduBQX4fQKxbs49TLqzA%40mail.gmail.com%3E] > {quote}About StandardAnalyzer: Unfortunately I aggressively complained a > while back when Mike McCandless wanted to move standard analyzer out of the > analysis package into core (“for convenience”). This was a bad step, and IMHO > we should revert that or completely rename the packages and everything. The > problem here is: As the analysis services are only part of lucene-analyzers, > we had to leave the factory classes there, but move the implementation > classes in core. The package has to be the same. The only way around that is > to move the analysis factory framework also to core (I would not be against > that). This would include all factory base classes and the service loading > stuff. Then we can move standard analyzer and some of the filters/tokenizers > including their factories to core an that problem would be solved. > {quote} > There are two options here, either move factory framework into core or revert > StandardAnalyzer back to lucene-analyzers. In the email, the solution lands > on reverting back as per the task list: > {quote}Add some preparatory issues to cleanup class hierarchy: Move Analysis > SPI to core / remove StandardAnalyzer and related classes out of core back to > anaysis > {quote} > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers
madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409694609 ## File path: solr/core/src/java/org/apache/solr/core/CoreContainer.java ## @@ -648,8 +655,8 @@ public void load() { pkiAuthenticationPlugin.initializeMetrics(solrMetricsContext, "/authentication/pki"); TracerConfigurator.loadTracer(loader, cfg.getTracerConfiguratorPluginInfo(), getZkController().getZkStateReader()); packageLoader = new PackageLoader(this); - containerHandlers.getApiBag().register(new AnnotatedApi(packageLoader.getPackageAPI().editAPI), Collections.EMPTY_MAP); - containerHandlers.getApiBag().register(new AnnotatedApi(packageLoader.getPackageAPI().readAPI), Collections.EMPTY_MAP); + containerHandlers.getApiBag().registerObject(packageLoader.getPackageAPI().editAPI); Review comment: There's a minor inconsistency here between edit/read and write/read, maybe we can standardize on one? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers
madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409691926 ## File path: solr/solrj/src/java/org/apache/solr/client/solrj/impl/BaseHttpSolrClient.java ## @@ -62,6 +63,9 @@ public static RemoteExecutionException create(String host, NamedList errResponse if (errObj != null) { Number code = (Number) getObjectByPath(errObj, true, Collections.singletonList("code")); String msg = (String) getObjectByPath(errObj, true, Collections.singletonList("msg")); +if(msg == null) msg = ""; Review comment: There's already a null check later, what kinds of messages do we get here that are useful and don't leak too much internals? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers
madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409685318 ## File path: solr/core/src/test-files/runtimecode/sig.txt ## @@ -69,6 +69,14 @@ openssl dgst -sha1 -sign ../cryptokeys/priv_key512.pem expressible.jar.bin | ope ZOT11arAiPmPZYOHzqodiNnxO9pRyRozWZEBX8XGjU1/HJptFnZK+DI7eXnUtbNaMcbXE2Ze8hh4M/eGyhY8BQ== +openssl dgst -sha1 -sign priv_key512.pem containerplugin.v.1.jar.bin | openssl enc -base64 | sed 's/+/%2B/g' | tr -d \\n | sed Review comment: Does this need to be `../cryptokeys/priv_key512.pem`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers
madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409690893 ## File path: solr/core/src/test/org/apache/solr/handler/admin/TestApiFramework.java ## @@ -199,6 +200,25 @@ public void testPayload() { } + public void testApiWrapper() { +Class klas = ApiWithConstructor.class; Review comment: unclear what this test is testing This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers
madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409689441 ## File path: solr/core/src/test/org/apache/solr/handler/TestContainerPlugin.java ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.handler; + +import java.io.IOException; +import java.util.List; +import java.util.concurrent.Callable; + +import com.google.common.collect.ImmutableMap; +import org.apache.solr.api.Command; +import org.apache.solr.api.EndPoint; +import org.apache.solr.client.solrj.SolrClient; +import org.apache.solr.client.solrj.SolrServerException; +import org.apache.solr.client.solrj.impl.BaseHttpSolrClient; +import org.apache.solr.client.solrj.request.V2Request; +import org.apache.solr.client.solrj.request.beans.Package; +import org.apache.solr.client.solrj.request.beans.PluginMeta; +import org.apache.solr.client.solrj.response.V2Response; +import org.apache.solr.cloud.MiniSolrCloudCluster; +import org.apache.solr.cloud.SolrCloudTestCase; +import org.apache.solr.common.NavigableObject; +import org.apache.solr.common.util.Utils; +import org.apache.solr.filestore.PackageStoreAPI; +import org.apache.solr.filestore.TestDistribPackageStore; +import org.apache.solr.pkg.TestPackages; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.security.PermissionNameProvider; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import static java.util.Collections.singletonMap; +import static org.apache.solr.client.solrj.SolrRequest.METHOD.GET; +import static org.apache.solr.client.solrj.SolrRequest.METHOD.POST; +import static org.apache.solr.filestore.TestDistribPackageStore.readFile; +import static org.apache.solr.filestore.TestDistribPackageStore.uploadKey; +import static org.hamcrest.CoreMatchers.containsString; + +public class TestContainerPlugin extends SolrCloudTestCase { + + @Before + public void setup() { +System.setProperty("enable.packages", "true"); + } + + @After + public void teardown() { +System.clearProperty("enable.packages"); + } + + @Test + public void testApi() throws Exception { +MiniSolrCloudCluster cluster = Review comment: cluster setup and teardown can go in Before/After methods This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers
madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409693320 ## File path: solr/core/src/test/org/apache/solr/handler/TestContainerPlugin.java ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.handler; + +import java.io.IOException; +import java.util.List; +import java.util.concurrent.Callable; + +import com.google.common.collect.ImmutableMap; +import org.apache.solr.api.Command; +import org.apache.solr.api.EndPoint; +import org.apache.solr.client.solrj.SolrClient; +import org.apache.solr.client.solrj.SolrServerException; +import org.apache.solr.client.solrj.impl.BaseHttpSolrClient; +import org.apache.solr.client.solrj.request.V2Request; +import org.apache.solr.client.solrj.request.beans.Package; +import org.apache.solr.client.solrj.request.beans.PluginMeta; +import org.apache.solr.client.solrj.response.V2Response; +import org.apache.solr.cloud.MiniSolrCloudCluster; +import org.apache.solr.cloud.SolrCloudTestCase; +import org.apache.solr.common.NavigableObject; +import org.apache.solr.common.util.Utils; +import org.apache.solr.filestore.PackageStoreAPI; +import org.apache.solr.filestore.TestDistribPackageStore; +import org.apache.solr.pkg.TestPackages; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.security.PermissionNameProvider; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import static java.util.Collections.singletonMap; +import static org.apache.solr.client.solrj.SolrRequest.METHOD.GET; +import static org.apache.solr.client.solrj.SolrRequest.METHOD.POST; +import static org.apache.solr.filestore.TestDistribPackageStore.readFile; +import static org.apache.solr.filestore.TestDistribPackageStore.uploadKey; +import static org.hamcrest.CoreMatchers.containsString; + +public class TestContainerPlugin extends SolrCloudTestCase { + + @Before + public void setup() { +System.setProperty("enable.packages", "true"); + } + + @After + public void teardown() { +System.clearProperty("enable.packages"); + } + + @Test + public void testApi() throws Exception { +MiniSolrCloudCluster cluster = +configureCluster(4) +.withJettyConfig(jetty -> jetty.enableV2(true)) +.configure(); +String errPath = "/error/details[0]/errorMessages[0]"; +try { + PluginMeta plugin = new PluginMeta(); + plugin.name = "testplugin"; + plugin.klass = C2.class.getName(); + V2Request req = new V2Request.Builder("/cluster/plugin") + .forceV2(true) + .withMethod(POST) + .withPayload(singletonMap("add", plugin)) + .build(); + expectError(req, cluster.getSolrClient(), errPath, "Must have a no-arg constructor or CoreContainer constructor and it must not be a non static inner class"); + + plugin.klass = C1.class.getName(); + expectError(req, cluster.getSolrClient(), errPath, "Invalid class, no @EndPoint annotation"); + + plugin.klass = C3.class.getName(); + req.process(cluster.getSolrClient()); + + V2Response rsp = new V2Request.Builder("/cluster/plugin") + .forceV2(true) + .withMethod(GET) + .build() + .process(cluster.getSolrClient()); + assertEquals(C3.class.getName(), rsp._getStr("/plugin/testplugin/class", null)); + + TestDistribPackageStore.assertResponseValues(10, + () -> new V2Request.Builder("/plugin/my/plugin") + .forceV2(true) + .withMethod(GET) + .build().process(cluster.getSolrClient()), + ImmutableMap.of("/testkey", "testval")); + + new V2Request.Builder("/cluster/plugin") + .withMethod(POST) + .forceV2(true) + .withPayload("{remove : testplugin}") + .build() + .process(cluster.getSolrClient()); + + rsp = new V2Request.Builder("/cluster/plugin") + .forceV2(true) + .withMethod(GET) + .build() + .process(cluster.getSolrClient()); + assertEquals(null,
[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers
madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409680029 ## File path: solr/core/src/java/org/apache/solr/handler/admin/ContainerPluginsApi.java ## @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.handler.admin; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.function.Function; +import java.util.function.Supplier; + +import org.apache.solr.api.AnnotatedApi; +import org.apache.solr.api.Command; +import org.apache.solr.api.CustomContainerPlugins; +import org.apache.solr.api.EndPoint; +import org.apache.solr.api.PayloadObj; +import org.apache.solr.client.solrj.SolrRequest.METHOD; +import org.apache.solr.client.solrj.request.beans.PluginMeta; +import org.apache.solr.common.cloud.SolrZkClient; +import org.apache.solr.common.cloud.ZkStateReader; +import org.apache.solr.common.util.Utils; +import org.apache.solr.core.CoreContainer; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.security.PermissionNameProvider; +import org.apache.zookeeper.KeeperException; +import org.apache.zookeeper.data.Stat; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.lucene.util.IOUtils.closeWhileHandlingException; + + +public class ContainerPluginsApi { + private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + + public static final String PLUGIN = "plugin"; + private final Supplier zkClientSupplier; + private final CoreContainer coreContainer; + public final Read readAPI = new Read(); + public final Edit editAPI = new Edit(); + + public ContainerPluginsApi(CoreContainer coreContainer) { +this.zkClientSupplier = coreContainer.zkClientSupplier; +this.coreContainer = coreContainer; + } + + @EndPoint(method = METHOD.GET, + path = "/cluster/plugin", + permission = PermissionNameProvider.Name.COLL_READ_PERM) + public class Read { + +@Command +public void list(SolrQueryRequest req, SolrQueryResponse rsp) throws IOException { + rsp.add(PLUGIN, plugins(zkClientSupplier)); +} + } + + @EndPoint(method = METHOD.POST, + path = "/cluster/plugin", + permission = PermissionNameProvider.Name.COLL_EDIT_PERM) + public class Edit { + +@Command(name = "add") +public void add(SolrQueryRequest req, SolrQueryResponse rsp, PayloadObj payload) throws IOException { + PluginMeta info = payload.get(); + validateConfig(payload, info); + if(payload.hasError()) return; + persistPlugins(map -> { +if (map.containsKey(info.name)) { + payload.addError(info.name + " already exists"); + return null; +} +map.put(info.name, info); +return map; + }); +} + +@Command(name = "remove") +public void remove(SolrQueryRequest req, SolrQueryResponse rsp, PayloadObj payload) throws IOException { Review comment: Should this be METHOD.DELETE instead of a post? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers
madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409688568 ## File path: solr/core/src/test/org/apache/solr/handler/TestContainerPlugin.java ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.handler; + +import java.io.IOException; +import java.util.List; +import java.util.concurrent.Callable; + +import com.google.common.collect.ImmutableMap; +import org.apache.solr.api.Command; +import org.apache.solr.api.EndPoint; +import org.apache.solr.client.solrj.SolrClient; +import org.apache.solr.client.solrj.SolrServerException; +import org.apache.solr.client.solrj.impl.BaseHttpSolrClient; +import org.apache.solr.client.solrj.request.V2Request; +import org.apache.solr.client.solrj.request.beans.Package; +import org.apache.solr.client.solrj.request.beans.PluginMeta; +import org.apache.solr.client.solrj.response.V2Response; +import org.apache.solr.cloud.MiniSolrCloudCluster; +import org.apache.solr.cloud.SolrCloudTestCase; +import org.apache.solr.common.NavigableObject; +import org.apache.solr.common.util.Utils; +import org.apache.solr.filestore.PackageStoreAPI; +import org.apache.solr.filestore.TestDistribPackageStore; +import org.apache.solr.pkg.TestPackages; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.security.PermissionNameProvider; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import static java.util.Collections.singletonMap; +import static org.apache.solr.client.solrj.SolrRequest.METHOD.GET; +import static org.apache.solr.client.solrj.SolrRequest.METHOD.POST; +import static org.apache.solr.filestore.TestDistribPackageStore.readFile; +import static org.apache.solr.filestore.TestDistribPackageStore.uploadKey; +import static org.hamcrest.CoreMatchers.containsString; + +public class TestContainerPlugin extends SolrCloudTestCase { + + @Before + public void setup() { +System.setProperty("enable.packages", "true"); + } + + @After + public void teardown() { +System.clearProperty("enable.packages"); + } + + @Test + public void testApi() throws Exception { +MiniSolrCloudCluster cluster = +configureCluster(4) +.withJettyConfig(jetty -> jetty.enableV2(true)) +.configure(); +String errPath = "/error/details[0]/errorMessages[0]"; +try { + PluginMeta plugin = new PluginMeta(); + plugin.name = "testplugin"; + plugin.klass = C2.class.getName(); + V2Request req = new V2Request.Builder("/cluster/plugin") + .forceV2(true) + .withMethod(POST) + .withPayload(singletonMap("add", plugin)) + .build(); + expectError(req, cluster.getSolrClient(), errPath, "Must have a no-arg constructor or CoreContainer constructor and it must not be a non static inner class"); + + plugin.klass = C1.class.getName(); + expectError(req, cluster.getSolrClient(), errPath, "Invalid class, no @EndPoint annotation"); + + plugin.klass = C3.class.getName(); + req.process(cluster.getSolrClient()); + + V2Response rsp = new V2Request.Builder("/cluster/plugin") + .forceV2(true) + .withMethod(GET) + .build() + .process(cluster.getSolrClient()); + assertEquals(C3.class.getName(), rsp._getStr("/plugin/testplugin/class", null)); + + TestDistribPackageStore.assertResponseValues(10, + () -> new V2Request.Builder("/plugin/my/plugin") + .forceV2(true) + .withMethod(GET) + .build().process(cluster.getSolrClient()), + ImmutableMap.of("/testkey", "testval")); + + new V2Request.Builder("/cluster/plugin") + .withMethod(POST) + .forceV2(true) + .withPayload("{remove : testplugin}") + .build() + .process(cluster.getSolrClient()); + + rsp = new V2Request.Builder("/cluster/plugin") + .forceV2(true) + .withMethod(GET) + .build() + .process(cluster.getSolrClient()); + assertEquals(null,
[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers
madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409676264 ## File path: solr/core/src/java/org/apache/solr/api/ApiBag.java ## @@ -134,6 +142,14 @@ static void registerIntrospect(List l, PathTrie registry, Map
[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers
madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409684159 ## File path: solr/core/src/java/org/apache/solr/pkg/PackageListeners.java ## @@ -63,13 +63,13 @@ public synchronized void removeListener(Listener listener) { } synchronized void packagesUpdated(List pkgs) { -MDCLoggingContext.setCore(core); Review comment: Why do we need this check? setCore already handles nulls. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers
madrob commented on a change in pull request #1432: SOLR-14404 CoreContainer level custom requesthandlers URL: https://github.com/apache/lucene-solr/pull/1432#discussion_r409687566 ## File path: solr/core/src/test-files/runtimecode/MyPlugin.java ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.handler; + +import org.apache.solr.api.Command; +import org.apache.solr.api.EndPoint; +import org.apache.solr.client.solrj.SolrRequest.METHOD; +import org.apache.solr.core.CoreContainer; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.security.PermissionNameProvider; + +@EndPoint(path = "/plugin/my/path", +method = METHOD.GET, +permission = PermissionNameProvider.Name.CONFIG_READ_PERM) +public class MyPlugin { Review comment: I assume this is used to generate containerplugin.v.1.jar.bin and v.2? Should we have two source files for those? Can we do this some other way, besides checking in binaries to the repo? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14387) SolrClient.getById() does not escape parameter separators within ids
[ https://issues.apache.org/jira/browse/SOLR-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085095#comment-17085095 ] ASF subversion and git services commented on SOLR-14387: Commit 74ecc13816fb6aae6e512e2e9d815459e235a120 in lucene-solr's branch refs/heads/master from Markus Schuch [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=74ecc13 ] SOLR-14387 add testcase for ids with separators to GetByIdTest and fix SolrClient to escape ids properly > SolrClient.getById() does not escape parameter separators within ids > > > Key: SOLR-14387 > URL: https://issues.apache.org/jira/browse/SOLR-14387 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 8.5 >Reporter: Markus Schuch >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Having a solr document with a comma in its id (e.g. "A,B"), > {{SolrClient.getById()}} is not able to retrieve this document, because it > queries {{/get?ids=A,B}} instead of {{/get?ids=A\,B}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14387) SolrClient.getById() does not escape parameter separators within ids
[ https://issues.apache.org/jira/browse/SOLR-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob resolved SOLR-14387. -- Fix Version/s: master (9.0) Assignee: Mike Drob Resolution: Fixed This is a good catch and a good fix! Thanks for the patch, I've pushed it to master. I'm a little bit concerned that folks might be improperly relying on the old behavior to get multiple ids thinking that it is a convenient shortcut, so I'm hesitant to put this in branch_8x. Let me know if you disagree! > SolrClient.getById() does not escape parameter separators within ids > > > Key: SOLR-14387 > URL: https://issues.apache.org/jira/browse/SOLR-14387 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 8.5 >Reporter: Markus Schuch >Assignee: Mike Drob >Priority: Major > Fix For: master (9.0) > > Time Spent: 10m > Remaining Estimate: 0h > > Having a solr document with a comma in its id (e.g. "A,B"), > {{SolrClient.getById()}} is not able to retrieve this document, because it > queries {{/get?ids=A,B}} instead of {{/get?ids=A\,B}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob closed pull request #1404: SOLR-14387: SolrClient.getById() does not escape parameter separators within ids
madrob closed pull request #1404: SOLR-14387: SolrClient.getById() does not escape parameter separators within ids URL: https://github.com/apache/lucene-solr/pull/1404 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on issue #1404: SOLR-14387: SolrClient.getById() does not escape parameter separators within ids
madrob commented on issue #1404: SOLR-14387: SolrClient.getById() does not escape parameter separators within ids URL: https://github.com/apache/lucene-solr/pull/1404#issuecomment-614775130 Fixed in 74ecc138 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14409) Existing violations allow bypassing policy rules when adding new replicas
[ https://issues.apache.org/jira/browse/SOLR-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated SOLR-14409: Priority: Critical (was: Major) > Existing violations allow bypassing policy rules when adding new replicas > - > > Key: SOLR-14409 > URL: https://issues.apache.org/jira/browse/SOLR-14409 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: master (9.0), 8.5, 8.6 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Critical > Attachments: SOLR-14409.patch > > > Steps to reproduce: > * start with an empty cluster policy. > * create a collection with as many replicas as there are nodes. > * add one more replica to any node. Now this node has two replicas, all > other nodes have one. > * define the following cluster policy: > {code:java} > { 'set-cluster-policy': [ {'replica': '<2', 'shard': '#ANY', 'node': '#ANY', > 'strict': true} ] } {code} > This automatically creates a violation because of the existing layout. > * try adding one more replica. This should fail because no node satisfies > the rules (there must be at most 1 replica per node). However, the command > succeeds and adds replica to the node that already has 2 replicas, which > clearly violates the policy and makes matters even worse. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14409) Existing violations allow bypassing policy rules when adding new replicas
[ https://issues.apache.org/jira/browse/SOLR-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085111#comment-17085111 ] Andrzej Bialecki commented on SOLR-14409: - Escalating to Critical as it breaks the autoscaling placement for common scenarios. > Existing violations allow bypassing policy rules when adding new replicas > - > > Key: SOLR-14409 > URL: https://issues.apache.org/jira/browse/SOLR-14409 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: master (9.0), 8.5, 8.6 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Critical > Attachments: SOLR-14409.patch > > > Steps to reproduce: > * start with an empty cluster policy. > * create a collection with as many replicas as there are nodes. > * add one more replica to any node. Now this node has two replicas, all > other nodes have one. > * define the following cluster policy: > {code:java} > { 'set-cluster-policy': [ {'replica': '<2', 'shard': '#ANY', 'node': '#ANY', > 'strict': true} ] } {code} > This automatically creates a violation because of the existing layout. > * try adding one more replica. This should fail because no node satisfies > the rules (there must be at most 1 replica per node). However, the command > succeeds and adds replica to the node that already has 2 replicas, which > clearly violates the policy and makes matters even worse. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-9324) Give IDs to SegmentCommitInfo
[ https://issues.apache.org/jira/browse/LUCENE-9324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer reassigned LUCENE-9324: --- Assignee: Simon Willnauer > Give IDs to SegmentCommitInfo > - > > Key: LUCENE-9324 > URL: https://issues.apache.org/jira/browse/LUCENE-9324 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Adrien Grand >Assignee: Simon Willnauer >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > > We already have IDs in SegmentInfo, which are useful to uniquely identify > segments. Having IDs on SegmentCommitInfo would be useful too in order to > compare commits for equality and make snapshots incremental on generational > files too. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9324) Give IDs to SegmentCommitInfo
[ https://issues.apache.org/jira/browse/LUCENE-9324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-9324: Fix Version/s: 8.6 master (9.0) > Give IDs to SegmentCommitInfo > - > > Key: LUCENE-9324 > URL: https://issues.apache.org/jira/browse/LUCENE-9324 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Adrien Grand >Assignee: Simon Willnauer >Priority: Minor > Fix For: master (9.0), 8.6 > > Time Spent: 1h > Remaining Estimate: 0h > > We already have IDs in SegmentInfo, which are useful to uniquely identify > segments. Having IDs on SegmentCommitInfo would be useful too in order to > compare commits for equality and make snapshots incremental on generational > files too. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14275) Policy calculations are very slow for large clusters and large operations
[ https://issues.apache.org/jira/browse/SOLR-14275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085191#comment-17085191 ] David Smiley commented on SOLR-14275: - [~noble.paul] I recall from a slack message a few weeks ago that you made _tremendous_ progress. I'm guessing that's the most recent PR which corresponds to the same timeframe. What's the next step? Known gotchas / TODOs / trade-offs? Does it need a code review? > Policy calculations are very slow for large clusters and large operations > - > > Key: SOLR-14275 > URL: https://issues.apache.org/jira/browse/SOLR-14275 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.7.2, 8.4.1 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Labels: scaling > Attachments: SOLR-14275.patch, scenario.txt > > Time Spent: 0.5h > Remaining Estimate: 0h > > Replica placement calculations performed during collection creation take > extremely long time (several minutes) when using a large cluster and creating > a large collection (eg. 1000 nodes, 500 shards, 4 replicas). > Profiling shows that most of the time is spent in > {{Row.computeCacheIfAbsent}}, which probably doesn't reuse this cache as much > as it should. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14410) Switch from SysV init script to systemd service definition
Marius Ghita created SOLR-14410: --- Summary: Switch from SysV init script to systemd service definition Key: SOLR-14410 URL: https://issues.apache.org/jira/browse/SOLR-14410 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Marius Ghita Attachments: solr.service The proposed change will incorporate the attached service definition file in the solr installation script. More information in the [dev mailing list thread|[http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e]] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14410) Switch from SysV init script to systemd service definition
[ https://issues.apache.org/jira/browse/SOLR-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marius Ghita updated SOLR-14410: Description: The proposed change will incorporate the attached service definition file in the solr installation script. More information in the [http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e] was: The proposed change will incorporate the attached service definition file in the solr installation script. More information in the [dev mailing list thread|[http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e]] > Switch from SysV init script to systemd service definition > -- > > Key: SOLR-14410 > URL: https://issues.apache.org/jira/browse/SOLR-14410 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Marius Ghita >Priority: Major > Attachments: solr.service > > > The proposed change will incorporate the attached service definition file in > the solr installation script. > > More information in the > [http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14410) Switch from SysV init script to systemd service definition
[ https://issues.apache.org/jira/browse/SOLR-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marius Ghita updated SOLR-14410: Description: The proposed change will incorporate the attached service definition file in the solr installation script. More information on the mailinglist [http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e] was: The proposed change will incorporate the attached service definition file in the solr installation script. More information in the [http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e] > Switch from SysV init script to systemd service definition > -- > > Key: SOLR-14410 > URL: https://issues.apache.org/jira/browse/SOLR-14410 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Marius Ghita >Priority: Major > Attachments: solr.service > > > The proposed change will incorporate the attached service definition file in > the solr installation script. > > More information on the mailinglist > [http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409804272 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java ## @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory directory, ChecksumIndexIn if (softDelCount + delCount > info.maxDoc()) { throw new CorruptIndexException("invalid deletion count: " + softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input); } - SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, softDelCount, delGen, fieldInfosGen, dvGen); + final byte[] sciId; + if (format > VERSION_74) { +sciId = new byte[StringHelper.ID_LENGTH]; +input.readBytes(sciId, 0, sciId.length); + } else { +sciId = infos.id; +// NOCOMMIT can we do this? it would at least give us consistent BWC but we can't identify the same SCI in different commits Review comment: When we introduced SegmentInfos#getId, we returned `null` as an ID for older segments. This is probably a safer option here as well, as callers can fall back to whatever behavior makes sense for them such as using a strong hash of the commit files as an ID, or re-downloading all files of the commit all the time and giving up incrementality? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409795498 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentCommitInfo.java ## @@ -79,8 +85,7 @@ /** * Sole constructor. - * - * @param info + * @param info Review comment: nit: ```suggestion * @param info ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness
[ https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085216#comment-17085216 ] Chris M. Hostetter commented on SOLR-13132: --- Michael: I'm still working my way through the latest changes, but i'm mainly surprised by this new {{fullDomainAccs}} concept (my prior suggestions were trying to reduce the number of impacts on FacetFieldProcessor API and it's impls that don't care about sweeping, this seems to have just "changed" the nature of impacts) ... {quote}The main other change that followed as a consequence of this work is that I think it's necessary (if we may now replace {{collectAcc}} with null when full domain collection can be accomplished via sweep count collection only) to explicitly separate the read-access to full-domain SlotAccs so that output can be handled appropriately. Formerly, collectAcc served for both collection _and_ read-access for such SlotAccs, but the 1:1 correspondence that made that work is no longer a valid assumption; and can't use the other existing references that get read for output ({{accs}}, {{otherAccs}}, {{deferredAggs}}, etc.) b/c they're handled quite differently (on a 1-slot, 1-bucket-at-a-time a la carte way). I could be missing something here, but in any event that's what explains the introduction of the new {{FacetFieldProcessor.fullDomainAccs}} field. {quote} Interesting ... yeah, i guess i hadn't really considered how to get values out of the {{SweepingAccs}} once we removed them from the {{collectAcc}}. It seems weird that the processors now have to keep track of a list of {{List fullDomainAccs}} to call {{setValues}} on later when the {{countAcc}} is already tracking those ~same~ (logical) {{SlotAcc}} instances in the form of {{SweepingAcc}} instances - so perhaps it would be cleaner/simpler if: * {{SweepingAcc}} defined a {{setValues}} method (same method sig as {{SlotAcc}}) that by default loops over all the "other" {{SweepingAccs}} it wraps (similar to how {{MultiAcc.setValues}} works. * {{CountAcc}} overrides {{SlotAcc.setValues()}} to look something like... {code:java} @Override public void setValues(SimpleOrderedMap bucket, int slotNum) throws IOException { super.setValues(bucket, slotNum); baseSweepingAcc.setValues(bucket, slotNum); } {code} ...? I'm not suggesting/requesting that you make this change right now ... i haven't thought it through enough to have any confidence if it's better/worse then what you've got at the moment ... just trying to talk it through and see WDYT since you're the most familiar with this code: do you think that would simplify the overall changes/impl and help keep the "sweeping" logic in only the classes that care/know about sweeping? > Improve JSON "terms" facet performance when sorted by relatedness > -- > > Key: SOLR-13132 > URL: https://issues.apache.org/jira/browse/SOLR-13132 > Project: Solr > Issue Type: Improvement > Components: Facet Module >Affects Versions: 7.4, master (9.0) >Reporter: Michael Gibney >Priority: Major > Attachments: SOLR-13132-with-cache-01.patch, > SOLR-13132-with-cache.patch, SOLR-13132.patch, SOLR-13132_testSweep.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate > {{relatedness}} for every term. > The current implementation uses a standard uninverted approach (either > {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain > base docSet, and then uses that initial pass as a pre-filter for a > second-pass, inverted approach of fetching docSets for each relevant term > (i.e., {{count > minCount}}?) and calculating intersection size of those sets > with the domain base docSet. > Over high-cardinality fields, the overhead of per-term docSet creation and > set intersection operations increases request latency to the point where > relatedness sort may not be usable in practice (for my use case, even after > applying the patch for SOLR-13108, for a field with ~220k unique terms per > core, QTime for high-cardinality domain docSets were, e.g.: cardinality > 1816684=9000ms, cardinality 5032902=18000ms). > The attached patch brings the above example QTimes down to a manageable > ~300ms and ~250ms respectively. The approach calculates uninverted facet > counts over domain base, foreground, and background docSets in parallel in a > single pass. This allows us to take advantage of the efficiencies built into > the standard uninverted {{FacetFieldProcessorByArray[DV|UIF]}}), and avoids > the per-term docSet creation and set intersection overhead. -- This message was sent by Atlassian Jira (v8.3.4#803005) ---
[GitHub] [lucene-solr] s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409812796 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java ## @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory directory, ChecksumIndexIn if (softDelCount + delCount > info.maxDoc()) { throw new CorruptIndexException("invalid deletion count: " + softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input); } - SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, softDelCount, delGen, fieldInfosGen, dvGen); + final byte[] sciId; + if (format > VERSION_74) { +sciId = new byte[StringHelper.ID_LENGTH]; +input.readBytes(sciId, 0, sciId.length); + } else { +sciId = infos.id; +// NOCOMMIT can we do this? it would at least give us consistent BWC but we can't identify the same SCI in different commits Review comment: I think this case is a bit more complicated due to the changing nature of this ID. For each change (DV, llive docs, fields) we need to move to a new ID. Should we then just accept null and create a new ID once it changes or should we stick with `null` on these segments until they are written first time? Introducing `null` requires quite some changes in how we handle this which we can do, for sure. I still wonder if we can get away with stealing the _parent_ ID and have a smooth upgrade path. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9300) Index corruption with doc values updates and addIndexes
[ https://issues.apache.org/jira/browse/LUCENE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-9300: - Fix Version/s: 7.7.3 > Index corruption with doc values updates and addIndexes > --- > > Key: LUCENE-9300 > URL: https://issues.apache.org/jira/browse/LUCENE-9300 > Project: Lucene - Core > Issue Type: Bug >Reporter: Jim Ferenczi >Priority: Major > Fix For: master (9.0), 7.7.3, 8.6, 8.5.1 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > Today a doc values update creates a new field infos file that contains the > original field infos updated for the new generation as well as the new fields > created by the doc values update. > However existing fields are cloned through the global fields (shared in the > index writer) instead of the local ones (present in the segment). In practice > this is not an issue since field numbers are shared between segments created > by the same index writer. But this assumption doesn't hold for segments > created by different writers and added through > IndexWriter#addIndexes(Directory). In this case, the field number of the same > field can differ between segments so any doc values update can corrupt the > index by assigning the wrong field number to an existing field in the next > generation. > When this happens, queries and merges can access wrong fields without > throwing any error, leading to a silent corruption in the index. > > Since segments are not guaranteed to have the same field number consistently > we should ensure that doc values update preserves the segment's field number > when rewriting field infos. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9300) Index corruption with doc values updates and addIndexes
[ https://issues.apache.org/jira/browse/LUCENE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085233#comment-17085233 ] ASF subversion and git services commented on LUCENE-9300: - Commit dab53a8089f78d860f6e046d3676b9a04131addf in lucene-solr's branch refs/heads/branch_7_7 from Jim Ferenczi [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=dab53a8 ] LUCENE-9300: Fix field infos update on doc values update (#1394) Today a doc values update creates a new field infos file that contains the original field infos updated for the new generation as well as the new fields created by the doc values update. However existing fields are cloned through the global fields (shared in the index writer) instead of the local ones (present in the segment). In practice this is not an issue since field numbers are shared between segments created by the same index writer. But this assumption doesn't hold for segments created by different writers and added through IndexWriter#addIndexes(Directory). In this case, the field number of the same field can differ between segments so any doc values update can corrupt the index by assigning the wrong field number to an existing field in the next generation. When this happens, queries and merges can access wrong fields without throwing any error, leading to a silent corruption in the index. This change ensures that we preserve local field numbers when creating a new field infos generation. > Index corruption with doc values updates and addIndexes > --- > > Key: LUCENE-9300 > URL: https://issues.apache.org/jira/browse/LUCENE-9300 > Project: Lucene - Core > Issue Type: Bug >Reporter: Jim Ferenczi >Priority: Major > Fix For: master (9.0), 8.6, 8.5.1 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > Today a doc values update creates a new field infos file that contains the > original field infos updated for the new generation as well as the new fields > created by the doc values update. > However existing fields are cloned through the global fields (shared in the > index writer) instead of the local ones (present in the segment). In practice > this is not an issue since field numbers are shared between segments created > by the same index writer. But this assumption doesn't hold for segments > created by different writers and added through > IndexWriter#addIndexes(Directory). In this case, the field number of the same > field can differ between segments so any doc values update can corrupt the > index by assigning the wrong field number to an existing field in the next > generation. > When this happens, queries and merges can access wrong fields without > throwing any error, leading to a silent corruption in the index. > > Since segments are not guaranteed to have the same field number consistently > we should ensure that doc values update preserves the segment's field number > when rewriting field infos. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409839154 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java ## @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory directory, ChecksumIndexIn if (softDelCount + delCount > info.maxDoc()) { throw new CorruptIndexException("invalid deletion count: " + softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input); } - SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, softDelCount, delGen, fieldInfosGen, dvGen); + final byte[] sciId; + if (format > VERSION_74) { +sciId = new byte[StringHelper.ID_LENGTH]; +input.readBytes(sciId, 0, sciId.length); + } else { +sciId = infos.id; +// NOCOMMIT can we do this? it would at least give us consistent BWC but we can't identify the same SCI in different commits Review comment: I pushed a new commit https://github.com/apache/lucene-solr/pull/1434/commits/f0a72f82bb17bd2582799aa25514ef764e012570 to address this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mhitza opened a new pull request #1435: SOLR-14410: Switch from SysV init script to systemd service file
mhitza opened a new pull request #1435: SOLR-14410: Switch from SysV init script to systemd service file URL: https://github.com/apache/lucene-solr/pull/1435 # Description Remove the init.d/solr SysV init script and use a systemd service file instead. # Solution I've tried not to diverge as much as possible from the way the installation script used to work before. # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `master` branch. - [ ] I have run `ant precommit` and the appropriate test suite. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409844046 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java ## @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory directory, ChecksumIndexIn if (softDelCount + delCount > info.maxDoc()) { throw new CorruptIndexException("invalid deletion count: " + softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input); } - SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, softDelCount, delGen, fieldInfosGen, dvGen); + final byte[] sciId; + if (format > VERSION_74) { +sciId = new byte[StringHelper.ID_LENGTH]; +input.readBytes(sciId, 0, sciId.length); + } else { +sciId = infos.id; +// NOCOMMIT can we do this? it would at least give us consistent BWC but we can't identify the same SCI in different commits Review comment: I don't understand the idea of stealing the parent ID, wouldn't it cause Lucene to consider commits equal when they are not, which would be a much worse problem than considering commits different when they are equal? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
s1monw commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409851521 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java ## @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory directory, ChecksumIndexIn if (softDelCount + delCount > info.maxDoc()) { throw new CorruptIndexException("invalid deletion count: " + softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input); } - SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, softDelCount, delGen, fieldInfosGen, dvGen); + final byte[] sciId; + if (format > VERSION_74) { +sciId = new byte[StringHelper.ID_LENGTH]; +input.readBytes(sciId, 0, sciId.length); + } else { +sciId = infos.id; +// NOCOMMIT can we do this? it would at least give us consistent BWC but we can't identify the same SCI in different commits Review comment: > I don't understand the idea of stealing the parent ID, wouldn't it cause Lucene to consider commits equal when they are not, which would be a much worse problem than considering commits different when they are equal? the idea was to use the `SegmentInfos` ID which is different for every commit as a default. It would not cause Lucene to consider commits equal when they are not. I just moved to using null instead, it was an idea that has downsides too we can just go with null. The real question is when do we assign an ID then? Once we write the SCI again even if it didn't change? I think we should but that would then bring back the same problem with the fallback. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo
jpountz commented on a change in pull request #1434: LUCENE-9324: Add an ID to SegmentCommitInfo URL: https://github.com/apache/lucene-solr/pull/1434#discussion_r409866462 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java ## @@ -374,7 +376,15 @@ public static final SegmentInfos readCommit(Directory directory, ChecksumIndexIn if (softDelCount + delCount > info.maxDoc()) { throw new CorruptIndexException("invalid deletion count: " + softDelCount + delCount + " vs maxDoc=" + info.maxDoc(), input); } - SegmentCommitInfo siPerCommit = new SegmentCommitInfo(info, delCount, softDelCount, delGen, fieldInfosGen, dvGen); + final byte[] sciId; + if (format > VERSION_74) { +sciId = new byte[StringHelper.ID_LENGTH]; +input.readBytes(sciId, 0, sciId.length); + } else { +sciId = infos.id; +// NOCOMMIT can we do this? it would at least give us consistent BWC but we can't identify the same SCI in different commits Review comment: Ah sorry I got confused because I thought that "parent" was referring to SegmentInfo (no s) instead of SegmentInfos, but I agree that SegmentInfos is not great either. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] janhoy commented on a change in pull request #1435: SOLR-14410: Switch from SysV init script to systemd service file
janhoy commented on a change in pull request #1435: SOLR-14410: Switch from SysV init script to systemd service file URL: https://github.com/apache/lucene-solr/pull/1435#discussion_r409880688 ## File path: solr/solr-ref-guide/src/taking-solr-to-production.adoc ## @@ -365,7 +373,7 @@ There is another issue once the heap reaches 32GB. Below 32GB, Java is able to u Because of the potential garbage collection issues and the particular issues that happen at 32GB, if a single instance would require a 64GB heap, performance is likely to improve greatly if the machine is set up with two nodes that each have a 31GB heap. -If your use case requires multiple instances, at a minimum you will need unique Solr home directories for each node you want to run; ideally, each home should be on a different physical disk so that multiple Solr nodes don’t have to compete with each other when accessing files on disk. Having different Solr home directories implies that you’ll need a different include file for each node. Moreover, if using the `/etc/init.d/solr` script to control Solr as a service, then you’ll need a separate script for each node. The easiest approach is to use the service installation script to add multiple services on the same host, such as: +If your use case requires multiple instances, at a minimum you will need unique Solr home directories for each node you want to run; ideally, each home should be on a different physical disk so that multiple Solr nodes don’t have to compete with each other when accessing files on disk. Having different Solr home directories implies that you’ll need a different include file for each node. Moreover, if using the `/etc/systemd/system/solr.service` script to control Solr, then you’ll need a separate service for each node. The easiest approach is to use the service installation script to add multiple services on the same host, such as: Review comment: Moreover, if using systemctl to control Solr... would be a better wording? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents
mayya-sharipova commented on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents URL: https://github.com/apache/lucene-solr/pull/1351#issuecomment-614931785 I have run another round of benchmarks, this time comparing the performance of this PR VS master as we don't need any special sort field. [Here](https://github.com/mayya-sharipova/luceneutil/commit/c3166e4fc44e7fcddcd1672112c96364d9f464e5) are the changes made to luceneutil. **wikimedium10m** ``` TaskQPS baseline StdDevQPS patch StdDev Pct diff HighTermDayOfYearSort 50.93 (5.6%) 49.31 (10.9%) -3.2% ( -18% - 14%) TermDTSort 83.37 (5.9%) 129.95 (41.2%) 55.9% ( 8% - 109%) WARNING: cat=HighTermDayOfYearSort: hit counts differ: 541957 vs 541957+ WARNING: cat=TermDTSort: hit counts differ: 506054 vs 1861+ ``` Here we have two sorts: - Int sort on a day of year. Slight decrease of performance: -3.2%. There was an attempt to do the optimization, but the optimization was eventually not run as every time [estimatedNumberOfMatches](https://github.com/apache/lucene-solr/pull/1351/files#diff-aff67e212aa0edd675ec31c068cb642bR268) was not selective enough. The reason for that the data here a day of the year in the range [1, 366], and all segments contain various values through a segment. - Long sort on date field (msecSinceEpoch). Speedups: 55.9%. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova edited a comment on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents
mayya-sharipova edited a comment on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents URL: https://github.com/apache/lucene-solr/pull/1351#issuecomment-614931785 I have run another round of benchmarks, this time comparing the performance of this PR VS master as we don't need any special sort field. [Here](https://github.com/mayya-sharipova/luceneutil/commit/c3166e4fc44e7fcddcd1672112c96364d9f464e5) are the changes made to luceneutil. **wikimedium10m**: 10 millon docs ``` TaskQPS baseline StdDevQPS patch StdDev Pct diff HighTermDayOfYearSort 50.93 (5.6%) 49.31 (10.9%) -3.2% ( -18% - 14%) TermDTSort 83.37 (5.9%) 129.95 (41.2%) 55.9% ( 8% - 109%) WARNING: cat=HighTermDayOfYearSort: hit counts differ: 541957 vs 541957+ WARNING: cat=TermDTSort: hit counts differ: 506054 vs 1861+ ``` **wikimediumall**: about 33 million docs ``` TaskQPS baseline StdDevQPS patch StdDev Pct diff HighTermDayOfYearSort 23.37 (4.4%) 21.76 (8.8%) -6.9% ( -19% -6%) TermDTSort 31.86 (3.5%) 108.33 (49.6%) 240.0% ( 180% - 303%) WARNING: cat=HighTermDayOfYearSort: hit counts differ: 1275574 vs 1275574+ WARNING: cat=TermDTSort: hit counts differ: 1474717 vs 1070+ ``` Here we have two sorts: - Int sort on a day of year. Slight decrease of performance: **-6.9% – -3.2%,**. There was an attempt to do the optimization, but the optimization was eventually not run as every time [estimatedNumberOfMatches](https://github.com/apache/lucene-solr/pull/1351/files#diff-aff67e212aa0edd675ec31c068cb642bR268) was not selective enough. The reason for that the data here a day of the year in the range [1, 366], and all segments contain various values through a segment, so this data is not really a target for optimization. - Long sort on date field (msecSinceEpoch). Speedups: **55.9% – 240.0%**. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents
mayya-sharipova commented on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents URL: https://github.com/apache/lucene-solr/pull/1351#issuecomment-614988778 @msokolov @jimczi @jpountz I was wondering if you have any other additional comments for this change? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14291) OldAnalyticsRequestConverter should support fields names with dots
[ https://issues.apache.org/jira/browse/SOLR-14291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated SOLR-14291: Fix Version/s: 8.6 Resolution: Fixed Status: Resolved (was: Patch Available) > OldAnalyticsRequestConverter should support fields names with dots > -- > > Key: SOLR-14291 > URL: https://issues.apache.org/jira/browse/SOLR-14291 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: search, SearchComponents - other >Reporter: Anatolii Siuniaev >Assignee: Mikhail Khludnev >Priority: Trivial > Fix For: 8.6 > > Attachments: SOLR-14291.patch, SOLR-14291.patch, SOLR-14291.patch > > > If you send a query with range facets using old olap-style syntax (see pdf > [here|https://issues.apache.org/jira/browse/SOLR-5302]), > OldAnalyticsRequestConverter just silently (no exception thrown) omits > parameters like > {code:java} > olap..rangefacet..start > {code} > in case if __ has dots inside (for instance field name is > _Project.Value_). And thus no range facets are returned in response. > Probably the same happens in case of field faceting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14291) OldAnalyticsRequestConverter should support fields names with dots
[ https://issues.apache.org/jira/browse/SOLR-14291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085462#comment-17085462 ] Mikhail Khludnev commented on SOLR-14291: - Thanks, [~houston] and [~anatolii_siuniaev]! > OldAnalyticsRequestConverter should support fields names with dots > -- > > Key: SOLR-14291 > URL: https://issues.apache.org/jira/browse/SOLR-14291 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: search, SearchComponents - other >Reporter: Anatolii Siuniaev >Assignee: Mikhail Khludnev >Priority: Trivial > Fix For: 8.6 > > Attachments: SOLR-14291.patch, SOLR-14291.patch, SOLR-14291.patch > > > If you send a query with range facets using old olap-style syntax (see pdf > [here|https://issues.apache.org/jira/browse/SOLR-5302]), > OldAnalyticsRequestConverter just silently (no exception thrown) omits > parameters like > {code:java} > olap..rangefacet..start > {code} > in case if __ has dots inside (for instance field name is > _Project.Value_). And thus no range facets are returned in response. > Probably the same happens in case of field faceting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org