[jira] [Created] (LUCENE-9011) Updating breaks backward compatibility by throwing IndexFormatTooOldException in some cases
xia0c created LUCENE-9011: - Summary: Updating breaks backward compatibility by throwing IndexFormatTooOldException in some cases Key: LUCENE-9011 URL: https://issues.apache.org/jira/browse/LUCENE-9011 Project: Lucene - Core Issue Type: Bug Components: core/FSTs Affects Versions: 7.7.1 Reporter: xia0c When I try to update Lucene from 7.7.1 to the latest version 8.2.0. The following code: {code:java} @Test public void test() throws FileSystemException{ String fstFileName = "fst/slovaklemma_ascii.fst"; File fstFile = new File(fstFileName); FST fst = FST.read(fstFile.toPath(), CharSequenceOutputs.getSingleton()); } {code} Throws an IndexFormatTooOldException error: {code:java} org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource org.apache.lucene.store.InputStreamDataInput@69d9c55): 4 (needs to be between 6 and 6). This version of Lucene only supports indexes created with release 6.0 and later. at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:213) at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:198) at org.apache.lucene.util.fst.FST.(FST.java:275) at org.apache.lucene.util.fst.FST.(FST.java:263) at org.apache.lucene.util.fst.FST.read(FST.java:487) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13846) PreemptiveBasicAuthClientBuilderFactory use of static code blocks makes it unreliable in tests
[ https://issues.apache.org/jira/browse/SOLR-13846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953284#comment-16953284 ] ASF subversion and git services commented on SOLR-13846: Commit 25968e3b75e5e9a4f2a64de10500aae10a257bdd in lucene-solr's branch refs/heads/branch_8_3 from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=25968e3 ] SOLR-13846: workaround - elliminate use of problematic PreemptiveBasicAuthClientBuilderFactory in tests that don't need it (cherry picked from commit 939b3364e604a4a16b3c4c5f278c4d7f30f1354b) > PreemptiveBasicAuthClientBuilderFactory use of static code blocks makes it > unreliable in tests > -- > > Key: SOLR-13846 > URL: https://issues.apache.org/jira/browse/SOLR-13846 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Priority: Major > > PreemptiveBasicAuthClientBuilderFactory uses static code blocks to initialize > global static variables in a way that makes it largely unusable in tests. > Notably: it uses {{System.getProperty(...)}} during classloading to read > system properties that then affect the behavior of all future instances -- > even if an individual test explicitly sets the system property in question > before instaniating instances of this class. > This means that if two tests that both use instances of > PreemptiveBasicAuthClientBuilderFactory run in the same JVM, only the system > properties set in the first test will be used by > PreemptiveBasicAuthClientBuilderFactory in the *second* test (even those the > system properties get reset by the test framework between runs) > There are currently two tests using PreemptiveBasicAuthClientBuilderFactory, > and depending on the order they run, one will fail... > {noformat} > $ ant test -Dtests.jvms=1 > '-Dtests.class=*.TestQueryingOnDownCollection|*.BasicAuthOnSingleNodeTest' > -Dtests.seed=EC8FB67A91689F48 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sl -Dtests.timezone=Asia/Baghdad -Dtests.asserts=true > -Dtests.file.encoding=US-ASCII > ... >[junit4] 2> NOTE: reproduce with: ant test > -Dtestcase=BasicAuthOnSingleNodeTest -Dtests.method=basicTest > -Dtests.seed=EC8FB67A91689F48 -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=sl -Dtests.timezone=Asia/Baghdad -Dtests.asserts=true > -Dtests.file.encoding=US-ASCII >[junit4] ERROR 4.05s | BasicAuthOnSingleNodeTest.basicTest <<< >[junit4]> Throwable #1: > org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: > Error from server at http://127.0.0.1:37047/solr: Expected mime type > application/octet-stream but got text/html. >[junit4]> >[junit4]> content="text/html;charset=utf-8"/> >[junit4]> Error 401 Bad credentials >[junit4]> >[junit4]> HTTP ERROR 401 >[junit4]> Problem accessing /solr/authCollection/select. Reason: >[junit4]> Bad credentials href="http://eclipse.org/jetty";>Powered by Jetty:// 9.4.19.v20190610 >[junit4]> >[junit4]> >[junit4]> at > __randomizedtesting.SeedInfo.seed([EC8FB67A91689F48:1E7BA118D5CD927B]:0) >[junit4]> at > org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:696) >[junit4]> at > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:402) >[junit4]> at > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:754) >[junit4]> at > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:207) >[junit4]> at > org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1003) >[junit4]> at > org.apache.solr.security.BasicAuthOnSingleNodeTest.basicTest(BasicAuthOnSingleNodeTest.java:72) >[junit4]> at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >[junit4]> at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >[junit4]> at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >[junit4]> at > java.base/java.lang.reflect.Method.invoke(Method.java:566) >[junit4]> at java.base/java.lang.Thread.run(Thread.java:834) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-13741) AuditLoggerIntegrationTest hardening
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter resolved SOLR-13741. --- Resolution: Fixed > AuditLoggerIntegrationTest hardening > > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: master (9.0), 8.4 > > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch > > > This issue started out as an investigation into possible test or code ugs > uncovered while hardening AuditLoggerIntegrationTest against timing related > failures. the bugs that were identified as being in code were spun of into > their own issues for tracking purposes to raise visibility to end users. > this issue remains as for tracking the final hardening of the test and fixing > of some test bugs found along the way. > Original jira description below... > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13741) AuditLoggerIntegrationTest hardening
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter updated SOLR-13741: -- Fix Version/s: 8.4 master (9.0) Description: This issue started out as an investigation into possible test or code ugs uncovered while hardening AuditLoggerIntegrationTest against timing related failures. the bugs that were identified as being in code were spun of into their own issues for tracking purposes to raise visibility to end users. this issue remains as for tracking the final hardening of the test and fixing of some test bugs found along the way. Original jira description below... A while back i saw a weird non-reproducible failure from AuditLoggerIntegrationTest. When i started reading through that code, 2 things jumped out at me: # the way the 'delay' option works is brittle, and makes assumptions about CPU scheduling that aren't neccessarily going to be true (and also suffers from the problem that Thread.sleep isn't garunteed to sleep as long as you ask it too) # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by checking the size of a (List) {{buffer}} of recieved events in a sleep/poll loop, until it contains at least N items -- but the code that adds items to that buffer in the async Callback thread async _before_ the code that updates other state variables (like the global {{count}} and the patch specific {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 events added to the buffer, but calling {{assertEquals(3, receiver.getTotalCount())}} could subsequently fail because that variable hadn't been udpated yet. #2 was the source of the failures I was seeing, and while a quick fix for that specific problem would be to update all other state _before_ adding the event to the buffer, I set out to try and make more general improvements to the test: * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data structures * harden the assertions made about the expected events recieved (updating some test methods that currently just assert the number of events recieved) * add new assertions that _only_ the expected events are recieved. In the process of doing this, I've found several oddities/descrepencies between things the test currently claims/asserts, and what *actually* happens under more rigerous scrutiny/assertions. I'll attach a patch shortly that has my (in progress) updates and inlcudes copious nocommits about things seem suspect. the summary of these concerns is: * SolrException status codes that do not match what the existing test says they should (but doesn't assert) * extra AuditEvents occuring that the existing test does not expect * AuditEvents for incorrect credentials that do not at all match the expected AuditEvent in the existing test -- which the current test seems to miss in it's assertions because it's picking up some extra events from triggered by previuos requests earlier in the test that just happen to also match the asserctions. ...it's not clear to me if the test logic is correct and these are "code bugs" or if the test is faulty. was: A while back i saw a weird non-reproducible failure from AuditLoggerIntegrationTest. When i started reading through that code, 2 things jumped out at me: # the way the 'delay' option works is brittle, and makes assumptions about CPU scheduling that aren't neccessarily going to be true (and also suffers from the problem that Thread.sleep isn't garunteed to sleep as long as you ask it too) # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by checking the size of a (List) {{buffer}} of recieved events in a sleep/poll loop, until it contains at least N items -- but the code that adds items to that buffer in the async Callback thread async _before_ the code that updates other state variables (like the global {{count}} and the patch specific {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 events added to the buffer, but calling {{assertEquals(3, receiver.getTotalCount())}} could subsequently fail because that variable hadn't been udpated yet. #2 was the source of the failures I was seeing, and while a quick fix for that specific problem would be to update all other state _before_ adding the event to the buffer, I set out to try and make more general improvements to the test: * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data structures * harden the assertions made about the expected events recieved (updating some test methods that currently just assert the number of events recieved) * add new assertions that _only_ the expected events are recieved. In the process of doing this, I've found several oddities/descrepencies between things the test currently claims/asserts, and what *actually* happens under mo
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953281#comment-16953281 ] ASF subversion and git services commented on SOLR-13741: Commit 28c1049a258bbd060a80803c72e1c6cadc784dab in lucene-solr's branch refs/heads/branch_8x from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=28c1049 ] SOLR-13741: Harden AuditLoggerIntegrationTest (cherry picked from commit 63e9bcf5d150e6324e5133a001613bd7f738a183) > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953279#comment-16953279 ] ASF subversion and git services commented on SOLR-13741: Commit 63e9bcf5d150e6324e5133a001613bd7f738a183 in lucene-solr's branch refs/heads/master from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=63e9bcf ] SOLR-13741: Harden AuditLoggerIntegrationTest > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9010) extend TopGroups.merge test coverage
[ https://issues.apache.org/jira/browse/LUCENE-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953274#comment-16953274 ] Lucene/Solr QA commented on LUCENE-9010: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 46s{color} | {color:red} lucene_grouping generated 4 new + 107 unchanged - 0 fixed = 111 total (was 107) {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 3s{color} | {color:green} grouping in the patch passed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 5m 10s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | LUCENE-9010 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12983203/LUCENE-9010.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / ebc720c | | ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 | | Default Java | LTS | | javac | https://builds.apache.org/job/PreCommit-LUCENE-Build/210/artifact/out/diff-compile-javac-lucene_grouping.txt | | Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/210/testReport/ | | modules | C: lucene/grouping U: lucene/grouping | | Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/210/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > extend TopGroups.merge test coverage > > > Key: LUCENE-9010 > URL: https://issues.apache.org/jira/browse/LUCENE-9010 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Christine Poerschke >Priority: Minor > Attachments: LUCENE-9010.patch > > > This sub-task proposes to add test coverage for the {{TopGroups.merge}} > method, separately from but as preparation for LUCENE-8996 fixing the > 'maxScore is sometimes missing' bug. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953269#comment-16953269 ] Jan Høydahl commented on SOLR-13741: {quote}Jan: you just beat me to it ... my updated patch looks exactly like yours, but with more lazy whitespace :) {quote} :) I'll let you take it from here and do the merge. > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953268#comment-16953268 ] Chris M. Hostetter commented on SOLR-13741: --- Jan: you just beat me to it ... my updated patch looks exactly like yours, but with more lazy whitespace :) Feel free to commit. > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13677) All Metrics Gauges should be unregistered by the objects that registered them
[ https://issues.apache.org/jira/browse/SOLR-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953266#comment-16953266 ] Noble Paul commented on SOLR-13677: --- [~ab] can you raise a PR so that we can review easily > All Metrics Gauges should be unregistered by the objects that registered them > - > > Key: SOLR-13677 > URL: https://issues.apache.org/jira/browse/SOLR-13677 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Noble Paul >Assignee: Andrzej Bialecki >Priority: Blocker > Fix For: 8.3 > > Attachments: SOLR-13677.patch > > Time Spent: 3h 50m > Remaining Estimate: 0h > > The life cycle of Metrics producers are managed by the core (mostly). So, if > the lifecycle of the object is different from that of the core itself, these > objects will never be unregistered from the metrics registry. This will lead > to memory leaks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953265#comment-16953265 ] Jan Høydahl commented on SOLR-13741: SOLR-13835 merged and updated this patch to master. > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-13741: --- Attachment: SOLR-13741.patch > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT
[ https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-13835: --- Fix Version/s: 8.3.0 > HttpSolrCall produces incorrect extra AuditEvent on > AuthorizationResponse.PROMPT > > > Key: SOLR-13835 > URL: https://issues.apache.org/jira/browse/SOLR-13835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Authentication, Authorization >Reporter: Chris M. Hostetter >Assignee: Jan Høydahl >Priority: Major > Fix For: 8.3.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > spinning this out of SOLR-13741... > {quote} > Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe > there is a code bug, not a test bug. In HttpSolrCall#471 in the > {{authorize()}} call, if authResponse == PROMPT, it will actually match both > blocks and emit two audit events: > [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493] > > {code:java} > if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...} > if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && > !(authResponse.statusCode == HttpStatus.SC_OK)) {...} > {code} > When code==401, it is also true that code!=200. Intuitively there should be > both a sendErrora and return RETURN before line #484 in the first if block? > {quote} > This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by > a coresponding {{UNAUTHORIZED}} AuditEvent. > It's not yet clear if, from the perspective of the external client, there are > any other bugs in behavior (TBD) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT
[ https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl resolved SOLR-13835. Resolution: Fixed Pushed to master, branch_8x and branch_8_3 > HttpSolrCall produces incorrect extra AuditEvent on > AuthorizationResponse.PROMPT > > > Key: SOLR-13835 > URL: https://issues.apache.org/jira/browse/SOLR-13835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Authentication, Authorization >Reporter: Chris M. Hostetter >Assignee: Jan Høydahl >Priority: Major > Fix For: 8.3.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > spinning this out of SOLR-13741... > {quote} > Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe > there is a code bug, not a test bug. In HttpSolrCall#471 in the > {{authorize()}} call, if authResponse == PROMPT, it will actually match both > blocks and emit two audit events: > [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493] > > {code:java} > if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...} > if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && > !(authResponse.statusCode == HttpStatus.SC_OK)) {...} > {code} > When code==401, it is also true that code!=200. Intuitively there should be > both a sendErrora and return RETURN before line #484 in the first if block? > {quote} > This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by > a coresponding {{UNAUTHORIZED}} AuditEvent. > It's not yet clear if, from the perspective of the external client, there are > any other bugs in behavior (TBD) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT
[ https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953259#comment-16953259 ] ASF subversion and git services commented on SOLR-13835: Commit b58695c98ce1356efc27beeb338a8300f6f72346 in lucene-solr's branch refs/heads/branch_8_3 from Jan Høydahl [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b58695c ] SOLR-13835 HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT (#946) (cherry picked from commit 611c4f960e9472880e2ec24dda9336a59cd41426) > HttpSolrCall produces incorrect extra AuditEvent on > AuthorizationResponse.PROMPT > > > Key: SOLR-13835 > URL: https://issues.apache.org/jira/browse/SOLR-13835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Authentication, Authorization >Reporter: Chris M. Hostetter >Assignee: Jan Høydahl >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > spinning this out of SOLR-13741... > {quote} > Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe > there is a code bug, not a test bug. In HttpSolrCall#471 in the > {{authorize()}} call, if authResponse == PROMPT, it will actually match both > blocks and emit two audit events: > [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493] > > {code:java} > if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...} > if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && > !(authResponse.statusCode == HttpStatus.SC_OK)) {...} > {code} > When code==401, it is also true that code!=200. Intuitively there should be > both a sendErrora and return RETURN before line #484 in the first if block? > {quote} > This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by > a coresponding {{UNAUTHORIZED}} AuditEvent. > It's not yet clear if, from the perspective of the external client, there are > any other bugs in behavior (TBD) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-13852) TestCloudNestedDocsSort can use the same uniqueKey for both a parent and child doc
[ https://issues.apache.org/jira/browse/SOLR-13852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter resolved SOLR-13852. --- Fix Version/s: master (9.0) 8.4 Assignee: Chris M. Hostetter Resolution: Fixed > TestCloudNestedDocsSort can use the same uniqueKey for both a parent and > child doc > -- > > Key: SOLR-13852 > URL: https://issues.apache.org/jira/browse/SOLR-13852 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: 8.4, master (9.0) > > Attachments: thetaphi_Lucene-Solr-master-Linux_24903.log.txt > > > TestCloudNestedDocsSort uses randomly generated "id" values for all docs, > which not only means that two "parent" docs can be indexed with the same "id" > value, but also that a child doc might be indexed with the same "id" value as > a parent doc. > While nothing in Solr actively prevents this, it's documented as something > people shouldn't do, and can cause problems. > In particular, this has caused some assertion failures for some test seeds > due to how it interacts with SOLR-13851 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13852) TestCloudNestedDocsSort can use the same uniqueKey for both a parent and child doc
[ https://issues.apache.org/jira/browse/SOLR-13852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953258#comment-16953258 ] ASF subversion and git services commented on SOLR-13852: Commit 3a67c82c9161454e3a7e6bf76cde7ed7e4018f28 in lucene-solr's branch refs/heads/branch_8x from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3a67c82 ] SOLR-13852: Fix TestCloudNestedDocsSort to ensure child docs are never created in a way that violates uniqueKey rules (cherry picked from commit ebc720c5b09ae06b8ab093b296bf87e4f6ed978f) > TestCloudNestedDocsSort can use the same uniqueKey for both a parent and > child doc > -- > > Key: SOLR-13852 > URL: https://issues.apache.org/jira/browse/SOLR-13852 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Priority: Major > Attachments: thetaphi_Lucene-Solr-master-Linux_24903.log.txt > > > TestCloudNestedDocsSort uses randomly generated "id" values for all docs, > which not only means that two "parent" docs can be indexed with the same "id" > value, but also that a child doc might be indexed with the same "id" value as > a parent doc. > While nothing in Solr actively prevents this, it's documented as something > people shouldn't do, and can cause problems. > In particular, this has caused some assertion failures for some test seeds > due to how it interacts with SOLR-13851 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT
[ https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953257#comment-16953257 ] ASF subversion and git services commented on SOLR-13835: Commit 5a074b0fe49ef863a162e7f5d55e351bc043c806 in lucene-solr's branch refs/heads/branch_8x from Jan Høydahl [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5a074b0 ] SOLR-13835 HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT (#946) (cherry picked from commit 611c4f960e9472880e2ec24dda9336a59cd41426) > HttpSolrCall produces incorrect extra AuditEvent on > AuthorizationResponse.PROMPT > > > Key: SOLR-13835 > URL: https://issues.apache.org/jira/browse/SOLR-13835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Authentication, Authorization >Reporter: Chris M. Hostetter >Assignee: Jan Høydahl >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > spinning this out of SOLR-13741... > {quote} > Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe > there is a code bug, not a test bug. In HttpSolrCall#471 in the > {{authorize()}} call, if authResponse == PROMPT, it will actually match both > blocks and emit two audit events: > [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493] > > {code:java} > if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...} > if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && > !(authResponse.statusCode == HttpStatus.SC_OK)) {...} > {code} > When code==401, it is also true that code!=200. Intuitively there should be > both a sendErrora and return RETURN before line #484 in the first if block? > {quote} > This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by > a coresponding {{UNAUTHORIZED}} AuditEvent. > It's not yet clear if, from the perspective of the external client, there are > any other bugs in behavior (TBD) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8987) Move Lucene web site from svn to git
[ https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953165#comment-16953165 ] Jan Høydahl edited comment on LUCENE-8987 at 10/16/19 11:00 PM: Steps # Create new git repo '{{lucene-site'}} # Create folder structure and copy old site (excluding JavaDoc and online RefGuide) from svn into appropriate folder(s) # Adapt to make local Pelican site build work for building the barebones site, and commit to master branch # Add {{.asf.yaml}} file with a 'staging' profile for branch asf-staging, and a 'publish' profile for branch 'asf-site', and a 'pelican' directive to auto build from 'master' branch and put site into 'asf-staging' branch (/output folder). # Verify that the staging build kicks off and that a site appears in [lucene.staged.apache.org|https://lucene.staged.apache.org/] (note that this is different from lucene.staging.apache.org that old CMS uses) # Find a solution for JavaDoc and RefGuide, which are *huge* amounts of statically generated HTML uploaded by RM during build. ** These should just be put on a filesystem somewhere, outside of git ** Do some {{.htaccess}} magic to make them appear in the right locations of the site # Once the staging site is good, merge {{asf-staging}} into {{asf-site}} branch to publish. This will automatically disable CMS. # Commit a README-NOT-IN-USE file to old svn repo and make it read-only Note that also the RM guidelines need to be updated wrt * how to update website, download pages etc during a release * how to publish JavaDoc * how to publish RefGuide HTML was (Author: janhoy): Steps # Create new git repo '{{lucene-site'}} # Create folder structure and copy old site from svn into appropriate folder(s) # Adapt to make local Pelican site build work, and commit to master branch # Add {{.asf.yaml}} file with a 'staging' profile for branch asf-staging, and a 'publish' profile for branch 'asf-site' # Merge master branch into 'asf-staging' and verify that the staging build kicks off and that a site appears in [lucene.staged.apache.org|https://lucene.staged.apache.org/] (note that this is different from lucene.staging.apache.org that old CMS uses) # Iterate until the site is perfect for publishing # Merge master branch into 'asf-site' branch, which will publish to the real site and automatically disable old CMS # Commit a README-NOT-IN-USE file to old svn repo and make it read-only > Move Lucene web site from svn to git > > > Key: LUCENE-8987 > URL: https://issues.apache.org/jira/browse/LUCENE-8987 > Project: Lucene - Core > Issue Type: Task > Components: general/website >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Attachments: lucene-site-repo.png > > > INFRA just enabled [a new way of configuring website > build|https://s.apache.org/asfyaml] from a git branch, [see dev list > email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E]. > It allows for automatic builds of both staging and production site, much > like the old CMS. We can choose to auto publish the html content of an > {{output/}} folder, or to have a bot build the site using > [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder. > The goal of this issue is to explore how this can be done for > [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new > git repo {{lucene-site}}, copy over the site from svn, see if it can be > "Pelicanized" easily and then test staging. Benefits are that more people > will be able to edit the web site and we can take PRs from the public (with > GitHub preview of pages). > Non-goals: > * Create a new web site or a new graphic design > * Change from Markdown to Asciidoc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13852) TestCloudNestedDocsSort can use the same uniqueKey for both a parent and child doc
[ https://issues.apache.org/jira/browse/SOLR-13852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953248#comment-16953248 ] ASF subversion and git services commented on SOLR-13852: Commit ebc720c5b09ae06b8ab093b296bf87e4f6ed978f in lucene-solr's branch refs/heads/master from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ebc720c ] SOLR-13852: Fix TestCloudNestedDocsSort to ensure child docs are never created in a way that violates uniqueKey rules > TestCloudNestedDocsSort can use the same uniqueKey for both a parent and > child doc > -- > > Key: SOLR-13852 > URL: https://issues.apache.org/jira/browse/SOLR-13852 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Priority: Major > Attachments: thetaphi_Lucene-Solr-master-Linux_24903.log.txt > > > TestCloudNestedDocsSort uses randomly generated "id" values for all docs, > which not only means that two "parent" docs can be indexed with the same "id" > value, but also that a child doc might be indexed with the same "id" value as > a parent doc. > While nothing in Solr actively prevents this, it's documented as something > people shouldn't do, and can cause problems. > In particular, this has caused some assertion failures for some test seeds > due to how it interacts with SOLR-13851 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT
[ https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953246#comment-16953246 ] ASF subversion and git services commented on SOLR-13835: Commit 611c4f960e9472880e2ec24dda9336a59cd41426 in lucene-solr's branch refs/heads/master from Jan Høydahl [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=611c4f9 ] SOLR-13835 HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT (#946) > HttpSolrCall produces incorrect extra AuditEvent on > AuthorizationResponse.PROMPT > > > Key: SOLR-13835 > URL: https://issues.apache.org/jira/browse/SOLR-13835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Authentication, Authorization >Reporter: Chris M. Hostetter >Assignee: Jan Høydahl >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > spinning this out of SOLR-13741... > {quote} > Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe > there is a code bug, not a test bug. In HttpSolrCall#471 in the > {{authorize()}} call, if authResponse == PROMPT, it will actually match both > blocks and emit two audit events: > [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493] > > {code:java} > if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...} > if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && > !(authResponse.statusCode == HttpStatus.SC_OK)) {...} > {code} > When code==401, it is also true that code!=200. Intuitively there should be > both a sendErrora and return RETURN before line #484 in the first if block? > {quote} > This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by > a coresponding {{UNAUTHORIZED}} AuditEvent. > It's not yet clear if, from the perspective of the external client, there are > any other bugs in behavior (TBD) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] janhoy merged pull request #946: SOLR-13835 HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT
janhoy merged pull request #946: SOLR-13835 HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT URL: https://github.com/apache/lucene-solr/pull/946 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson commented on issue #888: SOLR-13774 add lucene/solr openjdk compatibility matrix to ref guide.
ErickErickson commented on issue #888: SOLR-13774 add lucene/solr openjdk compatibility matrix to ref guide. URL: https://github.com/apache/lucene-solr/pull/888#issuecomment-542910805 Hmmm, I think overall the idea of putting it on a Wiki page then linking to it from the ref guide makes sense. We can put some disclaimers in about testing etc. as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT
[ https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953219#comment-16953219 ] Chris M. Hostetter commented on SOLR-13835: --- +1 > HttpSolrCall produces incorrect extra AuditEvent on > AuthorizationResponse.PROMPT > > > Key: SOLR-13835 > URL: https://issues.apache.org/jira/browse/SOLR-13835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Authentication, Authorization >Reporter: Chris M. Hostetter >Assignee: Jan Høydahl >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > spinning this out of SOLR-13741... > {quote} > Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe > there is a code bug, not a test bug. In HttpSolrCall#471 in the > {{authorize()}} call, if authResponse == PROMPT, it will actually match both > blocks and emit two audit events: > [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493] > > {code:java} > if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...} > if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && > !(authResponse.statusCode == HttpStatus.SC_OK)) {...} > {code} > When code==401, it is also true that code!=200. Intuitively there should be > both a sendErrora and return RETURN before line #484 in the first if block? > {quote} > This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by > a coresponding {{UNAUTHORIZED}} AuditEvent. > It's not yet clear if, from the perspective of the external client, there are > any other bugs in behavior (TBD) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8987) Move Lucene web site from svn to git
[ https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953171#comment-16953171 ] Jan Høydahl commented on LUCENE-8987: - !lucene-site-repo.png|width=339! > Move Lucene web site from svn to git > > > Key: LUCENE-8987 > URL: https://issues.apache.org/jira/browse/LUCENE-8987 > Project: Lucene - Core > Issue Type: Task > Components: general/website >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Attachments: lucene-site-repo.png > > > INFRA just enabled [a new way of configuring website > build|https://s.apache.org/asfyaml] from a git branch, [see dev list > email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E]. > It allows for automatic builds of both staging and production site, much > like the old CMS. We can choose to auto publish the html content of an > {{output/}} folder, or to have a bot build the site using > [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder. > The goal of this issue is to explore how this can be done for > [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new > git repo {{lucene-site}}, copy over the site from svn, see if it can be > "Pelicanized" easily and then test staging. Benefits are that more people > will be able to edit the web site and we can take PRs from the public (with > GitHub preview of pages). > Non-goals: > * Create a new web site or a new graphic design > * Change from Markdown to Asciidoc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-8987) Move Lucene web site from svn to git
[ https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated LUCENE-8987: Attachment: lucene-site-repo.png > Move Lucene web site from svn to git > > > Key: LUCENE-8987 > URL: https://issues.apache.org/jira/browse/LUCENE-8987 > Project: Lucene - Core > Issue Type: Task > Components: general/website >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Attachments: lucene-site-repo.png > > > INFRA just enabled [a new way of configuring website > build|https://s.apache.org/asfyaml] from a git branch, [see dev list > email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E]. > It allows for automatic builds of both staging and production site, much > like the old CMS. We can choose to auto publish the html content of an > {{output/}} folder, or to have a bot build the site using > [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder. > The goal of this issue is to explore how this can be done for > [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new > git repo {{lucene-site}}, copy over the site from svn, see if it can be > "Pelicanized" easily and then test staging. Benefits are that more people > will be able to edit the web site and we can take PRs from the public (with > GitHub preview of pages). > Non-goals: > * Create a new web site or a new graphic design > * Change from Markdown to Asciidoc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-12786) Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure
[ https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett resolved SOLR-12786. -- Fix Version/s: 8.3 Resolution: Fixed > Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure > --- > > Key: SOLR-12786 > URL: https://issues.apache.org/jira/browse/SOLR-12786 > Project: Solr > Issue Type: Improvement > Components: Build, documentation >Affects Versions: 8.0 >Reporter: Jan Høydahl >Assignee: Cassandra Targett >Priority: Major > Fix For: 8.3 > > Attachments: SOLR-12786.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the refguide build requires asciidoctor 1.5.6.2. > People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, > causing different header ID syntax and the build will break. > Long term we should move to latest asciidoctor. > It is already documented in README how to install the older 1.5.6.2 version. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8987) Move Lucene web site from svn to git
[ https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953165#comment-16953165 ] Jan Høydahl commented on LUCENE-8987: - Steps # Create new git repo '{{lucene-site'}} # Create folder structure and copy old site from svn into appropriate folder(s) # Adapt to make local Pelican site build work, and commit to master branch # Add {{.asf.yaml}} file with a 'staging' profile for branch asf-staging, and a 'publish' profile for branch 'asf-site' # Merge master branch into 'asf-staging' and verify that the staging build kicks off and that a site appears in [lucene.staged.apache.org|https://lucene.staged.apache.org/] (note that this is different from lucene.staging.apache.org that old CMS uses) # Iterate until the site is perfect for publishing # Merge master branch into 'asf-site' branch, which will publish to the real site and automatically disable old CMS # Commit a README-NOT-IN-USE file to old svn repo and make it read-only > Move Lucene web site from svn to git > > > Key: LUCENE-8987 > URL: https://issues.apache.org/jira/browse/LUCENE-8987 > Project: Lucene - Core > Issue Type: Task > Components: general/website >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > > INFRA just enabled [a new way of configuring website > build|https://s.apache.org/asfyaml] from a git branch, [see dev list > email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E]. > It allows for automatic builds of both staging and production site, much > like the old CMS. We can choose to auto publish the html content of an > {{output/}} folder, or to have a bot build the site using > [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder. > The goal of this issue is to explore how this can be done for > [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new > git repo {{lucene-site}}, copy over the site from svn, see if it can be > "Pelicanized" easily and then test staging. Benefits are that more people > will be able to edit the web site and we can take PRs from the public (with > GitHub preview of pages). > Non-goals: > * Create a new web site or a new graphic design > * Change from Markdown to Asciidoc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12786) Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure
[ https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953161#comment-16953161 ] ASF subversion and git services commented on SOLR-12786: Commit a27eabbd2132abcd47bb0a5f7c42fcafaded1d9a in lucene-solr's branch refs/heads/branch_8_3 from Cassandra Targett [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a27eabb ] SOLR-12786: Update Ref Guide build tool versions & fix section links for new format requirements > Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure > --- > > Key: SOLR-12786 > URL: https://issues.apache.org/jira/browse/SOLR-12786 > Project: Solr > Issue Type: Improvement > Components: Build, documentation >Affects Versions: 8.0 >Reporter: Jan Høydahl >Assignee: Cassandra Targett >Priority: Major > Attachments: SOLR-12786.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the refguide build requires asciidoctor 1.5.6.2. > People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, > causing different header ID syntax and the build will break. > Long term we should move to latest asciidoctor. > It is already documented in README how to install the older 1.5.6.2 version. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12786) Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure
[ https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953162#comment-16953162 ] ASF subversion and git services commented on SOLR-12786: Commit 2f11fd410a4ad707959f366ff7dda63c4cbbb4c4 in lucene-solr's branch refs/heads/branch_8_3 from Cassandra Targett [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2f11fd4 ] SOLR-12786: add back explicit asciidoctor install for Jenkins build > Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure > --- > > Key: SOLR-12786 > URL: https://issues.apache.org/jira/browse/SOLR-12786 > Project: Solr > Issue Type: Improvement > Components: Build, documentation >Affects Versions: 8.0 >Reporter: Jan Høydahl >Assignee: Cassandra Targett >Priority: Major > Attachments: SOLR-12786.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the refguide build requires asciidoctor 1.5.6.2. > People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, > causing different header ID syntax and the build will break. > Long term we should move to latest asciidoctor. > It is already documented in README how to install the older 1.5.6.2 version. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.
[ https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953152#comment-16953152 ] Lucene/Solr QA commented on SOLR-13824: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 21s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 56s{color} | {color:green} ltr in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 21s{color} | {color:red} core in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 51s{color} | {color:red} solrj in the patch failed. {color} | | {color:black}{color} | {color:black} {color} | {color:black}110m 42s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | solr.core.TestSolrConfigHandler | | | solr.cloud.autoscaling.AutoAddReplicasIntegrationTest | | | solr.search.facet.TestJsonFacetRefinement | | | solr.filestore.TestDistribPackageStore | | | solr.cloud.autoscaling.AutoAddReplicasPlanActionTest | | | solr.client.solrj.cloud.autoscaling.TestPolicyOld | | | solr.client.solrj.cloud.autoscaling.TestPolicy | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | SOLR-13824 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12983170/SOLR-13824.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / b3d59a7 | | ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 | | Default Java | LTS | | unit | https://builds.apache.org/job/PreCommit-SOLR-Build/579/artifact/out/patch-unit-solr_core.txt | | unit | https://builds.apache.org/job/PreCommit-SOLR-Build/579/artifact/out/patch-unit-solr_solrj.txt | | Test Results | https://builds.apache.org/job/PreCommit-SOLR-Build/579/testReport/ | | modules | C: solr/contrib/ltr solr/core solr/solrj U: solr | | Console output | https://builds.apache.org/job/PreCommit-SOLR-Build/579/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > JSON Request API ignores prematurely closing curly brace. > -- > > Key: SOLR-13824 > URL: https://issues.apache.org/jira/browse/SOLR-13824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: JSON Request API >Reporter: Mikhail Khludnev >Priority: Major > Attachments: SOLR-13824.patch, SOLR-13824.patch > > > {code:java} > json={query:"content:foo", facet:{zz:{field:id}}} > {code} > this works fine, but if we mistype {{}}} instead of {{,}} > {code:java} > json={query:"content:foo"} facet:{zz:{field:id}}} > {code} > It's captured only partially, here's we have under debug > {code:java} > "json":{"query":"content:foo"}, > {code} > I suppose it should throw an error with 400 code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8986) Add asf.yaml to our git repo
[ https://issues.apache.org/jira/browse/LUCENE-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953138#comment-16953138 ] Jan Høydahl commented on LUCENE-8986: - Please see my proposal in [GitHub Pull Request #958|https://github.com/apache/lucene-solr/pull/958], and feel free to provide a better project description or additional GitHub topic labels. Will commit this on Friday. > Add asf.yaml to our git repo > > > Key: LUCENE-8986 > URL: https://issues.apache.org/jira/browse/LUCENE-8986 > Project: Lucene - Core > Issue Type: Task >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Adding a {{asf.yaml}} file to our git repo allows us to control the > description, link and labels on Lucene-Solr project git page. See > https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E > for more. > I'll post a PR with the suggested change -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-8986) Add asf.yaml to our git repo
[ https://issues.apache.org/jira/browse/LUCENE-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated LUCENE-8986: Component/s: general/website > Add asf.yaml to our git repo > > > Key: LUCENE-8986 > URL: https://issues.apache.org/jira/browse/LUCENE-8986 > Project: Lucene - Core > Issue Type: Task > Components: general/website >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Adding a {{asf.yaml}} file to our git repo allows us to control the > description, link and labels on Lucene-Solr project git page. See > https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E > for more. > I'll post a PR with the suggested change -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] janhoy opened a new pull request #958: LUCENE-8986: Add asf.yaml to our git repo
janhoy opened a new pull request #958: LUCENE-8986: Add asf.yaml to our git repo URL: https://github.com/apache/lucene-solr/pull/958 # Description See https://issues.apache.org/jira/browse/LUCENE-8986 # Solution Adding the .asf.yaml will edit GitHub project description, link and labels # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [ ] I have created a Jira issue and added the issue ID to my pull request title. - [ ] I am authorized to contribute this code to the ASF and have removed any code I do not have a license to distribute. - [ ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [ ] I have developed this patch against the `master` branch. - [ ] I have run `ant precommit` and the appropriate test suite. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9010) extend TopGroups.merge test coverage
[ https://issues.apache.org/jira/browse/LUCENE-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke updated LUCENE-9010: Status: Patch Available (was: Open) > extend TopGroups.merge test coverage > > > Key: LUCENE-9010 > URL: https://issues.apache.org/jira/browse/LUCENE-9010 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Christine Poerschke >Priority: Minor > Attachments: LUCENE-9010.patch > > > This sub-task proposes to add test coverage for the {{TopGroups.merge}} > method, separately from but as preparation for LUCENE-8996 fixing the > 'maxScore is sometimes missing' bug. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953080#comment-16953080 ] Christine Poerschke commented on LUCENE-8996: - Looking at the {{TopGroupsTest}} portion of both the patch and the pull request for this ticket I had some "there's a lot of numbers here" thoughts and it (subjectively, of course) seemed to me a little tricky to work out what they all are (numbers for shard index, numbers for doc id, numbers for group value, numbers for scores, numbers for hit counts, sometimes NaN not-a-number numbers) and what they mean and why/that the expected test results are correct. The LUCENE-9010 sub-task proposes to split out the addition of test coverage for the existing code from the 'maxScore missing' fix here (and the first proposed patch for it tries to reduce the "amount of numbers" e.g. instead of integer group values 1 and 2 there's string group values "red" and "blue" and a narrative and local variable names (redAntScore, blueDragonflyScore, redSquirrelScore, blueWhaleScore) try to make it easier to work out what the {{expectedMaxScore}} value is. > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses
[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953078#comment-16953078 ] Christine Poerschke commented on LUCENE-8996: - {quote}... If you merge two groups with no real maxScores the final result will be MIN_VALUE (NaN would make more sense imo) ... {quote} Yes, MIN_VALUE seems a quirky result for this edge case. Though if one were to change the existing behaviour it might be clearest to do that separately from the 'maxScore missing' fix here: here we are removing an erroneous case of 'maxScore missing' and changing away from MIN_VALUE would add a legitimate case of 'maxScore missing'. {quote}... this *should* never happen in theory because if no segment contains documents about group x it shouldn't be possible that we retrieve documents about group x in first place. ... {quote} I agree, in theory it should never happen though in practice I think there's a timing window of opportunity that could make it happen, though it would seem quite unlikely. The first pass of the distributed search could determine that there are segments with documents about group X but subsequently it could then be 'just so' that by the time the second pass of the search runs a few moments later the document(s) in group X have all been deleted? > maxScore is sometimes missing from distributed grouped responses > > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Julien Massenet >Priority: Minor > Attachments: LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9010) extend TopGroups.merge test coverage
[ https://issues.apache.org/jira/browse/LUCENE-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953077#comment-16953077 ] Christine Poerschke commented on LUCENE-9010: - The attached proposed patch tries to reduce the "amount of numbers" in the test e.g. instead of integer group values 1 and 2 there's string group values "red" and "blue" and a narrative and local variable names (redAntScore, blueDragonflyScore, blueDragonflySize, redSquirrelScore, blueWhaleScore) try to make it easier to work out what the {{expectedMaxScore}} value is. > extend TopGroups.merge test coverage > > > Key: LUCENE-9010 > URL: https://issues.apache.org/jira/browse/LUCENE-9010 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Christine Poerschke >Priority: Minor > Attachments: LUCENE-9010.patch > > > This sub-task proposes to add test coverage for the {{TopGroups.merge}} > method, separately from but as preparation for LUCENE-8996 fixing the > 'maxScore is sometimes missing' bug. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9010) extend TopGroups.merge test coverage
[ https://issues.apache.org/jira/browse/LUCENE-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke updated LUCENE-9010: Attachment: LUCENE-9010.patch > extend TopGroups.merge test coverage > > > Key: LUCENE-9010 > URL: https://issues.apache.org/jira/browse/LUCENE-9010 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Christine Poerschke >Priority: Minor > Attachments: LUCENE-9010.patch > > > This sub-task proposes to add test coverage for the {{TopGroups.merge}} > method, separately from but as preparation for LUCENE-8996 fixing the > 'maxScore is sometimes missing' bug. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9010) extend TopGroups.merge test coverage
Christine Poerschke created LUCENE-9010: --- Summary: extend TopGroups.merge test coverage Key: LUCENE-9010 URL: https://issues.apache.org/jira/browse/LUCENE-9010 Project: Lucene - Core Issue Type: Sub-task Reporter: Christine Poerschke This sub-task proposes to add test coverage for the {{TopGroups.merge}} method, separately from but as preparation for LUCENE-8996 fixing the 'maxScore is sometimes missing' bug. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13852) TestCloudNestedDocsSort can use the same uniqueKey for both a parent and child doc
[ https://issues.apache.org/jira/browse/SOLR-13852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter updated SOLR-13852: -- Attachment: thetaphi_Lucene-Solr-master-Linux_24903.log.txt Status: Open (was: Open) attaching a jenkins log w/seed showing how this can cause failures due to the assertion logic introduced in SOLR-13851 > TestCloudNestedDocsSort can use the same uniqueKey for both a parent and > child doc > -- > > Key: SOLR-13852 > URL: https://issues.apache.org/jira/browse/SOLR-13852 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Priority: Major > Attachments: thetaphi_Lucene-Solr-master-Linux_24903.log.txt > > > TestCloudNestedDocsSort uses randomly generated "id" values for all docs, > which not only means that two "parent" docs can be indexed with the same "id" > value, but also that a child doc might be indexed with the same "id" value as > a parent doc. > While nothing in Solr actively prevents this, it's documented as something > people shouldn't do, and can cause problems. > In particular, this has caused some assertion failures for some test seeds > due to how it interacts with SOLR-13851 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-13852) TestCloudNestedDocsSort can use the same uniqueKey for both a parent and child doc
Chris M. Hostetter created SOLR-13852: - Summary: TestCloudNestedDocsSort can use the same uniqueKey for both a parent and child doc Key: SOLR-13852 URL: https://issues.apache.org/jira/browse/SOLR-13852 Project: Solr Issue Type: Test Security Level: Public (Default Security Level. Issues are Public) Reporter: Chris M. Hostetter TestCloudNestedDocsSort uses randomly generated "id" values for all docs, which not only means that two "parent" docs can be indexed with the same "id" value, but also that a child doc might be indexed with the same "id" value as a parent doc. While nothing in Solr actively prevents this, it's documented as something people shouldn't do, and can cause problems. In particular, this has caused some assertion failures for some test seeds due to how it interacts with SOLR-13851 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13851) SolrIndexSearcher.getFirstMatch trips assertion if multiple matches
[ https://issues.apache.org/jira/browse/SOLR-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953070#comment-16953070 ] Chris M. Hostetter commented on SOLR-13851: --- Background: I recently noticed jenkins test failures from TestCloudNestedDocsSort that stemmed from this assertion error... {noformat} [junit4] 2> Server ErrorCaused by:java.lang.AssertionError [junit4] 2>at org.apache.solr.search.SolrIndexSearcher.lookupId(SolrIndexSearcher.java:710) [junit4] 2>at org.apache.solr.search.SolrIndexSearcher.getFirstMatch(SolrIndexSearcher.java:676) [junit4] 2>at org.apache.solr.handler.component.QueryComponent.doProcessSearchByIds(QueryComponent.java:1266) [junit4] 2>at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:351) [junit4] 2>at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:305) [junit4] 2>at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:198) [junit4] 2>at org.apache.solr.core.SolrCore.execute(SolrCore.java:2559) {noformat} At the core of the problem is that TestCloudNestedDocsSort does some things it shouldn't in terms of fhild doc uniqueKeys (which i'll track in a linked jira) ... but while using git bisect to identify when/where the failure was introduced, it identified GIT:1e63b32731bedf108aaeeb5d0a04d671f5663102 (SOLR-12366) as the first bad commit, and that's when i realized that prior to SOLR-12366 this (bad test) worked fine because {{getFirstMatch}} just did what it says: returned the first match (w/o complaining if there were multiples) > SolrIndexSearcher.getFirstMatch trips assertion if multiple matches > --- > > Key: SOLR-13851 > URL: https://issues.apache.org/jira/browse/SOLR-13851 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Priority: Major > > the documentation for {{SolrIndexSearcher.getFirstMatch}} says... > {quote} > Returns the first document number containing the term t Returns > -1 if no document was found. This method is primarily intended for clients > that want to fetch documents using a unique identifier." > @return the first document number containing the term > {quote} > But SOLR-12366 refactored {{SolrIndexSearcher.getFirstMatch}} to eliminate > it's previous implementation and replace it with a call to (a refactored > version of) {{SolrIndexSearcher.lookupId}} -- but the code in {{lookupId}} > was always designed *explicitly* for dealing with a uniqueKey field, and has > an assertion that once it finds a match _there will be no other matches in > the index_ > This means that even though {{getFirstMatch}} is _intended_ for fields that > are unique between documents, i it's used on a field that is not unique, it > can trip an assertion. > At a minimum we need to either "fix" {{getFirstMatch}} to behave as > documented, or fix it's documetation. > Given that the current behavior has now been in place since Solr 7.4, and > given that all existing uses in "core" solr code are for looking up docs by > uniqueKey, it's probably best to simply fix the documentation, but we should > also consider replacing hte assertion with an IllegalStateException, or > SolrException -- anything not dependent on having assertions enabled -- to > prevent silent bugs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-13851) SolrIndexSearcher.getFirstMatch trips assertion if multiple matches
Chris M. Hostetter created SOLR-13851: - Summary: SolrIndexSearcher.getFirstMatch trips assertion if multiple matches Key: SOLR-13851 URL: https://issues.apache.org/jira/browse/SOLR-13851 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Chris M. Hostetter the documentation for {{SolrIndexSearcher.getFirstMatch}} says... {quote} Returns the first document number containing the term t Returns -1 if no document was found. This method is primarily intended for clients that want to fetch documents using a unique identifier." @return the first document number containing the term {quote} But SOLR-12366 refactored {{SolrIndexSearcher.getFirstMatch}} to eliminate it's previous implementation and replace it with a call to (a refactored version of) {{SolrIndexSearcher.lookupId}} -- but the code in {{lookupId}} was always designed *explicitly* for dealing with a uniqueKey field, and has an assertion that once it finds a match _there will be no other matches in the index_ This means that even though {{getFirstMatch}} is _intended_ for fields that are unique between documents, i it's used on a field that is not unique, it can trip an assertion. At a minimum we need to either "fix" {{getFirstMatch}} to behave as documented, or fix it's documetation. Given that the current behavior has now been in place since Solr 7.4, and given that all existing uses in "core" solr code are for looking up docs by uniqueKey, it's probably best to simply fix the documentation, but we should also consider replacing hte assertion with an IllegalStateException, or SolrException -- anything not dependent on having assertions enabled -- to prevent silent bugs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-12786) Upgrade refGuide build to Asciidoctor 20.10 and new link structure
[ https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-12786: - Summary: Upgrade refGuide build to Asciidoctor 20.10 and new link structure (was: Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure) > Upgrade refGuide build to Asciidoctor 20.10 and new link structure > -- > > Key: SOLR-12786 > URL: https://issues.apache.org/jira/browse/SOLR-12786 > Project: Solr > Issue Type: Improvement > Components: Build, documentation >Affects Versions: 8.0 >Reporter: Jan Høydahl >Assignee: Cassandra Targett >Priority: Major > Attachments: SOLR-12786.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the refguide build requires asciidoctor 1.5.6.2. > People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, > causing different header ID syntax and the build will break. > Long term we should move to latest asciidoctor. > It is already documented in README how to install the older 1.5.6.2 version. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-12786) Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure
[ https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-12786: - Summary: Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure (was: Upgrade refGuide build to Asciidoctor 20.10 and new link structure) > Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure > --- > > Key: SOLR-12786 > URL: https://issues.apache.org/jira/browse/SOLR-12786 > Project: Solr > Issue Type: Improvement > Components: Build, documentation >Affects Versions: 8.0 >Reporter: Jan Høydahl >Assignee: Cassandra Targett >Priority: Major > Attachments: SOLR-12786.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the refguide build requires asciidoctor 1.5.6.2. > People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, > causing different header ID syntax and the build will break. > Long term we should move to latest asciidoctor. > It is already documented in README how to install the older 1.5.6.2 version. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12786) Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure
[ https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953049#comment-16953049 ] ASF subversion and git services commented on SOLR-12786: Commit 802e97d6aa9806f495febc18790425cdcf12bece in lucene-solr's branch refs/heads/branch_8x from Cassandra Targett [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=802e97d ] SOLR-12786: Update Ref Guide build tool versions & fix section links for new format requirements > Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure > -- > > Key: SOLR-12786 > URL: https://issues.apache.org/jira/browse/SOLR-12786 > Project: Solr > Issue Type: Improvement > Components: Build, documentation >Affects Versions: 8.0 >Reporter: Jan Høydahl >Assignee: Cassandra Targett >Priority: Major > Attachments: SOLR-12786.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the refguide build requires asciidoctor 1.5.6.2. > People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, > causing different header ID syntax and the build will break. > Long term we should move to latest asciidoctor. > It is already documented in README how to install the older 1.5.6.2 version. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12786) Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure
[ https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953050#comment-16953050 ] ASF subversion and git services commented on SOLR-12786: Commit dc47aa5b16f5cc75678070c5b1b5b7459b3690a4 in lucene-solr's branch refs/heads/branch_8x from Cassandra Targett [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=dc47aa5 ] SOLR-12786: add back explicit asciidoctor install for Jenkins build > Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure > -- > > Key: SOLR-12786 > URL: https://issues.apache.org/jira/browse/SOLR-12786 > Project: Solr > Issue Type: Improvement > Components: Build, documentation >Affects Versions: 8.0 >Reporter: Jan Høydahl >Assignee: Cassandra Targett >Priority: Major > Attachments: SOLR-12786.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the refguide build requires asciidoctor 1.5.6.2. > People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, > causing different header ID syntax and the build will break. > Long term we should move to latest asciidoctor. > It is already documented in README how to install the older 1.5.6.2 version. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12786) Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure
[ https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953026#comment-16953026 ] ASF subversion and git services commented on SOLR-12786: Commit b3d59a7a8b5ed28ba985e54bcb7edd5c3b352302 in lucene-solr's branch refs/heads/master from Cassandra Targett [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b3d59a7 ] SOLR-12786: add back explicit asciidoctor install for Jenkins build > Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure > -- > > Key: SOLR-12786 > URL: https://issues.apache.org/jira/browse/SOLR-12786 > Project: Solr > Issue Type: Improvement > Components: Build, documentation >Affects Versions: 8.0 >Reporter: Jan Høydahl >Assignee: Cassandra Targett >Priority: Major > Attachments: SOLR-12786.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the refguide build requires asciidoctor 1.5.6.2. > People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, > causing different header ID syntax and the build will break. > Long term we should move to latest asciidoctor. > It is already documented in README how to install the older 1.5.6.2 version. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9005) BooleanQuery.visit() incorrectly pulls subvisitors from its parent
[ https://issues.apache.org/jira/browse/LUCENE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Woodward resolved LUCENE-9005. --- Fix Version/s: 8.3 master (9.0) Resolution: Fixed > BooleanQuery.visit() incorrectly pulls subvisitors from its parent > -- > > Key: LUCENE-9005 > URL: https://issues.apache.org/jira/browse/LUCENE-9005 > Project: Lucene - Core > Issue Type: Bug >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Fix For: master (9.0), 8.3 > > Attachments: LUCENE-9005.patch > > > BooleanQuery.visit() calls getSubVisitor once for each of its clause sets; > however, this sub visitor is called on the passed-in visitor, which means > that sub clauses get attached to its parent, rather than a visitor for that > particular BQ. > To illustrate, consider the following nested BooleanQuery: ("a b" (+c +d %e > f)); we have a top-level disjunction query containing one phrase query > (essentially a conjunction), and one boolean query containing both MUST, > FILTER and SHOULD clauses. When visiting, the top level query will pull a > SHOULD subvisitor, and pass both queries into it. The phrase query will pull > a MUST subvisitor and all its two terms. The nested boolean will pull a > MUST, and FILTER and a SHOULD; but these are all attached to the parent > SHOULD visitor - in particular, the MUST and FILTER clauses will end up being > attached to this SHOULD visitor, and be mis-interpreted as a disjunction. > To fix this, BQ should first pull a MUST visitor and visit its MUST clauses > using this visitor; SHOULD, FILTER and MUST_NOT clauses should then be pulled > from this top-level MUST visitor. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9005) BooleanQuery.visit() incorrectly pulls subvisitors from its parent
[ https://issues.apache.org/jira/browse/LUCENE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952981#comment-16952981 ] ASF subversion and git services commented on LUCENE-9005: - Commit f7711d712472528b567ab975d0ed677bbd30ac12 in lucene-solr's branch refs/heads/master from Alan Woodward [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f7711d7 ] LUCENE-9005: BooleanQuery.visit() pulls subvisitors from a top-level MUST visitor > BooleanQuery.visit() incorrectly pulls subvisitors from its parent > -- > > Key: LUCENE-9005 > URL: https://issues.apache.org/jira/browse/LUCENE-9005 > Project: Lucene - Core > Issue Type: Bug >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-9005.patch > > > BooleanQuery.visit() calls getSubVisitor once for each of its clause sets; > however, this sub visitor is called on the passed-in visitor, which means > that sub clauses get attached to its parent, rather than a visitor for that > particular BQ. > To illustrate, consider the following nested BooleanQuery: ("a b" (+c +d %e > f)); we have a top-level disjunction query containing one phrase query > (essentially a conjunction), and one boolean query containing both MUST, > FILTER and SHOULD clauses. When visiting, the top level query will pull a > SHOULD subvisitor, and pass both queries into it. The phrase query will pull > a MUST subvisitor and all its two terms. The nested boolean will pull a > MUST, and FILTER and a SHOULD; but these are all attached to the parent > SHOULD visitor - in particular, the MUST and FILTER clauses will end up being > attached to this SHOULD visitor, and be mis-interpreted as a disjunction. > To fix this, BQ should first pull a MUST visitor and visit its MUST clauses > using this visitor; SHOULD, FILTER and MUST_NOT clauses should then be pulled > from this top-level MUST visitor. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9005) BooleanQuery.visit() incorrectly pulls subvisitors from its parent
[ https://issues.apache.org/jira/browse/LUCENE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952980#comment-16952980 ] ASF subversion and git services commented on LUCENE-9005: - Commit 574e1e2d52d420dae41bee6b5d0e68799de8a1bd in lucene-solr's branch refs/heads/branch_8x from Alan Woodward [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=574e1e2 ] LUCENE-9005: BooleanQuery.visit() pulls subvisitors from a top-level MUST visitor > BooleanQuery.visit() incorrectly pulls subvisitors from its parent > -- > > Key: LUCENE-9005 > URL: https://issues.apache.org/jira/browse/LUCENE-9005 > Project: Lucene - Core > Issue Type: Bug >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-9005.patch > > > BooleanQuery.visit() calls getSubVisitor once for each of its clause sets; > however, this sub visitor is called on the passed-in visitor, which means > that sub clauses get attached to its parent, rather than a visitor for that > particular BQ. > To illustrate, consider the following nested BooleanQuery: ("a b" (+c +d %e > f)); we have a top-level disjunction query containing one phrase query > (essentially a conjunction), and one boolean query containing both MUST, > FILTER and SHOULD clauses. When visiting, the top level query will pull a > SHOULD subvisitor, and pass both queries into it. The phrase query will pull > a MUST subvisitor and all its two terms. The nested boolean will pull a > MUST, and FILTER and a SHOULD; but these are all attached to the parent > SHOULD visitor - in particular, the MUST and FILTER clauses will end up being > attached to this SHOULD visitor, and be mis-interpreted as a disjunction. > To fix this, BQ should first pull a MUST visitor and visit its MUST clauses > using this visitor; SHOULD, FILTER and MUST_NOT clauses should then be pulled > from this top-level MUST visitor. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9005) BooleanQuery.visit() incorrectly pulls subvisitors from its parent
[ https://issues.apache.org/jira/browse/LUCENE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952979#comment-16952979 ] ASF subversion and git services commented on LUCENE-9005: - Commit c19845775520108dce35feabfc081f606b34584f in lucene-solr's branch refs/heads/branch_8_3 from Alan Woodward [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c198457 ] LUCENE-9005: BooleanQuery.visit() pulls subvisitors from a top-level MUST visitor > BooleanQuery.visit() incorrectly pulls subvisitors from its parent > -- > > Key: LUCENE-9005 > URL: https://issues.apache.org/jira/browse/LUCENE-9005 > Project: Lucene - Core > Issue Type: Bug >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-9005.patch > > > BooleanQuery.visit() calls getSubVisitor once for each of its clause sets; > however, this sub visitor is called on the passed-in visitor, which means > that sub clauses get attached to its parent, rather than a visitor for that > particular BQ. > To illustrate, consider the following nested BooleanQuery: ("a b" (+c +d %e > f)); we have a top-level disjunction query containing one phrase query > (essentially a conjunction), and one boolean query containing both MUST, > FILTER and SHOULD clauses. When visiting, the top level query will pull a > SHOULD subvisitor, and pass both queries into it. The phrase query will pull > a MUST subvisitor and all its two terms. The nested boolean will pull a > MUST, and FILTER and a SHOULD; but these are all attached to the parent > SHOULD visitor - in particular, the MUST and FILTER clauses will end up being > attached to this SHOULD visitor, and be mis-interpreted as a disjunction. > To fix this, BQ should first pull a MUST visitor and visit its MUST clauses > using this visitor; SHOULD, FILTER and MUST_NOT clauses should then be pulled > from this top-level MUST visitor. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12786) Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure
[ https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952974#comment-16952974 ] ASF subversion and git services commented on SOLR-12786: Commit 621461fd1a51278c901399668c7d33a7474f4994 in lucene-solr's branch refs/heads/master from Cassandra Targett [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=621461f ] SOLR-12786: Update Ref Guide build tool versions & fix section links for new format requirements > Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure > -- > > Key: SOLR-12786 > URL: https://issues.apache.org/jira/browse/SOLR-12786 > Project: Solr > Issue Type: Improvement > Components: Build, documentation >Affects Versions: 8.0 >Reporter: Jan Høydahl >Assignee: Cassandra Targett >Priority: Major > Attachments: SOLR-12786.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the refguide build requires asciidoctor 1.5.6.2. > People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, > causing different header ID syntax and the build will break. > Long term we should move to latest asciidoctor. > It is already documented in README how to install the older 1.5.6.2 version. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13808) Query DSL should let to cache filter
[ https://issues.apache.org/jira/browse/SOLR-13808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952969#comment-16952969 ] Mikhail Khludnev commented on SOLR-13808: - Ok. It seems like the plan is to # create \{!cache} query parser to hook it up by existing DSL. Caveat for users is loosing scoring. # enable cache by default for \{!bool filter=... filter=..} # make sure that it sensitive for \{!cache=false} local param for enclosing queries I'm fine with it and patches are welcome. > Query DSL should let to cache filter > > > Key: SOLR-13808 > URL: https://issues.apache.org/jira/browse/SOLR-13808 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mikhail Khludnev >Priority: Major > > Query DSL let to express Lucene BQ's filter > > {code:java} > { query: {bool: { filter: {term: {f:name,query:"foo bar"}}} }}{code} > However, it might easily catch the need in caching it in filter cache. This > might rely on ExtensibleQuery and QParser: > {code:java} > { query: {bool: { filter: {term: {f:name,query:"foo bar", cache:true}}} }} > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13677) All Metrics Gauges should be unregistered by the objects that registered them
[ https://issues.apache.org/jira/browse/SOLR-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952968#comment-16952968 ] Andrzej Bialecki commented on SOLR-13677: - [~noble.paul] I would appreciate your review. > All Metrics Gauges should be unregistered by the objects that registered them > - > > Key: SOLR-13677 > URL: https://issues.apache.org/jira/browse/SOLR-13677 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Noble Paul >Assignee: Andrzej Bialecki >Priority: Blocker > Fix For: 8.3 > > Attachments: SOLR-13677.patch > > Time Spent: 3h 50m > Remaining Estimate: 0h > > The life cycle of Metrics producers are managed by the core (mostly). So, if > the lifecycle of the object is different from that of the core itself, these > objects will never be unregistered from the metrics registry. This will lead > to memory leaks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] iverase commented on issue #865: LUCENE-8973: XYRectangle2D should work on float space
iverase commented on issue #865: LUCENE-8973: XYRectangle2D should work on float space URL: https://github.com/apache/lucene-solr/pull/865#issuecomment-542773405 I have updated the PR so the XYRectangle2D is now a Component2D This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13845) DELETEREPLICA API by "count" and "type"
[ https://issues.apache.org/jira/browse/SOLR-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952956#comment-16952956 ] Amrit Sarkar commented on SOLR-13845: - Uploaded clean PATCH for the improvement. Requesting feedback. > DELETEREPLICA API by "count" and "type" > --- > > Key: SOLR-13845 > URL: https://issues.apache.org/jira/browse/SOLR-13845 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Amrit Sarkar >Priority: Major > Attachments: SOLR-13845.patch > > > SOLR-9319 added support for deleting replicas by count. It would be great to > have the feature with added functionality the type of replica we want to > delete like we add replicas by count and type. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13845) DELETEREPLICA API by "count" and "type"
[ https://issues.apache.org/jira/browse/SOLR-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amrit Sarkar updated SOLR-13845: Attachment: (was: STAR-13845.patch) > DELETEREPLICA API by "count" and "type" > --- > > Key: SOLR-13845 > URL: https://issues.apache.org/jira/browse/SOLR-13845 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Amrit Sarkar >Priority: Major > Attachments: SOLR-13845.patch > > > SOLR-9319 added support for deleting replicas by count. It would be great to > have the feature with added functionality the type of replica we want to > delete like we add replicas by count and type. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13845) DELETEREPLICA API by "count" and "type"
[ https://issues.apache.org/jira/browse/SOLR-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amrit Sarkar updated SOLR-13845: Attachment: SOLR-13845.patch > DELETEREPLICA API by "count" and "type" > --- > > Key: SOLR-13845 > URL: https://issues.apache.org/jira/browse/SOLR-13845 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Amrit Sarkar >Priority: Major > Attachments: SOLR-13845.patch, STAR-13845.patch > > > SOLR-9319 added support for deleting replicas by count. It would be great to > have the feature with added functionality the type of replica we want to > delete like we add replicas by count and type. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection
cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335541921 ## File path: lucene/suggest/src/test/org/apache/lucene/search/suggest/document/TestPrefixCompletionQuery.java ## @@ -253,6 +263,123 @@ public void testDocFiltering() throws Exception { iw.close(); } + /** + * Test that the correct amount of documents are collected if using a collector that also rejects documents. + */ + public void testCollectorThatRejects() throws Exception { +// use synonym analyzer to have multiple paths to same suggested document. This mock adds "dog" as synonym for "dogs" +Analyzer analyzer = new MockSynonymAnalyzer(); +RandomIndexWriter iw = new RandomIndexWriter(random(), dir, iwcWithSuggestField(analyzer, "suggest_field")); +List expectedResults = new ArrayList(); + +for (int docCount = 10; docCount > 0; docCount--) { + Document document = new Document(); + String value = "ab" + docCount + " dogs"; + document.add(new SuggestField("suggest_field", value, docCount)); + expectedResults.add(new Entry(value, docCount)); + iw.addDocument(document); +} + +if (rarely()) { + iw.commit(); +} + +DirectoryReader reader = iw.getReader(); +SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader); + +PrefixCompletionQuery query = new PrefixCompletionQuery(analyzer, new Term("suggest_field", "ab")); +int topN = 5; + +// use a TopSuggestDocsCollector that rejects results with duplicate docIds +TopSuggestDocsCollector collector = new TopSuggestDocsCollector(topN, false) { + + private Set seenDocIds = new HashSet<>(); + + @Override + public boolean collect(int docID, CharSequence key, CharSequence context, float score) throws IOException { + int globalDocId = docID + docBase; + boolean collected = false; + if (seenDocIds.contains(globalDocId) == false) { + super.collect(docID, key, context, score); + seenDocIds.add(globalDocId); + collected = true; + } + return collected; + } + + @Override + protected boolean canReject() { +return true; + } +}; + +indexSearcher.suggest(query, collector); +TopSuggestDocs suggestions = collector.get(); +assertSuggestions(suggestions, expectedResults.subList(0, topN).toArray(new Entry[0])); +assertTrue(suggestions.isComplete()); Review comment: I extended the existing test to the case where try getting the top 10. In this case the queue would have a max depth of 15, the reject count it 9. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] johtani commented on issue #935: LUCENE-4056: Japanese Tokenizer (Kuromoji) cannot build UniDic dictionary
johtani commented on issue #935: LUCENE-4056: Japanese Tokenizer (Kuromoji) cannot build UniDic dictionary URL: https://github.com/apache/lucene-solr/pull/935#issuecomment-542749849 Here is the message for `ant clean; ant build-dict` with ipadic. https://gist.github.com/johtani/b53e9e241e5b98519fb3ffe12b4164eb And also the message with unidic and `build.xml` https://gist.github.com/johtani/91cfd2753aba2e001c1d39f47666ada7 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13847) Fix ref guide for autoscaling metric trigger
[ https://issues.apache.org/jira/browse/SOLR-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ROCHETEAU Antoine updated SOLR-13847: - Component/s: AutoScaling > Fix ref guide for autoscaling metric trigger > > > Key: SOLR-13847 > URL: https://issues.apache.org/jira/browse/SOLR-13847 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling, documentation >Affects Versions: 7.7.2, 8.2 >Reporter: ROCHETEAU Antoine >Priority: Minor > Attachments: metric_trigger_documentation.patch > > > Reported in the IRC channel who ask me to raise an issue, > The documentation for the autoscaling metric trigger have an error on the > description (it's not possible to set up a basic metric trigger with the > current documentation). > [https://lucene.apache.org/solr/guide/8_1/solrcloud-autoscaling-triggers.html#metric-trigger] > metric:_group_:_prefix_ should be replaced by > metric{color:#ff}s{color}:_group_:_prefix_ > This correction is also required on the example: > {{metric{color:#ff}s{color}:solr.node:CONTAINER.fs.coreRoot.usableSpace}} > This is confirmed by the source code with explicit use of "metrics:" (see > for example: org.apache.solr.cloud.autoscaling.sim.SimNodeStateProvider or > org.apache.solr.cloud.autoscaling.MetricTriggerIntegrationTest) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection
cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335502868 ## File path: lucene/suggest/src/test/org/apache/lucene/search/suggest/document/TestPrefixCompletionQuery.java ## @@ -253,6 +263,123 @@ public void testDocFiltering() throws Exception { iw.close(); } + /** + * Test that the correct amount of documents are collected if using a collector that also rejects documents. + */ + public void testCollectorThatRejects() throws Exception { +// use synonym analyzer to have multiple paths to same suggested document. This mock adds "dog" as synonym for "dogs" +Analyzer analyzer = new MockSynonymAnalyzer(); +RandomIndexWriter iw = new RandomIndexWriter(random(), dir, iwcWithSuggestField(analyzer, "suggest_field")); +List expectedResults = new ArrayList(); + +for (int docCount = 10; docCount > 0; docCount--) { + Document document = new Document(); + String value = "ab" + docCount + " dogs"; + document.add(new SuggestField("suggest_field", value, docCount)); + expectedResults.add(new Entry(value, docCount)); + iw.addDocument(document); +} + +if (rarely()) { + iw.commit(); +} + +DirectoryReader reader = iw.getReader(); +SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader); + +PrefixCompletionQuery query = new PrefixCompletionQuery(analyzer, new Term("suggest_field", "ab")); +int topN = 5; + +// use a TopSuggestDocsCollector that rejects results with duplicate docIds +TopSuggestDocsCollector collector = new TopSuggestDocsCollector(topN, false) { + + private Set seenDocIds = new HashSet<>(); + + @Override + public boolean collect(int docID, CharSequence key, CharSequence context, float score) throws IOException { + int globalDocId = docID + docBase; + boolean collected = false; + if (seenDocIds.contains(globalDocId) == false) { + super.collect(docID, key, context, score); + seenDocIds.add(globalDocId); + collected = true; + } + return collected; + } + + @Override + protected boolean canReject() { +return true; + } +}; + +indexSearcher.suggest(query, collector); +TopSuggestDocs suggestions = collector.get(); +assertSuggestions(suggestions, expectedResults.subList(0, topN).toArray(new Entry[0])); +assertTrue(suggestions.isComplete()); Review comment: This will happen if the estimated queue size in `NRTSuggester#lookup` is not large enough to accout for all rejected documents, e.g. when in this particular test we try to get the top 5 of only 5 documents. In that case the queue size heuristic in `NRTSuggester#getMaxTopNSearcherQueueSize` will only size to queue to 7 (topN + numDocs/2), which is less than the number of topN + rejections, so the TopResults returned will have the `isComplete` flag set. I can add that case to the existing test if this helps. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9006) Ensure WordDelimiterGraphFilter always emits catenateAll token early
[ https://issues.apache.org/jira/browse/LUCENE-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952867#comment-16952867 ] David Smiley commented on LUCENE-9006: -- BTW this issue also fixes a bug in the offsets. The previous behavior resulted in the token "8other" having start offset of 2 because it followed the token "other" which is and should be 2. Now that "8other" is earlier, it can have the start offset it should -- 0. I was thinking about the core of the change here to the sort to consider the offset based length. I think it's simpler/faster and perhaps more correct to just use the start offset. This change passes the tests, so I'm inclined to push that. > Ensure WordDelimiterGraphFilter always emits catenateAll token early > > > Key: LUCENE-9006 > URL: https://issues.apache.org/jira/browse/LUCENE-9006 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Ideally, the first token of WDGF is the preserveOriginal (if configured to > emit), and the second should be the catenateAll (if configured to emit). The > deprecated WDF does this but WDGF can sometimes put the first other token > earlier when there is a non-emitted candidate sub-token. > Example input "8-other" when only generateWordParts and catenateAll -- *not* > generateNumberParts. WDGF internally sees the '8' but moves on. Ultimately, > the "other" token and the catenated "8other" will appear at the same internal > position, which by luck fools the sorter to emit "other" first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions
[ https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952862#comment-16952862 ] ASF subversion and git services commented on SOLR-13105: Commit 3a695853755fae8eaef06c8c37689308d93157f2 in lucene-solr's branch refs/heads/visual-guide from Joel Bernstein [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3a69585 ] SOLR-13105: The Visual Guide to Streaming Expressions and Math Expressions > A visual guide to Solr Math Expressions and Streaming Expressions > - > > Key: SOLR-13105 > URL: https://issues.apache.org/jira/browse/SOLR-13105 > Project: Solr > Issue Type: New Feature >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Major > Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot > 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, > Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 > AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png > > > Visualization is now a fundamental element of Solr Streaming Expressions and > Math Expressions. This ticket will create a visual guide to Solr Math > Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* > visualization examples. > It will also cover using the JDBC expression to *analyze* and *visualize* > results from any JDBC compliant data source. > Intro from the guide: > {code:java} > Streaming Expressions exposes the capabilities of Solr Cloud as composable > functions. These functions provide a system for searching, transforming, > analyzing and visualizing data stored in Solr Cloud collections. > At a high level there are four main capabilities that will be explored in the > documentation: > * Searching, sampling and aggregating results from Solr. > * Transforming result sets after they are retrieved from Solr. > * Analyzing and modeling result sets using probability and statistics and > machine learning libraries. > * Visualizing result sets, aggregations and statistical models of the data. > {code} > > A few sample visualizations are attached to the ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] janhoy commented on a change in pull request #946: SOLR-13835 HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT
janhoy commented on a change in pull request #946: SOLR-13835 HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT URL: https://github.com/apache/lucene-solr/pull/946#discussion_r335483832 ## File path: solr/core/src/test/org/apache/solr/security/BasicAuthIntegrationTest.java ## @@ -232,7 +234,7 @@ public void testBasicAuth() throws Exception { HttpSolrClient.RemoteSolrException e = expectThrows(HttpSolrClient.RemoteSolrException.class, () -> { new UpdateRequest().deleteByQuery("*:*").process(aNewClient, COLLECTION); }); -assertTrue(e.getMessage().contains("Unauthorized request")); +assertTrue(e.getMessage(), e.getMessage().contains("Authentication failed")); Review comment: Earlier both 401 and 403 responses would print the text "Unauthorized request" due to the fall-through. After this fix we also changed the text for 401 response, making this test fail. Don't know why Authorization plugin returns 401 though, the password is correct.. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.
[ https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated SOLR-13824: Attachment: (was: SOLR-13824.patch) > JSON Request API ignores prematurely closing curly brace. > -- > > Key: SOLR-13824 > URL: https://issues.apache.org/jira/browse/SOLR-13824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: JSON Request API >Reporter: Mikhail Khludnev >Priority: Major > Attachments: SOLR-13824.patch, SOLR-13824.patch > > > {code:java} > json={query:"content:foo", facet:{zz:{field:id}}} > {code} > this works fine, but if we mistype {{}}} instead of {{,}} > {code:java} > json={query:"content:foo"} facet:{zz:{field:id}}} > {code} > It's captured only partially, here's we have under debug > {code:java} > "json":{"query":"content:foo"}, > {code} > I suppose it should throw an error with 400 code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.
[ https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated SOLR-13824: Attachment: SOLR-13824.patch > JSON Request API ignores prematurely closing curly brace. > -- > > Key: SOLR-13824 > URL: https://issues.apache.org/jira/browse/SOLR-13824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: JSON Request API >Reporter: Mikhail Khludnev >Priority: Major > Attachments: SOLR-13824.patch, SOLR-13824.patch > > > {code:java} > json={query:"content:foo", facet:{zz:{field:id}}} > {code} > this works fine, but if we mistype {{}}} instead of {{,}} > {code:java} > json={query:"content:foo"} facet:{zz:{field:id}}} > {code} > It's captured only partially, here's we have under debug > {code:java} > "json":{"query":"content:foo"}, > {code} > I suppose it should throw an error with 400 code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions
[ https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952800#comment-16952800 ] ASF subversion and git services commented on SOLR-13105: Commit 16dfdbca48f54baacf06a4ac68c75ca2841d9d34 in lucene-solr's branch refs/heads/SOLR-13105-visual from Joel Bernstein [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=16dfdbc ] SOLR-13105: Improve curve fitting docs 5 > A visual guide to Solr Math Expressions and Streaming Expressions > - > > Key: SOLR-13105 > URL: https://issues.apache.org/jira/browse/SOLR-13105 > Project: Solr > Issue Type: New Feature >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Major > Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot > 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, > Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 > AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png > > > Visualization is now a fundamental element of Solr Streaming Expressions and > Math Expressions. This ticket will create a visual guide to Solr Math > Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* > visualization examples. > It will also cover using the JDBC expression to *analyze* and *visualize* > results from any JDBC compliant data source. > Intro from the guide: > {code:java} > Streaming Expressions exposes the capabilities of Solr Cloud as composable > functions. These functions provide a system for searching, transforming, > analyzing and visualizing data stored in Solr Cloud collections. > At a high level there are four main capabilities that will be explored in the > documentation: > * Searching, sampling and aggregating results from Solr. > * Transforming result sets after they are retrieved from Solr. > * Analyzing and modeling result sets using probability and statistics and > machine learning libraries. > * Visualizing result sets, aggregations and statistical models of the data. > {code} > > A few sample visualizations are attached to the ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions
[ https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952797#comment-16952797 ] ASF subversion and git services commented on SOLR-13105: Commit 11cc8460b5e90ebf8360a3b71f14794afcd2a7c8 in lucene-solr's branch refs/heads/SOLR-13105-visual from Joel Bernstein [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=11cc846 ] SOLR-13105: Improve curve fitting docs 4 > A visual guide to Solr Math Expressions and Streaming Expressions > - > > Key: SOLR-13105 > URL: https://issues.apache.org/jira/browse/SOLR-13105 > Project: Solr > Issue Type: New Feature >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Major > Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot > 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, > Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 > AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png > > > Visualization is now a fundamental element of Solr Streaming Expressions and > Math Expressions. This ticket will create a visual guide to Solr Math > Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* > visualization examples. > It will also cover using the JDBC expression to *analyze* and *visualize* > results from any JDBC compliant data source. > Intro from the guide: > {code:java} > Streaming Expressions exposes the capabilities of Solr Cloud as composable > functions. These functions provide a system for searching, transforming, > analyzing and visualizing data stored in Solr Cloud collections. > At a high level there are four main capabilities that will be explored in the > documentation: > * Searching, sampling and aggregating results from Solr. > * Transforming result sets after they are retrieved from Solr. > * Analyzing and modeling result sets using probability and statistics and > machine learning libraries. > * Visualizing result sets, aggregations and statistical models of the data. > {code} > > A few sample visualizations are attached to the ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9006) Ensure WordDelimiterGraphFilter always emits catenateAll token early
[ https://issues.apache.org/jira/browse/LUCENE-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952799#comment-16952799 ] David Wayne Smiley commented on LUCENE-9006: Thanks for the explanation RE graphOffsetsAreCorrect. I guess there is no new concern here the PR then. I discovered this problem due to a custom filter that directly collaborates with a delegated WDGF instance. It assumes the first two tokens are preserveOriginal then catenateAll. This was the case with the now deprecated WDF. It's intuitive too, so "looks" odd when it doesn't happen. I noticed in LUCENE-8730 a precedent for making the token orderings consistent, which makes sense to me. > Ensure WordDelimiterGraphFilter always emits catenateAll token early > > > Key: LUCENE-9006 > URL: https://issues.apache.org/jira/browse/LUCENE-9006 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Reporter: David Wayne Smiley >Assignee: David Wayne Smiley >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Ideally, the first token of WDGF is the preserveOriginal (if configured to > emit), and the second should be the catenateAll (if configured to emit). The > deprecated WDF does this but WDGF can sometimes put the first other token > earlier when there is a non-emitted candidate sub-token. > Example input "8-other" when only generateWordParts and catenateAll -- *not* > generateNumberParts. WDGF internally sees the '8' but moves on. Ultimately, > the "other" token and the catenated "8other" will appear at the same internal > position, which by luck fools the sorter to emit "other" first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions
[ https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952795#comment-16952795 ] ASF subversion and git services commented on SOLR-13105: Commit 48d9c76bc5c9efa9dfecd7a81783c753fef3bcd1 in lucene-solr's branch refs/heads/SOLR-13105-visual from Joel Bernstein [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=48d9c76 ] SOLR-13105: Improve curve fitting docs 3 > A visual guide to Solr Math Expressions and Streaming Expressions > - > > Key: SOLR-13105 > URL: https://issues.apache.org/jira/browse/SOLR-13105 > Project: Solr > Issue Type: New Feature >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Major > Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot > 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, > Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 > AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png > > > Visualization is now a fundamental element of Solr Streaming Expressions and > Math Expressions. This ticket will create a visual guide to Solr Math > Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* > visualization examples. > It will also cover using the JDBC expression to *analyze* and *visualize* > results from any JDBC compliant data source. > Intro from the guide: > {code:java} > Streaming Expressions exposes the capabilities of Solr Cloud as composable > functions. These functions provide a system for searching, transforming, > analyzing and visualizing data stored in Solr Cloud collections. > At a high level there are four main capabilities that will be explored in the > documentation: > * Searching, sampling and aggregating results from Solr. > * Transforming result sets after they are retrieved from Solr. > * Analyzing and modeling result sets using probability and statistics and > machine learning libraries. > * Visualizing result sets, aggregations and statistical models of the data. > {code} > > A few sample visualizations are attached to the ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions
[ https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952793#comment-16952793 ] ASF subversion and git services commented on SOLR-13105: Commit 623a026321ad8746265e8c4526423ec29e321c7f in lucene-solr's branch refs/heads/SOLR-13105-visual from Joel Bernstein [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=623a026 ] SOLR-13105: Improve curve fitting docs 2 > A visual guide to Solr Math Expressions and Streaming Expressions > - > > Key: SOLR-13105 > URL: https://issues.apache.org/jira/browse/SOLR-13105 > Project: Solr > Issue Type: New Feature >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Major > Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot > 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, > Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 > AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png > > > Visualization is now a fundamental element of Solr Streaming Expressions and > Math Expressions. This ticket will create a visual guide to Solr Math > Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* > visualization examples. > It will also cover using the JDBC expression to *analyze* and *visualize* > results from any JDBC compliant data source. > Intro from the guide: > {code:java} > Streaming Expressions exposes the capabilities of Solr Cloud as composable > functions. These functions provide a system for searching, transforming, > analyzing and visualizing data stored in Solr Cloud collections. > At a high level there are four main capabilities that will be explored in the > documentation: > * Searching, sampling and aggregating results from Solr. > * Transforming result sets after they are retrieved from Solr. > * Analyzing and modeling result sets using probability and statistics and > machine learning libraries. > * Visualizing result sets, aggregations and statistical models of the data. > {code} > > A few sample visualizations are attached to the ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions
[ https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952785#comment-16952785 ] ASF subversion and git services commented on SOLR-13105: Commit fd3d50c5807b3c1097bb4a7639f35bff94b11dc6 in lucene-solr's branch refs/heads/SOLR-13105-visual from Joel Bernstein [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fd3d50c ] SOLR-13105: Improve curve fitting docs > A visual guide to Solr Math Expressions and Streaming Expressions > - > > Key: SOLR-13105 > URL: https://issues.apache.org/jira/browse/SOLR-13105 > Project: Solr > Issue Type: New Feature >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Major > Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot > 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, > Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 > AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png > > > Visualization is now a fundamental element of Solr Streaming Expressions and > Math Expressions. This ticket will create a visual guide to Solr Math > Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* > visualization examples. > It will also cover using the JDBC expression to *analyze* and *visualize* > results from any JDBC compliant data source. > Intro from the guide: > {code:java} > Streaming Expressions exposes the capabilities of Solr Cloud as composable > functions. These functions provide a system for searching, transforming, > analyzing and visualizing data stored in Solr Cloud collections. > At a high level there are four main capabilities that will be explored in the > documentation: > * Searching, sampling and aggregating results from Solr. > * Transforming result sets after they are retrieved from Solr. > * Analyzing and modeling result sets using probability and statistics and > machine learning libraries. > * Visualizing result sets, aggregations and statistical models of the data. > {code} > > A few sample visualizations are attached to the ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13836) Streaming Expression Query Parser
[ https://issues.apache.org/jira/browse/SOLR-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13836: - Component/s: streaming expressions query parsers > Streaming Expression Query Parser > - > > Key: SOLR-13836 > URL: https://issues.apache.org/jira/browse/SOLR-13836 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers, streaming expressions >Reporter: Trey Grainger >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > It is currently possible to hit the search handler in a streaming expression > ("search(...)"), but it is not currently possible to invoke a streaming > expression from within a regular search within the search handler. In some > cases, it would be useful to leverage the power of streaming expressions to > generate a result set and then join that result set with a normal set of > search results. > This isn't expected to be particularly efficient for high cardinality > streaming expression results, but it would be pretty powerful feature that > could enable a bunch of use cases that aren't possible today within a normal > search. > h2. Example: > *Docs:* > {code:java} > curl -X POST -H "Content-Type: application/json" > http://localhost:8983/solr/food_collection/update?commit=true --data-binary ' > [ > {"id": "1", "name_s":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]}, > {"id": "2", "name_s":"apple > juice","vector_fs":[1.0,5.0,0.0,0.0,0.0,4.0,4.0,3.0]}, > {"id": "3", > "name_s":"cappuccino","vector_fs":[0.0,5.0,3.0,0.0,4.0,1.0,2.0,3.0]}, > {"id": "4", "name_s":"cheese > pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]}, > {"id": "5", "name_s":"green > tea","vector_fs":[0.0,5.0,0.0,0.0,2.0,1.0,1.0,5.0]}, > {"id": "6", "name_s":"latte","vector_fs":[0.0,5.0,4.0,0.0,4.0,1.0,3.0,3.0]}, > {"id": "7", "name_s":"soda","vector_fs":[0.0,5.0,0.0,0.0,3.0,5.0,5.0,0.0]}, > {"id": "8", "name_s":"cheese bread > sticks","vector_fs":[5.0,0.0,4.0,5.0,0.0,1.0,4.0,2.0]}, > {"id": "9", "name_s":"water","vector_fs":[0.0,5.0,0.0,0.0,0.0,0.0,0.0,5.0]}, > {"id": "10", "name_s":"cinnamon bread > sticks","vector_fs":[5.0,0.0,1.0,5.0,0.0,3.0,4.0,2.0]} > ] > {code} > > *Query:* > {code:java} > http://localhost:8983/solr/food/select?q=*:*&fq=\{!streaming_expression}top(select(search(food,%20q=%22*:*%22,%20fl=%22id,vector_fs%22,%20sort=%22id%20asc%22),%20cosineSimilarity(vector_fs,%20array(5.1,0.0,1.0,5.0,0.0,4.0,5.0,1.0))%20as%20cos,%20id),%20n=5,%20sort=%22cos%20desc%22)&fl=id,name_s > {code} > > *Response:* > {code:java} > { > "responseHeader":{ > "zkConnected":true, > "status":0, > "QTime":7, > "params":{ > "q":"*:*", > "fl":"id,name_s", > "fq":"{!streaming_expression}top(select(search(food, q=\"*:*\", > fl=\"id,vector_fs\", sort=\"id asc\"), cosineSimilarity(vector_fs, > array(5.2,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as cos, id), n=5, sort=\"cos > desc\")"}}, > "response":{"numFound":5,"start":0,"docs":[ > { > "name_s":"donut", > "id":"1"}, > { > "name_s":"apple juice", > "id":"2"}, > { > "name_s":"cheese pizza", > "id":"4"}, > { > "name_s":"cheese bread sticks", > "id":"8"}, > { > "name_s":"cinnamon bread sticks", > "id":"10"}] > }} > {code} > The current implementation also supports the following additional parameters: > *f*: (optional) The field name from the streaming expression containing the > document ids upon which to filter. Defaults to the same uniqueKey field name > from your documents. > *method*: (optional) Any of termsFilter (default), booleanQuery, automaton, > docValuesTermsFilter. > The method may go away, especially if we find a more efficient way to join > the stream to the main query doc set. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT
[ https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952774#comment-16952774 ] Jan Høydahl commented on SOLR-13835: New commits to PR to explicitly handle known codes: * 401 => EventType.FORBIDDEN * 403 => EventType.UNAUTHORIZED * 200/202 => EventType.AUTHORIZED * All other statuses => EventType.ERROR Please review. Think this should be mergeable now. > HttpSolrCall produces incorrect extra AuditEvent on > AuthorizationResponse.PROMPT > > > Key: SOLR-13835 > URL: https://issues.apache.org/jira/browse/SOLR-13835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Authentication, Authorization >Reporter: Chris M. Hostetter >Assignee: Jan Høydahl >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > spinning this out of SOLR-13741... > {quote} > Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe > there is a code bug, not a test bug. In HttpSolrCall#471 in the > {{authorize()}} call, if authResponse == PROMPT, it will actually match both > blocks and emit two audit events: > [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493] > > {code:java} > if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...} > if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && > !(authResponse.statusCode == HttpStatus.SC_OK)) {...} > {code} > When code==401, it is also true that code!=200. Intuitively there should be > both a sendErrora and return RETURN before line #484 in the first if block? > {quote} > This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by > a coresponding {{UNAUTHORIZED}} AuditEvent. > It's not yet clear if, from the perspective of the external client, there are > any other bugs in behavior (TBD) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13403) Terms component fails for DatePointField
[ https://issues.apache.org/jira/browse/SOLR-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952769#comment-16952769 ] Mikhail Khludnev commented on SOLR-13403: - patch makes sense > Terms component fails for DatePointField > > > Key: SOLR-13403 > URL: https://issues.apache.org/jira/browse/SOLR-13403 > Project: Solr > Issue Type: Bug > Components: SearchComponents - other >Reporter: Munendra S N >Assignee: Munendra S N >Priority: Major > Attachments: SOLR-13403.patch, SOLR-13403.patch, SOLR-13403.patch > > > Getting terms for PointFields except DatePointField. For DatePointField, the > request fails NPE -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9006) Ensure WordDelimiterGraphFilter always emits catenateAll token early
[ https://issues.apache.org/jira/browse/LUCENE-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952770#comment-16952770 ] Jim Ferenczi commented on LUCENE-9006: -- I don't think your change affects the fact that we cannot set graphOffsetsAreCorrect when writing a test using the WDGF. Your test should fail the same way with graphOffsetsAreCorrect if you don't reorder the terms in the output. The other tests for the WDGF sets this flag to false. I also wonder why do you think that there should be any order among the different form that start at the same position ? Are you relying on this order in a subsequent filter ? Maybe we could mark the alternatives with a specific type like synonyms are doing ? This way it would be easier to differentiate a splitting path from the original token ? > Ensure WordDelimiterGraphFilter always emits catenateAll token early > > > Key: LUCENE-9006 > URL: https://issues.apache.org/jira/browse/LUCENE-9006 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Reporter: David Wayne Smiley >Assignee: David Wayne Smiley >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Ideally, the first token of WDGF is the preserveOriginal (if configured to > emit), and the second should be the catenateAll (if configured to emit). The > deprecated WDF does this but WDGF can sometimes put the first other token > earlier when there is a non-emitted candidate sub-token. > Example input "8-other" when only generateWordParts and catenateAll -- *not* > generateNumberParts. WDGF internally sees the '8' but moves on. Ultimately, > the "other" token and the catenated "8other" will appear at the same internal > position, which by luck fools the sorter to emit "other" first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13677) All Metrics Gauges should be unregistered by the objects that registered them
[ https://issues.apache.org/jira/browse/SOLR-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952764#comment-16952764 ] Andrzej Bialecki commented on SOLR-13677: - This is an updated patch, after cleanup of the issues listed in the review, and with some additional changes: * I excluded the {{scope}} from the context, because in most cases we can reuse the parent context for components with different scopes. * I converted some internal, non-pluggable components to use the new API. This still is a large change and needs more testing - I'll need another day to be reasonably sure that it doesn't break things. > All Metrics Gauges should be unregistered by the objects that registered them > - > > Key: SOLR-13677 > URL: https://issues.apache.org/jira/browse/SOLR-13677 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Noble Paul >Assignee: Andrzej Bialecki >Priority: Blocker > Fix For: 8.3 > > Attachments: SOLR-13677.patch > > Time Spent: 3h 50m > Remaining Estimate: 0h > > The life cycle of Metrics producers are managed by the core (mostly). So, if > the lifecycle of the object is different from that of the core itself, these > objects will never be unregistered from the metrics registry. This will lead > to memory leaks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection
jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335431950 ## File path: lucene/suggest/src/test/org/apache/lucene/search/suggest/document/TestPrefixCompletionQuery.java ## @@ -253,6 +258,61 @@ public void testDocFiltering() throws Exception { iw.close(); } + /** + * Test that the correct amount of documents are collected if using a collector that also rejects documents. + */ + public void testCollectorThatRejects() throws Exception { +// use synonym analyzer to have multiple paths to same suggested document. This mock adds "dog" as synonym for "dogs" +Analyzer analyzer = new MockSynonymAnalyzer(); +RandomIndexWriter iw = new RandomIndexWriter(random(), dir, iwcWithSuggestField(analyzer, "suggest_field")); +List expectedResults = new ArrayList(); + +for (int docCount = 10; docCount > 0; docCount--) { + Document document = new Document(); + String value = "ab" + docCount + " dogs"; + document.add(new SuggestField("suggest_field", value, docCount)); + expectedResults.add(new Entry(value, docCount)); + iw.addDocument(document); +} + +if (rarely()) { + iw.commit(); +} + +DirectoryReader reader = iw.getReader(); +SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader); + +PrefixCompletionQuery query = new PrefixCompletionQuery(analyzer, new Term("suggest_field", "ab")); +int topN = 5; + +// use a TopSuggestDocsCollector that rejects results with duplicate docIds +TopSuggestDocsCollector collector = new TopSuggestDocsCollector(topN, false) { + + private Set seenDocIds = new HashSet<>(); + + @Override + public boolean collect(int docID, CharSequence key, CharSequence context, float score) throws IOException { + int globalDocId = docID + docBase; + boolean collected = false; + if (seenDocIds.contains(globalDocId) == false) { + super.collect(docID, key, context, score); + seenDocIds.add(globalDocId); + collected = true; + } + return collected; + } +}; + +indexSearcher.suggest(query, collector); +assertSuggestions(collector.get(), expectedResults.subList(0, topN).toArray(new Entry[0])); + +// TODO expecting true here, why false? Review comment: I'll open an issue. I also wonder if we shouldn't rely on the fact that the top suggest collector will also early terminate so whenever we expect rejection (because of deleted docs or because we deduplicate on suggestions/doc) we could set the queue size to its maximum value (5000). Currently we have different heuristics that tries to pick a sensitive value automatically but there is no guarantee of admissibility. For instance if we want to deduplicate by document id we should ensure that the queue size is greater than `topN*maxAnalyzedValuesPerDoc` and we'd need to compute this value at index time. I may be completely off but it would be interesting to see the effects of setting the queue size to its maximum value on all search. This way the admissibility is easier to reason about and we don't need to correlate it with the choice made by the heuristic. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13677) All Metrics Gauges should be unregistered by the objects that registered them
[ https://issues.apache.org/jira/browse/SOLR-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated SOLR-13677: Attachment: SOLR-13677.patch > All Metrics Gauges should be unregistered by the objects that registered them > - > > Key: SOLR-13677 > URL: https://issues.apache.org/jira/browse/SOLR-13677 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Noble Paul >Assignee: Andrzej Bialecki >Priority: Blocker > Fix For: 8.3 > > Attachments: SOLR-13677.patch > > Time Spent: 3h 50m > Remaining Estimate: 0h > > The life cycle of Metrics producers are managed by the core (mostly). So, if > the lifecycle of the object is different from that of the core itself, these > objects will never be unregistered from the metrics registry. This will lead > to memory leaks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection
jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335426995 ## File path: lucene/suggest/src/java/org/apache/lucene/search/suggest/document/TopSuggestDocs.java ## @@ -116,19 +133,29 @@ public TopSuggestDocs(TotalHits totalHits, SuggestScoreDoc[] scoreDocs) { */ public static TopSuggestDocs merge(int topN, TopSuggestDocs[] shardHits) { SuggestScoreDocPriorityQueue priorityQueue = new SuggestScoreDocPriorityQueue(topN); +boolean allComplete = true; for (TopSuggestDocs shardHit : shardHits) { for (SuggestScoreDoc scoreDoc : shardHit.scoreLookupDocs()) { if (scoreDoc == priorityQueue.insertWithOverflow(scoreDoc)) { break; } } + allComplete &= shardHit.isComplete; } SuggestScoreDoc[] topNResults = priorityQueue.getResults(); if (topNResults.length > 0) { - return new TopSuggestDocs(new TotalHits(topNResults.length, TotalHits.Relation.EQUAL_TO), topNResults); + return new TopSuggestDocs(new TotalHits(topNResults.length, TotalHits.Relation.EQUAL_TO), topNResults, + allComplete); } else { return TopSuggestDocs.EMPTY; } } + /** + * Indicates if the list of results is complete or not. Might be false if the {@link TopNSearcher} rejected + * too many of the queued results. Review comment: The admissibility of the search is computed from the reject count so a value of `false` means that we exhausted all the paths but we had to reject all of them so the topN is truncated. It's hard to follow the full logic but it should be ok as long as it is ok to return less than the topN when there are more rejections than the queue size can handle ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on issue #904: LUCENE-8992: Share minimum score across segment in concurrent search
jimczi commented on issue #904: LUCENE-8992: Share minimum score across segment in concurrent search URL: https://github.com/apache/lucene-solr/pull/904#issuecomment-542656802 I pushed another commit to replace the modulo with a bitwise operation as suggested by @jpountz . That seemed to help a bit and since there are no regressions and some nice boosts I think it is ready for another review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.
[ https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated SOLR-13824: Attachment: SOLR-13824.patch > JSON Request API ignores prematurely closing curly brace. > -- > > Key: SOLR-13824 > URL: https://issues.apache.org/jira/browse/SOLR-13824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: JSON Request API >Reporter: Mikhail Khludnev >Priority: Major > Attachments: SOLR-13824.patch, SOLR-13824.patch > > > {code:java} > json={query:"content:foo", facet:{zz:{field:id}}} > {code} > this works fine, but if we mistype {{}}} instead of {{,}} > {code:java} > json={query:"content:foo"} facet:{zz:{field:id}}} > {code} > It's captured only partially, here's we have under debug > {code:java} > "json":{"query":"content:foo"}, > {code} > I suppose it should throw an error with 400 code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.
[ https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated SOLR-13824: Attachment: (was: SOLR-13824.patch) > JSON Request API ignores prematurely closing curly brace. > -- > > Key: SOLR-13824 > URL: https://issues.apache.org/jira/browse/SOLR-13824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: JSON Request API >Reporter: Mikhail Khludnev >Priority: Major > Attachments: SOLR-13824.patch > > > {code:java} > json={query:"content:foo", facet:{zz:{field:id}}} > {code} > this works fine, but if we mistype {{}}} instead of {{,}} > {code:java} > json={query:"content:foo"} facet:{zz:{field:id}}} > {code} > It's captured only partially, here's we have under debug > {code:java} > "json":{"query":"content:foo"}, > {code} > I suppose it should throw an error with 400 code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.
[ https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated SOLR-13824: Attachment: SOLR-13824.patch > JSON Request API ignores prematurely closing curly brace. > -- > > Key: SOLR-13824 > URL: https://issues.apache.org/jira/browse/SOLR-13824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: JSON Request API >Reporter: Mikhail Khludnev >Priority: Major > Attachments: SOLR-13824.patch, SOLR-13824.patch > > > {code:java} > json={query:"content:foo", facet:{zz:{field:id}}} > {code} > this works fine, but if we mistype {{}}} instead of {{,}} > {code:java} > json={query:"content:foo"} facet:{zz:{field:id}}} > {code} > It's captured only partially, here's we have under debug > {code:java} > "json":{"query":"content:foo"}, > {code} > I suppose it should throw an error with 400 code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8993) Change Maven POM repository URLs to https
[ https://issues.apache.org/jira/browse/LUCENE-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952709#comment-16952709 ] Uwe Schindler commented on LUCENE-8993: --- I had to revert the Apache Parent POM upgrade in 8.x and 8.3 branch, because the Apache Parent POM now needs a higher Maven minimum version, which we don't use yet in Lucene/Solr 8. > Change Maven POM repository URLs to https > - > > Key: LUCENE-8993 > URL: https://issues.apache.org/jira/browse/LUCENE-8993 > Project: Lucene - Core > Issue Type: Task > Components: general/build >Affects Versions: 7.7.2, 8.2, 8.1.1 >Reporter: Uwe Schindler >Assignee: Uwe Schindler >Priority: Major > Fix For: master (9.0), 8.3 > > Attachments: LUCENE-8993.patch > > > After fixing LUCENE-8807 I figured out today, that Lucene's build system uses > HTTPS URLs everywhere. But the POMs deployed to Maven central still use http > (I assumed that those are inherited from the ANT build). > This will fix it for later versions by changing the POM templates. Hopefully > this will not happen in Gradle! > [~markrmil...@gmail.com]: Can you make sure that the new Gradle build uses > HTTPS for all hard configured repositories (like Cloudera)? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8993) Change Maven POM repository URLs to https
[ https://issues.apache.org/jira/browse/LUCENE-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952706#comment-16952706 ] ASF subversion and git services commented on LUCENE-8993: - Commit 0c8e76764dd62728c61c415118584de04de6b022 in lucene-solr's branch refs/heads/branch_8_3 from Uwe Schindler [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=0c8e767 ] Revert "LUCENE-8993: Also update to latest version of Apache Parent POM" This reverts commit 9d21418dfcc5c884f45ab668579b0391965a18bb. This is needed because Lucene 8.x does not yet update minimum Maven version, but Apache Parent POM requires this. > Change Maven POM repository URLs to https > - > > Key: LUCENE-8993 > URL: https://issues.apache.org/jira/browse/LUCENE-8993 > Project: Lucene - Core > Issue Type: Task > Components: general/build >Affects Versions: 7.7.2, 8.2, 8.1.1 >Reporter: Uwe Schindler >Assignee: Uwe Schindler >Priority: Major > Fix For: master (9.0), 8.3 > > Attachments: LUCENE-8993.patch > > > After fixing LUCENE-8807 I figured out today, that Lucene's build system uses > HTTPS URLs everywhere. But the POMs deployed to Maven central still use http > (I assumed that those are inherited from the ANT build). > This will fix it for later versions by changing the POM templates. Hopefully > this will not happen in Gradle! > [~markrmil...@gmail.com]: Can you make sure that the new Gradle build uses > HTTPS for all hard configured repositories (like Cloudera)? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8993) Change Maven POM repository URLs to https
[ https://issues.apache.org/jira/browse/LUCENE-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952704#comment-16952704 ] ASF subversion and git services commented on LUCENE-8993: - Commit fa726bec50ddbe1819a2d32c06aff3837b948e9e in lucene-solr's branch refs/heads/branch_8x from Uwe Schindler [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fa726be ] Revert "LUCENE-8993: Also update to latest version of Apache Parent POM" This reverts commit 9d21418dfcc5c884f45ab668579b0391965a18bb. This is needed because Lucene 8.x does not yet update minimum Maven version, but Apache Parent POM requires this. > Change Maven POM repository URLs to https > - > > Key: LUCENE-8993 > URL: https://issues.apache.org/jira/browse/LUCENE-8993 > Project: Lucene - Core > Issue Type: Task > Components: general/build >Affects Versions: 7.7.2, 8.2, 8.1.1 >Reporter: Uwe Schindler >Assignee: Uwe Schindler >Priority: Major > Fix For: master (9.0), 8.3 > > Attachments: LUCENE-8993.patch > > > After fixing LUCENE-8807 I figured out today, that Lucene's build system uses > HTTPS URLs everywhere. But the POMs deployed to Maven central still use http > (I assumed that those are inherited from the ANT build). > This will fix it for later versions by changing the POM templates. Hopefully > this will not happen in Gradle! > [~markrmil...@gmail.com]: Can you make sure that the new Gradle build uses > HTTPS for all hard configured repositories (like Cloudera)? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12393) ExpandComponent only calculates the score of expanded docs when sorted by score
[ https://issues.apache.org/jira/browse/SOLR-12393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952701#comment-16952701 ] Lucene/Solr QA commented on SOLR-12393: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 47m 42s{color} | {color:green} core in the patch passed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 51m 45s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | SOLR-12393 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12983079/SOLR-12393.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / f7f6a37f337 | | ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 | | Default Java | LTS | | Test Results | https://builds.apache.org/job/PreCommit-SOLR-Build/578/testReport/ | | modules | C: solr/core U: solr/core | | Console output | https://builds.apache.org/job/PreCommit-SOLR-Build/578/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > ExpandComponent only calculates the score of expanded docs when sorted by > score > --- > > Key: SOLR-12393 > URL: https://issues.apache.org/jira/browse/SOLR-12393 > Project: Solr > Issue Type: Bug > Components: SearchComponents - other >Reporter: David Wayne Smiley >Assignee: Munendra S N >Priority: Major > Attachments: SOLR-12393.patch, SOLR-12393.patch, SOLR-12393.patch, > SOLR-12393.patch > > > If you use the ExpandComponent to show expanded docs and if you want the > score back (specified in "fl"), it will be NaN if the expanded docs are > sorted by anything other than the default score descending. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection
cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335389135 ## File path: lucene/core/src/java/org/apache/lucene/util/fst/Util.java ## @@ -460,11 +460,6 @@ public void addStartPaths(FST.Arc node, T startOutput, boolean allowEmptyStri continue; } -if (results.size() == topN-1 && maxQueueDepth == topN) { - // Last path -- don't bother w/ queue anymore: - queue = null; Review comment: As far as I understand this optimization assumes we surely accept (and collect) the path later in L516s acceptResult(), which always seems to be the case for collectors that don't reject, but if the collector that is eventually called via NRTSuggesters acceptResult() chooses to reject this option, we were losing expected results. This surfaced in the prefix completion tests I added. @jimczi might be able to explain this a bit better than me. > Have you run the suggest benchmarks to see if removing this opto hurt performance? No, where are they and how can I run them? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection
cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335389135 ## File path: lucene/core/src/java/org/apache/lucene/util/fst/Util.java ## @@ -460,11 +460,6 @@ public void addStartPaths(FST.Arc node, T startOutput, boolean allowEmptyStri continue; } -if (results.size() == topN-1 && maxQueueDepth == topN) { - // Last path -- don't bother w/ queue anymore: - queue = null; Review comment: As far as I understand this optimization assumes we surely accept (and collect) the path later in L516s acceptResult(), which always seems to be the case for collectors that don't reject, but if the collector that is eventually called via NRTSuggesters acceptResult() chooses to reject this option, we were losing expected results. This surfaced in the prefix completion tests I added. @jimczi might be able to explain this a bit better than me. > Have you run the suggest benchmarks to see if removing this opto hurt performance? No, where are they and how can I run them? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-8928) BKDWriter could make splitting decisions based on the actual range of values
[ https://issues.apache.org/jira/browse/LUCENE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ignacio Vera resolved LUCENE-8928. -- Fix Version/s: 8.4 Assignee: Ignacio Vera Resolution: Fixed > BKDWriter could make splitting decisions based on the actual range of values > > > Key: LUCENE-8928 > URL: https://issues.apache.org/jira/browse/LUCENE-8928 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Ignacio Vera >Priority: Minor > Fix For: 8.4 > > Time Spent: 20m > Remaining Estimate: 0h > > Currently BKDWriter assumes that splitting on one dimension has no effect on > values in other dimensions. While this may be ok for geo points, this is > usually not true for ranges (or geo shapes, which are ranges too). Maybe we > could get better indexing by re-computing the range of values on each > dimension before making the choice of the split dimension? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-8746) Make EdgeTree (aka ComponentTree) support different type of components
[ https://issues.apache.org/jira/browse/LUCENE-8746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ignacio Vera resolved LUCENE-8746. -- Fix Version/s: 8.4 Assignee: Ignacio Vera Resolution: Fixed Thanks [~jpountz] for muting the terst. I have pushed fix ad it seems test are happy. The use was related to the order of the edges of decoded triangles. This is something that Lucene-8997 should improve. > Make EdgeTree (aka ComponentTree) support different type of components > -- > > Key: LUCENE-8746 > URL: https://issues.apache.org/jira/browse/LUCENE-8746 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ignacio Vera >Assignee: Ignacio Vera >Priority: Major > Fix For: 8.4 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Currently the class {{EdgeTree}} is a bit confusing as it is in reality a > tree of components. The inner class {{Edge}} is the one that builds a tree of > edges which is used by Polygon2D and Line2D to represent their structure. > Here is proposed: > 1) Create a new class called {{ComponentTree}} which is in fact the current > {{EdgeTree}} > 2) Modify {{EdgeTree}} to be in fact the inner class Edge > 3) Extract a {{Component}} interface so we can have different types of > components in the same tree. This allow us to support heterogeneous trees of > components. > 4) Make {{Polygon2D}} and {{Line2D}} instance of the component interface. > 4) With this change, {{LatLonShapePolygonQuery}} and {{LatLonShapeLineQuery}} > can be replaced with one {{LatLonShapeComponentQuery.}} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13850) Atomic Updates with PreAnalyzedField
[ https://issues.apache.org/jira/browse/SOLR-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksandr Drapushko updated SOLR-13850: --- Description: If you try to update non pre-analyzed fields in a document using atomic updates, data in pre-analyzed fields (if there is any) will be lost. *Steps to reproduce* 1. Index this document into techproducts {code:json} { "id": "a", "n_s": "s1", "pre": "{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}" } {code} 2. Query the document {code:json} { "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "id":"a", "n_s":"s1", "pre":"Alaska", "_version_":1647475215142223872}] }} {code} 3. Update using atomic syntax {code:json} { "add": { "doc": { "id": "a", "n_s": {"set": "s2"} }}} {code} 4. Observe the warning in solr log UI: {noformat} WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing pre-analyzed field 'pre' {noformat} solr.log: {noformat} WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type java.lang.String, expected Map at org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86) {noformat} 5. Query the document again {code:json} { "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "id":"a", "n_s":"s2", "_version_":1647475461695995904}] }} {code} *Result*: There is no 'pre' field in the document anymore. _My thoughts on it_ 1. Data loss can be prevented if the warning will be replaced with error (re-throwing exception). Atomic updates for such documents still won't work, but updates will be explicitly rejected. 2. Solr tries to read the document from index, merge it with input document and re-index the document, but when it reads indexed pre-analyzed fields the format is different, so Solr cannot parse and re-index those fields properly. was: If you try to update non pre-analyzed fields in a document using atomic updates, data in pre-analyzed fields (if there is any) will be lost. Steps to reproduce 1. Index this document into techproducts { "id": "a", "n_s": "s1", "pre": "\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}" } 2. Query the document { "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "id":"a", "n_s":"s1", "pre":"Alaska", "_version_":1647475215142223872}] }} 3. Update using atomic syntax { "add": { "doc": { "id": "a", "n_s": \{"set": "s2"} }}} 4. Observe the warning in solr log UI: WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing pre-analyzed field 'pre' solr.log: WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type java.lang.String, expected Map at org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86) 5. Query the document again { "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "id":"a", "n_s":"s2", "_version_":1647475461695995904}] }} Result: There is no 'pre' field in the document anymore. My thoughts on it 1. Data loss can be prevented if the warning will be replaced with error (re-throwing exception). Atomic updates for such documents still won't work, but updates will be explicitly rejected. 2. Solr tries to read the document from index, merge it with input document and re-index the document, but when it reads indexed pre-analyzed fields the format is different, so Solr cannot parse and re-index those fields properly. > Atomic Updates with PreAnalyzedField > > > Key: SOLR-13850 > URL: https://issues.apache.org/jira/browse/SOLR-13850 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7.2, 8.2 > Environment: Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 > (Oracle) >Reporter: Oleksandr Drapushko >Priority: Critical > Labels: AtomicUpdate > > If you try to update non pre-analyzed fields in a document using atomic > updates, data in pre-analyzed fields (if there is any) will be lost. > *Steps to reproduce* > 1. Index this document into techproducts > {code:json} > { > "id": "a", > "n_s": "s1", > "pre": > "{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}" > } > {code} > 2. Query the document > {code:json} > { > "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ > { > "id":
[GitHub] [lucene-solr] cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection
cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335372402 ## File path: lucene/suggest/src/test/org/apache/lucene/search/suggest/document/TestPrefixCompletionQuery.java ## @@ -253,6 +263,123 @@ public void testDocFiltering() throws Exception { iw.close(); } + /** + * Test that the correct amount of documents are collected if using a collector that also rejects documents. + */ + public void testCollectorThatRejects() throws Exception { +// use synonym analyzer to have multiple paths to same suggested document. This mock adds "dog" as synonym for "dogs" +Analyzer analyzer = new MockSynonymAnalyzer(); +RandomIndexWriter iw = new RandomIndexWriter(random(), dir, iwcWithSuggestField(analyzer, "suggest_field")); +List expectedResults = new ArrayList(); + +for (int docCount = 10; docCount > 0; docCount--) { + Document document = new Document(); + String value = "ab" + docCount + " dogs"; + document.add(new SuggestField("suggest_field", value, docCount)); + expectedResults.add(new Entry(value, docCount)); + iw.addDocument(document); +} + +if (rarely()) { + iw.commit(); +} + +DirectoryReader reader = iw.getReader(); +SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader); + +PrefixCompletionQuery query = new PrefixCompletionQuery(analyzer, new Term("suggest_field", "ab")); +int topN = 5; + +// use a TopSuggestDocsCollector that rejects results with duplicate docIds +TopSuggestDocsCollector collector = new TopSuggestDocsCollector(topN, false) { + + private Set seenDocIds = new HashSet<>(); + + @Override + public boolean collect(int docID, CharSequence key, CharSequence context, float score) throws IOException { + int globalDocId = docID + docBase; + boolean collected = false; + if (seenDocIds.contains(globalDocId) == false) { Review comment: The collector is called multiple times with the same docID because of the MockSynonymAnalyzer used in the test setup which adds "dog" for "dogs", so each document has two completion paths. This collector is meant to de-duplicate this. I added a note explaining this. This is a simplified version of the behaviour we observe in https://github.com/elastic/elasticsearch/issues/46445. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13850) Atomic Updates with PreAnalyzedField
[ https://issues.apache.org/jira/browse/SOLR-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksandr Drapushko updated SOLR-13850: --- Description: If you try to update non pre-analyzed fields in a document using atomic updates, data in pre-analyzed fields (if there is any) will be lost. *Steps to reproduce* 1. Index this document into techproducts {{{}} {{ "id": "a",}} {{ "n_s": "s1",}} {{ "pre": "\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"}} {{}}} 2. Query the document {{{}} {{ "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[}} {{ {}} {{ "id":"a",}} {{ "n_s":"s1",}} {{ "pre":"Alaska",}} {{ "_version_":1647475215142223872}]}} {{ 3. Update using atomic syntax {{{}} {{ "add": {}} {{ "doc": {}} {{ "id": "a",}} {{ "n_s": \{"set": "s2"}}} {{}}}{{}}}{{}}} 4. Observe the warning in solr log UI: WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing pre-analyzed field 'pre' solr.log: WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type java.lang.String, expected Map at org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86) 5. Query the document again {{{}} {{ "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[}} {{ {}} {{ "id":"a",}} {{ "n_s":"s2",}} {{ "_version_":1647475461695995904}]}} {{ *Result*: There is no 'pre' field in the document anymore. _My thoughts on it_ 1. Data loss can be prevented if the warning will be replaced with error (re-throwing exception). Atomic updates for such documents still won't work, but updates will be explicitly rejected. 2. Solr tries to read the document from index, merge it with input document and re-index the document, but when it reads indexed pre-analyzed fields the format is different, so Solr cannot parse and re-index those fields properly. was: If you try to update non pre-analyzed fields in a document using atomic updates, data in pre-analyzed fields (if there is any) will be lost. *Steps to reproduce* 1. Index this document into techproducts { "id": "a", "n_s": "s1", "pre": "\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}" } 2. Query the document { "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "id":"a", "n_s":"s1", "pre":"Alaska", "_version_":1647475215142223872}] }} 3. Update using atomic syntax { "add": { "doc": { "id": "a", "n_s": \{"set": "s2"} }}} 4. Observe the warning in solr log UI: WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing pre-analyzed field 'pre' solr.log: WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type java.lang.String, expected Map at org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86) 5. Query the document again { "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "id":"a", "n_s":"s2", "_version_":1647475461695995904}] }} *Result*: There is no 'pre' field in the document anymore. _My thoughts on it_ 1. Data loss can be prevented if the warning will be replaced with error (re-throwing exception). Atomic updates for such documents still won't work, but updates will be explicitly rejected. 2. Solr tries to read the document from index, merge it with input document and re-index the document, but when it reads indexed pre-analyzed fields the format is different, so Solr cannot parse and re-index those fields properly. Environment: Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 (Oracle) (was: Ubuntu 16.04 LTS, Java 8 (Zulu) Windows 10, Java 11 (Oracle)) > Atomic Updates with PreAnalyzedField > > > Key: SOLR-13850 > URL: https://issues.apache.org/jira/browse/SOLR-13850 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7.2, 8.2 > Environment: Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 > (Oracle) >Reporter: Oleksandr Drapushko >Priority: Critical > Labels: AtomicUpdate > > If you try to update non pre-analyzed fields in a document using atomic > updates, data in pre-analyzed fields (if there is any) will be lost. > > *Steps to reproduce* > 1. Index this document into techproducts > {{{}} > {{ "id": "a",}} > {{ "n_s": "s1",}} > {{ "pre": > "\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":
[jira] [Updated] (SOLR-13850) Atomic Updates with PreAnalyzedField
[ https://issues.apache.org/jira/browse/SOLR-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksandr Drapushko updated SOLR-13850: --- Description: If you try to update non pre-analyzed fields in a document using atomic updates, data in pre-analyzed fields (if there is any) will be lost. Steps to reproduce 1. Index this document into techproducts { "id": "a", "n_s": "s1", "pre": "\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}" } 2. Query the document { "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "id":"a", "n_s":"s1", "pre":"Alaska", "_version_":1647475215142223872}] }} 3. Update using atomic syntax { "add": { "doc": { "id": "a", "n_s": \{"set": "s2"} }}} 4. Observe the warning in solr log UI: WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing pre-analyzed field 'pre' solr.log: WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type java.lang.String, expected Map at org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86) 5. Query the document again { "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "id":"a", "n_s":"s2", "_version_":1647475461695995904}] }} Result: There is no 'pre' field in the document anymore. My thoughts on it 1. Data loss can be prevented if the warning will be replaced with error (re-throwing exception). Atomic updates for such documents still won't work, but updates will be explicitly rejected. 2. Solr tries to read the document from index, merge it with input document and re-index the document, but when it reads indexed pre-analyzed fields the format is different, so Solr cannot parse and re-index those fields properly. was: If you try to update non pre-analyzed fields in a document using atomic updates, data in pre-analyzed fields (if there is any) will be lost. *Steps to reproduce* 1. Index this document into techproducts {{{}} {{ "id": "a",}} {{ "n_s": "s1",}} {{ "pre": "\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"}} {{}}} 2. Query the document {{{}} {{ "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[}} {{ {}} {{ "id":"a",}} {{ "n_s":"s1",}} {{ "pre":"Alaska",}} {{ "_version_":1647475215142223872}]}} {{ 3. Update using atomic syntax {{{}} {{ "add": {}} {{ "doc": {}} {{ "id": "a",}} {{ "n_s": \{"set": "s2"}}} {{}}}{{}}}{{}}} 4. Observe the warning in solr log UI: WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing pre-analyzed field 'pre' solr.log: WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type java.lang.String, expected Map at org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86) 5. Query the document again {{{}} {{ "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[}} {{ {}} {{ "id":"a",}} {{ "n_s":"s2",}} {{ "_version_":1647475461695995904}]}} {{ *Result*: There is no 'pre' field in the document anymore. _My thoughts on it_ 1. Data loss can be prevented if the warning will be replaced with error (re-throwing exception). Atomic updates for such documents still won't work, but updates will be explicitly rejected. 2. Solr tries to read the document from index, merge it with input document and re-index the document, but when it reads indexed pre-analyzed fields the format is different, so Solr cannot parse and re-index those fields properly. > Atomic Updates with PreAnalyzedField > > > Key: SOLR-13850 > URL: https://issues.apache.org/jira/browse/SOLR-13850 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7.2, 8.2 > Environment: Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 > (Oracle) >Reporter: Oleksandr Drapushko >Priority: Critical > Labels: AtomicUpdate > > If you try to update non pre-analyzed fields in a document using atomic > updates, data in pre-analyzed fields (if there is any) will be lost. > > Steps to reproduce > 1. Index this document into techproducts > { > "id": "a", > "n_s": "s1", > "pre": > "\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}" > } > 2. Query the document > { > "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ > { > "id":"a", > "n_s":"s1", >
[jira] [Created] (SOLR-13850) Atomic Updates with PreAnalyzedField
Oleksandr Drapushko created SOLR-13850: -- Summary: Atomic Updates with PreAnalyzedField Key: SOLR-13850 URL: https://issues.apache.org/jira/browse/SOLR-13850 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Affects Versions: 8.2, 7.7.2 Environment: Ubuntu 16.04 LTS, Java 8 (Zulu) Windows 10, Java 11 (Oracle) Reporter: Oleksandr Drapushko If you try to update non pre-analyzed fields in a document using atomic updates, data in pre-analyzed fields (if there is any) will be lost. *Steps to reproduce* 1. Index this document into techproducts { "id": "a", "n_s": "s1", "pre": "\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}" } 2. Query the document { "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "id":"a", "n_s":"s1", "pre":"Alaska", "_version_":1647475215142223872}] }} 3. Update using atomic syntax { "add": { "doc": { "id": "a", "n_s": \{"set": "s2"} }}} 4. Observe the warning in solr log UI: WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing pre-analyzed field 'pre' solr.log: WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type java.lang.String, expected Map at org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86) 5. Query the document again { "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "id":"a", "n_s":"s2", "_version_":1647475461695995904}] }} *Result*: There is no 'pre' field in the document anymore. _My thoughts on it_ 1. Data loss can be prevented if the warning will be replaced with error (re-throwing exception). Atomic updates for such documents still won't work, but updates will be explicitly rejected. 2. Solr tries to read the document from index, merge it with input document and re-index the document, but when it reads indexed pre-analyzed fields the format is different, so Solr cannot parse and re-index those fields properly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org