date:20191016

[jira] [Created] (LUCENE-9011) Updating breaks backward compatibility by throwing IndexFormatTooOldException in some cases

2019-10-16 Thread xia0c (Jira)

xia0c created LUCENE-9011:
-

 Summary: Updating breaks backward compatibility by throwing 
IndexFormatTooOldException in some cases
 Key: LUCENE-9011
 URL: https://issues.apache.org/jira/browse/LUCENE-9011
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 7.7.1
Reporter: xia0c


When I try to update Lucene from 7.7.1 to the latest version 8.2.0. The 
following code:


{code:java}
@Test
public void test() throws FileSystemException{
String fstFileName = "fst/slovaklemma_ascii.fst";
File fstFile = new File(fstFileName);
FST fst = FST.read(fstFile.toPath(), 
CharSequenceOutputs.getSingleton());
}
{code}

Throws an IndexFormatTooOldException error:

{code:java}
org.apache.lucene.index.IndexFormatTooOldException: Format version is not 
supported (resource org.apache.lucene.store.InputStreamDataInput@69d9c55): 4 
(needs to be between 6 and 6). This version of Lucene only supports indexes 
created with release 6.0 and later.
at 
org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:213)
at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:198)
at org.apache.lucene.util.fst.FST.(FST.java:275)
at org.apache.lucene.util.fst.FST.(FST.java:263)
at org.apache.lucene.util.fst.FST.read(FST.java:487)
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13846) PreemptiveBasicAuthClientBuilderFactory use of static code blocks makes it unreliable in tests

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953284#comment-16953284
 ] 

ASF subversion and git services commented on SOLR-13846:


Commit 25968e3b75e5e9a4f2a64de10500aae10a257bdd in lucene-solr's branch 
refs/heads/branch_8_3 from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=25968e3 ]

SOLR-13846: workaround - elliminate use of problematic 
PreemptiveBasicAuthClientBuilderFactory in tests that don't need it

(cherry picked from commit 939b3364e604a4a16b3c4c5f278c4d7f30f1354b)


> PreemptiveBasicAuthClientBuilderFactory use of static code blocks makes it 
> unreliable in tests
> --
>
> Key: SOLR-13846
> URL: https://issues.apache.org/jira/browse/SOLR-13846
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
>
> PreemptiveBasicAuthClientBuilderFactory uses static code blocks to initialize 
> global static variables in a way that makes it largely unusable in tests.
> Notably: it uses {{System.getProperty(...)}} during classloading to read 
> system properties that then affect the behavior of all future instances -- 
> even if an individual test explicitly sets the system property in question 
> before instaniating instances of this class.
> This means that if two tests that both use instances of 
> PreemptiveBasicAuthClientBuilderFactory run in the same JVM, only the system 
> properties set in the first test will be used by 
> PreemptiveBasicAuthClientBuilderFactory in the *second* test (even those the 
> system properties get reset by the test framework between runs)
> There are currently two tests using PreemptiveBasicAuthClientBuilderFactory, 
> and depending on the order they run, one will fail...
> {noformat}
> $ ant test -Dtests.jvms=1 
> '-Dtests.class=*.TestQueryingOnDownCollection|*.BasicAuthOnSingleNodeTest' 
> -Dtests.seed=EC8FB67A91689F48 -Dtests.slow=true -Dtests.badapples=true 
> -Dtests.locale=sl -Dtests.timezone=Asia/Baghdad -Dtests.asserts=true 
> -Dtests.file.encoding=US-ASCII
> ...
>[junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=BasicAuthOnSingleNodeTest -Dtests.method=basicTest 
> -Dtests.seed=EC8FB67A91689F48 -Dtests.slow=true -Dtests.badapples=true 
> -Dtests.locale=sl -Dtests.timezone=Asia/Baghdad -Dtests.asserts=true 
> -Dtests.file.encoding=US-ASCII
>[junit4] ERROR   4.05s | BasicAuthOnSingleNodeTest.basicTest <<<
>[junit4]> Throwable #1: 
> org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: 
> Error from server at http://127.0.0.1:37047/solr: Expected mime type 
> application/octet-stream but got text/html. 
>[junit4]> 
>[junit4]>  content="text/html;charset=utf-8"/>
>[junit4]> Error 401 Bad credentials
>[junit4]> 
>[junit4]> HTTP ERROR 401
>[junit4]> Problem accessing /solr/authCollection/select. Reason:
>[junit4]> Bad credentials href="http://eclipse.org/jetty";>Powered by Jetty:// 9.4.19.v20190610
>[junit4]> 
>[junit4]> 
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([EC8FB67A91689F48:1E7BA118D5CD927B]:0)
>[junit4]>  at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:696)
>[junit4]>  at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:402)
>[junit4]>  at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:754)
>[junit4]>  at 
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:207)
>[junit4]>  at 
> org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1003)
>[junit4]>  at 
> org.apache.solr.security.BasicAuthOnSingleNodeTest.basicTest(BasicAuthOnSingleNodeTest.java:72)
>[junit4]>  at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>[junit4]>  at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>[junit4]>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>[junit4]>  at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
>[junit4]>  at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-13741) AuditLoggerIntegrationTest hardening

2019-10-16 Thread Chris M. Hostetter (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter resolved SOLR-13741.
---
Resolution: Fixed

> AuditLoggerIntegrationTest hardening
> 
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Fix For: master (9.0), 8.4
>
> Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch
>
>
> This issue started out as an investigation into possible test or code ugs 
> uncovered while hardening AuditLoggerIntegrationTest against timing related 
> failures.  the bugs that were identified as being in code were spun of into 
> their own issues for tracking purposes to raise visibility to end users.
> this issue remains as for tracking the final hardening of the test and fixing 
> of some test bugs found along the way.
> Original jira description below...
> 
> A while back i saw a weird non-reproducible failure from 
> AuditLoggerIntegrationTest.  When i started reading through that code, 2 
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about 
> CPU scheduling that aren't neccessarily going to be true (and also suffers 
> from the problem that Thread.sleep isn't garunteed to sleep as long as you 
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
> loop, until it contains at least N items -- but the code that adds items to 
> that buffer in the async Callback thread async _before_ the code that updates 
> other state variables (like the global {{count}} and the patch specific 
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
> events added to the buffer, but calling {{assertEquals(3, 
> receiver.getTotalCount())}} could subsequently fail because that variable 
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for 
> that specific problem would be to update all other state _before_ adding the 
> event to the buffer, I set out to try and make more general improvements to 
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
> structures
> * harden the assertions made about the expected events recieved (updating 
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies 
> between things the test currently claims/asserts, and what *actually* happens 
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes 
> copious nocommits about things seem suspect.  the summary of these concerns 
> is:
> * SolrException status codes that do not match what the existing test says 
> they should (but doesn't assert)
> * extra AuditEvents occuring that the existing test does not expect
> * AuditEvents for incorrect credentials that do not at all match the expected 
> AuditEvent in the existing test -- which the current test seems to miss in 
> it's assertions because it's picking up some extra events from triggered by 
> previuos requests earlier in the test that just happen to also match the 
> asserctions.
> ...it's not clear to me if the test logic is correct and these are "code 
> bugs" or if the test is faulty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13741) AuditLoggerIntegrationTest hardening

2019-10-16 Thread Chris M. Hostetter (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-13741:
--
Fix Version/s: 8.4
   master (9.0)
  Description: 
This issue started out as an investigation into possible test or code ugs 
uncovered while hardening AuditLoggerIntegrationTest against timing related 
failures.  the bugs that were identified as being in code were spun of into 
their own issues for tracking purposes to raise visibility to end users.

this issue remains as for tracking the final hardening of the test and fixing 
of some test bugs found along the way.

Original jira description below...



A while back i saw a weird non-reproducible failure from 
AuditLoggerIntegrationTest.  When i started reading through that code, 2 things 
jumped out at me:

# the way the 'delay' option works is brittle, and makes assumptions about CPU 
scheduling that aren't neccessarily going to be true (and also suffers from the 
problem that Thread.sleep isn't garunteed to sleep as long as you ask it too)
# the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
loop, until it contains at least N items -- but the code that adds items to 
that buffer in the async Callback thread async _before_ the code that updates 
other state variables (like the global {{count}} and the patch specific 
{{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
events added to the buffer, but calling {{assertEquals(3, 
receiver.getTotalCount())}} could subsequently fail because that variable 
hadn't been udpated yet.

#2 was the source of the failures I was seeing, and while a quick fix for that 
specific problem would be to update all other state _before_ adding the event 
to the buffer, I set out to try and make more general improvements to the test:

* eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
structures
* harden the assertions made about the expected events recieved (updating some 
test methods that currently just assert the number of events recieved)
* add new assertions that _only_ the expected events are recieved.

In the process of doing this, I've found several oddities/descrepencies between 
things the test currently claims/asserts, and what *actually* happens under 
more rigerous scrutiny/assertions.

I'll attach a patch shortly that has my (in progress) updates and inlcudes 
copious nocommits about things seem suspect.  the summary of these concerns is:

* SolrException status codes that do not match what the existing test says they 
should (but doesn't assert)
* extra AuditEvents occuring that the existing test does not expect
* AuditEvents for incorrect credentials that do not at all match the expected 
AuditEvent in the existing test -- which the current test seems to miss in it's 
assertions because it's picking up some extra events from triggered by previuos 
requests earlier in the test that just happen to also match the asserctions.


...it's not clear to me if the test logic is correct and these are "code bugs" 
or if the test is faulty.


  was:
A while back i saw a weird non-reproducible failure from 
AuditLoggerIntegrationTest.  When i started reading through that code, 2 things 
jumped out at me:

# the way the 'delay' option works is brittle, and makes assumptions about CPU 
scheduling that aren't neccessarily going to be true (and also suffers from the 
problem that Thread.sleep isn't garunteed to sleep as long as you ask it too)
# the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
loop, until it contains at least N items -- but the code that adds items to 
that buffer in the async Callback thread async _before_ the code that updates 
other state variables (like the global {{count}} and the patch specific 
{{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
events added to the buffer, but calling {{assertEquals(3, 
receiver.getTotalCount())}} could subsequently fail because that variable 
hadn't been udpated yet.

#2 was the source of the failures I was seeing, and while a quick fix for that 
specific problem would be to update all other state _before_ adding the event 
to the buffer, I set out to try and make more general improvements to the test:

* eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
structures
* harden the assertions made about the expected events recieved (updating some 
test methods that currently just assert the number of events recieved)
* add new assertions that _only_ the expected events are recieved.

In the process of doing this, I've found several oddities/descrepencies between 
things the test currently claims/asserts, and what *actually* happens under 
mo

[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953281#comment-16953281
 ] 

ASF subversion and git services commented on SOLR-13741:


Commit 28c1049a258bbd060a80803c72e1c6cadc784dab in lucene-solr's branch 
refs/heads/branch_8x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=28c1049 ]

SOLR-13741: Harden AuditLoggerIntegrationTest

(cherry picked from commit 63e9bcf5d150e6324e5133a001613bd7f738a183)


> possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
> --
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch
>
>
> A while back i saw a weird non-reproducible failure from 
> AuditLoggerIntegrationTest.  When i started reading through that code, 2 
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about 
> CPU scheduling that aren't neccessarily going to be true (and also suffers 
> from the problem that Thread.sleep isn't garunteed to sleep as long as you 
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
> loop, until it contains at least N items -- but the code that adds items to 
> that buffer in the async Callback thread async _before_ the code that updates 
> other state variables (like the global {{count}} and the patch specific 
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
> events added to the buffer, but calling {{assertEquals(3, 
> receiver.getTotalCount())}} could subsequently fail because that variable 
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for 
> that specific problem would be to update all other state _before_ adding the 
> event to the buffer, I set out to try and make more general improvements to 
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
> structures
> * harden the assertions made about the expected events recieved (updating 
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies 
> between things the test currently claims/asserts, and what *actually* happens 
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes 
> copious nocommits about things seem suspect.  the summary of these concerns 
> is:
> * SolrException status codes that do not match what the existing test says 
> they should (but doesn't assert)
> * extra AuditEvents occuring that the existing test does not expect
> * AuditEvents for incorrect credentials that do not at all match the expected 
> AuditEvent in the existing test -- which the current test seems to miss in 
> it's assertions because it's picking up some extra events from triggered by 
> previuos requests earlier in the test that just happen to also match the 
> asserctions.
> ...it's not clear to me if the test logic is correct and these are "code 
> bugs" or if the test is faulty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953279#comment-16953279
 ] 

ASF subversion and git services commented on SOLR-13741:


Commit 63e9bcf5d150e6324e5133a001613bd7f738a183 in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=63e9bcf ]

SOLR-13741: Harden AuditLoggerIntegrationTest


> possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
> --
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch
>
>
> A while back i saw a weird non-reproducible failure from 
> AuditLoggerIntegrationTest.  When i started reading through that code, 2 
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about 
> CPU scheduling that aren't neccessarily going to be true (and also suffers 
> from the problem that Thread.sleep isn't garunteed to sleep as long as you 
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
> loop, until it contains at least N items -- but the code that adds items to 
> that buffer in the async Callback thread async _before_ the code that updates 
> other state variables (like the global {{count}} and the patch specific 
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
> events added to the buffer, but calling {{assertEquals(3, 
> receiver.getTotalCount())}} could subsequently fail because that variable 
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for 
> that specific problem would be to update all other state _before_ adding the 
> event to the buffer, I set out to try and make more general improvements to 
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
> structures
> * harden the assertions made about the expected events recieved (updating 
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies 
> between things the test currently claims/asserts, and what *actually* happens 
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes 
> copious nocommits about things seem suspect.  the summary of these concerns 
> is:
> * SolrException status codes that do not match what the existing test says 
> they should (but doesn't assert)
> * extra AuditEvents occuring that the existing test does not expect
> * AuditEvents for incorrect credentials that do not at all match the expected 
> AuditEvent in the existing test -- which the current test seems to miss in 
> it's assertions because it's picking up some extra events from triggered by 
> previuos requests earlier in the test that just happen to also match the 
> asserctions.
> ...it's not clear to me if the test logic is correct and these are "code 
> bugs" or if the test is faulty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9010) extend TopGroups.merge test coverage

2019-10-16 Thread Lucene/Solr QA (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953274#comment-16953274
 ] 

Lucene/Solr QA commented on LUCENE-9010:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 46s{color} 
| {color:red} lucene_grouping generated 4 new + 107 unchanged - 0 fixed = 111 
total (was 107) {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  0m 46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  0m 46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  0m 46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
3s{color} | {color:green} grouping in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}  5m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | LUCENE-9010 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12983203/LUCENE-9010.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP 
Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / ebc720c |
| ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 |
| Default Java | LTS |
| javac | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/210/artifact/out/diff-compile-javac-lucene_grouping.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/210/testReport/ |
| modules | C: lucene/grouping U: lucene/grouping |
| Console output | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/210/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> extend TopGroups.merge test coverage
> 
>
> Key: LUCENE-9010
> URL: https://issues.apache.org/jira/browse/LUCENE-9010
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: LUCENE-9010.patch
>
>
> This sub-task proposes to add test coverage for the {{TopGroups.merge}} 
> method, separately from but as preparation for LUCENE-8996 fixing the 
> 'maxScore is sometimes missing' bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest

2019-10-16 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953269#comment-16953269
 ] 

Jan Høydahl commented on SOLR-13741:


{quote}Jan: you just beat me to it ... my updated patch looks exactly like 
yours, but with more lazy whitespace :)
{quote}
:)  I'll let you take it from here and do the merge.

> possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
> --
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch
>
>
> A while back i saw a weird non-reproducible failure from 
> AuditLoggerIntegrationTest.  When i started reading through that code, 2 
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about 
> CPU scheduling that aren't neccessarily going to be true (and also suffers 
> from the problem that Thread.sleep isn't garunteed to sleep as long as you 
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
> loop, until it contains at least N items -- but the code that adds items to 
> that buffer in the async Callback thread async _before_ the code that updates 
> other state variables (like the global {{count}} and the patch specific 
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
> events added to the buffer, but calling {{assertEquals(3, 
> receiver.getTotalCount())}} could subsequently fail because that variable 
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for 
> that specific problem would be to update all other state _before_ adding the 
> event to the buffer, I set out to try and make more general improvements to 
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
> structures
> * harden the assertions made about the expected events recieved (updating 
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies 
> between things the test currently claims/asserts, and what *actually* happens 
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes 
> copious nocommits about things seem suspect.  the summary of these concerns 
> is:
> * SolrException status codes that do not match what the existing test says 
> they should (but doesn't assert)
> * extra AuditEvents occuring that the existing test does not expect
> * AuditEvents for incorrect credentials that do not at all match the expected 
> AuditEvent in the existing test -- which the current test seems to miss in 
> it's assertions because it's picking up some extra events from triggered by 
> previuos requests earlier in the test that just happen to also match the 
> asserctions.
> ...it's not clear to me if the test logic is correct and these are "code 
> bugs" or if the test is faulty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest

2019-10-16 Thread Chris M. Hostetter (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953268#comment-16953268
 ] 

Chris M. Hostetter commented on SOLR-13741:
---

Jan: you just beat me to it ... my updated patch looks exactly like yours, but 
with more lazy whitespace :)

Feel free to commit.

> possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
> --
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch
>
>
> A while back i saw a weird non-reproducible failure from 
> AuditLoggerIntegrationTest.  When i started reading through that code, 2 
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about 
> CPU scheduling that aren't neccessarily going to be true (and also suffers 
> from the problem that Thread.sleep isn't garunteed to sleep as long as you 
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
> loop, until it contains at least N items -- but the code that adds items to 
> that buffer in the async Callback thread async _before_ the code that updates 
> other state variables (like the global {{count}} and the patch specific 
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
> events added to the buffer, but calling {{assertEquals(3, 
> receiver.getTotalCount())}} could subsequently fail because that variable 
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for 
> that specific problem would be to update all other state _before_ adding the 
> event to the buffer, I set out to try and make more general improvements to 
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
> structures
> * harden the assertions made about the expected events recieved (updating 
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies 
> between things the test currently claims/asserts, and what *actually* happens 
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes 
> copious nocommits about things seem suspect.  the summary of these concerns 
> is:
> * SolrException status codes that do not match what the existing test says 
> they should (but doesn't assert)
> * extra AuditEvents occuring that the existing test does not expect
> * AuditEvents for incorrect credentials that do not at all match the expected 
> AuditEvent in the existing test -- which the current test seems to miss in 
> it's assertions because it's picking up some extra events from triggered by 
> previuos requests earlier in the test that just happen to also match the 
> asserctions.
> ...it's not clear to me if the test logic is correct and these are "code 
> bugs" or if the test is faulty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13677) All Metrics Gauges should be unregistered by the objects that registered them

2019-10-16 Thread Noble Paul (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953266#comment-16953266
 ] 

Noble Paul commented on SOLR-13677:
---

[~ab] can you raise a PR so that we can review easily

> All Metrics Gauges should be unregistered by the objects that registered them
> -
>
> Key: SOLR-13677
> URL: https://issues.apache.org/jira/browse/SOLR-13677
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Noble Paul
>Assignee: Andrzej Bialecki
>Priority: Blocker
> Fix For: 8.3
>
> Attachments: SOLR-13677.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The life cycle of Metrics producers are managed by the core (mostly). So, if 
> the lifecycle of the object is different from that of the core itself, these 
> objects will never be unregistered from the metrics registry. This will lead 
> to memory leaks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest

2019-10-16 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953265#comment-16953265
 ] 

Jan Høydahl commented on SOLR-13741:


SOLR-13835 merged and updated this patch to master.

> possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
> --
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch
>
>
> A while back i saw a weird non-reproducible failure from 
> AuditLoggerIntegrationTest.  When i started reading through that code, 2 
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about 
> CPU scheduling that aren't neccessarily going to be true (and also suffers 
> from the problem that Thread.sleep isn't garunteed to sleep as long as you 
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
> loop, until it contains at least N items -- but the code that adds items to 
> that buffer in the async Callback thread async _before_ the code that updates 
> other state variables (like the global {{count}} and the patch specific 
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
> events added to the buffer, but calling {{assertEquals(3, 
> receiver.getTotalCount())}} could subsequently fail because that variable 
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for 
> that specific problem would be to update all other state _before_ adding the 
> event to the buffer, I set out to try and make more general improvements to 
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
> structures
> * harden the assertions made about the expected events recieved (updating 
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies 
> between things the test currently claims/asserts, and what *actually* happens 
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes 
> copious nocommits about things seem suspect.  the summary of these concerns 
> is:
> * SolrException status codes that do not match what the existing test says 
> they should (but doesn't assert)
> * extra AuditEvents occuring that the existing test does not expect
> * AuditEvents for incorrect credentials that do not at all match the expected 
> AuditEvent in the existing test -- which the current test seems to miss in 
> it's assertions because it's picking up some extra events from triggered by 
> previuos requests earlier in the test that just happen to also match the 
> asserctions.
> ...it's not clear to me if the test logic is correct and these are "code 
> bugs" or if the test is faulty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest

2019-10-16 Thread Jira



 [ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-13741:
---
Attachment: SOLR-13741.patch

> possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
> --
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch
>
>
> A while back i saw a weird non-reproducible failure from 
> AuditLoggerIntegrationTest.  When i started reading through that code, 2 
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about 
> CPU scheduling that aren't neccessarily going to be true (and also suffers 
> from the problem that Thread.sleep isn't garunteed to sleep as long as you 
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
> loop, until it contains at least N items -- but the code that adds items to 
> that buffer in the async Callback thread async _before_ the code that updates 
> other state variables (like the global {{count}} and the patch specific 
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
> events added to the buffer, but calling {{assertEquals(3, 
> receiver.getTotalCount())}} could subsequently fail because that variable 
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for 
> that specific problem would be to update all other state _before_ adding the 
> event to the buffer, I set out to try and make more general improvements to 
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
> structures
> * harden the assertions made about the expected events recieved (updating 
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies 
> between things the test currently claims/asserts, and what *actually* happens 
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes 
> copious nocommits about things seem suspect.  the summary of these concerns 
> is:
> * SolrException status codes that do not match what the existing test says 
> they should (but doesn't assert)
> * extra AuditEvents occuring that the existing test does not expect
> * AuditEvents for incorrect credentials that do not at all match the expected 
> AuditEvent in the existing test -- which the current test seems to miss in 
> it's assertions because it's picking up some extra events from triggered by 
> previuos requests earlier in the test that just happen to also match the 
> asserctions.
> ...it's not clear to me if the test logic is correct and these are "code 
> bugs" or if the test is faulty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT

2019-10-16 Thread Jira



 [ 
https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-13835:
---
Fix Version/s: 8.3.0

> HttpSolrCall produces incorrect extra AuditEvent on 
> AuthorizationResponse.PROMPT
> 
>
> Key: SOLR-13835
> URL: https://issues.apache.org/jira/browse/SOLR-13835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication, Authorization
>Reporter: Chris M. Hostetter
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 8.3.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> spinning this out of SOLR-13741...
> {quote}
> Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe 
> there is a code bug, not a test bug. In HttpSolrCall#471 in the 
> {{authorize()}} call, if authResponse == PROMPT, it will actually match both 
> blocks and emit two audit events: 
> [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493]
>  
> {code:java}
> if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...}
> if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && 
> !(authResponse.statusCode == HttpStatus.SC_OK)) {...}
> {code}
> When code==401, it is also true that code!=200. Intuitively there should be 
> both a sendErrora and return RETURN before line #484 in the first if block?
> {quote}
> This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by 
> a coresponding {{UNAUTHORIZED}} AuditEvent.  
> It's not yet clear if, from the perspective of the external client, there are 
> any other bugs in behavior (TBD)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT

2019-10-16 Thread Jira



 [ 
https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-13835.

Resolution: Fixed

Pushed to master, branch_8x and branch_8_3

> HttpSolrCall produces incorrect extra AuditEvent on 
> AuthorizationResponse.PROMPT
> 
>
> Key: SOLR-13835
> URL: https://issues.apache.org/jira/browse/SOLR-13835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication, Authorization
>Reporter: Chris M. Hostetter
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 8.3.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> spinning this out of SOLR-13741...
> {quote}
> Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe 
> there is a code bug, not a test bug. In HttpSolrCall#471 in the 
> {{authorize()}} call, if authResponse == PROMPT, it will actually match both 
> blocks and emit two audit events: 
> [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493]
>  
> {code:java}
> if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...}
> if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && 
> !(authResponse.statusCode == HttpStatus.SC_OK)) {...}
> {code}
> When code==401, it is also true that code!=200. Intuitively there should be 
> both a sendErrora and return RETURN before line #484 in the first if block?
> {quote}
> This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by 
> a coresponding {{UNAUTHORIZED}} AuditEvent.  
> It's not yet clear if, from the perspective of the external client, there are 
> any other bugs in behavior (TBD)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953259#comment-16953259
 ] 

ASF subversion and git services commented on SOLR-13835:


Commit b58695c98ce1356efc27beeb338a8300f6f72346 in lucene-solr's branch 
refs/heads/branch_8_3 from Jan Høydahl
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b58695c ]

SOLR-13835 HttpSolrCall produces incorrect extra AuditEvent on 
AuthorizationResponse.PROMPT (#946)

(cherry picked from commit 611c4f960e9472880e2ec24dda9336a59cd41426)


> HttpSolrCall produces incorrect extra AuditEvent on 
> AuthorizationResponse.PROMPT
> 
>
> Key: SOLR-13835
> URL: https://issues.apache.org/jira/browse/SOLR-13835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication, Authorization
>Reporter: Chris M. Hostetter
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> spinning this out of SOLR-13741...
> {quote}
> Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe 
> there is a code bug, not a test bug. In HttpSolrCall#471 in the 
> {{authorize()}} call, if authResponse == PROMPT, it will actually match both 
> blocks and emit two audit events: 
> [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493]
>  
> {code:java}
> if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...}
> if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && 
> !(authResponse.statusCode == HttpStatus.SC_OK)) {...}
> {code}
> When code==401, it is also true that code!=200. Intuitively there should be 
> both a sendErrora and return RETURN before line #484 in the first if block?
> {quote}
> This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by 
> a coresponding {{UNAUTHORIZED}} AuditEvent.  
> It's not yet clear if, from the perspective of the external client, there are 
> any other bugs in behavior (TBD)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-13852) TestCloudNestedDocsSort can use the same uniqueKey for both a parent and child doc

2019-10-16 Thread Chris M. Hostetter (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter resolved SOLR-13852.
---
Fix Version/s: master (9.0)
   8.4
 Assignee: Chris M. Hostetter
   Resolution: Fixed

> TestCloudNestedDocsSort can use the same uniqueKey for both a parent and 
> child doc
> --
>
> Key: SOLR-13852
> URL: https://issues.apache.org/jira/browse/SOLR-13852
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Fix For: 8.4, master (9.0)
>
> Attachments: thetaphi_Lucene-Solr-master-Linux_24903.log.txt
>
>
> TestCloudNestedDocsSort uses randomly generated "id" values for all docs, 
> which not only means that two "parent" docs can be indexed with the same "id" 
> value, but also that a child doc might be indexed with the same "id" value as 
> a parent doc.
> While nothing in Solr actively prevents this, it's documented as something 
> people shouldn't do, and can cause problems.
> In particular, this has caused some assertion failures for some test seeds 
> due to how it interacts with SOLR-13851



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13852) TestCloudNestedDocsSort can use the same uniqueKey for both a parent and child doc

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953258#comment-16953258
 ] 

ASF subversion and git services commented on SOLR-13852:


Commit 3a67c82c9161454e3a7e6bf76cde7ed7e4018f28 in lucene-solr's branch 
refs/heads/branch_8x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3a67c82 ]

SOLR-13852: Fix TestCloudNestedDocsSort to ensure child docs are never created 
in a way that violates uniqueKey rules

(cherry picked from commit ebc720c5b09ae06b8ab093b296bf87e4f6ed978f)


> TestCloudNestedDocsSort can use the same uniqueKey for both a parent and 
> child doc
> --
>
> Key: SOLR-13852
> URL: https://issues.apache.org/jira/browse/SOLR-13852
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: thetaphi_Lucene-Solr-master-Linux_24903.log.txt
>
>
> TestCloudNestedDocsSort uses randomly generated "id" values for all docs, 
> which not only means that two "parent" docs can be indexed with the same "id" 
> value, but also that a child doc might be indexed with the same "id" value as 
> a parent doc.
> While nothing in Solr actively prevents this, it's documented as something 
> people shouldn't do, and can cause problems.
> In particular, this has caused some assertion failures for some test seeds 
> due to how it interacts with SOLR-13851



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953257#comment-16953257
 ] 

ASF subversion and git services commented on SOLR-13835:


Commit 5a074b0fe49ef863a162e7f5d55e351bc043c806 in lucene-solr's branch 
refs/heads/branch_8x from Jan Høydahl
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5a074b0 ]

SOLR-13835 HttpSolrCall produces incorrect extra AuditEvent on 
AuthorizationResponse.PROMPT (#946)

(cherry picked from commit 611c4f960e9472880e2ec24dda9336a59cd41426)


> HttpSolrCall produces incorrect extra AuditEvent on 
> AuthorizationResponse.PROMPT
> 
>
> Key: SOLR-13835
> URL: https://issues.apache.org/jira/browse/SOLR-13835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication, Authorization
>Reporter: Chris M. Hostetter
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> spinning this out of SOLR-13741...
> {quote}
> Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe 
> there is a code bug, not a test bug. In HttpSolrCall#471 in the 
> {{authorize()}} call, if authResponse == PROMPT, it will actually match both 
> blocks and emit two audit events: 
> [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493]
>  
> {code:java}
> if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...}
> if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && 
> !(authResponse.statusCode == HttpStatus.SC_OK)) {...}
> {code}
> When code==401, it is also true that code!=200. Intuitively there should be 
> both a sendErrora and return RETURN before line #484 in the first if block?
> {quote}
> This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by 
> a coresponding {{UNAUTHORIZED}} AuditEvent.  
> It's not yet clear if, from the perspective of the external client, there are 
> any other bugs in behavior (TBD)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-8987) Move Lucene web site from svn to git

2019-10-16 Thread Jira



[ 
https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953165#comment-16953165
 ] 

Jan Høydahl edited comment on LUCENE-8987 at 10/16/19 11:00 PM:


Steps
 # Create new git repo '{{lucene-site'}}
 # Create folder structure and copy old site (excluding JavaDoc and online 
RefGuide) from svn into appropriate folder(s)
 # Adapt to make local Pelican site build work for building the barebones site, 
and commit to master branch
 # Add {{.asf.yaml}} file with a 'staging' profile for branch asf-staging, and 
a 'publish' profile for branch 'asf-site', and a 'pelican' directive to auto 
build from 'master' branch and put site into 'asf-staging' branch (/output 
folder).
 # Verify that the staging build kicks off and that a site appears in 
[lucene.staged.apache.org|https://lucene.staged.apache.org/] (note that this is 
different from lucene.staging.apache.org that old CMS uses)
 # Find a solution for JavaDoc and RefGuide, which are *huge* amounts of 
statically generated HTML uploaded by RM during build.
 ** These should just be put on a filesystem somewhere, outside of git
 ** Do some {{.htaccess}} magic to make them appear in the right locations of 
the site
 # Once the staging site is good, merge {{asf-staging}} into {{asf-site}} 
branch to publish. This will automatically disable CMS.
 # Commit a README-NOT-IN-USE file to old svn repo and make it read-only

Note that also the RM guidelines need to be updated wrt
 * how to update website, download pages etc during a release
 * how to publish JavaDoc
 * how to publish RefGuide HTML


was (Author: janhoy):
Steps
 # Create new git repo '{{lucene-site'}}
 # Create folder structure and copy old site from svn into appropriate folder(s)
 # Adapt to make local Pelican site build work, and commit to master branch
 # Add {{.asf.yaml}} file with a 'staging' profile for branch asf-staging, and 
a 'publish' profile for branch 'asf-site'
 # Merge master branch into 'asf-staging' and verify that the staging build 
kicks off and that a site appears in 
[lucene.staged.apache.org|https://lucene.staged.apache.org/] (note that this is 
different from lucene.staging.apache.org that old CMS uses)
 # Iterate until the site is perfect for publishing
 # Merge master branch into 'asf-site' branch, which will publish to the real 
site and automatically disable old CMS
 # Commit a README-NOT-IN-USE file to old svn repo and make it read-only

> Move Lucene web site from svn to git
> 
>
> Key: LUCENE-8987
> URL: https://issues.apache.org/jira/browse/LUCENE-8987
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Attachments: lucene-site-repo.png
>
>
> INFRA just enabled [a new way of configuring website 
> build|https://s.apache.org/asfyaml] from a git branch, [see dev list 
> email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E].
>  It allows for automatic builds of both staging and production site, much 
> like the old CMS. We can choose to auto publish the html content of an 
> {{output/}} folder, or to have a bot build the site using 
> [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder.
> The goal of this issue is to explore how this can be done for 
> [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new 
> git repo {{lucene-site}}, copy over the site from svn, see if it can be 
> "Pelicanized" easily and then test staging. Benefits are that more people 
> will be able to edit the web site and we can take PRs from the public (with 
> GitHub preview of pages).
> Non-goals:
>  * Create a new web site or a new graphic design
>  * Change from Markdown to Asciidoc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13852) TestCloudNestedDocsSort can use the same uniqueKey for both a parent and child doc

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953248#comment-16953248
 ] 

ASF subversion and git services commented on SOLR-13852:


Commit ebc720c5b09ae06b8ab093b296bf87e4f6ed978f in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ebc720c ]

SOLR-13852: Fix TestCloudNestedDocsSort to ensure child docs are never created 
in a way that violates uniqueKey rules


> TestCloudNestedDocsSort can use the same uniqueKey for both a parent and 
> child doc
> --
>
> Key: SOLR-13852
> URL: https://issues.apache.org/jira/browse/SOLR-13852
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: thetaphi_Lucene-Solr-master-Linux_24903.log.txt
>
>
> TestCloudNestedDocsSort uses randomly generated "id" values for all docs, 
> which not only means that two "parent" docs can be indexed with the same "id" 
> value, but also that a child doc might be indexed with the same "id" value as 
> a parent doc.
> While nothing in Solr actively prevents this, it's documented as something 
> people shouldn't do, and can cause problems.
> In particular, this has caused some assertion failures for some test seeds 
> due to how it interacts with SOLR-13851



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953246#comment-16953246
 ] 

ASF subversion and git services commented on SOLR-13835:


Commit 611c4f960e9472880e2ec24dda9336a59cd41426 in lucene-solr's branch 
refs/heads/master from Jan Høydahl
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=611c4f9 ]

SOLR-13835 HttpSolrCall produces incorrect extra AuditEvent on 
AuthorizationResponse.PROMPT (#946)



> HttpSolrCall produces incorrect extra AuditEvent on 
> AuthorizationResponse.PROMPT
> 
>
> Key: SOLR-13835
> URL: https://issues.apache.org/jira/browse/SOLR-13835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication, Authorization
>Reporter: Chris M. Hostetter
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> spinning this out of SOLR-13741...
> {quote}
> Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe 
> there is a code bug, not a test bug. In HttpSolrCall#471 in the 
> {{authorize()}} call, if authResponse == PROMPT, it will actually match both 
> blocks and emit two audit events: 
> [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493]
>  
> {code:java}
> if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...}
> if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && 
> !(authResponse.statusCode == HttpStatus.SC_OK)) {...}
> {code}
> When code==401, it is also true that code!=200. Intuitively there should be 
> both a sendErrora and return RETURN before line #484 in the first if block?
> {quote}
> This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by 
> a coresponding {{UNAUTHORIZED}} AuditEvent.  
> It's not yet clear if, from the perspective of the external client, there are 
> any other bugs in behavior (TBD)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] janhoy merged pull request #946: SOLR-13835 HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT

2019-10-16 Thread GitBox

janhoy merged pull request #946: SOLR-13835 HttpSolrCall produces incorrect 
extra AuditEvent on AuthorizationResponse.PROMPT
URL: https://github.com/apache/lucene-solr/pull/946
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] ErickErickson commented on issue #888: SOLR-13774 add lucene/solr openjdk compatibility matrix to ref guide.

2019-10-16 Thread GitBox

ErickErickson commented on issue #888: SOLR-13774 add lucene/solr openjdk 
compatibility matrix to ref guide.
URL: https://github.com/apache/lucene-solr/pull/888#issuecomment-542910805
 
 
   Hmmm, I think overall the idea of putting it on a Wiki page then linking to 
it from the ref guide makes sense. We can put some disclaimers in about testing 
etc. as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT

2019-10-16 Thread Chris M. Hostetter (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953219#comment-16953219
 ] 

Chris M. Hostetter commented on SOLR-13835:
---

+1

> HttpSolrCall produces incorrect extra AuditEvent on 
> AuthorizationResponse.PROMPT
> 
>
> Key: SOLR-13835
> URL: https://issues.apache.org/jira/browse/SOLR-13835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication, Authorization
>Reporter: Chris M. Hostetter
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> spinning this out of SOLR-13741...
> {quote}
> Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe 
> there is a code bug, not a test bug. In HttpSolrCall#471 in the 
> {{authorize()}} call, if authResponse == PROMPT, it will actually match both 
> blocks and emit two audit events: 
> [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493]
>  
> {code:java}
> if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...}
> if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && 
> !(authResponse.statusCode == HttpStatus.SC_OK)) {...}
> {code}
> When code==401, it is also true that code!=200. Intuitively there should be 
> both a sendErrora and return RETURN before line #484 in the first if block?
> {quote}
> This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by 
> a coresponding {{UNAUTHORIZED}} AuditEvent.  
> It's not yet clear if, from the perspective of the external client, there are 
> any other bugs in behavior (TBD)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8987) Move Lucene web site from svn to git

2019-10-16 Thread Jira



[ 
https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953171#comment-16953171
 ] 

Jan Høydahl commented on LUCENE-8987:
-

!lucene-site-repo.png|width=339!

> Move Lucene web site from svn to git
> 
>
> Key: LUCENE-8987
> URL: https://issues.apache.org/jira/browse/LUCENE-8987
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Attachments: lucene-site-repo.png
>
>
> INFRA just enabled [a new way of configuring website 
> build|https://s.apache.org/asfyaml] from a git branch, [see dev list 
> email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E].
>  It allows for automatic builds of both staging and production site, much 
> like the old CMS. We can choose to auto publish the html content of an 
> {{output/}} folder, or to have a bot build the site using 
> [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder.
> The goal of this issue is to explore how this can be done for 
> [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new 
> git repo {{lucene-site}}, copy over the site from svn, see if it can be 
> "Pelicanized" easily and then test staging. Benefits are that more people 
> will be able to edit the web site and we can take PRs from the public (with 
> GitHub preview of pages).
> Non-goals:
>  * Create a new web site or a new graphic design
>  * Change from Markdown to Asciidoc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-8987) Move Lucene web site from svn to git

2019-10-16 Thread Jira



 [ 
https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated LUCENE-8987:

Attachment: lucene-site-repo.png

> Move Lucene web site from svn to git
> 
>
> Key: LUCENE-8987
> URL: https://issues.apache.org/jira/browse/LUCENE-8987
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Attachments: lucene-site-repo.png
>
>
> INFRA just enabled [a new way of configuring website 
> build|https://s.apache.org/asfyaml] from a git branch, [see dev list 
> email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E].
>  It allows for automatic builds of both staging and production site, much 
> like the old CMS. We can choose to auto publish the html content of an 
> {{output/}} folder, or to have a bot build the site using 
> [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder.
> The goal of this issue is to explore how this can be done for 
> [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new 
> git repo {{lucene-site}}, copy over the site from svn, see if it can be 
> "Pelicanized" easily and then test staging. Benefits are that more people 
> will be able to edit the web site and we can take PRs from the public (with 
> GitHub preview of pages).
> Non-goals:
>  * Create a new web site or a new graphic design
>  * Change from Markdown to Asciidoc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-12786) Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure

2019-10-16 Thread Cassandra Targett (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett resolved SOLR-12786.
--
Fix Version/s: 8.3
   Resolution: Fixed

> Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure
> ---
>
> Key: SOLR-12786
> URL: https://issues.apache.org/jira/browse/SOLR-12786
> Project: Solr
>  Issue Type: Improvement
>  Components: Build, documentation
>Affects Versions: 8.0
>Reporter: Jan Høydahl
>Assignee: Cassandra Targett
>Priority: Major
> Fix For: 8.3
>
> Attachments: SOLR-12786.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the refguide build requires asciidoctor 1.5.6.2.
> People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, 
> causing different header ID syntax and the build will break.
> Long term we should move to latest asciidoctor.
> It is already documented in README how to install the older 1.5.6.2 version.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8987) Move Lucene web site from svn to git

2019-10-16 Thread Jira



[ 
https://issues.apache.org/jira/browse/LUCENE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953165#comment-16953165
 ] 

Jan Høydahl commented on LUCENE-8987:
-

Steps
 # Create new git repo '{{lucene-site'}}
 # Create folder structure and copy old site from svn into appropriate folder(s)
 # Adapt to make local Pelican site build work, and commit to master branch
 # Add {{.asf.yaml}} file with a 'staging' profile for branch asf-staging, and 
a 'publish' profile for branch 'asf-site'
 # Merge master branch into 'asf-staging' and verify that the staging build 
kicks off and that a site appears in 
[lucene.staged.apache.org|https://lucene.staged.apache.org/] (note that this is 
different from lucene.staging.apache.org that old CMS uses)
 # Iterate until the site is perfect for publishing
 # Merge master branch into 'asf-site' branch, which will publish to the real 
site and automatically disable old CMS
 # Commit a README-NOT-IN-USE file to old svn repo and make it read-only

> Move Lucene web site from svn to git
> 
>
> Key: LUCENE-8987
> URL: https://issues.apache.org/jira/browse/LUCENE-8987
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> INFRA just enabled [a new way of configuring website 
> build|https://s.apache.org/asfyaml] from a git branch, [see dev list 
> email|https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E].
>  It allows for automatic builds of both staging and production site, much 
> like the old CMS. We can choose to auto publish the html content of an 
> {{output/}} folder, or to have a bot build the site using 
> [Pelican|https://github.com/getpelican/pelican] from a {{content/}} folder.
> The goal of this issue is to explore how this can be done for 
> [http://lucene.apache.org|http://lucene.apache.org/] by, by creating a new 
> git repo {{lucene-site}}, copy over the site from svn, see if it can be 
> "Pelicanized" easily and then test staging. Benefits are that more people 
> will be able to edit the web site and we can take PRs from the public (with 
> GitHub preview of pages).
> Non-goals:
>  * Create a new web site or a new graphic design
>  * Change from Markdown to Asciidoc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12786) Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953161#comment-16953161
 ] 

ASF subversion and git services commented on SOLR-12786:


Commit a27eabbd2132abcd47bb0a5f7c42fcafaded1d9a in lucene-solr's branch 
refs/heads/branch_8_3 from Cassandra Targett
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a27eabb ]

SOLR-12786: Update Ref Guide build tool versions & fix section links for new 
format requirements


> Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure
> ---
>
> Key: SOLR-12786
> URL: https://issues.apache.org/jira/browse/SOLR-12786
> Project: Solr
>  Issue Type: Improvement
>  Components: Build, documentation
>Affects Versions: 8.0
>Reporter: Jan Høydahl
>Assignee: Cassandra Targett
>Priority: Major
> Attachments: SOLR-12786.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the refguide build requires asciidoctor 1.5.6.2.
> People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, 
> causing different header ID syntax and the build will break.
> Long term we should move to latest asciidoctor.
> It is already documented in README how to install the older 1.5.6.2 version.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12786) Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953162#comment-16953162
 ] 

ASF subversion and git services commented on SOLR-12786:


Commit 2f11fd410a4ad707959f366ff7dda63c4cbbb4c4 in lucene-solr's branch 
refs/heads/branch_8_3 from Cassandra Targett
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2f11fd4 ]

SOLR-12786: add back explicit asciidoctor install for Jenkins build


> Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure
> ---
>
> Key: SOLR-12786
> URL: https://issues.apache.org/jira/browse/SOLR-12786
> Project: Solr
>  Issue Type: Improvement
>  Components: Build, documentation
>Affects Versions: 8.0
>Reporter: Jan Høydahl
>Assignee: Cassandra Targett
>Priority: Major
> Attachments: SOLR-12786.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the refguide build requires asciidoctor 1.5.6.2.
> People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, 
> causing different header ID syntax and the build will break.
> Long term we should move to latest asciidoctor.
> It is already documented in README how to install the older 1.5.6.2 version.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.

2019-10-16 Thread Lucene/Solr QA (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953152#comment-16953152
 ] 

Lucene/Solr QA commented on SOLR-13824:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
21s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  1m 26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  1m 14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  1m 14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
56s{color} | {color:green} ltr in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 21s{color} 
| {color:red} core in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 51s{color} 
| {color:red} solrj in the patch failed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}110m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | solr.core.TestSolrConfigHandler |
|   | solr.cloud.autoscaling.AutoAddReplicasIntegrationTest |
|   | solr.search.facet.TestJsonFacetRefinement |
|   | solr.filestore.TestDistribPackageStore |
|   | solr.cloud.autoscaling.AutoAddReplicasPlanActionTest |
|   | solr.client.solrj.cloud.autoscaling.TestPolicyOld |
|   | solr.client.solrj.cloud.autoscaling.TestPolicy |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-13824 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12983170/SOLR-13824.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP 
Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / b3d59a7 |
| ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 |
| Default Java | LTS |
| unit | 
https://builds.apache.org/job/PreCommit-SOLR-Build/579/artifact/out/patch-unit-solr_core.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-SOLR-Build/579/artifact/out/patch-unit-solr_solrj.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-SOLR-Build/579/testReport/ |
| modules | C: solr/contrib/ltr solr/core solr/solrj U: solr |
| Console output | 
https://builds.apache.org/job/PreCommit-SOLR-Build/579/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> JSON Request API ignores prematurely closing curly brace. 
> --
>
> Key: SOLR-13824
> URL: https://issues.apache.org/jira/browse/SOLR-13824
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-13824.patch, SOLR-13824.patch
>
>
> {code:java}
> json={query:"content:foo", facet:{zz:{field:id}}}
> {code}
> this works fine, but if we mistype {{}}} instead of {{,}}
> {code:java}
> json={query:"content:foo"} facet:{zz:{field:id}}}
> {code}
> It's captured only partially, here's we have under debug
> {code:java}
>   "json":{"query":"content:foo"},
> {code}
> I suppose it should throw an error with 400 code.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8986) Add asf.yaml to our git repo

2019-10-16 Thread Jira



[ 
https://issues.apache.org/jira/browse/LUCENE-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953138#comment-16953138
 ] 

Jan Høydahl commented on LUCENE-8986:
-

Please see my proposal in [GitHub Pull Request 
#958|https://github.com/apache/lucene-solr/pull/958], and feel free to provide 
a better project description or additional GitHub topic labels.

Will commit this on Friday.

> Add asf.yaml to our git repo
> 
>
> Key: LUCENE-8986
> URL: https://issues.apache.org/jira/browse/LUCENE-8986
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Adding a {{asf.yaml}} file to our git repo allows us to control the 
> description, link and labels on Lucene-Solr project git page. See 
> https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E
>  for more.
> I'll post a PR with the suggested change



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-8986) Add asf.yaml to our git repo

2019-10-16 Thread Jira



 [ 
https://issues.apache.org/jira/browse/LUCENE-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated LUCENE-8986:

Component/s: general/website

> Add asf.yaml to our git repo
> 
>
> Key: LUCENE-8986
> URL: https://issues.apache.org/jira/browse/LUCENE-8986
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/website
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Adding a {{asf.yaml}} file to our git repo allows us to control the 
> description, link and labels on Lucene-Solr project git page. See 
> https://lists.apache.org/thread.html/b6f7e40bece5e83e27072ecc634a7815980c90240bc0a2ccb417f1fd@%3Cdev.lucene.apache.org%3E
>  for more.
> I'll post a PR with the suggested change



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] janhoy opened a new pull request #958: LUCENE-8986: Add asf.yaml to our git repo

2019-10-16 Thread GitBox

janhoy opened a new pull request #958: LUCENE-8986: Add asf.yaml to our git repo
URL: https://github.com/apache/lucene-solr/pull/958
 
 
   # Description
   
   See https://issues.apache.org/jira/browse/LUCENE-8986
   
   # Solution
   
   Adding the .asf.yaml will edit GitHub project description, link and labels
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I am authorized to contribute this code to the ASF and have removed 
any code I do not have a license to distribute.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [ ] I have developed this patch against the `master` branch.
   - [ ] I have run `ant precommit` and the appropriate test suite.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9010) extend TopGroups.merge test coverage

2019-10-16 Thread Christine Poerschke (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated LUCENE-9010:

Status: Patch Available  (was: Open)

> extend TopGroups.merge test coverage
> 
>
> Key: LUCENE-9010
> URL: https://issues.apache.org/jira/browse/LUCENE-9010
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: LUCENE-9010.patch
>
>
> This sub-task proposes to add test coverage for the {{TopGroups.merge}} 
> method, separately from but as preparation for LUCENE-8996 fixing the 
> 'maxScore is sometimes missing' bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses

2019-10-16 Thread Christine Poerschke (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953080#comment-16953080
 ] 

Christine Poerschke commented on LUCENE-8996:
-

Looking at the {{TopGroupsTest}} portion of both the patch and the pull request 
for this ticket I had some "there's a lot of numbers here" thoughts and it 
(subjectively, of course) seemed to me a little tricky to work out what they 
all are (numbers for shard index, numbers for doc id, numbers for group value, 
numbers for scores, numbers for hit counts, sometimes NaN not-a-number numbers) 
and what they mean and why/that the expected test results are correct.

The LUCENE-9010 sub-task proposes to split out the addition of test coverage 
for the existing code from the 'maxScore missing' fix here (and the first 
proposed patch for it tries to reduce the "amount of numbers" e.g. instead of 
integer group values 1 and 2 there's string group values "red" and "blue" and a 
narrative and local variable names (redAntScore, blueDragonflyScore, 
redSquirrelScore, blueWhaleScore) try to make it easier to work out what the 
{{expectedMaxScore}} value is.

> maxScore is sometimes missing from distributed grouped responses
> 
>
> Key: LUCENE-8996
> URL: https://issues.apache.org/jira/browse/LUCENE-8996
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 5.3
>Reporter: Julien Massenet
>Priority: Minor
> Attachments: LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, 
> lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This issue occurs when using the grouping feature in distributed mode and 
> sorting by score.
> Each group's {{docList}} in the response is supposed to contain a 
> {{maxScore}} entry that hold the maximum score for that group. Using the 
> current releases, it sometimes happens that this piece of information is not 
> included:
> {code}
> {
>   "responseHeader": {
> "status": 0,
> "QTime": 42,
> "params": {
>   "sort": "score desc",
>   "fl": "id,score",
>   "q": "_text_:\"72\"",
>   "group.limit": "2",
>   "group.field": "group2",
>   "group.sort": "score desc",
>   "group": "true",
>   "wt": "json",
>   "fq": "group2:72 OR group2:45"
> }
>   },
>   "grouped": {
> "group2": {
>   "matches": 567,
>   "groups": [
> {
>   "groupValue": 72,
>   "doclist": {
> "numFound": 562,
> "start": 0,
> "maxScore": 2.0378063,
> "docs": [
>   {
> "id": "29!26551",
> "score": 2.0378063
>   },
>   {
> "id": "78!11462",
> "score": 2.0298104
>   }
> ]
>   }
> },
> {
>   "groupValue": 45,
>   "doclist": {
> "numFound": 5,
> "start": 0,
> "docs": [
>   {
> "id": "72!8569",
> "score": 1.8988966
>   },
>   {
> "id": "72!14075",
> "score": 1.5191172
>   }
> ]
>   }
> }
>   ]
> }
>   }
> }
> {code}
> Looking into the issue, it comes from the fact that if a shard does not 
> contain a document from that group, trying to merge its {{maxScore}} with 
> real {{maxScore}} entries from other shards is invalid (it results in NaN).
> I'm attaching a patch containing a fix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8996) maxScore is sometimes missing from distributed grouped responses

2019-10-16 Thread Christine Poerschke (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953078#comment-16953078
 ] 

Christine Poerschke commented on LUCENE-8996:
-

{quote}... If you merge two groups with no real maxScores the final result will 
be MIN_VALUE (NaN would make more sense imo) ...
{quote}
Yes, MIN_VALUE seems a quirky result for this edge case. Though if one were to 
change the existing behaviour it might be clearest to do that separately from 
the 'maxScore missing' fix here: here we are removing an erroneous case of 
'maxScore missing' and changing away from MIN_VALUE would add a legitimate case 
of 'maxScore missing'.
{quote}... this *should* never happen in theory because if no segment contains 
documents about group x it shouldn't be possible that we retrieve documents 
about group x in first place. ...
{quote}
I agree, in theory it should never happen though in practice I think there's a 
timing window of opportunity that could make it happen, though it would seem 
quite unlikely. The first pass of the distributed search could determine that 
there are segments with documents about group X but subsequently it could then 
be 'just so' that by the time the second pass of the search runs a few moments 
later the document(s) in group X have all been deleted?

> maxScore is sometimes missing from distributed grouped responses
> 
>
> Key: LUCENE-8996
> URL: https://issues.apache.org/jira/browse/LUCENE-8996
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 5.3
>Reporter: Julien Massenet
>Priority: Minor
> Attachments: LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, 
> lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This issue occurs when using the grouping feature in distributed mode and 
> sorting by score.
> Each group's {{docList}} in the response is supposed to contain a 
> {{maxScore}} entry that hold the maximum score for that group. Using the 
> current releases, it sometimes happens that this piece of information is not 
> included:
> {code}
> {
>   "responseHeader": {
> "status": 0,
> "QTime": 42,
> "params": {
>   "sort": "score desc",
>   "fl": "id,score",
>   "q": "_text_:\"72\"",
>   "group.limit": "2",
>   "group.field": "group2",
>   "group.sort": "score desc",
>   "group": "true",
>   "wt": "json",
>   "fq": "group2:72 OR group2:45"
> }
>   },
>   "grouped": {
> "group2": {
>   "matches": 567,
>   "groups": [
> {
>   "groupValue": 72,
>   "doclist": {
> "numFound": 562,
> "start": 0,
> "maxScore": 2.0378063,
> "docs": [
>   {
> "id": "29!26551",
> "score": 2.0378063
>   },
>   {
> "id": "78!11462",
> "score": 2.0298104
>   }
> ]
>   }
> },
> {
>   "groupValue": 45,
>   "doclist": {
> "numFound": 5,
> "start": 0,
> "docs": [
>   {
> "id": "72!8569",
> "score": 1.8988966
>   },
>   {
> "id": "72!14075",
> "score": 1.5191172
>   }
> ]
>   }
> }
>   ]
> }
>   }
> }
> {code}
> Looking into the issue, it comes from the fact that if a shard does not 
> contain a document from that group, trying to merge its {{maxScore}} with 
> real {{maxScore}} entries from other shards is invalid (it results in NaN).
> I'm attaching a patch containing a fix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9010) extend TopGroups.merge test coverage

2019-10-16 Thread Christine Poerschke (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953077#comment-16953077
 ] 

Christine Poerschke commented on LUCENE-9010:
-

The attached proposed patch tries to reduce the "amount of numbers" in the test 
e.g. instead of integer group values 1 and 2 there's string group values "red" 
and "blue" and a narrative and local variable names (redAntScore, 
blueDragonflyScore, blueDragonflySize, redSquirrelScore, blueWhaleScore) try to 
make it easier to work out what the {{expectedMaxScore}} value is.

> extend TopGroups.merge test coverage
> 
>
> Key: LUCENE-9010
> URL: https://issues.apache.org/jira/browse/LUCENE-9010
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: LUCENE-9010.patch
>
>
> This sub-task proposes to add test coverage for the {{TopGroups.merge}} 
> method, separately from but as preparation for LUCENE-8996 fixing the 
> 'maxScore is sometimes missing' bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9010) extend TopGroups.merge test coverage

2019-10-16 Thread Christine Poerschke (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated LUCENE-9010:

Attachment: LUCENE-9010.patch

> extend TopGroups.merge test coverage
> 
>
> Key: LUCENE-9010
> URL: https://issues.apache.org/jira/browse/LUCENE-9010
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: LUCENE-9010.patch
>
>
> This sub-task proposes to add test coverage for the {{TopGroups.merge}} 
> method, separately from but as preparation for LUCENE-8996 fixing the 
> 'maxScore is sometimes missing' bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9010) extend TopGroups.merge test coverage

2019-10-16 Thread Christine Poerschke (Jira)

Christine Poerschke created LUCENE-9010:
---

 Summary: extend TopGroups.merge test coverage
 Key: LUCENE-9010
 URL: https://issues.apache.org/jira/browse/LUCENE-9010
 Project: Lucene - Core
  Issue Type: Sub-task
Reporter: Christine Poerschke


This sub-task proposes to add test coverage for the {{TopGroups.merge}} method, 
separately from but as preparation for LUCENE-8996 fixing the 'maxScore is 
sometimes missing' bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13852) TestCloudNestedDocsSort can use the same uniqueKey for both a parent and child doc

2019-10-16 Thread Chris M. Hostetter (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-13852:
--
Attachment: thetaphi_Lucene-Solr-master-Linux_24903.log.txt
Status: Open  (was: Open)

attaching a jenkins log w/seed showing how this can cause failures due to the 
assertion logic introduced in SOLR-13851

> TestCloudNestedDocsSort can use the same uniqueKey for both a parent and 
> child doc
> --
>
> Key: SOLR-13852
> URL: https://issues.apache.org/jira/browse/SOLR-13852
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: thetaphi_Lucene-Solr-master-Linux_24903.log.txt
>
>
> TestCloudNestedDocsSort uses randomly generated "id" values for all docs, 
> which not only means that two "parent" docs can be indexed with the same "id" 
> value, but also that a child doc might be indexed with the same "id" value as 
> a parent doc.
> While nothing in Solr actively prevents this, it's documented as something 
> people shouldn't do, and can cause problems.
> In particular, this has caused some assertion failures for some test seeds 
> due to how it interacts with SOLR-13851



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-13852) TestCloudNestedDocsSort can use the same uniqueKey for both a parent and child doc

2019-10-16 Thread Chris M. Hostetter (Jira)

Chris M. Hostetter created SOLR-13852:
-

 Summary: TestCloudNestedDocsSort can use the same uniqueKey for 
both a parent and child doc
 Key: SOLR-13852
 URL: https://issues.apache.org/jira/browse/SOLR-13852
 Project: Solr
  Issue Type: Test
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Chris M. Hostetter


TestCloudNestedDocsSort uses randomly generated "id" values for all docs, which 
not only means that two "parent" docs can be indexed with the same "id" value, 
but also that a child doc might be indexed with the same "id" value as a parent 
doc.

While nothing in Solr actively prevents this, it's documented as something 
people shouldn't do, and can cause problems.

In particular, this has caused some assertion failures for some test seeds due 
to how it interacts with SOLR-13851




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13851) SolrIndexSearcher.getFirstMatch trips assertion if multiple matches

2019-10-16 Thread Chris M. Hostetter (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953070#comment-16953070
 ] 

Chris M. Hostetter commented on SOLR-13851:
---

Background: I recently noticed jenkins test failures from 
TestCloudNestedDocsSort that stemmed from this assertion error...

{noformat}
   [junit4]   2> Server ErrorCaused 
by:java.lang.AssertionError
   [junit4]   2>at 
org.apache.solr.search.SolrIndexSearcher.lookupId(SolrIndexSearcher.java:710)
   [junit4]   2>at 
org.apache.solr.search.SolrIndexSearcher.getFirstMatch(SolrIndexSearcher.java:676)
   [junit4]   2>at 
org.apache.solr.handler.component.QueryComponent.doProcessSearchByIds(QueryComponent.java:1266)
   [junit4]   2>at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:351)
   [junit4]   2>at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:305)
   [junit4]   2>at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:198)
   [junit4]   2>at 
org.apache.solr.core.SolrCore.execute(SolrCore.java:2559)
{noformat}

At the core of the problem is that TestCloudNestedDocsSort does some things it 
shouldn't in terms of fhild doc uniqueKeys (which i'll track in a linked jira) 
... but while using git bisect to identify when/where the failure was 
introduced, it identified GIT:1e63b32731bedf108aaeeb5d0a04d671f5663102 
(SOLR-12366) as the first bad commit, and that's when i realized that prior to 
SOLR-12366 this (bad test) worked fine because {{getFirstMatch}} just did what 
it says: returned the first match (w/o complaining if there were multiples)


> SolrIndexSearcher.getFirstMatch trips assertion if multiple matches
> ---
>
> Key: SOLR-13851
> URL: https://issues.apache.org/jira/browse/SOLR-13851
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
>
> the documentation for {{SolrIndexSearcher.getFirstMatch}} says...
> {quote}
> Returns the first document number containing the term t Returns 
> -1 if no document was found. This method is primarily intended for clients 
> that want to fetch documents using a unique identifier."
> @return the first document number containing the term
> {quote}
> But SOLR-12366 refactored {{SolrIndexSearcher.getFirstMatch}} to eliminate 
> it's previous implementation and replace it with a call to (a refactored 
> version of) {{SolrIndexSearcher.lookupId}} -- but the code in {{lookupId}} 
> was always designed *explicitly* for dealing with a uniqueKey field, and has 
> an assertion that once it finds a match _there will be no other matches in 
> the index_
> This means that even though {{getFirstMatch}} is _intended_ for fields that 
> are unique between documents, i it's used on a field that is not unique, it 
> can trip an assertion.
> At a minimum we need to either "fix" {{getFirstMatch}} to behave as 
> documented, or fix it's documetation.
> Given that the current behavior has now been in place since Solr 7.4, and 
> given that all existing uses in "core" solr code are for looking up docs by 
> uniqueKey, it's probably best to simply fix the documentation, but we should 
> also consider replacing hte assertion with an IllegalStateException, or 
> SolrException -- anything not dependent on having assertions enabled -- to 
> prevent silent bugs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-13851) SolrIndexSearcher.getFirstMatch trips assertion if multiple matches

2019-10-16 Thread Chris M. Hostetter (Jira)

Chris M. Hostetter created SOLR-13851:
-

 Summary: SolrIndexSearcher.getFirstMatch trips assertion if 
multiple matches
 Key: SOLR-13851
 URL: https://issues.apache.org/jira/browse/SOLR-13851
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Chris M. Hostetter


the documentation for {{SolrIndexSearcher.getFirstMatch}} says...

{quote}
Returns the first document number containing the term t Returns -1 
if no document was found. This method is primarily intended for clients that 
want to fetch documents using a unique identifier."

@return the first document number containing the term
{quote}

But SOLR-12366 refactored {{SolrIndexSearcher.getFirstMatch}} to eliminate it's 
previous implementation and replace it with a call to (a refactored version of) 
{{SolrIndexSearcher.lookupId}} -- but the code in {{lookupId}} was always 
designed *explicitly* for dealing with a uniqueKey field, and has an assertion 
that once it finds a match _there will be no other matches in the index_

This means that even though {{getFirstMatch}} is _intended_ for fields that are 
unique between documents, i it's used on a field that is not unique, it can 
trip an assertion.

At a minimum we need to either "fix" {{getFirstMatch}} to behave as documented, 
or fix it's documetation.

Given that the current behavior has now been in place since Solr 7.4, and given 
that all existing uses in "core" solr code are for looking up docs by 
uniqueKey, it's probably best to simply fix the documentation, but we should 
also consider replacing hte assertion with an IllegalStateException, or 
SolrException -- anything not dependent on having assertions enabled -- to 
prevent silent bugs.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-12786) Upgrade refGuide build to Asciidoctor 20.10 and new link structure

2019-10-16 Thread Cassandra Targett (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-12786:
-
Summary: Upgrade refGuide build to Asciidoctor 20.10 and new link structure 
 (was: Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure)

> Upgrade refGuide build to Asciidoctor 20.10 and new link structure
> --
>
> Key: SOLR-12786
> URL: https://issues.apache.org/jira/browse/SOLR-12786
> Project: Solr
>  Issue Type: Improvement
>  Components: Build, documentation
>Affects Versions: 8.0
>Reporter: Jan Høydahl
>Assignee: Cassandra Targett
>Priority: Major
> Attachments: SOLR-12786.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the refguide build requires asciidoctor 1.5.6.2.
> People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, 
> causing different header ID syntax and the build will break.
> Long term we should move to latest asciidoctor.
> It is already documented in README how to install the older 1.5.6.2 version.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-12786) Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure

2019-10-16 Thread Cassandra Targett (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-12786:
-
Summary: Upgrade refGuide build to Asciidoctor 2.0.10 and new link 
structure  (was: Upgrade refGuide build to Asciidoctor 20.10 and new link 
structure)

> Upgrade refGuide build to Asciidoctor 2.0.10 and new link structure
> ---
>
> Key: SOLR-12786
> URL: https://issues.apache.org/jira/browse/SOLR-12786
> Project: Solr
>  Issue Type: Improvement
>  Components: Build, documentation
>Affects Versions: 8.0
>Reporter: Jan Høydahl
>Assignee: Cassandra Targett
>Priority: Major
> Attachments: SOLR-12786.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the refguide build requires asciidoctor 1.5.6.2.
> People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, 
> causing different header ID syntax and the build will break.
> Long term we should move to latest asciidoctor.
> It is already documented in README how to install the older 1.5.6.2 version.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12786) Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953049#comment-16953049
 ] 

ASF subversion and git services commented on SOLR-12786:


Commit 802e97d6aa9806f495febc18790425cdcf12bece in lucene-solr's branch 
refs/heads/branch_8x from Cassandra Targett
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=802e97d ]

SOLR-12786: Update Ref Guide build tool versions & fix section links for new 
format requirements


> Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure
> --
>
> Key: SOLR-12786
> URL: https://issues.apache.org/jira/browse/SOLR-12786
> Project: Solr
>  Issue Type: Improvement
>  Components: Build, documentation
>Affects Versions: 8.0
>Reporter: Jan Høydahl
>Assignee: Cassandra Targett
>Priority: Major
> Attachments: SOLR-12786.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the refguide build requires asciidoctor 1.5.6.2.
> People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, 
> causing different header ID syntax and the build will break.
> Long term we should move to latest asciidoctor.
> It is already documented in README how to install the older 1.5.6.2 version.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12786) Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953050#comment-16953050
 ] 

ASF subversion and git services commented on SOLR-12786:


Commit dc47aa5b16f5cc75678070c5b1b5b7459b3690a4 in lucene-solr's branch 
refs/heads/branch_8x from Cassandra Targett
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=dc47aa5 ]

SOLR-12786: add back explicit asciidoctor install for Jenkins build


> Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure
> --
>
> Key: SOLR-12786
> URL: https://issues.apache.org/jira/browse/SOLR-12786
> Project: Solr
>  Issue Type: Improvement
>  Components: Build, documentation
>Affects Versions: 8.0
>Reporter: Jan Høydahl
>Assignee: Cassandra Targett
>Priority: Major
> Attachments: SOLR-12786.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the refguide build requires asciidoctor 1.5.6.2.
> People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, 
> causing different header ID syntax and the build will break.
> Long term we should move to latest asciidoctor.
> It is already documented in README how to install the older 1.5.6.2 version.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12786) Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953026#comment-16953026
 ] 

ASF subversion and git services commented on SOLR-12786:


Commit b3d59a7a8b5ed28ba985e54bcb7edd5c3b352302 in lucene-solr's branch 
refs/heads/master from Cassandra Targett
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b3d59a7 ]

SOLR-12786: add back explicit asciidoctor install for Jenkins build


> Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure
> --
>
> Key: SOLR-12786
> URL: https://issues.apache.org/jira/browse/SOLR-12786
> Project: Solr
>  Issue Type: Improvement
>  Components: Build, documentation
>Affects Versions: 8.0
>Reporter: Jan Høydahl
>Assignee: Cassandra Targett
>Priority: Major
> Attachments: SOLR-12786.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the refguide build requires asciidoctor 1.5.6.2.
> People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, 
> causing different header ID syntax and the build will break.
> Long term we should move to latest asciidoctor.
> It is already documented in README how to install the older 1.5.6.2 version.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-9005) BooleanQuery.visit() incorrectly pulls subvisitors from its parent

2019-10-16 Thread Alan Woodward (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved LUCENE-9005.
---
Fix Version/s: 8.3
   master (9.0)
   Resolution: Fixed

> BooleanQuery.visit() incorrectly pulls subvisitors from its parent
> --
>
> Key: LUCENE-9005
> URL: https://issues.apache.org/jira/browse/LUCENE-9005
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Fix For: master (9.0), 8.3
>
> Attachments: LUCENE-9005.patch
>
>
> BooleanQuery.visit() calls getSubVisitor once for each of its clause sets; 
> however, this sub visitor is called on the passed-in visitor, which means 
> that sub clauses get attached to its parent, rather than a visitor for that 
> particular BQ.
> To illustrate, consider the following nested BooleanQuery: ("a b" (+c +d %e 
> f)); we have a top-level disjunction query containing one phrase query 
> (essentially a conjunction), and one boolean query containing both MUST, 
> FILTER and SHOULD clauses.  When visiting, the top level query will pull a 
> SHOULD subvisitor, and pass both queries into it.  The phrase query will pull 
> a MUST subvisitor and all its two terms.  The nested boolean will pull a 
> MUST, and FILTER and a SHOULD; but these are all attached to the parent 
> SHOULD visitor - in particular, the MUST and FILTER clauses will end up being 
> attached to this SHOULD visitor, and be mis-interpreted as a disjunction.
> To fix this, BQ should first pull a MUST visitor and visit its MUST clauses 
> using this visitor; SHOULD, FILTER and MUST_NOT clauses should then be pulled 
> from this top-level MUST visitor. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9005) BooleanQuery.visit() incorrectly pulls subvisitors from its parent

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952981#comment-16952981
 ] 

ASF subversion and git services commented on LUCENE-9005:
-

Commit f7711d712472528b567ab975d0ed677bbd30ac12 in lucene-solr's branch 
refs/heads/master from Alan Woodward
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f7711d7 ]

LUCENE-9005: BooleanQuery.visit() pulls subvisitors from a top-level MUST 
visitor


> BooleanQuery.visit() incorrectly pulls subvisitors from its parent
> --
>
> Key: LUCENE-9005
> URL: https://issues.apache.org/jira/browse/LUCENE-9005
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Attachments: LUCENE-9005.patch
>
>
> BooleanQuery.visit() calls getSubVisitor once for each of its clause sets; 
> however, this sub visitor is called on the passed-in visitor, which means 
> that sub clauses get attached to its parent, rather than a visitor for that 
> particular BQ.
> To illustrate, consider the following nested BooleanQuery: ("a b" (+c +d %e 
> f)); we have a top-level disjunction query containing one phrase query 
> (essentially a conjunction), and one boolean query containing both MUST, 
> FILTER and SHOULD clauses.  When visiting, the top level query will pull a 
> SHOULD subvisitor, and pass both queries into it.  The phrase query will pull 
> a MUST subvisitor and all its two terms.  The nested boolean will pull a 
> MUST, and FILTER and a SHOULD; but these are all attached to the parent 
> SHOULD visitor - in particular, the MUST and FILTER clauses will end up being 
> attached to this SHOULD visitor, and be mis-interpreted as a disjunction.
> To fix this, BQ should first pull a MUST visitor and visit its MUST clauses 
> using this visitor; SHOULD, FILTER and MUST_NOT clauses should then be pulled 
> from this top-level MUST visitor. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9005) BooleanQuery.visit() incorrectly pulls subvisitors from its parent

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952980#comment-16952980
 ] 

ASF subversion and git services commented on LUCENE-9005:
-

Commit 574e1e2d52d420dae41bee6b5d0e68799de8a1bd in lucene-solr's branch 
refs/heads/branch_8x from Alan Woodward
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=574e1e2 ]

LUCENE-9005: BooleanQuery.visit() pulls subvisitors from a top-level MUST 
visitor


> BooleanQuery.visit() incorrectly pulls subvisitors from its parent
> --
>
> Key: LUCENE-9005
> URL: https://issues.apache.org/jira/browse/LUCENE-9005
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Attachments: LUCENE-9005.patch
>
>
> BooleanQuery.visit() calls getSubVisitor once for each of its clause sets; 
> however, this sub visitor is called on the passed-in visitor, which means 
> that sub clauses get attached to its parent, rather than a visitor for that 
> particular BQ.
> To illustrate, consider the following nested BooleanQuery: ("a b" (+c +d %e 
> f)); we have a top-level disjunction query containing one phrase query 
> (essentially a conjunction), and one boolean query containing both MUST, 
> FILTER and SHOULD clauses.  When visiting, the top level query will pull a 
> SHOULD subvisitor, and pass both queries into it.  The phrase query will pull 
> a MUST subvisitor and all its two terms.  The nested boolean will pull a 
> MUST, and FILTER and a SHOULD; but these are all attached to the parent 
> SHOULD visitor - in particular, the MUST and FILTER clauses will end up being 
> attached to this SHOULD visitor, and be mis-interpreted as a disjunction.
> To fix this, BQ should first pull a MUST visitor and visit its MUST clauses 
> using this visitor; SHOULD, FILTER and MUST_NOT clauses should then be pulled 
> from this top-level MUST visitor. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9005) BooleanQuery.visit() incorrectly pulls subvisitors from its parent

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952979#comment-16952979
 ] 

ASF subversion and git services commented on LUCENE-9005:
-

Commit c19845775520108dce35feabfc081f606b34584f in lucene-solr's branch 
refs/heads/branch_8_3 from Alan Woodward
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c198457 ]

LUCENE-9005: BooleanQuery.visit() pulls subvisitors from a top-level MUST 
visitor


> BooleanQuery.visit() incorrectly pulls subvisitors from its parent
> --
>
> Key: LUCENE-9005
> URL: https://issues.apache.org/jira/browse/LUCENE-9005
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Attachments: LUCENE-9005.patch
>
>
> BooleanQuery.visit() calls getSubVisitor once for each of its clause sets; 
> however, this sub visitor is called on the passed-in visitor, which means 
> that sub clauses get attached to its parent, rather than a visitor for that 
> particular BQ.
> To illustrate, consider the following nested BooleanQuery: ("a b" (+c +d %e 
> f)); we have a top-level disjunction query containing one phrase query 
> (essentially a conjunction), and one boolean query containing both MUST, 
> FILTER and SHOULD clauses.  When visiting, the top level query will pull a 
> SHOULD subvisitor, and pass both queries into it.  The phrase query will pull 
> a MUST subvisitor and all its two terms.  The nested boolean will pull a 
> MUST, and FILTER and a SHOULD; but these are all attached to the parent 
> SHOULD visitor - in particular, the MUST and FILTER clauses will end up being 
> attached to this SHOULD visitor, and be mis-interpreted as a disjunction.
> To fix this, BQ should first pull a MUST visitor and visit its MUST clauses 
> using this visitor; SHOULD, FILTER and MUST_NOT clauses should then be pulled 
> from this top-level MUST visitor. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12786) Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952974#comment-16952974
 ] 

ASF subversion and git services commented on SOLR-12786:


Commit 621461fd1a51278c901399668c7d33a7474f4994 in lucene-solr's branch 
refs/heads/master from Cassandra Targett
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=621461f ]

SOLR-12786: Update Ref Guide build tool versions & fix section links for new 
format requirements


> Upgrade refGuide build to Asciidoctor 1.5.7 and new link structure
> --
>
> Key: SOLR-12786
> URL: https://issues.apache.org/jira/browse/SOLR-12786
> Project: Solr
>  Issue Type: Improvement
>  Components: Build, documentation
>Affects Versions: 8.0
>Reporter: Jan Høydahl
>Assignee: Cassandra Targett
>Priority: Major
> Attachments: SOLR-12786.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the refguide build requires asciidoctor 1.5.6.2.
> People using {{gem install jekyll-asciidoc}} will end up with version 1.5.7, 
> causing different header ID syntax and the build will break.
> Long term we should move to latest asciidoctor.
> It is already documented in README how to install the older 1.5.6.2 version.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13808) Query DSL should let to cache filter

2019-10-16 Thread Mikhail Khludnev (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952969#comment-16952969
 ] 

Mikhail Khludnev commented on SOLR-13808:
-

Ok. It seems like the plan is to
 # create \{!cache} query parser to hook it up by existing DSL. Caveat for 
users is loosing scoring. 
 # enable cache by default for \{!bool filter=... filter=..}
 # make sure that it sensitive for \{!cache=false} local param for enclosing 
queries

I'm fine with it and patches are welcome. 

> Query DSL should let to cache filter
> 
>
> Key: SOLR-13808
> URL: https://issues.apache.org/jira/browse/SOLR-13808
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mikhail Khludnev
>Priority: Major
>
> Query DSL let to express Lucene BQ's filter
>  
> {code:java}
> { query: {bool: { filter: {term: {f:name,query:"foo bar"}}} }}{code}
> However, it might easily catch the need in caching it in filter cache. This 
> might rely on ExtensibleQuery and QParser: 
> {code:java}
> { query: {bool: { filter: {term: {f:name,query:"foo bar", cache:true}}} }}
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13677) All Metrics Gauges should be unregistered by the objects that registered them

2019-10-16 Thread Andrzej Bialecki (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952968#comment-16952968
 ] 

Andrzej Bialecki commented on SOLR-13677:
-

[~noble.paul] I would appreciate your review.

> All Metrics Gauges should be unregistered by the objects that registered them
> -
>
> Key: SOLR-13677
> URL: https://issues.apache.org/jira/browse/SOLR-13677
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Noble Paul
>Assignee: Andrzej Bialecki
>Priority: Blocker
> Fix For: 8.3
>
> Attachments: SOLR-13677.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The life cycle of Metrics producers are managed by the core (mostly). So, if 
> the lifecycle of the object is different from that of the core itself, these 
> objects will never be unregistered from the metrics registry. This will lead 
> to memory leaks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] iverase commented on issue #865: LUCENE-8973: XYRectangle2D should work on float space

2019-10-16 Thread GitBox

iverase commented on issue #865: LUCENE-8973: XYRectangle2D should work on 
float space
URL: https://github.com/apache/lucene-solr/pull/865#issuecomment-542773405
 
 
   I have updated the PR so the XYRectangle2D is now a Component2D


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13845) DELETEREPLICA API by "count" and "type"

2019-10-16 Thread Amrit Sarkar (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952956#comment-16952956
 ] 

Amrit Sarkar commented on SOLR-13845:
-

Uploaded clean PATCH for the improvement. Requesting feedback.

> DELETEREPLICA API by "count" and "type"
> ---
>
> Key: SOLR-13845
> URL: https://issues.apache.org/jira/browse/SOLR-13845
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Amrit Sarkar
>Priority: Major
> Attachments: SOLR-13845.patch
>
>
> SOLR-9319 added support for deleting replicas by count. It would be great to 
> have the feature with added functionality the type of replica we want to 
> delete like we add replicas by count and type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13845) DELETEREPLICA API by "count" and "type"

2019-10-16 Thread Amrit Sarkar (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amrit Sarkar updated SOLR-13845:

Attachment: (was: STAR-13845.patch)

> DELETEREPLICA API by "count" and "type"
> ---
>
> Key: SOLR-13845
> URL: https://issues.apache.org/jira/browse/SOLR-13845
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Amrit Sarkar
>Priority: Major
> Attachments: SOLR-13845.patch
>
>
> SOLR-9319 added support for deleting replicas by count. It would be great to 
> have the feature with added functionality the type of replica we want to 
> delete like we add replicas by count and type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13845) DELETEREPLICA API by "count" and "type"

2019-10-16 Thread Amrit Sarkar (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amrit Sarkar updated SOLR-13845:

Attachment: SOLR-13845.patch

> DELETEREPLICA API by "count" and "type"
> ---
>
> Key: SOLR-13845
> URL: https://issues.apache.org/jira/browse/SOLR-13845
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Amrit Sarkar
>Priority: Major
> Attachments: SOLR-13845.patch, STAR-13845.patch
>
>
> SOLR-9319 added support for deleting replicas by count. It would be great to 
> have the feature with added functionality the type of replica we want to 
> delete like we add replicas by count and type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-16 Thread GitBox

cbuescher commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335541921
 
 

 ##
 File path: 
lucene/suggest/src/test/org/apache/lucene/search/suggest/document/TestPrefixCompletionQuery.java
 ##
 @@ -253,6 +263,123 @@ public void testDocFiltering() throws Exception {
 iw.close();
   }
 
+  /**
+   * Test that the correct amount of documents are collected if using a 
collector that also rejects documents.
+   */
+  public void testCollectorThatRejects() throws Exception {
+// use synonym analyzer to have multiple paths to same suggested document. 
This mock adds "dog" as synonym for "dogs"
+Analyzer analyzer = new MockSynonymAnalyzer();
+RandomIndexWriter iw = new RandomIndexWriter(random(), dir, 
iwcWithSuggestField(analyzer, "suggest_field"));
+List expectedResults = new ArrayList();
+
+for (int docCount = 10; docCount > 0; docCount--) {
+  Document document = new Document();
+  String value = "ab" + docCount + " dogs";
+  document.add(new SuggestField("suggest_field", value, docCount));
+  expectedResults.add(new Entry(value, docCount));
+  iw.addDocument(document);
+}
+
+if (rarely()) {
+  iw.commit();
+}
+
+DirectoryReader reader = iw.getReader();
+SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader);
+
+PrefixCompletionQuery query = new PrefixCompletionQuery(analyzer, new 
Term("suggest_field", "ab"));
+int topN = 5;
+
+// use a TopSuggestDocsCollector that rejects results with duplicate docIds
+TopSuggestDocsCollector collector = new TopSuggestDocsCollector(topN, 
false) {
+
+  private Set seenDocIds = new HashSet<>();
+
+  @Override
+  public boolean collect(int docID, CharSequence key, CharSequence 
context, float score) throws IOException {
+  int globalDocId = docID + docBase;
+  boolean collected = false;
+  if (seenDocIds.contains(globalDocId) == false) {
+  super.collect(docID, key, context, score);
+  seenDocIds.add(globalDocId);
+  collected = true;
+  }
+  return collected;
+  }
+
+  @Override
+  protected boolean canReject() {
+return true;
+  }
+};
+
+indexSearcher.suggest(query, collector);
+TopSuggestDocs suggestions = collector.get();
+assertSuggestions(suggestions, expectedResults.subList(0, 
topN).toArray(new Entry[0]));
+assertTrue(suggestions.isComplete());
 
 Review comment:
   I extended the existing test to the case where try getting the top 10. In 
this case the queue would have a max depth of 15, the reject count it 9.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] johtani commented on issue #935: LUCENE-4056: Japanese Tokenizer (Kuromoji) cannot build UniDic dictionary

2019-10-16 Thread GitBox

johtani commented on issue #935: LUCENE-4056: Japanese Tokenizer (Kuromoji) 
cannot build UniDic dictionary
URL: https://github.com/apache/lucene-solr/pull/935#issuecomment-542749849
 
 
   Here is the message for `ant clean; ant build-dict` with ipadic.
   https://gist.github.com/johtani/b53e9e241e5b98519fb3ffe12b4164eb
   
   And also the message with unidic and `build.xml` 
   https://gist.github.com/johtani/91cfd2753aba2e001c1d39f47666ada7
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13847) Fix ref guide for autoscaling metric trigger

2019-10-16 Thread ROCHETEAU Antoine (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ROCHETEAU Antoine updated SOLR-13847:
-
Component/s: AutoScaling

> Fix ref guide for autoscaling metric trigger
> 
>
> Key: SOLR-13847
> URL: https://issues.apache.org/jira/browse/SOLR-13847
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling, documentation
>Affects Versions: 7.7.2, 8.2
>Reporter: ROCHETEAU Antoine
>Priority: Minor
> Attachments: metric_trigger_documentation.patch
>
>
> Reported in the IRC channel who ask me to raise an issue,
> The documentation for the autoscaling metric trigger have an error on the 
> description (it's not possible to set up a basic metric trigger with the 
> current documentation).
> [https://lucene.apache.org/solr/guide/8_1/solrcloud-autoscaling-triggers.html#metric-trigger]
> metric:_group_:_prefix_ should be replaced by 
> metric{color:#ff}s{color}:_group_:_prefix_
> This correction is also required on the example:
> {{metric{color:#ff}s{color}:solr.node:CONTAINER.fs.coreRoot.usableSpace}}
> This is confirmed by the source code with explicit use of "metrics:"  (see 
> for example: org.apache.solr.cloud.autoscaling.sim.SimNodeStateProvider or 
> org.apache.solr.cloud.autoscaling.MetricTriggerIntegrationTest)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-16 Thread GitBox

cbuescher commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335502868
 
 

 ##
 File path: 
lucene/suggest/src/test/org/apache/lucene/search/suggest/document/TestPrefixCompletionQuery.java
 ##
 @@ -253,6 +263,123 @@ public void testDocFiltering() throws Exception {
 iw.close();
   }
 
+  /**
+   * Test that the correct amount of documents are collected if using a 
collector that also rejects documents.
+   */
+  public void testCollectorThatRejects() throws Exception {
+// use synonym analyzer to have multiple paths to same suggested document. 
This mock adds "dog" as synonym for "dogs"
+Analyzer analyzer = new MockSynonymAnalyzer();
+RandomIndexWriter iw = new RandomIndexWriter(random(), dir, 
iwcWithSuggestField(analyzer, "suggest_field"));
+List expectedResults = new ArrayList();
+
+for (int docCount = 10; docCount > 0; docCount--) {
+  Document document = new Document();
+  String value = "ab" + docCount + " dogs";
+  document.add(new SuggestField("suggest_field", value, docCount));
+  expectedResults.add(new Entry(value, docCount));
+  iw.addDocument(document);
+}
+
+if (rarely()) {
+  iw.commit();
+}
+
+DirectoryReader reader = iw.getReader();
+SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader);
+
+PrefixCompletionQuery query = new PrefixCompletionQuery(analyzer, new 
Term("suggest_field", "ab"));
+int topN = 5;
+
+// use a TopSuggestDocsCollector that rejects results with duplicate docIds
+TopSuggestDocsCollector collector = new TopSuggestDocsCollector(topN, 
false) {
+
+  private Set seenDocIds = new HashSet<>();
+
+  @Override
+  public boolean collect(int docID, CharSequence key, CharSequence 
context, float score) throws IOException {
+  int globalDocId = docID + docBase;
+  boolean collected = false;
+  if (seenDocIds.contains(globalDocId) == false) {
+  super.collect(docID, key, context, score);
+  seenDocIds.add(globalDocId);
+  collected = true;
+  }
+  return collected;
+  }
+
+  @Override
+  protected boolean canReject() {
+return true;
+  }
+};
+
+indexSearcher.suggest(query, collector);
+TopSuggestDocs suggestions = collector.get();
+assertSuggestions(suggestions, expectedResults.subList(0, 
topN).toArray(new Entry[0]));
+assertTrue(suggestions.isComplete());
 
 Review comment:
   This will happen if the estimated queue size in `NRTSuggester#lookup` is not 
large enough to accout for all rejected documents, e.g. when in this particular 
test we try to get the top 5 of only 5 documents. In that case the queue size 
heuristic in `NRTSuggester#getMaxTopNSearcherQueueSize` will only size to queue 
to 7 (topN + numDocs/2), which is less than the number of topN + rejections, so 
the TopResults returned will have the `isComplete` flag set.
   I can add that case to the existing test if this helps.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9006) Ensure WordDelimiterGraphFilter always emits catenateAll token early

2019-10-16 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952867#comment-16952867
 ] 

David Smiley commented on LUCENE-9006:
--

BTW this issue also fixes a bug in the offsets.  The previous behavior resulted 
in the token "8other" having start offset of 2 because it followed the token 
"other" which is and should be 2.  Now that "8other" is earlier, it can have 
the start offset it should -- 0.

I was thinking about the core of the change here to the sort to consider the 
offset based length.  I think it's simpler/faster and perhaps more correct to 
just use the start offset.  This change passes the tests, so I'm inclined to 
push that.

> Ensure WordDelimiterGraphFilter always emits catenateAll token early
> 
>
> Key: LUCENE-9006
> URL: https://issues.apache.org/jira/browse/LUCENE-9006
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ideally, the first token of WDGF is the preserveOriginal (if configured to 
> emit), and the second should be the catenateAll (if configured to emit).  The 
> deprecated WDF does this but WDGF can sometimes put the first other token 
> earlier when there is a non-emitted candidate sub-token.
> Example input "8-other" when only generateWordParts and catenateAll -- *not* 
> generateNumberParts.  WDGF internally sees the '8' but moves on.  Ultimately, 
> the "other" token and the catenated "8other" will appear at the same internal 
> position, which by luck fools the sorter to emit "other" first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952862#comment-16952862
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit 3a695853755fae8eaef06c8c37689308d93157f2 in lucene-solr's branch 
refs/heads/visual-guide from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3a69585 ]

SOLR-13105: The Visual Guide to Streaming Expressions and Math Expressions


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] janhoy commented on a change in pull request #946: SOLR-13835 HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT

2019-10-16 Thread GitBox

janhoy commented on a change in pull request #946: SOLR-13835 HttpSolrCall 
produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT
URL: https://github.com/apache/lucene-solr/pull/946#discussion_r335483832
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/security/BasicAuthIntegrationTest.java
 ##
 @@ -232,7 +234,7 @@ public void testBasicAuth() throws Exception {
 HttpSolrClient.RemoteSolrException e = 
expectThrows(HttpSolrClient.RemoteSolrException.class, () -> {
   new UpdateRequest().deleteByQuery("*:*").process(aNewClient, 
COLLECTION);
 });
-assertTrue(e.getMessage().contains("Unauthorized request"));
+assertTrue(e.getMessage(), e.getMessage().contains("Authentication 
failed"));
 
 Review comment:
   Earlier both 401 and 403 responses would print the text "Unauthorized 
request" due to the fall-through. After this fix we also changed the text for 
401 response, making this test fail. Don't know why Authorization plugin 
returns 401 though, the password is correct..


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.

2019-10-16 Thread Mikhail Khludnev (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated SOLR-13824:

Attachment: (was: SOLR-13824.patch)

> JSON Request API ignores prematurely closing curly brace. 
> --
>
> Key: SOLR-13824
> URL: https://issues.apache.org/jira/browse/SOLR-13824
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-13824.patch, SOLR-13824.patch
>
>
> {code:java}
> json={query:"content:foo", facet:{zz:{field:id}}}
> {code}
> this works fine, but if we mistype {{}}} instead of {{,}}
> {code:java}
> json={query:"content:foo"} facet:{zz:{field:id}}}
> {code}
> It's captured only partially, here's we have under debug
> {code:java}
>   "json":{"query":"content:foo"},
> {code}
> I suppose it should throw an error with 400 code.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.

2019-10-16 Thread Mikhail Khludnev (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated SOLR-13824:

Attachment: SOLR-13824.patch

> JSON Request API ignores prematurely closing curly brace. 
> --
>
> Key: SOLR-13824
> URL: https://issues.apache.org/jira/browse/SOLR-13824
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-13824.patch, SOLR-13824.patch
>
>
> {code:java}
> json={query:"content:foo", facet:{zz:{field:id}}}
> {code}
> this works fine, but if we mistype {{}}} instead of {{,}}
> {code:java}
> json={query:"content:foo"} facet:{zz:{field:id}}}
> {code}
> It's captured only partially, here's we have under debug
> {code:java}
>   "json":{"query":"content:foo"},
> {code}
> I suppose it should throw an error with 400 code.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952800#comment-16952800
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit 16dfdbca48f54baacf06a4ac68c75ca2841d9d34 in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=16dfdbc ]

SOLR-13105: Improve curve fitting docs 5


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952797#comment-16952797
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit 11cc8460b5e90ebf8360a3b71f14794afcd2a7c8 in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=11cc846 ]

SOLR-13105: Improve curve fitting docs 4


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9006) Ensure WordDelimiterGraphFilter always emits catenateAll token early

2019-10-16 Thread David Wayne Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952799#comment-16952799
 ] 

David Wayne Smiley commented on LUCENE-9006:


Thanks for the explanation RE graphOffsetsAreCorrect.  I guess there is no new 
concern here the PR then.

I discovered this problem due to a custom filter that directly collaborates 
with a delegated WDGF instance.  It assumes the first two tokens are 
preserveOriginal then catenateAll.  This was the case with the now deprecated 
WDF.  It's intuitive too, so "looks" odd when it doesn't happen.  I noticed in 
LUCENE-8730 a precedent for making the token orderings consistent, which makes 
sense to me.

> Ensure WordDelimiterGraphFilter always emits catenateAll token early
> 
>
> Key: LUCENE-9006
> URL: https://issues.apache.org/jira/browse/LUCENE-9006
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: David Wayne Smiley
>Assignee: David Wayne Smiley
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ideally, the first token of WDGF is the preserveOriginal (if configured to 
> emit), and the second should be the catenateAll (if configured to emit).  The 
> deprecated WDF does this but WDGF can sometimes put the first other token 
> earlier when there is a non-emitted candidate sub-token.
> Example input "8-other" when only generateWordParts and catenateAll -- *not* 
> generateNumberParts.  WDGF internally sees the '8' but moves on.  Ultimately, 
> the "other" token and the catenated "8other" will appear at the same internal 
> position, which by luck fools the sorter to emit "other" first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952795#comment-16952795
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit 48d9c76bc5c9efa9dfecd7a81783c753fef3bcd1 in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=48d9c76 ]

SOLR-13105: Improve curve fitting docs 3


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952793#comment-16952793
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit 623a026321ad8746265e8c4526423ec29e321c7f in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=623a026 ]

SOLR-13105: Improve curve fitting docs 2


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952785#comment-16952785
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit fd3d50c5807b3c1097bb4a7639f35bff94b11dc6 in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fd3d50c ]

SOLR-13105: Improve curve fitting docs


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13836) Streaming Expression Query Parser

2019-10-16 Thread Cassandra Targett (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-13836:
-
Component/s: streaming expressions
 query parsers

> Streaming Expression Query Parser
> -
>
> Key: SOLR-13836
> URL: https://issues.apache.org/jira/browse/SOLR-13836
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers, streaming expressions
>Reporter: Trey Grainger
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It is currently possible to hit the search handler in a streaming expression 
> ("search(...)"), but it is not currently possible to invoke a streaming 
> expression from within a regular search within the search handler. In some 
> cases, it would be useful to leverage the power of streaming expressions to 
> generate a result set and then join that result set with a normal set of 
> search results.
> This isn't expected to be particularly efficient for high cardinality 
> streaming expression results, but it would be pretty powerful feature that 
> could enable a bunch of use cases that aren't possible today within a normal 
> search.
> h2. Example:
> *Docs:*
> {code:java}
> curl -X POST -H "Content-Type: application/json" 
> http://localhost:8983/solr/food_collection/update?commit=true  --data-binary '
> [
> {"id": "1", "name_s":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]},
> {"id": "2", "name_s":"apple 
> juice","vector_fs":[1.0,5.0,0.0,0.0,0.0,4.0,4.0,3.0]},
> {"id": "3", 
> "name_s":"cappuccino","vector_fs":[0.0,5.0,3.0,0.0,4.0,1.0,2.0,3.0]},
> {"id": "4", "name_s":"cheese 
> pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]},
> {"id": "5", "name_s":"green 
> tea","vector_fs":[0.0,5.0,0.0,0.0,2.0,1.0,1.0,5.0]},
> {"id": "6", "name_s":"latte","vector_fs":[0.0,5.0,4.0,0.0,4.0,1.0,3.0,3.0]},
> {"id": "7", "name_s":"soda","vector_fs":[0.0,5.0,0.0,0.0,3.0,5.0,5.0,0.0]},
> {"id": "8", "name_s":"cheese bread 
> sticks","vector_fs":[5.0,0.0,4.0,5.0,0.0,1.0,4.0,2.0]},
> {"id": "9", "name_s":"water","vector_fs":[0.0,5.0,0.0,0.0,0.0,0.0,0.0,5.0]},
> {"id": "10", "name_s":"cinnamon bread 
> sticks","vector_fs":[5.0,0.0,1.0,5.0,0.0,3.0,4.0,2.0]}
> ]
> {code}
>  
> *Query:*
> {code:java}
> http://localhost:8983/solr/food/select?q=*:*&fq=\{!streaming_expression}top(select(search(food,%20q=%22*:*%22,%20fl=%22id,vector_fs%22,%20sort=%22id%20asc%22),%20cosineSimilarity(vector_fs,%20array(5.1,0.0,1.0,5.0,0.0,4.0,5.0,1.0))%20as%20cos,%20id),%20n=5,%20sort=%22cos%20desc%22)&fl=id,name_s
> {code}
>  
> *Response:*
> {code:java}
> {
>   "responseHeader":{
> "zkConnected":true,
> "status":0,
> "QTime":7,
> "params":{
>   "q":"*:*",
>   "fl":"id,name_s",
>   "fq":"{!streaming_expression}top(select(search(food, q=\"*:*\", 
> fl=\"id,vector_fs\", sort=\"id asc\"), cosineSimilarity(vector_fs, 
> array(5.2,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as cos, id), n=5, sort=\"cos 
> desc\")"}},
>   "response":{"numFound":5,"start":0,"docs":[
>   {
> "name_s":"donut",
> "id":"1"},
>   {
> "name_s":"apple juice",
> "id":"2"},
>   {
> "name_s":"cheese pizza",
> "id":"4"},
>   {
> "name_s":"cheese bread sticks",
> "id":"8"},
>   {
> "name_s":"cinnamon bread sticks",
> "id":"10"}]
>   }}
> {code}
> The current implementation also supports the following additional parameters:
>  *f*: (optional) The field name from the streaming expression containing the 
> document ids upon which to filter. Defaults to the same uniqueKey field name 
> from your documents. 
>  *method*: (optional) Any of termsFilter (default), booleanQuery, automaton, 
> docValuesTermsFilter.
> The method may go away, especially if we find a more efficient way to join 
> the stream to the main query doc set.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT

2019-10-16 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952774#comment-16952774
 ] 

Jan Høydahl commented on SOLR-13835:


New commits to PR to explicitly handle known codes:
 * 401 => EventType.FORBIDDEN
 * 403 => EventType.UNAUTHORIZED
 * 200/202 => EventType.AUTHORIZED
 * All other statuses => EventType.ERROR

Please review. Think this should be mergeable now.

> HttpSolrCall produces incorrect extra AuditEvent on 
> AuthorizationResponse.PROMPT
> 
>
> Key: SOLR-13835
> URL: https://issues.apache.org/jira/browse/SOLR-13835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication, Authorization
>Reporter: Chris M. Hostetter
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> spinning this out of SOLR-13741...
> {quote}
> Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe 
> there is a code bug, not a test bug. In HttpSolrCall#471 in the 
> {{authorize()}} call, if authResponse == PROMPT, it will actually match both 
> blocks and emit two audit events: 
> [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493]
>  
> {code:java}
> if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...}
> if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && 
> !(authResponse.statusCode == HttpStatus.SC_OK)) {...}
> {code}
> When code==401, it is also true that code!=200. Intuitively there should be 
> both a sendErrora and return RETURN before line #484 in the first if block?
> {quote}
> This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by 
> a coresponding {{UNAUTHORIZED}} AuditEvent.  
> It's not yet clear if, from the perspective of the external client, there are 
> any other bugs in behavior (TBD)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13403) Terms component fails for DatePointField

2019-10-16 Thread Mikhail Khludnev (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952769#comment-16952769
 ] 

Mikhail Khludnev commented on SOLR-13403:
-

patch makes sense

> Terms component fails for DatePointField
> 
>
> Key: SOLR-13403
> URL: https://issues.apache.org/jira/browse/SOLR-13403
> Project: Solr
>  Issue Type: Bug
>  Components: SearchComponents - other
>Reporter: Munendra S N
>Assignee: Munendra S N
>Priority: Major
> Attachments: SOLR-13403.patch, SOLR-13403.patch, SOLR-13403.patch
>
>
> Getting terms for PointFields except DatePointField. For DatePointField, the 
> request fails NPE



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9006) Ensure WordDelimiterGraphFilter always emits catenateAll token early

2019-10-16 Thread Jim Ferenczi (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952770#comment-16952770
 ] 

Jim Ferenczi commented on LUCENE-9006:
--

I don't think your change affects the fact that we cannot set 
graphOffsetsAreCorrect when writing a test using the WDGF. Your test should 
fail the same way with graphOffsetsAreCorrect if you don't reorder the terms in 
the output. The other tests for the WDGF sets this flag to false. I also wonder 
why do you think that there should be any order among the different form that 
start at the same position ? Are you relying on this order in a subsequent 
filter ? Maybe we could mark the alternatives with a specific type like 
synonyms are doing ? This way it would be easier to differentiate a splitting 
path from the original token ?

> Ensure WordDelimiterGraphFilter always emits catenateAll token early
> 
>
> Key: LUCENE-9006
> URL: https://issues.apache.org/jira/browse/LUCENE-9006
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: David Wayne Smiley
>Assignee: David Wayne Smiley
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ideally, the first token of WDGF is the preserveOriginal (if configured to 
> emit), and the second should be the catenateAll (if configured to emit).  The 
> deprecated WDF does this but WDGF can sometimes put the first other token 
> earlier when there is a non-emitted candidate sub-token.
> Example input "8-other" when only generateWordParts and catenateAll -- *not* 
> generateNumberParts.  WDGF internally sees the '8' but moves on.  Ultimately, 
> the "other" token and the catenated "8other" will appear at the same internal 
> position, which by luck fools the sorter to emit "other" first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13677) All Metrics Gauges should be unregistered by the objects that registered them

2019-10-16 Thread Andrzej Bialecki (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952764#comment-16952764
 ] 

Andrzej Bialecki commented on SOLR-13677:
-

This is an updated patch, after cleanup of the issues listed in the review, and 
with some additional changes:
* I excluded the {{scope}} from the context, because in most cases we can reuse 
the parent context for components with different scopes.
* I converted some internal, non-pluggable components to use the new API.

This still is a large change and needs more testing - I'll need another day to 
be reasonably sure that it doesn't break things.


> All Metrics Gauges should be unregistered by the objects that registered them
> -
>
> Key: SOLR-13677
> URL: https://issues.apache.org/jira/browse/SOLR-13677
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Noble Paul
>Assignee: Andrzej Bialecki
>Priority: Blocker
> Fix For: 8.3
>
> Attachments: SOLR-13677.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The life cycle of Metrics producers are managed by the core (mostly). So, if 
> the lifecycle of the object is different from that of the core itself, these 
> objects will never be unregistered from the metrics registry. This will lead 
> to memory leaks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-16 Thread GitBox

jimczi commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335431950
 
 

 ##
 File path: 
lucene/suggest/src/test/org/apache/lucene/search/suggest/document/TestPrefixCompletionQuery.java
 ##
 @@ -253,6 +258,61 @@ public void testDocFiltering() throws Exception {
 iw.close();
   }
 
+  /**
+   * Test that the correct amount of documents are collected if using a 
collector that also rejects documents.
+   */
+  public void testCollectorThatRejects() throws Exception {
+// use synonym analyzer to have multiple paths to same suggested document. 
This mock adds "dog" as synonym for "dogs"
+Analyzer analyzer = new MockSynonymAnalyzer();
+RandomIndexWriter iw = new RandomIndexWriter(random(), dir, 
iwcWithSuggestField(analyzer, "suggest_field"));
+List expectedResults = new ArrayList();
+
+for (int docCount = 10; docCount > 0; docCount--) {
+  Document document = new Document();
+  String value = "ab" + docCount + " dogs";
+  document.add(new SuggestField("suggest_field", value, docCount));
+  expectedResults.add(new Entry(value, docCount));
+  iw.addDocument(document);
+}
+
+if (rarely()) {
+  iw.commit();
+}
+
+DirectoryReader reader = iw.getReader();
+SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader);
+
+PrefixCompletionQuery query = new PrefixCompletionQuery(analyzer, new 
Term("suggest_field", "ab"));
+int topN = 5;
+
+// use a TopSuggestDocsCollector that rejects results with duplicate docIds
+TopSuggestDocsCollector collector = new TopSuggestDocsCollector(topN, 
false) {
+
+  private Set seenDocIds = new HashSet<>();
+
+  @Override
+  public boolean collect(int docID, CharSequence key, CharSequence 
context, float score) throws IOException {
+  int globalDocId = docID + docBase;
+  boolean collected = false;
+  if (seenDocIds.contains(globalDocId) == false) {
+  super.collect(docID, key, context, score);
+  seenDocIds.add(globalDocId);
+  collected = true;
+  }
+  return collected;
+  }
+};
+
+indexSearcher.suggest(query, collector);
+assertSuggestions(collector.get(), expectedResults.subList(0, 
topN).toArray(new Entry[0]));
+
+// TODO expecting true here, why false?
 
 Review comment:
   I'll open an issue. I also wonder if we shouldn't rely on the fact that the 
top suggest collector will also early terminate so whenever we expect rejection 
(because of deleted docs or because we deduplicate on suggestions/doc) we could 
set the queue size to its maximum value (5000). Currently we have different 
heuristics that tries to pick a sensitive value automatically but there is no 
guarantee of admissibility. For instance if we want to deduplicate by document 
id we should ensure that the queue size is greater than 
`topN*maxAnalyzedValuesPerDoc` and we'd need to compute this value at index 
time.
   I may be completely off but it would be interesting to see the effects  of 
setting the queue size to its maximum value on all search. This way the 
admissibility is easier to reason about and we don't need to correlate it with 
the choice made by the heuristic.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13677) All Metrics Gauges should be unregistered by the objects that registered them

2019-10-16 Thread Andrzej Bialecki (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki updated SOLR-13677:

Attachment: SOLR-13677.patch

> All Metrics Gauges should be unregistered by the objects that registered them
> -
>
> Key: SOLR-13677
> URL: https://issues.apache.org/jira/browse/SOLR-13677
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Noble Paul
>Assignee: Andrzej Bialecki
>Priority: Blocker
> Fix For: 8.3
>
> Attachments: SOLR-13677.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The life cycle of Metrics producers are managed by the core (mostly). So, if 
> the lifecycle of the object is different from that of the core itself, these 
> objects will never be unregistered from the metrics registry. This will lead 
> to memory leaks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-16 Thread GitBox

jimczi commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335426995
 
 

 ##
 File path: 
lucene/suggest/src/java/org/apache/lucene/search/suggest/document/TopSuggestDocs.java
 ##
 @@ -116,19 +133,29 @@ public TopSuggestDocs(TotalHits totalHits, 
SuggestScoreDoc[] scoreDocs) {
*/
   public static TopSuggestDocs merge(int topN, TopSuggestDocs[] shardHits) {
 SuggestScoreDocPriorityQueue priorityQueue = new 
SuggestScoreDocPriorityQueue(topN);
+boolean allComplete = true;
 for (TopSuggestDocs shardHit : shardHits) {
   for (SuggestScoreDoc scoreDoc : shardHit.scoreLookupDocs()) {
 if (scoreDoc == priorityQueue.insertWithOverflow(scoreDoc)) {
   break;
 }
   }
+  allComplete &= shardHit.isComplete;
 }
 SuggestScoreDoc[] topNResults = priorityQueue.getResults();
 if (topNResults.length > 0) {
-  return new TopSuggestDocs(new TotalHits(topNResults.length, 
TotalHits.Relation.EQUAL_TO), topNResults);
+  return new TopSuggestDocs(new TotalHits(topNResults.length, 
TotalHits.Relation.EQUAL_TO), topNResults,
+  allComplete);
 } else {
   return TopSuggestDocs.EMPTY;
 }
   }
 
+  /**
+   * Indicates if the list of results is complete or not. Might be 
false if the {@link TopNSearcher} rejected
+   * too many of the queued results.
 
 Review comment:
   The admissibility of the search is computed from the reject count so a value 
of `false` means that we exhausted all the paths but we had to reject all of 
them so the topN is truncated. It's hard to follow the full logic but it should 
be ok as long as it is ok to return less than the topN when there are more 
rejections than the queue size can handle ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jimczi commented on issue #904: LUCENE-8992: Share minimum score across segment in concurrent search

2019-10-16 Thread GitBox

jimczi commented on issue #904: LUCENE-8992: Share minimum score across segment 
in concurrent search
URL: https://github.com/apache/lucene-solr/pull/904#issuecomment-542656802
 
 
   I pushed another commit to replace the modulo with a bitwise operation as 
suggested by @jpountz . That seemed to help a bit and since there are no 
regressions and some nice boosts I think it is ready for another review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.

2019-10-16 Thread Mikhail Khludnev (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated SOLR-13824:

Attachment: SOLR-13824.patch

> JSON Request API ignores prematurely closing curly brace. 
> --
>
> Key: SOLR-13824
> URL: https://issues.apache.org/jira/browse/SOLR-13824
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-13824.patch, SOLR-13824.patch
>
>
> {code:java}
> json={query:"content:foo", facet:{zz:{field:id}}}
> {code}
> this works fine, but if we mistype {{}}} instead of {{,}}
> {code:java}
> json={query:"content:foo"} facet:{zz:{field:id}}}
> {code}
> It's captured only partially, here's we have under debug
> {code:java}
>   "json":{"query":"content:foo"},
> {code}
> I suppose it should throw an error with 400 code.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.

2019-10-16 Thread Mikhail Khludnev (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated SOLR-13824:

Attachment: (was: SOLR-13824.patch)

> JSON Request API ignores prematurely closing curly brace. 
> --
>
> Key: SOLR-13824
> URL: https://issues.apache.org/jira/browse/SOLR-13824
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-13824.patch
>
>
> {code:java}
> json={query:"content:foo", facet:{zz:{field:id}}}
> {code}
> this works fine, but if we mistype {{}}} instead of {{,}}
> {code:java}
> json={query:"content:foo"} facet:{zz:{field:id}}}
> {code}
> It's captured only partially, here's we have under debug
> {code:java}
>   "json":{"query":"content:foo"},
> {code}
> I suppose it should throw an error with 400 code.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.

2019-10-16 Thread Mikhail Khludnev (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated SOLR-13824:

Attachment: SOLR-13824.patch

> JSON Request API ignores prematurely closing curly brace. 
> --
>
> Key: SOLR-13824
> URL: https://issues.apache.org/jira/browse/SOLR-13824
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-13824.patch, SOLR-13824.patch
>
>
> {code:java}
> json={query:"content:foo", facet:{zz:{field:id}}}
> {code}
> this works fine, but if we mistype {{}}} instead of {{,}}
> {code:java}
> json={query:"content:foo"} facet:{zz:{field:id}}}
> {code}
> It's captured only partially, here's we have under debug
> {code:java}
>   "json":{"query":"content:foo"},
> {code}
> I suppose it should throw an error with 400 code.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8993) Change Maven POM repository URLs to https

2019-10-16 Thread Uwe Schindler (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952709#comment-16952709
 ] 

Uwe Schindler commented on LUCENE-8993:
---

I had to revert the Apache Parent POM upgrade in 8.x and 8.3 branch, because 
the Apache Parent POM now needs a higher Maven minimum version, which we don't 
use yet in Lucene/Solr 8.

> Change Maven POM repository URLs to https
> -
>
> Key: LUCENE-8993
> URL: https://issues.apache.org/jira/browse/LUCENE-8993
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Affects Versions: 7.7.2, 8.2, 8.1.1
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>Priority: Major
> Fix For: master (9.0), 8.3
>
> Attachments: LUCENE-8993.patch
>
>
> After fixing LUCENE-8807 I figured out today, that Lucene's build system uses 
> HTTPS URLs everywhere. But the POMs deployed to Maven central still use http 
> (I assumed that those are inherited from the ANT build).
> This will fix it for later versions by changing the POM templates. Hopefully 
> this will not happen in Gradle!
> [~markrmil...@gmail.com]: Can you make sure that the new Gradle build uses 
> HTTPS for all hard configured repositories (like Cloudera)?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8993) Change Maven POM repository URLs to https

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952706#comment-16952706
 ] 

ASF subversion and git services commented on LUCENE-8993:
-

Commit 0c8e76764dd62728c61c415118584de04de6b022 in lucene-solr's branch 
refs/heads/branch_8_3 from Uwe Schindler
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=0c8e767 ]

Revert "LUCENE-8993: Also update to latest version of Apache Parent POM"

This reverts commit 9d21418dfcc5c884f45ab668579b0391965a18bb.

This is needed because Lucene 8.x does not yet update minimum Maven version, 
but Apache Parent POM requires this.


> Change Maven POM repository URLs to https
> -
>
> Key: LUCENE-8993
> URL: https://issues.apache.org/jira/browse/LUCENE-8993
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Affects Versions: 7.7.2, 8.2, 8.1.1
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>Priority: Major
> Fix For: master (9.0), 8.3
>
> Attachments: LUCENE-8993.patch
>
>
> After fixing LUCENE-8807 I figured out today, that Lucene's build system uses 
> HTTPS URLs everywhere. But the POMs deployed to Maven central still use http 
> (I assumed that those are inherited from the ANT build).
> This will fix it for later versions by changing the POM templates. Hopefully 
> this will not happen in Gradle!
> [~markrmil...@gmail.com]: Can you make sure that the new Gradle build uses 
> HTTPS for all hard configured repositories (like Cloudera)?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8993) Change Maven POM repository URLs to https

2019-10-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952704#comment-16952704
 ] 

ASF subversion and git services commented on LUCENE-8993:
-

Commit fa726bec50ddbe1819a2d32c06aff3837b948e9e in lucene-solr's branch 
refs/heads/branch_8x from Uwe Schindler
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fa726be ]

Revert "LUCENE-8993: Also update to latest version of Apache Parent POM"

This reverts commit 9d21418dfcc5c884f45ab668579b0391965a18bb.

This is needed because Lucene 8.x does not yet update minimum Maven version, 
but Apache Parent POM requires this.


> Change Maven POM repository URLs to https
> -
>
> Key: LUCENE-8993
> URL: https://issues.apache.org/jira/browse/LUCENE-8993
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Affects Versions: 7.7.2, 8.2, 8.1.1
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>Priority: Major
> Fix For: master (9.0), 8.3
>
> Attachments: LUCENE-8993.patch
>
>
> After fixing LUCENE-8807 I figured out today, that Lucene's build system uses 
> HTTPS URLs everywhere. But the POMs deployed to Maven central still use http 
> (I assumed that those are inherited from the ANT build).
> This will fix it for later versions by changing the POM templates. Hopefully 
> this will not happen in Gradle!
> [~markrmil...@gmail.com]: Can you make sure that the new Gradle build uses 
> HTTPS for all hard configured repositories (like Cloudera)?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12393) ExpandComponent only calculates the score of expanded docs when sorted by score

2019-10-16 Thread Lucene/Solr QA (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952701#comment-16952701
 ] 

Lucene/Solr QA commented on SOLR-12393:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  1m  5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  1m  5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  1m  5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 47m 
42s{color} | {color:green} core in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 51m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-12393 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12983079/SOLR-12393.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 
10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / f7f6a37f337 |
| ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 |
| Default Java | LTS |
|  Test Results | 
https://builds.apache.org/job/PreCommit-SOLR-Build/578/testReport/ |
| modules | C: solr/core U: solr/core |
| Console output | 
https://builds.apache.org/job/PreCommit-SOLR-Build/578/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> ExpandComponent only calculates the score of expanded docs when sorted by 
> score
> ---
>
> Key: SOLR-12393
> URL: https://issues.apache.org/jira/browse/SOLR-12393
> Project: Solr
>  Issue Type: Bug
>  Components: SearchComponents - other
>Reporter: David Wayne Smiley
>Assignee: Munendra S N
>Priority: Major
> Attachments: SOLR-12393.patch, SOLR-12393.patch, SOLR-12393.patch, 
> SOLR-12393.patch
>
>
> If you use the ExpandComponent to show expanded docs and if you want the 
> score back (specified in "fl"), it will be NaN if the expanded docs are 
> sorted by anything other than the default score descending.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-16 Thread GitBox

cbuescher commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335389135
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/util/fst/Util.java
 ##
 @@ -460,11 +460,6 @@ public void addStartPaths(FST.Arc node, T startOutput, 
boolean allowEmptyStri
   continue;
 }
 
-if (results.size() == topN-1 && maxQueueDepth == topN) {
-  // Last path -- don't bother w/ queue anymore:
-  queue = null;
 
 Review comment:
   As far as I understand this optimization assumes we surely accept (and 
collect) the path later in L516s acceptResult(), which always seems to be the 
case for collectors that don't reject, but if the collector that is eventually 
called via NRTSuggesters acceptResult() chooses to reject this option, we were 
losing expected results. This surfaced in the prefix completion tests I added. 
@jimczi might be able to explain this a bit better than me.
   
   > Have you run the suggest benchmarks to see if removing this opto hurt 
performance?
   
   No, where are they and how can I run them?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-16 Thread GitBox

cbuescher commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335389135
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/util/fst/Util.java
 ##
 @@ -460,11 +460,6 @@ public void addStartPaths(FST.Arc node, T startOutput, 
boolean allowEmptyStri
   continue;
 }
 
-if (results.size() == topN-1 && maxQueueDepth == topN) {
-  // Last path -- don't bother w/ queue anymore:
-  queue = null;
 
 Review comment:
   As far as I understand this optimization assumes we surely accept (and 
collect) the path later in L516s acceptResult(), which always seems to be the 
case for collectors that don't reject, but if the collector that is eventually 
called via NRTSuggesters acceptResult() chooses to reject this option, we were 
losing expected results. This surfaced in the prefix completion tests I added. 
@jimczi might be able to explain this a bit better than me.
   
   > Have you run the suggest benchmarks to see if removing this opto hurt 
performance?
   No, where are they and how can I run them?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-8928) BKDWriter could make splitting decisions based on the actual range of values

2019-10-16 Thread Ignacio Vera (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera resolved LUCENE-8928.
--
Fix Version/s: 8.4
 Assignee: Ignacio Vera
   Resolution: Fixed

> BKDWriter could make splitting decisions based on the actual range of values
> 
>
> Key: LUCENE-8928
> URL: https://issues.apache.org/jira/browse/LUCENE-8928
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Ignacio Vera
>Priority: Minor
> Fix For: 8.4
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently BKDWriter assumes that splitting on one dimension has no effect on 
> values in other dimensions. While this may be ok for geo points, this is 
> usually not true for ranges (or geo shapes, which are ranges too). Maybe we 
> could get better indexing by re-computing the range of values on each 
> dimension before making the choice of the split dimension?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-8746) Make EdgeTree (aka ComponentTree) support different type of components

2019-10-16 Thread Ignacio Vera (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera resolved LUCENE-8746.
--
Fix Version/s: 8.4
 Assignee: Ignacio Vera
   Resolution: Fixed

Thanks [~jpountz] for muting the terst. I have pushed fix ad it seems test are 
happy. The use was related to the order of the edges of decoded triangles. This 
is something that Lucene-8997 should improve.

> Make EdgeTree (aka ComponentTree) support different type of components
> --
>
> Key: LUCENE-8746
> URL: https://issues.apache.org/jira/browse/LUCENE-8746
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ignacio Vera
>Assignee: Ignacio Vera
>Priority: Major
> Fix For: 8.4
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently the class {{EdgeTree}} is a bit confusing as it is in reality a 
> tree of components. The inner class {{Edge}} is the one that builds a tree of 
> edges which is used by Polygon2D and Line2D to represent their structure.
> Here is proposed:
> 1) Create a new class called {{ComponentTree}} which is in fact the current 
> {{EdgeTree}}
> 2) Modify {{EdgeTree}} to be in fact the inner class Edge
> 3) Extract a {{Component}} interface so we can have different types of 
> components in the same tree. This allow us to support heterogeneous trees of 
> components.
> 4) Make {{Polygon2D}} and {{Line2D}} instance of the component interface.
> 4) With this change, {{LatLonShapePolygonQuery}} and {{LatLonShapeLineQuery}} 
> can be replaced with one {{LatLonShapeComponentQuery.}}  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13850) Atomic Updates with PreAnalyzedField

2019-10-16 Thread Oleksandr Drapushko (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Drapushko updated SOLR-13850:
---
Description: 
If you try to update non pre-analyzed fields in a document using atomic 
updates, data in pre-analyzed fields (if there is any) will be lost.

*Steps to reproduce*

1. Index this document into techproducts
{code:json}
{
  "id": "a",
  "n_s": "s1",
  "pre": 
"{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"
}
{code}

2. Query the document
{code:json}
{
  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
    {
      "id":"a",
      "n_s":"s1",
      "pre":"Alaska",
      "_version_":1647475215142223872}]
}}
{code}

3. Update using atomic syntax
{code:json}
{
  "add": {
    "doc": {
      "id": "a",
      "n_s": {"set": "s2"}
}}}
{code}

4. Observe the warning in solr log
UI:
{noformat}
 WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing 
pre-analyzed field 'pre'
{noformat}

solr.log:
{noformat}
WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 
x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing 
pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type 
java.lang.String, expected Map
 at 
org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86)
{noformat}

5. Query the document again
{code:json}
{
  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
    {
      "id":"a",
      "n_s":"s2",
      "_version_":1647475461695995904}]
}}
{code}

*Result*: There is no 'pre' field in the document anymore.


_My thoughts on it_

1. Data loss can be prevented if the warning will be replaced with error 
(re-throwing exception). Atomic updates for such documents still won't work, 
but updates will be explicitly rejected.

2. Solr tries to read the document from index, merge it with input document and 
re-index the document, but when it reads indexed pre-analyzed fields the format 
is different, so Solr cannot parse and re-index those fields properly.

  was:
If you try to update non pre-analyzed fields in a document using atomic 
updates, data in pre-analyzed fields (if there is any) will be lost.

 

Steps to reproduce

1. Index this document into techproducts

{
  "id": "a",
  "n_s": "s1",
  "pre": 
"\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"
}

2. Query the document

{
  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
    {
      "id":"a",
      "n_s":"s1",
      "pre":"Alaska",
      "_version_":1647475215142223872}]
}}

3. Update using atomic syntax

{
  "add": {
    "doc": {
      "id": "a",
      "n_s": \{"set": "s2"}
}}}

4. Observe the warning in solr log

UI:
WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing 
pre-analyzed field 'pre'

solr.log:
WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 
x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing 
pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type 
java.lang.String, expected Map
at 
org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86)

5. Query the document again

{
  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
    {
      "id":"a",
      "n_s":"s2",
      "_version_":1647475461695995904}]
}}

Result: There is no 'pre' field in the document anymore.

 

My thoughts on it

1. Data loss can be prevented if the warning will be replaced with error 
(re-throwing exception). Atomic updates for such documents still won't work, 
but updates will be explicitly rejected.

2. Solr tries to read the document from index, merge it with input document and 
re-index the document, but when it reads indexed pre-analyzed fields the format 
is different, so Solr cannot parse and re-index those fields properly.


> Atomic Updates with PreAnalyzedField
> 
>
> Key: SOLR-13850
> URL: https://issues.apache.org/jira/browse/SOLR-13850
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7.2, 8.2
> Environment: Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 
> (Oracle)
>Reporter: Oleksandr Drapushko
>Priority: Critical
>  Labels: AtomicUpdate
>
> If you try to update non pre-analyzed fields in a document using atomic 
> updates, data in pre-analyzed fields (if there is any) will be lost.
> *Steps to reproduce*
> 1. Index this document into techproducts
> {code:json}
> {
>   "id": "a",
>   "n_s": "s1",
>   "pre": 
> "{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"
> }
> {code}
> 2. Query the document
> {code:json}
> {
>   "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
>     {
>       "id":

[GitHub] [lucene-solr] cbuescher commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-16 Thread GitBox

cbuescher commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335372402
 
 

 ##
 File path: 
lucene/suggest/src/test/org/apache/lucene/search/suggest/document/TestPrefixCompletionQuery.java
 ##
 @@ -253,6 +263,123 @@ public void testDocFiltering() throws Exception {
 iw.close();
   }
 
+  /**
+   * Test that the correct amount of documents are collected if using a 
collector that also rejects documents.
+   */
+  public void testCollectorThatRejects() throws Exception {
+// use synonym analyzer to have multiple paths to same suggested document. 
This mock adds "dog" as synonym for "dogs"
+Analyzer analyzer = new MockSynonymAnalyzer();
+RandomIndexWriter iw = new RandomIndexWriter(random(), dir, 
iwcWithSuggestField(analyzer, "suggest_field"));
+List expectedResults = new ArrayList();
+
+for (int docCount = 10; docCount > 0; docCount--) {
+  Document document = new Document();
+  String value = "ab" + docCount + " dogs";
+  document.add(new SuggestField("suggest_field", value, docCount));
+  expectedResults.add(new Entry(value, docCount));
+  iw.addDocument(document);
+}
+
+if (rarely()) {
+  iw.commit();
+}
+
+DirectoryReader reader = iw.getReader();
+SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader);
+
+PrefixCompletionQuery query = new PrefixCompletionQuery(analyzer, new 
Term("suggest_field", "ab"));
+int topN = 5;
+
+// use a TopSuggestDocsCollector that rejects results with duplicate docIds
+TopSuggestDocsCollector collector = new TopSuggestDocsCollector(topN, 
false) {
+
+  private Set seenDocIds = new HashSet<>();
+
+  @Override
+  public boolean collect(int docID, CharSequence key, CharSequence 
context, float score) throws IOException {
+  int globalDocId = docID + docBase;
+  boolean collected = false;
+  if (seenDocIds.contains(globalDocId) == false) {
 
 Review comment:
   The collector is called multiple times with the same docID because of the 
MockSynonymAnalyzer used in the test setup which adds "dog" for "dogs", so each 
document has two completion paths. This collector is meant to de-duplicate 
this. I added a note explaining this. This is a simplified version of the 
behaviour we observe in https://github.com/elastic/elasticsearch/issues/46445.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-13850) Atomic Updates with PreAnalyzedField

2019-10-16 Thread Oleksandr Drapushko (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Drapushko updated SOLR-13850:
---
Description: 
If you try to update non pre-analyzed fields in a document using atomic 
updates, data in pre-analyzed fields (if there is any) will be lost.

 

*Steps to reproduce*

1. Index this document into techproducts

{{{}}
{{  "id": "a",}}
{{  "n_s": "s1",}}
{{  "pre": 
"\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"}}
{{}}}

2. Query the document

{{{}}
{{  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[}}
{{    {}}
{{      "id":"a",}}
{{      "n_s":"s1",}}
{{      "pre":"Alaska",}}
{{      "_version_":1647475215142223872}]}}
{{

3. Update using atomic syntax

{{{}}
{{  "add": {}}
{{    "doc": {}}
{{      "id": "a",}}
{{      "n_s": \{"set": "s2"}}}
{{}}}{{}}}{{}}}

4. Observe the warning in solr log

UI:
 WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing 
pre-analyzed field 'pre'

solr.log:
 WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 
x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing 
pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type 
java.lang.String, expected Map
 at 
org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86)

5. Query the document again

{{{}}
{{  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[}}
{{    {}}
{{      "id":"a",}}
{{      "n_s":"s2",}}
{{      "_version_":1647475461695995904}]}}
{{

*Result*: There is no 'pre' field in the document anymore.

 

_My thoughts on it_

1. Data loss can be prevented if the warning will be replaced with error 
(re-throwing exception). Atomic updates for such documents still won't work, 
but updates will be explicitly rejected.

2. Solr tries to read the document from index, merge it with input document and 
re-index the document, but when it reads indexed pre-analyzed fields the format 
is different, so Solr cannot parse and re-index those fields properly.

  was:
If you try to update non pre-analyzed fields in a document using atomic 
updates, data in pre-analyzed fields (if there is any) will be lost.


*Steps to reproduce*

1. Index this document into techproducts

{
  "id": "a",
  "n_s": "s1",
  "pre": 
"\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"
}

2. Query the document
{
  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
    {
      "id":"a",
      "n_s":"s1",
      "pre":"Alaska",
      "_version_":1647475215142223872}]
}}

3. Update using atomic syntax
{
  "add": {
    "doc": {
      "id": "a",
      "n_s": \{"set": "s2"}
}}}

4. Observe the warning in solr log
UI:
WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing 
pre-analyzed field 'pre'

solr.log:
WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 
x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing 
pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type 
java.lang.String, expected Map
at 
org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86)

5. Query the document again
{
  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
    {
      "id":"a",
      "n_s":"s2",
      "_version_":1647475461695995904}]
 }}

*Result*: There is no 'pre' field in the document anymore.


_My thoughts on it_

1. Data loss can be prevented if the warning will be replaced with error 
(re-throwing exception). Atomic updates for such documents still won't work, 
but updates will be explicitly rejected.

2. Solr tries to read the document from index, merge it with input document and 
re-index the document, but when it reads indexed pre-analyzed fields the format 
is different, so Solr cannot parse and re-index those fields properly.

Environment: Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 
(Oracle)  (was: Ubuntu 16.04 LTS, Java 8 (Zulu)

Windows 10, Java 11 (Oracle))

> Atomic Updates with PreAnalyzedField
> 
>
> Key: SOLR-13850
> URL: https://issues.apache.org/jira/browse/SOLR-13850
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7.2, 8.2
> Environment: Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 
> (Oracle)
>Reporter: Oleksandr Drapushko
>Priority: Critical
>  Labels: AtomicUpdate
>
> If you try to update non pre-analyzed fields in a document using atomic 
> updates, data in pre-analyzed fields (if there is any) will be lost.
>  
> *Steps to reproduce*
> 1. Index this document into techproducts
> {{{}}
> {{  "id": "a",}}
> {{  "n_s": "s1",}}
> {{  "pre": 
> "\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":

[jira] [Updated] (SOLR-13850) Atomic Updates with PreAnalyzedField

2019-10-16 Thread Oleksandr Drapushko (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Drapushko updated SOLR-13850:
---
Description: 
If you try to update non pre-analyzed fields in a document using atomic 
updates, data in pre-analyzed fields (if there is any) will be lost.

 

Steps to reproduce

1. Index this document into techproducts

{
  "id": "a",
  "n_s": "s1",
  "pre": 
"\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"
}

2. Query the document

{
  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
    {
      "id":"a",
      "n_s":"s1",
      "pre":"Alaska",
      "_version_":1647475215142223872}]
}}

3. Update using atomic syntax

{
  "add": {
    "doc": {
      "id": "a",
      "n_s": \{"set": "s2"}
}}}

4. Observe the warning in solr log

UI:
WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing 
pre-analyzed field 'pre'

solr.log:
WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 
x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing 
pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type 
java.lang.String, expected Map
at 
org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86)

5. Query the document again

{
  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
    {
      "id":"a",
      "n_s":"s2",
      "_version_":1647475461695995904}]
}}

Result: There is no 'pre' field in the document anymore.

 

My thoughts on it

1. Data loss can be prevented if the warning will be replaced with error 
(re-throwing exception). Atomic updates for such documents still won't work, 
but updates will be explicitly rejected.

2. Solr tries to read the document from index, merge it with input document and 
re-index the document, but when it reads indexed pre-analyzed fields the format 
is different, so Solr cannot parse and re-index those fields properly.

  was:
If you try to update non pre-analyzed fields in a document using atomic 
updates, data in pre-analyzed fields (if there is any) will be lost.

 

*Steps to reproduce*

1. Index this document into techproducts

{{{}}
{{  "id": "a",}}
{{  "n_s": "s1",}}
{{  "pre": 
"\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"}}
{{}}}

2. Query the document

{{{}}
{{  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[}}
{{    {}}
{{      "id":"a",}}
{{      "n_s":"s1",}}
{{      "pre":"Alaska",}}
{{      "_version_":1647475215142223872}]}}
{{

3. Update using atomic syntax

{{{}}
{{  "add": {}}
{{    "doc": {}}
{{      "id": "a",}}
{{      "n_s": \{"set": "s2"}}}
{{}}}{{}}}{{}}}

4. Observe the warning in solr log

UI:
 WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing 
pre-analyzed field 'pre'

solr.log:
 WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 
x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing 
pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type 
java.lang.String, expected Map
 at 
org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86)

5. Query the document again

{{{}}
{{  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[}}
{{    {}}
{{      "id":"a",}}
{{      "n_s":"s2",}}
{{      "_version_":1647475461695995904}]}}
{{

*Result*: There is no 'pre' field in the document anymore.

 

_My thoughts on it_

1. Data loss can be prevented if the warning will be replaced with error 
(re-throwing exception). Atomic updates for such documents still won't work, 
but updates will be explicitly rejected.

2. Solr tries to read the document from index, merge it with input document and 
re-index the document, but when it reads indexed pre-analyzed fields the format 
is different, so Solr cannot parse and re-index those fields properly.


> Atomic Updates with PreAnalyzedField
> 
>
> Key: SOLR-13850
> URL: https://issues.apache.org/jira/browse/SOLR-13850
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7.2, 8.2
> Environment: Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 
> (Oracle)
>Reporter: Oleksandr Drapushko
>Priority: Critical
>  Labels: AtomicUpdate
>
> If you try to update non pre-analyzed fields in a document using atomic 
> updates, data in pre-analyzed fields (if there is any) will be lost.
>  
> Steps to reproduce
> 1. Index this document into techproducts
> {
>   "id": "a",
>   "n_s": "s1",
>   "pre": 
> "\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"
> }
> 2. Query the document
> {
>   "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
>     {
>       "id":"a",
>       "n_s":"s1",
>

[jira] [Created] (SOLR-13850) Atomic Updates with PreAnalyzedField

2019-10-16 Thread Oleksandr Drapushko (Jira)

Oleksandr Drapushko created SOLR-13850:
--

 Summary: Atomic Updates with PreAnalyzedField
 Key: SOLR-13850
 URL: https://issues.apache.org/jira/browse/SOLR-13850
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 8.2, 7.7.2
 Environment: Ubuntu 16.04 LTS, Java 8 (Zulu)

Windows 10, Java 11 (Oracle)
Reporter: Oleksandr Drapushko


If you try to update non pre-analyzed fields in a document using atomic 
updates, data in pre-analyzed fields (if there is any) will be lost.


*Steps to reproduce*

1. Index this document into techproducts

{
  "id": "a",
  "n_s": "s1",
  "pre": 
"\{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"
}

2. Query the document
{
  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
    {
      "id":"a",
      "n_s":"s1",
      "pre":"Alaska",
      "_version_":1647475215142223872}]
}}

3. Update using atomic syntax
{
  "add": {
    "doc": {
      "id": "a",
      "n_s": \{"set": "s2"}
}}}

4. Observe the warning in solr log
UI:
WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing 
pre-analyzed field 'pre'

solr.log:
WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 
x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing 
pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type 
java.lang.String, expected Map
at 
org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86)

5. Query the document again
{
  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
    {
      "id":"a",
      "n_s":"s2",
      "_version_":1647475461695995904}]
 }}

*Result*: There is no 'pre' field in the document anymore.


_My thoughts on it_

1. Data loss can be prevented if the warning will be replaced with error 
(re-throwing exception). Atomic updates for such documents still won't work, 
but updates will be explicitly rejected.

2. Solr tries to read the document from index, merge it with input document and 
re-index the document, but when it reads indexed pre-analyzed fields the format 
is different, so Solr cannot parse and re-index those fields properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

1 2 >

1 - 100 of 113 matches

Mail list logo