date:20201002

[jira] [Commented] (SOLR-14659) Remove restlet from Solr

2020-10-02 Thread Noble Paul (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206601#comment-17206601
 ] 

Noble Paul commented on SOLR-14659:
---

This does not introduce any backward incompatible change. We should be able to 
commit this to 8.x 

> Remove restlet from Solr
> 
>
> Key: SOLR-14659
> URL: https://issues.apache.org/jira/browse/SOLR-14659
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Assignee: Timothy Potter
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> restlet is only used by managed resources. We can support that even without a 
> restlet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1938: SOLR-14659: Remove restlet as dependency for the ManagedResource API

2020-10-02 Thread GitBox



dsmiley commented on a change in pull request #1938:
URL: https://github.com/apache/lucene-solr/pull/1938#discussion_r499113760



##
File path: solr/core/src/java/org/apache/solr/rest/RestManager.java
##
@@ -326,44 +327,46 @@ public void doInit() throws ResourceException {
   }
 }
   }
-  
+
   if (managedResource == null) {
-if (Method.PUT.equals(getMethod()) || Method.POST.equals(getMethod())) 
{
+final String method = getSolrRequest().getHttpMethod();
+if ("PUT".equals(method) || "POST".equals(method)) {

Review comment:
   If you click the details link, it explains.  It's pretty wild what it 
suggests... it's a stretch IMO.  Good luck pulling that attack off.  Besides, 
this is the HTTP method (fixed vocab); it's not a param.
   
   Our use of Muse here is very new we haven't tweaked `.muse/config.toml` 
yet but it needs some taming.  https://docs.muse.dev/docs/configuring-muse/





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1938: SOLR-14659: Remove restlet as dependency for the ManagedResource API

2020-10-02 Thread GitBox



dsmiley commented on a change in pull request #1938:
URL: https://github.com/apache/lucene-solr/pull/1938#discussion_r499113148



##
File path: solr/core/src/java/org/apache/solr/rest/RestManager.java
##
@@ -16,15 +16,33 @@
  */
 package org.apache.solr.rest;
 
+import org.apache.solr.common.SolrException;

Review comment:
   I'd prefer you configure your IDE to keep java.* up front.  FWIW this is 
in the IntelliJ config in the project.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] thelabdude commented on pull request #1938: SOLR-14659: Remove restlet as dependency for the ManagedResource API

2020-10-02 Thread GitBox



thelabdude commented on pull request #1938:
URL: https://github.com/apache/lucene-solr/pull/1938#issuecomment-703034422


   Ok, I'm good with that, mainly just wanted some buy-in from others ;-) 
Unless there are any objections, I'll move ahead with merging to master and 8.x 
... thanks for the help @noblepaul 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14659) Remove restlet from Solr

2020-10-02 Thread Noble Paul (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206585#comment-17206585
 ] 

Noble Paul commented on SOLR-14659:
---

I wish we could get rid of the {{RestManager}} name itself. There is no "REST" 
being managed. Basically, it is just "managed resources" . It has nothing to do 
with REST.

But, we will deal with it in another ticket

> Remove restlet from Solr
> 
>
> Key: SOLR-14659
> URL: https://issues.apache.org/jira/browse/SOLR-14659
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Assignee: Timothy Potter
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> restlet is only used by managed resources. We can support that even without a 
> restlet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on pull request #1938: SOLR-14659: Remove restlet as dependency for the ManagedResource API

2020-10-02 Thread GitBox



noblepaul commented on pull request #1938:
URL: https://github.com/apache/lucene-solr/pull/1938#issuecomment-703029595


   @thelabdude  if all the tests pass, I wish this get committed to 8.x itself.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14749) Provide a clean API for cluster-level event processing

2020-10-02 Thread Noble Paul (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206584#comment-17206584
 ] 

Noble Paul commented on SOLR-14749:
---

{quote}I had the impression we are having a proper discussion, both here and on 
the PR. Your -1 means you have serious technical objections, correct? {quote}

My objections are not on the ticket. I'm objecting to the PR. I see too many 
different changes being made and this is not healthy

I request you to make focussed PRs (if required make sub JIRAs) so that others 
can make meaningful suggestions 

> Provide a clean API for cluster-level event processing
> --
>
> Key: SOLR-14749
> URL: https://issues.apache.org/jira/browse/SOLR-14749
> Project: Solr
>  Issue Type: Improvement
>  Components: AutoScaling
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
>  Labels: clean-api
> Fix For: master (9.0)
>
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> This is a companion issue to SOLR-14613 and it aims at providing a clean, 
> strongly typed API for the functionality formerly known as "triggers" - that 
> is, a component for generating cluster-level events corresponding to changes 
> in the cluster state, and a pluggable API for processing these events.
> The 8x triggers have been removed so this functionality is currently missing 
> in 9.0. However, this functionality is crucial for implementing the automatic 
> collection repair and re-balancing as the cluster state changes (nodes going 
> down / up, becoming overloaded / unused / decommissioned, etc).
> For this reason we need this API and a default implementation of triggers 
> that at least can perform automatic collection repair (maintaining the 
> desired replication factor in presence of live node changes).
> As before, the actual changes to the collections will be executed using 
> existing CollectionAdmin API, which in turn may use the placement plugins 
> from SOLR-14613.
> h3. Division of responsibility
>  * built-in Solr components (non-pluggable):
>  ** cluster state monitoring and event generation,
>  ** simple scheduler to periodically generate scheduled events
>  * plugins:
>  ** automatic collection repair on {{nodeLost}} events (provided by default)
>  ** re-balancing of replicas (periodic or on {{nodeAdded}} events)
>  ** reporting (eg. requesting additional node provisioning)
>  ** scheduled maintenance (eg. removing inactive shards after split)
> h3. Other considerations
> These plugins (unlike the placement plugins) need to execute on one 
> designated node in the cluster. Currently the easiest way to implement this 
> is to run them on the Overseer leader node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14911) Logger : UpdateLog Error Message : java.io.EOFException

2020-10-02 Thread D (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

D updated SOLR-14911:
-
Summary: Logger : UpdateLog Error Message : java.io.EOFException  (was: 
Looger : UpdateLog Error Message : java.io.EOFException)

> Logger : UpdateLog Error Message : java.io.EOFException
> ---
>
> Key: SOLR-14911
> URL: https://issues.apache.org/jira/browse/SOLR-14911
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCLI
>Affects Versions: 7.5
>Reporter: D
>Priority: Blocker
> Attachments: GC events.PNG
>
>
> *Events:*
>  # GC logs showing continuos Full GC events. Log report attached.
>  # Core filling failed , showing less data than expected.
>  # following warnings showing on dashboard before error.
> |Level|Logger|Message|
> |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to 
> managed, but the non-managed schema schema.xml is still loadable. 
> PLEASE REMOVE THIS FILE.|
> |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to 
> managed, but the non-managed schema schema.xml is still loadable. 
> PLEASE REMOVE THIS FILE.|
> |WARN false|SolrResourceLoader|Solr loaded a deprecated plugin/analysis class 
> [solr.TrieDateField]. Please consult documentation how to replace it 
> accordingly.|
> |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to 
> managed, but the non-managed schema schema.xml is still loadable. 
> PLEASE REMOVE THIS FILE.|
> |WARN false|UpdateLog|Starting log replay 
> tlog\{file=\data\tlog\tlog.0445482 refcount=2} 
> active=false starting pos=0 inSortedOrder=false|
>  * Total data in all cores around 8 GB
>  * *Other Configurations:*
>  ** -XX:+UseG1GC
>  ** -XX:+UseStringDeduplication
>  ** -XX:MaxGCPauseMillis=500
>  ** -Xms15g
>  ** -Xmx15g
>  ** -Xss256k
>  * *OS Environment :*
>  ** Windows 10,
>  ** Filling cores by calling SQL query using jtds-1.3.1 library.
>  ** Solr Version 7.5
>  ** Runtime: Oracle Corporation OpenJDK 64-Bit Server VM 11.0.2 11.0.2+9
>  ** Processors : 48
>  ** System Physical Memory : 128 GB
>  ** Swap Space : 256GB
>  * solr-spec7.5.0
>  ** solr-impl7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
> 2018-09-18 13:07:55
>  * lucene-spec7.5.0
>  ** lucene-impl7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
> 2018-09-18 13:01:1
> *Error Message :* 
> java.io.EOFException
>  at 
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:168)
>  at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:863)
>  at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:857)
>  at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:266)
>  at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
>  at 
> org.apache.solr.common.util.JavaBinCodec.readSolrInputDocument(JavaBinCodec.java:603)
>  at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:315)
>  at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
>  at org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:747)
>  at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:272)
>  at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
>  at 
> org.apache.solr.update.TransactionLog$LogReader.next(TransactionLog.java:673)
>  at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1832)
>  at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1747)
>  at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>  at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>  at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at java.base/java.lang.Thread.run(Thread.java:834)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9554) Expose pendingNumDocs from IndexWriter

2020-10-02 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206550#comment-17206550
 ] 

ASF subversion and git services commented on LUCENE-9554:
-

Commit 77396dbf33944502814bc993c3c9db84b974 in lucene-solr's branch 
refs/heads/branch_8x from Nhat Nguyen
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=77396db ]

LUCENE-9554: Expose IndexWriter#pendingNumDocs (#1941)

Some applications can use the pendingNumDocs from IndexWriter to 
estimate that the number of documents of an index is very close to the
hard limit so that it can reject writes without constructing Lucene
documents.

> Expose pendingNumDocs from IndexWriter
> --
>
> Key: LUCENE-9554
> URL: https://issues.apache.org/jira/browse/LUCENE-9554
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Nhat Nguyen
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Some applications can use the pendingNumDocs from IndexWriter to estimate 
> that the number of documents of an index is reaching to the hard limit so 
> that it can reject writes without constructing Lucene documents.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-9554) Expose pendingNumDocs from IndexWriter

2020-10-02 Thread Nhat Nguyen (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nhat Nguyen resolved LUCENE-9554.
-
Fix Version/s: 8.7
   master (9.0)
   Resolution: Fixed

> Expose pendingNumDocs from IndexWriter
> --
>
> Key: LUCENE-9554
> URL: https://issues.apache.org/jira/browse/LUCENE-9554
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Nhat Nguyen
>Priority: Minor
> Fix For: master (9.0), 8.7
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Some applications can use the pendingNumDocs from IndexWriter to estimate 
> that the number of documents of an index is reaching to the hard limit so 
> that it can reject writes without constructing Lucene documents.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dnhatn merged pull request #1944: LUCENE-9554: Expose IndexWriter#pendingNumDocs

2020-10-02 Thread GitBox



dnhatn merged pull request #1944:
URL: https://github.com/apache/lucene-solr/pull/1944


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dnhatn opened a new pull request #1944: LUCENE-9554: Expose IndexWriter#pendingNumDocs

2020-10-02 Thread GitBox



dnhatn opened a new pull request #1944:
URL: https://github.com/apache/lucene-solr/pull/1944


   Some applications can use the pendingNumDocs from IndexWriter to 
   estimate that the number of documents of an index is very close to the
   hard limit so that it can reject writes without constructing Lucene
   documents.
   
   Backport of #1941



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9554) Expose pendingNumDocs from IndexWriter

2020-10-02 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206528#comment-17206528
 ] 

ASF subversion and git services commented on LUCENE-9554:
-

Commit 7e04e4d0ca3951e90c064f2fd04ca89772080c91 in lucene-solr's branch 
refs/heads/master from Nhat Nguyen
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7e04e4d ]

LUCENE-9554: Expose IndexWriter#pendingNumDocs (#1941)

Some applications can use the pendingNumDocs from IndexWriter to 
estimate that the number of documents of an index is very close to the
hard limit so that it can reject writes without constructing Lucene
documents.

> Expose pendingNumDocs from IndexWriter
> --
>
> Key: LUCENE-9554
> URL: https://issues.apache.org/jira/browse/LUCENE-9554
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Nhat Nguyen
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Some applications can use the pendingNumDocs from IndexWriter to estimate 
> that the number of documents of an index is reaching to the hard limit so 
> that it can reject writes without constructing Lucene documents.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dnhatn merged pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs

2020-10-02 Thread GitBox



dnhatn merged pull request #1941:
URL: https://github.com/apache/lucene-solr/pull/1941


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dnhatn commented on pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs

2020-10-02 Thread GitBox



dnhatn commented on pull request #1941:
URL: https://github.com/apache/lucene-solr/pull/1941#issuecomment-702967476


   Thanks Mike!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14766) Deprecate ManagedResources from Solr

2020-10-02 Thread Timothy Potter (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206513#comment-17206513
 ] 

Timothy Potter commented on SOLR-14766:
---

I moved the work over to SOLR-14659. The issue with removing the 
ManagedResources framework is that LTR contrib relies on it for model / feature 
management. Of course we can change the LTR code to do something different but 
for now it seems sufficient to keep ManagedResources but w/o Restlet.

> Deprecate ManagedResources from Solr
> 
>
> Key: SOLR-14766
> URL: https://issues.apache.org/jira/browse/SOLR-14766
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Timothy Potter
>Priority: Major
>  Labels: deprecation
> Attachments: SOLR-14766.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This feature has the following problems. 
> * It's insecure because it is using restlet
> * Nobody knows that code enough to even remove the restlet dependency
> * Restlest dependency on Solr exists just because of this
> We should deprecate this from 8.7 and remove it from master



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14659) Remove restlet from Solr

2020-10-02 Thread Timothy Potter (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter updated SOLR-14659:
--
Status: Patch Available  (was: Open)

> Remove restlet from Solr
> 
>
> Key: SOLR-14659
> URL: https://issues.apache.org/jira/browse/SOLR-14659
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Assignee: Timothy Potter
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> restlet is only used by managed resources. We can support that even without a 
> restlet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-10321) Unified highlighter returns empty fields when using glob

2020-10-02 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206507#comment-17206507
 ] 

David Smiley commented on SOLR-10321:
-

Looking back at this... I think I'm inclined to just drop the empty snippet 
arrays.  It wasn't and still won't be known if the document has a field value 
just by looking at the highlighting output, but it's better than needless empty 
output.
I filed a LUCENE side issue for the UH to address this, but I'd rather wait to 
address it before a real user comes along instead of just a hypothetical need.

> Unified highlighter returns empty fields when using glob
> 
>
> Key: SOLR-10321
> URL: https://issues.apache.org/jira/browse/SOLR-10321
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 6.4.2
>Reporter: Markus Jelsma
>Priority: Minor
> Fix For: 7.0
>
>
> {code}
> q=lama=unified=content_*
> {code}
> returns:
> {code}
>name="http://www.nu.nl/weekend/3771311/dalai-lama-inspireert-westen.html;>
> 
> 
>   Nobelprijs Voorafgaand aan zijn bezoek aan Nederland is de dalai 
> emlama/em in Noorwegen om te vieren dat 25 jaar geleden de 
> Nobelprijs voor de Vrede aan hem werd toegekend. Anders dan in Nederland 
> wordt de dalai emlama/em niet ontvangen in het Noorse 
> parlement. 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>   
> {code}
> FastVector and original do not emit: 
> {code}
> 
> 
> 
> 
> 
> 
> 
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dsmiley commented on pull request #1942: SOLR-14910: Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard, change //logok to //nowarn

2020-10-02 Thread GitBox



dsmiley commented on pull request #1942:
URL: https://github.com/apache/lucene-solr/pull/1942#issuecomment-702948130


   >  it was often an error to have logging calls with exception.getMessage() 
   
   Ah; right, I agree with this very much and I thought I proposed this rule.
   
   > WDYT about adding a reference to SOLR-14523 to the output message? 
   
   to the validation rule error?  Yes; I think referring to JIRA issues is 
great -- gives people a place to go to learn/understand.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9556) UnifiedHighlighter: distinguish no field from no passages in field

2020-10-02 Thread David Smiley (Jira)

David Smiley created LUCENE-9556:


 Summary: UnifiedHighlighter: distinguish no field from no passages 
in field
 Key: LUCENE-9556
 URL: https://issues.apache.org/jira/browse/LUCENE-9556
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Reporter: David Smiley


The UnifiedHighlighter does not distinguish between highlighting a field that 
is absent on a document, and highlighting a field that is present but no 
passages were found (can happen for a variety of reasons) -- both cases produce 
null.  While not a huge deal in general, it's an annoyance.  It can be useful 
for the user/client to detect the content was present but not highlightable so 
that it might take some default action, like returning the whole field value 
(possibly to process in some way) or highlighting some other field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-10-02 Thread GitBox



murblanc commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r498982949



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/impl/CollectionsRepairEventListener.java
##
@@ -0,0 +1,185 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.cluster.events.impl;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.atomic.AtomicInteger;
+
+import org.apache.solr.client.solrj.SolrClient;
+import org.apache.solr.client.solrj.cloud.SolrCloudManager;
+import org.apache.solr.client.solrj.request.CollectionAdminRequest;
+import org.apache.solr.cloud.api.collections.Assign;
+import org.apache.solr.cluster.events.ClusterEvent;
+import org.apache.solr.cluster.events.ClusterEventListener;
+import org.apache.solr.cluster.events.NodesDownEvent;
+import org.apache.solr.cluster.events.ReplicasDownEvent;
+import org.apache.solr.common.cloud.ClusterState;
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.common.cloud.ReplicaPosition;
+import org.apache.solr.core.CoreContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * This is an illustration how to re-implement the combination of 8x
+ * NodeLostTrigger and AutoAddReplicasPlanAction to maintain the collection's 
replication factor.
+ * NOTE: there's no support for 'waitFor' yet.
+ * NOTE 2: this functionality would be probably more reliable when executed 
also as a
+ * periodically scheduled check - both as a reactive (listener) and proactive 
(scheduled) measure.
+ */
+public class CollectionsRepairEventListener implements ClusterEventListener {

Review comment:
   Is this class registered somewhere to be actually called? If so I missed 
that part.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-14911) Looger : UpdateLog Error Message : java.io.EOFException

2020-10-02 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-14911.
---
Resolution: Invalid

Please raise questions like this on the user's list, we try to reserve JIRAs 
for known bugs/enhancements rather than usage questions. The JIRA system is not 
a support portal.

See: 
http://lucene.apache.org/solr/community.html#mailing-lists-irc there are links 
to both Lucene and Solr mailing lists there.

A _lot_ more people will see your question on that list and may be able to help 
more quickly.

You might want to review: 
https://wiki.apache.org/solr/UsingMailingLists

If it's determined that this really is a code issue or enhancement to Lucene or 
Solr and not a configuration/usage problem, we can raise a new JIRA or reopen 
this one.



> Looger : UpdateLog Error Message : java.io.EOFException
> ---
>
> Key: SOLR-14911
> URL: https://issues.apache.org/jira/browse/SOLR-14911
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCLI
>Affects Versions: 7.5
>Reporter: D
>Priority: Blocker
> Attachments: GC events.PNG
>
>
> *Events:*
>  # GC logs showing continuos Full GC events. Log report attached.
>  # Core filling failed , showing less data than expected.
>  # following warnings showing on dashboard before error.
> |Level|Logger|Message|
> |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to 
> managed, but the non-managed schema schema.xml is still loadable. 
> PLEASE REMOVE THIS FILE.|
> |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to 
> managed, but the non-managed schema schema.xml is still loadable. 
> PLEASE REMOVE THIS FILE.|
> |WARN false|SolrResourceLoader|Solr loaded a deprecated plugin/analysis class 
> [solr.TrieDateField]. Please consult documentation how to replace it 
> accordingly.|
> |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to 
> managed, but the non-managed schema schema.xml is still loadable. 
> PLEASE REMOVE THIS FILE.|
> |WARN false|UpdateLog|Starting log replay 
> tlog\{file=\data\tlog\tlog.0445482 refcount=2} 
> active=false starting pos=0 inSortedOrder=false|
>  * Total data in all cores around 8 GB
>  * *Other Configurations:*
>  ** -XX:+UseG1GC
>  ** -XX:+UseStringDeduplication
>  ** -XX:MaxGCPauseMillis=500
>  ** -Xms15g
>  ** -Xmx15g
>  ** -Xss256k
>  * *OS Environment :*
>  ** Windows 10,
>  ** Filling cores by calling SQL query using jtds-1.3.1 library.
>  ** Solr Version 7.5
>  ** Runtime: Oracle Corporation OpenJDK 64-Bit Server VM 11.0.2 11.0.2+9
>  ** Processors : 48
>  ** System Physical Memory : 128 GB
>  ** Swap Space : 256GB
>  * solr-spec7.5.0
>  ** solr-impl7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
> 2018-09-18 13:07:55
>  * lucene-spec7.5.0
>  ** lucene-impl7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
> 2018-09-18 13:01:1
> *Error Message :* 
> java.io.EOFException
>  at 
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:168)
>  at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:863)
>  at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:857)
>  at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:266)
>  at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
>  at 
> org.apache.solr.common.util.JavaBinCodec.readSolrInputDocument(JavaBinCodec.java:603)
>  at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:315)
>  at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
>  at org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:747)
>  at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:272)
>  at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
>  at 
> org.apache.solr.update.TransactionLog$LogReader.next(TransactionLog.java:673)
>  at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1832)
>  at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1747)
>  at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>  at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>  at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
>

[jira] [Created] (SOLR-14911) Looger : UpdateLog Error Message : java.io.EOFException

2020-10-02 Thread D (Jira)

D created SOLR-14911:


 Summary: Looger : UpdateLog Error Message : java.io.EOFException
 Key: SOLR-14911
 URL: https://issues.apache.org/jira/browse/SOLR-14911
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrCLI
Affects Versions: 7.5
Reporter: D
 Attachments: GC events.PNG

*Events:*
 # GC logs showing continuos Full GC events. Log report attached.
 # Core filling failed , showing less data than expected.
 # following warnings showing on dashboard before error.

|Level|Logger|Message|
|WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to 
managed, but the non-managed schema schema.xml is still loadable. PLEASE 
REMOVE THIS FILE.|
|WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to 
managed, but the non-managed schema schema.xml is still loadable. PLEASE 
REMOVE THIS FILE.|
|WARN false|SolrResourceLoader|Solr loaded a deprecated plugin/analysis class 
[solr.TrieDateField]. Please consult documentation how to replace it 
accordingly.|
|WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to 
managed, but the non-managed schema schema.xml is still loadable. PLEASE 
REMOVE THIS FILE.|
|WARN false|UpdateLog|Starting log replay 
tlog\{file=\data\tlog\tlog.0445482 refcount=2} active=false 
starting pos=0 inSortedOrder=false|
 * Total data in all cores around 8 GB
 * *Other Configurations:*
 ** -XX:+UseG1GC
 ** -XX:+UseStringDeduplication
 ** -XX:MaxGCPauseMillis=500
 ** -Xms15g
 ** -Xmx15g
 ** -Xss256k
 * *OS Environment :*
 ** Windows 10,
 ** Filling cores by calling SQL query using jtds-1.3.1 library.
 ** Solr Version 7.5
 ** Runtime: Oracle Corporation OpenJDK 64-Bit Server VM 11.0.2 11.0.2+9
 ** Processors : 48
 ** System Physical Memory : 128 GB
 ** Swap Space : 256GB
 * solr-spec7.5.0
 ** solr-impl7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
2018-09-18 13:07:55
 * lucene-spec7.5.0
 ** lucene-impl7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
2018-09-18 13:01:1

*Error Message :* 

java.io.EOFException
 at 
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:168)
 at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:863)
 at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:857)
 at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:266)
 at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
 at 
org.apache.solr.common.util.JavaBinCodec.readSolrInputDocument(JavaBinCodec.java:603)
 at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:315)
 at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
 at org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:747)
 at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:272)
 at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
 at 
org.apache.solr.update.TransactionLog$LogReader.next(TransactionLog.java:673)
 at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1832)
 at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1747)
 at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
 at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
 at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
 at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
 at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 at java.base/java.lang.Thread.run(Thread.java:834)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-10-02 Thread GitBox



murblanc commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r498957288



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/ClusterEventProducer.java
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.cluster.events;
+
+import org.apache.solr.cloud.ClusterSingleton;
+
+import java.util.Collections;
+import java.util.Map;
+import java.util.Objects;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+
+/**
+ * Component that produces {@link ClusterEvent} instances.
+ */
+public interface ClusterEventProducer extends ClusterSingleton {
+
+  String PLUGIN_NAME = "clusterEventProducer";
+
+  default String getName() {
+return PLUGIN_NAME;
+  }
+
+  /**
+   * Returns a modifiable map of event types and listeners to process events
+   * of a given type.
+   */
+  Map> getEventListeners();
+
+  /**
+   * Register an event listener for processing the specified event types.
+   * @param listener non-null listener. If the same instance of the listener is
+   * already registered it will be ignored.
+   * @param eventTypes non-empty array of event types that this listener
+   *   is being registered for. If this is null or empty then 
all types will be used.
+   */
+  default void registerListener(ClusterEventListener listener, 
ClusterEvent.EventType... eventTypes) throws Exception {
+Objects.requireNonNull(listener);
+if (eventTypes == null || eventTypes.length == 0) {
+  eventTypes = ClusterEvent.EventType.values();
+}
+for (ClusterEvent.EventType type : eventTypes) {
+  Set perType = 
getEventListeners().computeIfAbsent(type, t -> ConcurrentHashMap.newKeySet());
+  perType.add(listener);
+}
+  }
+
+  /**
+   * Unregister an event listener.
+   * @param listener non-null listener.
+   */
+  default void unregisterListener(ClusterEventListener listener) {
+Objects.requireNonNull(listener);
+getEventListeners().forEach((type, listeners) -> {
+  listeners.remove(listener);
+});
+  }
+
+  /**
+   * Unregister an event listener for specified event types.
+   * @param listener non-null listener.
+   * @param eventTypes event types from which the listener will be 
unregistered. If this
+   *   is null or empty then all event types will be used
+   */
+  default void unregisterListener(ClusterEventListener listener, 
ClusterEvent.EventType... eventTypes) {
+Objects.requireNonNull(listener);
+if (eventTypes == null || eventTypes.length == 0) {
+  eventTypes = ClusterEvent.EventType.values();
+}
+for (ClusterEvent.EventType type : eventTypes) {
+  getEventListeners()
+  .getOrDefault(type, Collections.emptySet())
+  .remove(listener);
+}
+  }
+
+  /**
+   * Fire an event. This method will call registered listeners that subscribed 
to the
+   * type of event being passed.
+   * @param event cluster event
+   */
+  default void fireEvent(ClusterEvent event) {

Review comment:
   Firing (publishing) events is definitely needed in solr core where 
events are created.
   Do we need to couple the publishing of events and managing the consumption 
of events as done here, or maybe have a simple publishing interface (with this 
single method) and a separate one for implementing the even distribution logic?
   This could eventually allow keeping in Solr core the things that really 
belong there (creating events related to solr internal state change and 
publishing them) while allowing to move the event bus logic outside of solr 
core.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dnhatn commented on a change in pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs

2020-10-02 Thread GitBox



dnhatn commented on a change in pull request #1941:
URL: https://github.com/apache/lucene-solr/pull/1941#discussion_r498957345



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -5463,6 +5463,13 @@ private void tooManyDocs(long addedNumDocs) {
 throw new IllegalArgumentException("number of documents in the index 
cannot exceed " + actualMaxDocs + " (current document count is " + 
pendingNumDocs.get() + "; added numDocs is " + addedNumDocs + ")");
   }
 
+  /**
+   * Returns the number of documents in the index including documents are 
being added (i.e., reserved).
+   */

Review comment:
   ++. I pushed 
https://github.com/apache/lucene-solr/pull/1941/commits/61d4fbc4a01cbfc50f386a8a88064e82d1027ee0





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] muse-dev[bot] commented on a change in pull request #1938: SOLR-14659: Remove restlet as dependency for the ManagedResource API

2020-10-02 Thread GitBox



muse-dev[bot] commented on a change in pull request #1938:
URL: https://github.com/apache/lucene-solr/pull/1938#discussion_r498956758



##
File path: solr/core/src/java/org/apache/solr/rest/RestManager.java
##
@@ -160,6 +156,10 @@ private Pattern getReservedEndpointsPattern() {
   return Pattern.compile(builder.toString());
 }
 
+public boolean isPathRegistered(String s) {
+  return registered.containsKey(s);

Review comment:
   *THREAD_SAFETY_VIOLATION:*  Read/Write race. Non-private method 
`RestManager$Registry.isPathRegistered(...)` reads without synchronization from 
container `this.registered` via call to `Map.containsKey(...)`. Potentially 
races with write in method `RestManager$Registry.registerManagedResource(...)`.
Reporting because another access to the same memory occurs on a background 
thread, although this access may not.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-10-02 Thread GitBox



murblanc commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r498954021



##
File path: solr/core/src/java/org/apache/solr/cluster/events/ClusterEvent.java
##
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.cluster.events;
+
+import org.apache.solr.common.MapWriter;
+
+import java.io.IOException;
+import java.time.Instant;
+
+/**
+ * Cluster-level event.
+ */
+public interface ClusterEvent extends MapWriter {

Review comment:
   The `MapWriter` related machinery (including the `writeMap` method) 
doesn't belong in the interfaces IMO. It should be hidden in the implementation 
if it's needed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14749) Provide a clean API for cluster-level event processing

2020-10-02 Thread Andrzej Bialecki (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206361#comment-17206361
 ] 

Andrzej Bialecki commented on SOLR-14749:
-

{quote}The PR is too complex.
{quote}
What part is too complex? The proposed interfaces or the proof-of-concept 
implementation? The APIs are quite simple IMHO.
{quote}please do not make backward incompatible changes
{quote}
Sure, I reverted that change, although it would have been simple to accommodate 
the name change so that we preserve back-compat, and the current name is 
illogical - {{/cluster/plugin}} vs. {{/cluster/plugins}} when the location 
keeps information about multiple plugins. In any case, it's a trivial change so 
let's not focus on this.
{quote}Here is my official -1 on this PR. We would like to have a proper 
discussion
{quote}
I had the impression we are having a proper discussion, both here and on the 
PR. Your -1 means you have serious technical objections, correct? If so then 
please explain in detail your objections to the proposed implementation, having 
in mind the goal and scope of this issue. If you have in mind an alternative 
approach that can satisfy these goals then I'm all ears.

> Provide a clean API for cluster-level event processing
> --
>
> Key: SOLR-14749
> URL: https://issues.apache.org/jira/browse/SOLR-14749
> Project: Solr
>  Issue Type: Improvement
>  Components: AutoScaling
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
>  Labels: clean-api
> Fix For: master (9.0)
>
>  Time Spent: 14h 20m
>  Remaining Estimate: 0h
>
> This is a companion issue to SOLR-14613 and it aims at providing a clean, 
> strongly typed API for the functionality formerly known as "triggers" - that 
> is, a component for generating cluster-level events corresponding to changes 
> in the cluster state, and a pluggable API for processing these events.
> The 8x triggers have been removed so this functionality is currently missing 
> in 9.0. However, this functionality is crucial for implementing the automatic 
> collection repair and re-balancing as the cluster state changes (nodes going 
> down / up, becoming overloaded / unused / decommissioned, etc).
> For this reason we need this API and a default implementation of triggers 
> that at least can perform automatic collection repair (maintaining the 
> desired replication factor in presence of live node changes).
> As before, the actual changes to the collections will be executed using 
> existing CollectionAdmin API, which in turn may use the placement plugins 
> from SOLR-14613.
> h3. Division of responsibility
>  * built-in Solr components (non-pluggable):
>  ** cluster state monitoring and event generation,
>  ** simple scheduler to periodically generate scheduled events
>  * plugins:
>  ** automatic collection repair on {{nodeLost}} events (provided by default)
>  ** re-balancing of replicas (periodic or on {{nodeAdded}} events)
>  ** reporting (eg. requesting additional node provisioning)
>  ** scheduled maintenance (eg. removing inactive shards after split)
> h3. Other considerations
> These plugins (unlike the placement plugins) need to execute on one 
> designated node in the cluster. Currently the easiest way to implement this 
> is to run them on the Overseer leader node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank

2020-10-02 Thread GitBox



cpoerschke commented on a change in pull request #1571:
URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r498948018



##
File path: 
solr/contrib/ltr/src/java/org/apache/solr/ltr/search/LTRQParserPlugin.java
##
@@ -146,93 +149,114 @@ public LTRQParser(String qstr, SolrParams localParams, 
SolrParams params,
 @Override
 public Query parse() throws SyntaxError {
   // ReRanking Model
-  final String modelName = localParams.get(LTRQParserPlugin.MODEL);
-  if ((modelName == null) || modelName.isEmpty()) {
+  final String[] modelNames = 
localParams.getParams(LTRQParserPlugin.MODEL);
+  if ((modelNames == null) || modelNames.length==0 || 
modelNames[0].isEmpty()) {
 throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
 "Must provide model in the request");
   }
-
-  final LTRScoringModel ltrScoringModel = mr.getModel(modelName);
-  if (ltrScoringModel == null) {
-throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
-"cannot find " + LTRQParserPlugin.MODEL + " " + modelName);
-  }
-
-  final String modelFeatureStoreName = 
ltrScoringModel.getFeatureStoreName();
-  final boolean extractFeatures = 
SolrQueryRequestContextUtils.isExtractingFeatures(req);
-  final String fvStoreName = 
SolrQueryRequestContextUtils.getFvStoreName(req);
-  // Check if features are requested and if the model feature store and 
feature-transform feature store are the same
-  final boolean featuresRequestedFromSameStore = 
(modelFeatureStoreName.equals(fvStoreName) || fvStoreName == null) ? 
extractFeatures:false;
-  if (threadManager != null) {
-
threadManager.setExecutor(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
-  }
-  final LTRScoringQuery scoringQuery = new LTRScoringQuery(ltrScoringModel,
-  extractEFIParams(localParams),
-  featuresRequestedFromSameStore, threadManager);
-
-  // Enable the feature vector caching if we are extracting features, and 
the features
-  // we requested are the same ones we are reranking with
-  if (featuresRequestedFromSameStore) {
-scoringQuery.setFeatureLogger( 
SolrQueryRequestContextUtils.getFeatureLogger(req) );
+ 
+  LTRScoringQuery[] rerankingQueries = new 
LTRScoringQuery[modelNames.length];
+  for (int i = 0; i < modelNames.length; i++) {
+final LTRScoringQuery rerankingQuery;
+if (!ORIGINAL_RANKING.equals(modelNames[i])) {

Review comment:
   Ah, good point about the special "OriginalRanking" also appearing in the 
"[interleaving]" transformer!
   
   When using interleaving there's always at least two models to be 
interleaved, right?
   
   The models could all be actual models or one of them could be the 
"OriginalRanking" pseudo-model.
   
   I wonder if class inheritance might help us e.g.
   
   ```
   class InterleavingLTRQParserPlugin extends LTRQParserPlugin
   ```
   
   and (say)
   
   ```
   
   
   ```
   
   where
   
   ```
   rq={!iltr model=myModelXYZ}
   ```
   
   can be a convenience equivalent to
   
   ```
   rq={!iltr model=OriginalRanking model=myModelXYZ}
   ```
   
   i.e. if only one model is supplied then it is implied that the second model 
is the original ranking.
   
   And if the special "OriginalRanking" name doesn't suit someone (either 
because they already have a real model that happens to be called 
"OriginalRanking" or because they would prefer a different descriptor in the 
"[interleaving]" transformer output) then something like
   
   ```
   rq={!iltr originalRankingModel=noModel model=noModel model=someModel}
   ```
   
   would allow them to call the "OriginalRanking" something else e.g. "noModel" 
instead.
   
   
   We could even reject any "OriginalRanking" models that are actual models via 
something like
   
   ```
   final String originalRankingModelName = 
localParams.get("originalRankingModel", "OriginalRanking" /* default */);
   if (null != mr.getModel(originalRankingModelName)) {
 throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
 "Found an actual '" + originalRankingModelName + 
"' model, please ...
   }
   ```
   
   
   
   However, the "[interleaving]" transformer still needs to know about the 
special name, hmm.
   
   
   At the moment we have
   
   ```
   public static boolean isOriginalRanking(LTRScoringQuery rerankingQuery){
 return rerankingQuery.getScoringModel() == null;
   }
   ```
   
   in LTRQParserPlugin and in LTRInterleavingTransformerFactory
   
   ```
   if (isOriginalRanking(rerankingQuery)) {
 doc.addField(name, ORIGINAL_RANKING);
   } else {
 doc.addField(name, rerankingQuery.getScoringModel().getName());
   }
   ```
   
   and if we gave LTRScoringQuery a getScoringModelName method
   
   ```
   public String getScoringModelName() {
 return ltrScoringModel.getName();
   }
   ```

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs

2020-10-02 Thread GitBox



mikemccand commented on a change in pull request #1941:
URL: https://github.com/apache/lucene-solr/pull/1941#discussion_r498937003



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -5463,6 +5463,13 @@ private void tooManyDocs(long addedNumDocs) {
 throw new IllegalArgumentException("number of documents in the index 
cannot exceed " + actualMaxDocs + " (current document count is " + 
pendingNumDocs.get() + "; added numDocs is " + addedNumDocs + ")");
   }
 
+  /**
+   * Returns the number of documents in the index including documents are 
being added (i.e., reserved).
+   */

Review comment:
   Maybe add `@lucene.experimental`?  We are exposing (slightly) internal 
details about `IndexWriter` so maybe we need to reserve the right to change 
this API in the future ...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dnhatn commented on pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs

2020-10-02 Thread GitBox



dnhatn commented on pull request #1941:
URL: https://github.com/apache/lucene-solr/pull/1941#issuecomment-702837408


   > I was a little confused by why we separate negative and non-negative cases 
in IndexWriter...
   
   @mikemccand Thank you for looking.  I reverted that change and will make it 
in a separate PR



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4510) when a test's heart beats it should also throw up (dump stack of all threads)

2020-10-02 Thread Michael McCandless (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206317#comment-17206317
 ] 

Michael McCandless commented on LUCENE-4510:


Thanks [~dweiss], [~rcmuir] and [~uschindler]!  I will attempt all of the above 
solutions when I next need it again :)

> when a test's heart beats it should also throw up (dump stack of all threads)
> -
>
> Key: LUCENE-4510
> URL: https://issues.apache.org/jira/browse/LUCENE-4510
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Michael McCandless
>Assignee: Dawid Weiss
>Priority: Major
>
> We've had numerous cases where tests were hung but the "operator" of that 
> particular Jenkins instance struggles to properly get a stack dump for all 
> threads and eg accidentally kills the process instead (rather awful that the 
> same powerful tool "kill" can be used to get stack traces and to destroy the 
> process...).
> Is there some way the test infra could do this for us, eg when it prints the 
> HEARTBEAT message?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] thelabdude commented on pull request #1938: SOLR-14659: Remove restlet as dependency for the ManagedResource API

2020-10-02 Thread GitBox



thelabdude commented on pull request #1938:
URL: https://github.com/apache/lucene-solr/pull/1938#issuecomment-702832996


   @noblepaul so you had SOLR-14659 with fix version 8.7. Given all the tests 
pass, I think we could apply these changes for 8.7 but I'd be more comfortable 
if we just left 8.x as-is and apply these changes to 9.x ... wdyt?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-10-02 Thread Michael McCandless (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206314#comment-17206314
 ] 

Michael McCandless commented on LUCENE-9444:


[~goankur] can we re-resolve this one now?

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Fix For: master (9.0), 8.7
>
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova commented on pull request #1943: LUCENE-9555 Ensure scorerIterator is fresh for opt

2020-10-02 Thread GitBox



mayya-sharipova commented on pull request #1943:
URL: https://github.com/apache/lucene-solr/pull/1943#issuecomment-702831820


   This patch also addresses the failure of 
`TestUnifiedHighlighterStrictPhrases.testBasics`.
   This test was using `ReqExclBulkScorer` that lead to `scorerIterator` being 
already on max document when we were trying to make a conjunction between 
`scorerIterator` and `collectorIterator`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14659) Remove restlet from Solr

2020-10-02 Thread Timothy Potter (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206313#comment-17206313
 ] 

Timothy Potter commented on SOLR-14659:
---

I'm changing the fix version to 9.0 for this effort.

> Remove restlet from Solr
> 
>
> Key: SOLR-14659
> URL: https://issues.apache.org/jira/browse/SOLR-14659
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Assignee: Timothy Potter
>Priority: Major
> Fix For: master (9.0)
>
>
> restlet is only used by managed resources. We can support that even without a 
> restlet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-14659) Remove restlet from Solr

2020-10-02 Thread Timothy Potter (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter reassigned SOLR-14659:
-

Assignee: Timothy Potter

> Remove restlet from Solr
> 
>
> Key: SOLR-14659
> URL: https://issues.apache.org/jira/browse/SOLR-14659
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Assignee: Timothy Potter
>Priority: Major
> Fix For: 8.7
>
>
> restlet is only used by managed resources. We can support that even without a 
> restlet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14659) Remove restlet from Solr

2020-10-02 Thread Timothy Potter (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter updated SOLR-14659:
--
Fix Version/s: (was: 8.7)
   master (9.0)

> Remove restlet from Solr
> 
>
> Key: SOLR-14659
> URL: https://issues.apache.org/jira/browse/SOLR-14659
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Assignee: Timothy Potter
>Priority: Major
> Fix For: master (9.0)
>
>
> restlet is only used by managed resources. We can support that even without a 
> restlet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova opened a new pull request #1943: LUCENE-9555 Ensure scorerIterator is fresh for opt

2020-10-02 Thread GitBox



mayya-sharipova opened a new pull request #1943:
URL: https://github.com/apache/lucene-solr/pull/1943


   Some collectors provide iterators that can efficiently skip
   non-competitive docs. When using DefaultBulkScorer#score function
   we create a conjunction of scorerIterator and collectorIterator.
   As collectorIterator always starts from a docID = -1,
   and for creation of conjunction iterator we need all of its
   sub-iterators to be on the same doc, the creation of conjuction
   iterator will fail if scorerIterator has already been advanced
   to some other document.
   
   This patch ensures that we create conjunction between scorerIterator
   and collectorIterator only if scorerIterator has not been advanced yet.
   
   Relates to #1725
   Relates to #1937



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9555) Sort optimization failure if scorerIterator is already advanced

2020-10-02 Thread Mayya Sharipova (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayya Sharipova updated LUCENE-9555:

Description: 
Some collectors provide iterators that can efficiently skip non-competitive 
docs. When using DefaultBulkScorer#score function we create a conjunction of 
scorerIterator and collectorIterator. The problem could be if scorerIterator 
has already been advanced. As collectorIterator always starts from a docID = 
-1, and for creation of conjunction iterator we need all of its
 sub-iterators to be on the same doc, the creation of conjunction iterator will 
fail.

We need to create a conjunction between scorerIterator and collectorIterator 
only if scorerIterator has not been advanced yet. 

Relates to https://issues.apache.org/jira/browse/LUCENE-9280

Relates to https://issues.apache.org/jira/browse/LUCENE-9541

 

 

  was:
Some collectors provide iterators that can efficiently skip non-competitive 
docs. When using DefaultBulkScorer#score function
we create a conjunction of scorerIterator and collectorIterator. The problem 
could be if scorerIterator has already been advanced. As collectorIterator 
always starts from a docID = -1, and for creation of conjunction iterator we 
need all of its
sub-iterators to be on the same doc, the creation of conjunction iterator will 
fail.

We need to create a conjunction between scorerIterator and collectorIterator 
only if scorerIterator has not been advanced yet. 

Relates to https://issues.apache.org/jira/browse/LUCENE-9280

Relates to https://issues.apache.org/jira/browse/LUCENE-9541

 

 


> Sort optimization failure if scorerIterator is already advanced
> ---
>
> Key: LUCENE-9555
> URL: https://issues.apache.org/jira/browse/LUCENE-9555
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>
> Some collectors provide iterators that can efficiently skip non-competitive 
> docs. When using DefaultBulkScorer#score function we create a conjunction of 
> scorerIterator and collectorIterator. The problem could be if scorerIterator 
> has already been advanced. As collectorIterator always starts from a docID = 
> -1, and for creation of conjunction iterator we need all of its
>  sub-iterators to be on the same doc, the creation of conjunction iterator 
> will fail.
> We need to create a conjunction between scorerIterator and collectorIterator 
> only if scorerIterator has not been advanced yet. 
> Relates to https://issues.apache.org/jira/browse/LUCENE-9280
> Relates to https://issues.apache.org/jira/browse/LUCENE-9541
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9555) Sort optimization failure if scorerIterator is already advanced

2020-10-02 Thread Mayya Sharipova (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayya Sharipova updated LUCENE-9555:

Description: 
Some collectors provide iterators that can efficiently skip non-competitive 
docs. When using DefaultBulkScorer#score function
we create a conjunction of scorerIterator and collectorIterator. The problem 
could be if scorerIterator has already been advanced. As collectorIterator 
always starts from a docID = -1, and for creation of conjunction iterator we 
need all of its
sub-iterators to be on the same doc, the creation of conjunction iterator will 
fail.

We need to create a conjunction between scorerIterator and collectorIterator 
only if scorerIterator has not been advanced yet. 

Relates to https://issues.apache.org/jira/browse/LUCENE-9280

Relates to https://issues.apache.org/jira/browse/LUCENE-9541

 

 

  was:
Some collectors provide iterators that can efficiently skip non-competitive 
docs.  When using DefaultBulkScorer#score function we create a conjunction of 
scorerIterator and collectorIterator.  The problem could be that if 
scorerIterator has already been advanced, while collectorIterator always starts 
from a docID = -1.

We need to create a conjunction between scorerIterator and collectorIterator 
only if scorerIterator has not been advanced yet. 

Relates to https://issues.apache.org/jira/browse/LUCENE-9280

Relates to https://issues.apache.org/jira/browse/LUCENE-9541

 

 


> Sort optimization failure if scorerIterator is already advanced
> ---
>
> Key: LUCENE-9555
> URL: https://issues.apache.org/jira/browse/LUCENE-9555
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>
> Some collectors provide iterators that can efficiently skip non-competitive 
> docs. When using DefaultBulkScorer#score function
> we create a conjunction of scorerIterator and collectorIterator. The problem 
> could be if scorerIterator has already been advanced. As collectorIterator 
> always starts from a docID = -1, and for creation of conjunction iterator we 
> need all of its
> sub-iterators to be on the same doc, the creation of conjunction iterator 
> will fail.
> We need to create a conjunction between scorerIterator and collectorIterator 
> only if scorerIterator has not been advanced yet. 
> Relates to https://issues.apache.org/jira/browse/LUCENE-9280
> Relates to https://issues.apache.org/jira/browse/LUCENE-9541
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1912: LUCENE-9535: Try to do larger flushes.

2020-10-02 Thread GitBox



mikemccand commented on a change in pull request #1912:
URL: https://github.com/apache/lucene-solr/pull/1912#discussion_r498242843



##
File path: 
lucene/core/src/java/org/apache/lucene/index/ApproximatePriorityQueue.java
##
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.index;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.ListIterator;
+import java.util.function.Predicate;
+
+/**
+ * An approximate priority queue, which attempts to poll items by decreasing
+ * log of the weight, though exact ordering is not guaranteed.
+ * This class doesn't support null elements.
+ */
+final class ApproximatePriorityQueue {
+
+  // Indexes between 0 and 63 are sparely populated, and indexes that are
+  // greater than or equal to 64 are densely populated
+  // Items closed to the beginning of this list are more likely to have a

Review comment:
   s/`closed`/`close`?

##
File path: 
lucene/core/src/java/org/apache/lucene/index/ApproximatePriorityQueue.java
##
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.index;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.ListIterator;
+import java.util.function.Predicate;
+
+/**
+ * An approximate priority queue, which attempts to poll items by decreasing
+ * log of the weight, though exact ordering is not guaranteed.
+ * This class doesn't support null elements.
+ */
+final class ApproximatePriorityQueue {
+
+  // Indexes between 0 and 63 are sparely populated, and indexes that are

Review comment:
   s/`sparely`/`sparsely`?

##
File path: 
lucene/core/src/java/org/apache/lucene/index/ApproximatePriorityQueue.java
##
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.index;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.ListIterator;
+import java.util.function.Predicate;
+
+/**
+ * An approximate priority queue, which attempts to poll items by decreasing
+ * log of the weight, though exact ordering is not guaranteed.
+ * This class doesn't support null elements.
+ */
+final class ApproximatePriorityQueue {
+
+  // Indexes between 0 and 63 are sparely populated, and indexes that are
+  // greater than or equal to 64 are densely populated
+  // Items closed to the beginning of this list are more likely to have a
+  // higher weight.
+  private final List slots = new ArrayList<>(Long.SIZE);
+
+  // A bitset where ones indicate that the corresponding index in `slots` is 
taken.
+  private long usedSlots = 0L;
+
+  ApproximatePriorityQueue() {
+for

[GitHub] [lucene-solr] ErickErickson commented on pull request #1942: SOLR-14910: Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard, change //logok to //nowarn

2020-10-02 Thread GitBox

ErickErickson commented on pull request #1942:
URL: https://github.com/apache/lucene-solr/pull/1942#issuecomment-702826141

   It was decided (there’s a phrase I hate, used to avoid taking responsibility 
for something)…

   Anyway, Andras Salamon and I decided that it was often an error to have 
logging calls with exception.getMessage() rather than the full stack trace, 
see: SOLR-14523 and the checking code isn’t sophisticated enough to figure out 
that this call to getMessage() has nothing to do with an exception. This is a 
case that doesn’t matter whether it’s in an "if (loglevelenabled)” clause or 
not…

   WDYT about adding a reference to SOLR-14523 to the output message? That’s 
not obvious on a code review unfortunately...

   > On Oct 2, 2020, at 11:22 AM, David Smiley  wrote:
   > 
   > 
   > @dsmiley commented on this pull request.
   > 
   > In solr/core/src/java/org/apache/solr/servlet/ :
   > 
   > > @@ -493,7 +493,7 @@ Action authorize() throws IOException {
   >  }
   >  if (statusCode == AuthorizationResponse.FORBIDDEN.statusCode) {
   >if (log.isDebugEnabled()) {
   > -log.debug("UNAUTHORIZED auth header {} context : {}, msg: {}", 
req.getHeader("Authorization"), context, authResponse.getMessage()); // logOk
   > +log.debug("UNAUTHORIZED auth header {} context : {}, msg: {}", 
req.getHeader("Authorization"), context, authResponse.getMessage()); // nowarn
   > 
   > What does our source validation complain about here?
   > 
   > Many of the logok/nowarn places look fine to me at a glance but I'm no 
match for the logging policeman ;-)
   > 
   > —
   > You are receiving this because you authored the thread.
   > Reply to this email directly, view it on GitHub, or unsubscribe.
   > 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9555) Sort optimization failure if scorerIterator is already advanced

2020-10-02 Thread Mayya Sharipova (Jira)

Mayya Sharipova created LUCENE-9555:
---

 Summary: Sort optimization failure if scorerIterator is already 
advanced
 Key: LUCENE-9555
 URL: https://issues.apache.org/jira/browse/LUCENE-9555
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Mayya Sharipova


Some collectors provide iterators that can efficiently skip non-competitive 
docs.  When using DefaultBulkScorer#score function we create a conjunction of 
scorerIterator and collectorIterator.  The problem could be that if 
scorerIterator has already been advanced, while collectorIterator always starts 
from a docID = -1.

We need to create a conjunction between scorerIterator and collectorIterator 
only if scorerIterator has not been advanced yet. 

Relates to https://issues.apache.org/jira/browse/LUCENE-9280

Relates to https://issues.apache.org/jira/browse/LUCENE-9541

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] thelabdude commented on a change in pull request #1938: SOLR-14766: Remove restlet as dependency for the ManagedResource API

2020-10-02 Thread GitBox



thelabdude commented on a change in pull request #1938:
URL: https://github.com/apache/lucene-solr/pull/1938#discussion_r498916106



##
File path: solr/core/src/java/org/apache/solr/rest/RestManager.java
##
@@ -326,44 +327,46 @@ public void doInit() throws ResourceException {
   }
 }
   }
-  
+
   if (managedResource == null) {
-if (Method.PUT.equals(getMethod()) || Method.POST.equals(getMethod())) 
{
+final String method = getSolrRequest().getHttpMethod();
+if ("PUT".equals(method) || "POST".equals(method)) {

Review comment:
   String equals unsafe? Not sure I understand what this warning is trying 
to convey?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] thelabdude commented on pull request #1938: SOLR-14766: Remove restlet as dependency for the ManagedResource API

2020-10-02 Thread GitBox



thelabdude commented on pull request #1938:
URL: https://github.com/apache/lucene-solr/pull/1938#issuecomment-702821500


   Thanks @noblepaul ... that's much cleaner. Patch applied.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] iverase commented on pull request #1940: LUCENE-9552: Adds a LatLonPoint query that accepts an array of LatLonGeometries

2020-10-02 Thread GitBox



iverase commented on pull request #1940:
URL: https://github.com/apache/lucene-solr/pull/1940#issuecomment-702819188


   I have two motivations for this change:
   
   1) Finding all points at a distance of a line can be approximated  by 
creating a polygon and two circles and finding all points that is inside any of 
those geometries. Currently three queries are need inside a boolean OR. With 
this approach it can be executed in a single query.
   
   2) API-wise, it simplifies the implementation as you don't need to know 
which geometry you are dealing with to create the query.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs

2020-10-02 Thread GitBox



mikemccand commented on a change in pull request #1941:
URL: https://github.com/apache/lucene-solr/pull/1941#discussion_r498907778



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -1072,7 +1073,14 @@ public IndexWriter(Directory d, IndexWriterConfig conf) 
throws IOException {
 
   config.getFlushPolicy().init(config);
   bufferedUpdatesStream = new BufferedUpdatesStream(infoStream);
-  docWriter = new DocumentsWriter(flushNotifications, 
segmentInfos.getIndexCreatedVersionMajor(), pendingNumDocs,
+  final IntConsumer reserveDocs = numDocs -> {
+if (numDocs > 0) {
+  reserveDocs(numDocs);

Review comment:
   I'm confused why we are calling separate methods when `numDocs` is 
negative or not?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1942: SOLR-14910: Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard, change //logok to //n

2020-10-02 Thread GitBox



dsmiley commented on a change in pull request #1942:
URL: https://github.com/apache/lucene-solr/pull/1942#discussion_r498887554



##
File path: solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java
##
@@ -493,7 +493,7 @@ Action authorize() throws IOException {
 }
 if (statusCode == AuthorizationResponse.FORBIDDEN.statusCode) {
   if (log.isDebugEnabled()) {
-log.debug("UNAUTHORIZED auth header {} context : {}, msg: {}", 
req.getHeader("Authorization"), context, authResponse.getMessage()); // logOk
+log.debug("UNAUTHORIZED auth header {} context : {}, msg: {}", 
req.getHeader("Authorization"), context, authResponse.getMessage()); // nowarn

Review comment:
   What does our source validation complain about here?
   
   Many of the logok/nowarn places look fine to me at a glance but I'm no match 
for the logging policeman ;-)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] ErickErickson opened a new pull request #1942: SOLR-14910: Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard, change //logok to //nowarn

2020-10-02 Thread GitBox



ErickErickson opened a new pull request #1942:
URL: https://github.com/apache/lucene-solr/pull/1942


   Most of this is just changing //logok to //nowarn. 
   
   The substantive changes are in validate-log-calls.gradle, taking out the 
special handling for several files and _not_ producing failures if there's a 
//nowarn on the line instead, plus adding a few //nowarn tags in the files that 
were handled specially.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] rmuir commented on pull request #1940: LUCENE-9552: Adds a LatLonPoint query that accepts an array of LatLonGeometries

2020-10-02 Thread GitBox



rmuir commented on pull request #1940:
URL: https://github.com/apache/lucene-solr/pull/1940#issuecomment-702790376


   Why is this needed when the existing Polygon is a multipolygon and optimizes 
for that case? Instead of making an array of polygons, just use a single 
MultiPolygon?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke commented on pull request #1890: Rename ConfigSetsAPITest to TestConfigSetsAPISolrCloud

2020-10-02 Thread GitBox



cpoerschke commented on pull request #1890:
URL: https://github.com/apache/lucene-solr/pull/1890#issuecomment-702776305


   > ... if adding of `ConfigSetsAPITest` functionality to `TestConfigSetsAPI` 
might be better? ...
   
   Looking more closely, there seem to be sufficient differences between the 
tests i.e. them remaining separate they would be clearer. However, 
`TestConfigSetsAPIShareSchema` could be a better alternative name than 
`TestConfigSetsAPISolrCloud`.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dnhatn opened a new pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs

2020-10-02 Thread GitBox



dnhatn opened a new pull request #1941:
URL: https://github.com/apache/lucene-solr/pull/1941


   Some applications can use the pendingNumDocs from IndexWriter to estimate 
that the number of documents of an index is very close to the hard limit so 
that it can reject writes without constructing Lucene documents.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke merged pull request #1920: branch_8x: add two missing(?) solr/CHANGES.txt entries

2020-10-02 Thread GitBox



cpoerschke merged pull request #1920:
URL: https://github.com/apache/lucene-solr/pull/1920


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9554) Expose pendingNumDocs from IndexWriter

2020-10-02 Thread Nhat Nguyen (Jira)

Nhat Nguyen created LUCENE-9554:
---

 Summary: Expose pendingNumDocs from IndexWriter
 Key: LUCENE-9554
 URL: https://issues.apache.org/jira/browse/LUCENE-9554
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Nhat Nguyen


Some applications can use the pendingNumDocs from IndexWriter to estimate that 
the number of documents of an index is reaching to the hard limit so that it 
can reject writes without constructing Lucene documents.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-14910) Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard, change //logok to //nowarn

2020-10-02 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-14910:
-

Assignee: Erick Erickson

> Use in-line tags for logger declarations in Gradle ValidateLogCalls that are 
> non-standard, change //logok to //nowarn
> -
>
> Key: SOLR-14910
> URL: https://issues.apache.org/jira/browse/SOLR-14910
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1937: LUCENE-9541 ConjunctionDISI sub-iterators check

2020-10-02 Thread GitBox



mayya-sharipova commented on a change in pull request #1937:
URL: https://github.com/apache/lucene-solr/pull/1937#discussion_r498851410



##
File path: lucene/core/src/java/org/apache/lucene/search/ConjunctionDISI.java
##
@@ -140,6 +141,13 @@ private static void addTwoPhaseIterator(TwoPhaseIterator 
twoPhaseIter, List allIterators,
   List twoPhaseIterators) {
+
+// assert that all sub-iterators are on the same doc ID
+int curDoc = allIterators.size() > 0 ? allIterators.get(0).docID() : 
twoPhaseIterators.get(0).approximation.docID();
+boolean iteratorsOnTheSameDoc = allIterators.stream().allMatch(it -> 
it.docID() == curDoc);
+iteratorsOnTheSameDoc = iteratorsOnTheSameDoc && 
twoPhaseIterators.stream().allMatch(it -> it.approximation().docID() == curDoc);
+assert iteratorsOnTheSameDoc : "Sub-iterators of ConjunctionDISI are not 
the same document!";

Review comment:
   addressed in 74151e3

##
File path: lucene/core/src/java/org/apache/lucene/search/ConjunctionDISI.java
##
@@ -227,6 +236,7 @@ private int doNext(int doc) throws IOException {
 
   @Override
   public int advance(int target) throws IOException {
+assertItersOnSameDoc();

Review comment:
   addressed in 74151e3





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] iverase opened a new pull request #1940: LUCENE-9552: Adds a LatLonPoint query that accepts an array of LatLonGeometries

2020-10-02 Thread GitBox



iverase opened a new pull request #1940:
URL: https://github.com/apache/lucene-solr/pull/1940


   New query that accepts an array of LatLonGeometries.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1921: SOLR-14829: Improve documentation for Request Handlers in RefGuide and solrconfig.xml

2020-10-02 Thread GitBox



dsmiley commented on a change in pull request #1921:
URL: https://github.com/apache/lucene-solr/pull/1921#discussion_r498832396



##
File path: solr/solr-ref-guide/src/common-query-parameters.adoc
##
@@ -307,11 +307,13 @@ The `echoParams` parameter controls what information 
about request parameters is
 
 The `echoParams` parameter accepts the following values:
 
-* `explicit`: This is the default value. Only parameters included in the 
actual request, plus the `_` parameter (which is a 64-bit numeric timestamp) 
will be added to the `params` section of the response header.
+* `explicit`: Only parameters included in the actual request, plus the `_` 
parameter (which is a 64-bit numeric timestamp) will be added to the `params` 
section of the response header.

Review comment:
   I did some digging and discovered that the admin UI will add this 
underscore.  See `services.js` which does this via `Date.now()` all over the 
place.  I don't know what purpose it has; git blame shows it was added with 
that whole UI refactor of it's day but perhaps it predated it?  AFAICT Solr 
isn't doing anything with it Solr side; I set multiple conditional debugger 
breakpoints across SolrParams subclasses but no where is "_" requested.  My 
suspicion is that it was added to foil HTTP cache attempts?
   
   I did some JIRA digging and found: 
https://issues.apache.org/jira/browse/SOLR-4311 which has a patch with the 
underscores but it was for version, not the timestamp.  Given the underscore's 
use to defeat bad caching in that scenario, if I keep digging, I'll probably 
find it.
   
   In short, I don't think we should document this underscore param in the ref 
guide.  It's not a special param to Solr; it's an oddity of our current admin 
UI.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14910) Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard, change //logok to //nowarn

2020-10-02 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-14910:
--
Summary: Use in-line tags for logger declarations in Gradle 
ValidateLogCalls that are non-standard, change //logok to //nowarn  (was: Use 
in-line tags for logger declarations in Gradle ValidateLogCalls that are 
non-standard)

> Use in-line tags for logger declarations in Gradle ValidateLogCalls that are 
> non-standard, change //logok to //nowarn
> -
>
> Key: SOLR-14910
> URL: https://issues.apache.org/jira/browse/SOLR-14910
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14910) Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard

2020-10-02 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206151#comment-17206151
 ] 

Erick Erickson commented on SOLR-14910:
---

[~dsmiley] pointed out that having to muck with the gradle build system to 
handle non-standard logger declarations is yucky, and looking back at that code 
I don't know what I was thinking. We already have a //logok flag, and adding 
being able to ignore a declaration if that tag is present isn't nearly so yucky.

For instance, HttpServer2.java requires an upper-case LOG in order for Hadoop 
to function (yuck) and we want our log variables to be just lower-case "log". 
Or SolrCore.java declares requestLog and slowLog, which are perfectly valid but 
aren't just "log".

Along the way, David suggested that //nowarn is more general and can be used in 
other situations than a specific //logok, which makes sense.

PR shortly. I'll commit this probably tomorrow absent objections.

> Use in-line tags for logger declarations in Gradle ValidateLogCalls that are 
> non-standard
> -
>
> Key: SOLR-14910
> URL: https://issues.apache.org/jira/browse/SOLR-14910
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14910) Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard

2020-10-02 Thread Erick Erickson (Jira)

Erick Erickson created SOLR-14910:
-

 Summary: Use in-line tags for logger declarations in Gradle 
ValidateLogCalls that are non-standard
 Key: SOLR-14910
 URL: https://issues.apache.org/jira/browse/SOLR-14910
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Erick Erickson






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] iverase opened a new pull request #1939: Adds a XYPoint query that accepts an array of XYGeometries

2020-10-02 Thread GitBox



iverase opened a new pull request #1939:
URL: https://github.com/apache/lucene-solr/pull/1939


   New query that accepts an array of XYGeometries. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9004) Approximate nearest vector search

2020-10-02 Thread Jim Ferenczi (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206088#comment-17206088
 ] 

Jim Ferenczi commented on LUCENE-9004:
--

> I think this is not so for the NSW algorithm; you just keep the docids, so 
>N*M*4 is the cost. Still this doesn't change any of the scaling conclusions.

My comment was about the cost of building where you need to keep the list 
sorted by distances.

 

+1 for the plan [~sokolov] and to start with a simple NSW graph that we can 
make hierarchical later if needed.

Filtering with a query is tricky and will require special treatments based on 
the internal implementation so I agree that it would be simpler to consider it 
out of scope at the moment.

 

> Approximate nearest vector search
> -
>
> Key: LUCENE-9004
> URL: https://issues.apache.org/jira/browse/LUCENE-9004
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Michael Sokolov
>Priority: Major
> Attachments: hnsw_layered_graph.png
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> "Semantic" search based on machine-learned vector "embeddings" representing 
> terms, queries and documents is becoming a must-have feature for a modern 
> search engine. SOLR-12890 is exploring various approaches to this, including 
> providing vector-based scoring functions. This is a spinoff issue from that.
> The idea here is to explore approximate nearest-neighbor search. Researchers 
> have found an approach based on navigating a graph that partially encodes the 
> nearest neighbor relation at multiple scales can provide accuracy > 95% (as 
> compared to exact nearest neighbor calculations) at a reasonable cost. This 
> issue will explore implementing HNSW (hierarchical navigable small-world) 
> graphs for the purpose of approximate nearest vector search (often referred 
> to as KNN or k-nearest-neighbor search).
> At a high level the way this algorithm works is this. First assume you have a 
> graph that has a partial encoding of the nearest neighbor relation, with some 
> short and some long-distance links. If this graph is built in the right way 
> (has the hierarchical navigable small world property), then you can 
> efficiently traverse it to find nearest neighbors (approximately) in log N 
> time where N is the number of nodes in the graph. I believe this idea was 
> pioneered in  [1]. The great insight in that paper is that if you use the 
> graph search algorithm to find the K nearest neighbors of a new document 
> while indexing, and then link those neighbors (undirectedly, ie both ways) to 
> the new document, then the graph that emerges will have the desired 
> properties.
> The implementation I propose for Lucene is as follows. We need two new data 
> structures to encode the vectors and the graph. We can encode vectors using a 
> light wrapper around {{BinaryDocValues}} (we also want to encode the vector 
> dimension and have efficient conversion from bytes to floats). For the graph 
> we can use {{SortedNumericDocValues}} where the values we encode are the 
> docids of the related documents. Encoding the interdocument relations using 
> docids directly will make it relatively fast to traverse the graph since we 
> won't need to lookup through an id-field indirection. This choice limits us 
> to building a graph-per-segment since it would be impractical to maintain a 
> global graph for the whole index in the face of segment merges. However 
> graph-per-segment is a very natural at search time - we can traverse each 
> segments' graph independently and merge results as we do today for term-based 
> search.
> At index time, however, merging graphs is somewhat challenging. While 
> indexing we build a graph incrementally, performing searches to construct 
> links among neighbors. When merging segments we must construct a new graph 
> containing elements of all the merged segments. Ideally we would somehow 
> preserve the work done when building the initial graphs, but at least as a 
> start I'd propose we construct a new graph from scratch when merging. The 
> process is going to be  limited, at least initially, to graphs that can fit 
> in RAM since we require random access to the entire graph while constructing 
> it: In order to add links bidirectionally we must continually update existing 
> documents.
> I think we want to express this API to users as a single joint 
> {{KnnGraphField}} abstraction that joins together the vectors and the graph 
> as a single joint field type. Mostly it just looks like a vector-valued 
> field, but has this graph attached to it.
> I'll push a branch with my POC and would love to hear comments. It has many 
> nocommits, basic design is not really set, there is no Query implementation 
> and no integration iwth IndexSearcher, but it does work by

[GitHub] [lucene-solr] dweiss commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2

2020-10-02 Thread GitBox



dweiss commented on a change in pull request #1905:
URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r498666120



##
File path: lucene/packaging/build.gradle
##
@@ -0,0 +1,160 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+// This project puts together a "distribution", assembling dependencies from
+// various other projects.
+
+plugins {
+id 'distribution'
+}
+
+description = 'Lucene distribution packaging'
+
+// Declare all subprojects that should be included in binary distribution.
+// By default everything is included, unless explicitly excluded.
+def includeInBinaries = project(":lucene").subprojects.findAll {subproject ->
+return !(subproject.path in [
+":lucene:packaging",
+":lucene:analysis",
+":lucene:luke", // nocommit - Encountered duplicate path 
"luke/lib/log4j-core-2.13.2.jar"

Review comment:
   This wasn't obvious. Here's what happened.
   
   We create a configuration with dependency on "full project", it looked like 
this:
   ```
   handler.add(confFull, project(path: includedProject.path), {...
   ```
   
   This in fact is a dependency on the configuration aptly named 'default', 
which includes all of project binaries and transitive dependencies (archives in 
general).
   
   Luke declared a separate configuration for dependencies called 'standalone' 
which added a few non-project dependencies (the 'implementation' extended from 
it so they were visible on runtime classpath). The core of the problem was in 
that luke also exported an extra artifact that belonged to this 'standalone' 
configuration  - this was a set of JARs assembled under a folder. So when the 
packaging process was collecting files for inclusion, it encountered log4j 
twice: a copy in that 'standalone' folder and a copy from transitive project 
dependencies.
   
   I recall you asked why this 'fail' on double archive entries is needed. 
Well, it's useful to catch pearls like this one above...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dweiss commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2

2020-10-02 Thread GitBox



dweiss commented on a change in pull request #1905:
URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r498666282



##
File path: lucene/packaging/build.gradle
##
@@ -0,0 +1,160 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+// This project puts together a "distribution", assembling dependencies from
+// various other projects.
+
+plugins {
+id 'distribution'
+}
+
+description = 'Lucene distribution packaging'
+
+// Declare all subprojects that should be included in binary distribution.
+// By default everything is included, unless explicitly excluded.
+def includeInBinaries = project(":lucene").subprojects.findAll {subproject ->
+return !(subproject.path in [
+":lucene:packaging",
+":lucene:analysis",
+":lucene:luke", // nocommit - Encountered duplicate path 
"luke/lib/log4j-core-2.13.2.jar"

Review comment:
   I've verified that it collects Luke properly and the launch script 
starts it fine (on Windows).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9548) Publish master (9.x) snapshots to https://repository.apache.org

2020-10-02 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206004#comment-17206004
 ] 

Dawid Weiss commented on LUCENE-9548:
-

Hi Uwe. I've changed credential property names to use those mentioned on that 
cwiki page so in theory it should work if you run this from jenkins:
{code}
gradlew mavenToApacheSnapshots
{code}

I also added other task aliases for convention tasks (which have fairly long 
names):
{code}
mavenToApacheSnapshots - Publish Maven JARs and POMs to Apache Snapshots 
repository: https://repository.apache.org/content/repositories/snapshots
mavenToLocalFolder - Publish Maven JARs and POMs locally to 
[...]\build\maven-local
mavenToLocalRepo - Publish Maven JARs and POMs to current user's local maven 
repository.
{code}

We should probably add ApacheReleases repository too for final releases (or 
create a bundle uploadable to Nexus?).


> Publish master (9.x) snapshots to https://repository.apache.org
> ---
>
> Key: LUCENE-9548
> URL: https://issues.apache.org/jira/browse/LUCENE-9548
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We should start publishing snapshot JARs to Apache repositories. I'm not sure 
> how to set it all up with gradle but maybe there are other Apache projects 
> that use gradle and we could peek at their config? Mostly it's about signing 
> artifacts (how to pass credentials for signing) and setting up Nexus 
> deployment repository.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-9548) Publish master (9.x) snapshots to https://repository.apache.org

2020-10-02 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206004#comment-17206004
 ] 

Dawid Weiss edited comment on LUCENE-9548 at 10/2/20, 6:58 AM:
---

Hi Uwe. I've changed credential property names to use those mentioned on that 
cwiki page so in theory it should work if you run this from jenkins:
{code}
gradlew mavenToApacheSnapshots
{code}

I also added other task aliases for convention tasks (which have fairly long 
names):
{code}
mavenToApacheSnapshots - Publish Maven JARs and POMs to Apache Snapshots 
repository: https://repository.apache.org/content/repositories/snapshots
mavenToLocalFolder - Publish Maven JARs and POMs locally to 
[...]\build\maven-local
mavenToLocalRepo - Publish Maven JARs and POMs to current user's local maven 
repository.
{code}

We should probably add ApacheReleases repository too for final releases (or 
create a bundle uploadable to Nexus?).



was (Author: dweiss):
Hi Uwe. I've changed credential property names to use those mentioned on that 
cwiki page so in theory it should work if you run this from jenkins:
{code}
gradlew mavenToApacheSnapshots
{code}

I also added other task aliases for convention tasks (which have fairly long 
names):
{code}
mavenToApacheSnapshots - Publish Maven JARs and POMs to Apache Snapshots 
repository: https://repository.apache.org/content/repositories/snapshots
mavenToLocalFolder - Publish Maven JARs and POMs locally to 
[...]\build\maven-local
mavenToLocalRepo - Publish Maven JARs and POMs to current user's local maven 
repository.
{code}

We should probably add ApacheReleases repository too for final releases (or 
create a bundle uploadable to Nexus?).


> Publish master (9.x) snapshots to https://repository.apache.org
> ---
>
> Key: LUCENE-9548
> URL: https://issues.apache.org/jira/browse/LUCENE-9548
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We should start publishing snapshot JARs to Apache repositories. I'm not sure 
> how to set it all up with gradle but maybe there are other Apache projects 
> that use gradle and we could peek at their config? Mostly it's about signing 
> artifacts (how to pass credentials for signing) and setting up Nexus 
> deployment repository.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

68 matches

Mail list logo