[jira] [Commented] (SOLR-14659) Remove restlet from Solr
[ https://issues.apache.org/jira/browse/SOLR-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206601#comment-17206601 ] Noble Paul commented on SOLR-14659: --- This does not introduce any backward incompatible change. We should be able to commit this to 8.x > Remove restlet from Solr > > > Key: SOLR-14659 > URL: https://issues.apache.org/jira/browse/SOLR-14659 > Project: Solr > Issue Type: Improvement >Reporter: Noble Paul >Assignee: Timothy Potter >Priority: Major > Fix For: master (9.0) > > Time Spent: 1h > Remaining Estimate: 0h > > restlet is only used by managed resources. We can support that even without a > restlet. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1938: SOLR-14659: Remove restlet as dependency for the ManagedResource API
dsmiley commented on a change in pull request #1938: URL: https://github.com/apache/lucene-solr/pull/1938#discussion_r499113760 ## File path: solr/core/src/java/org/apache/solr/rest/RestManager.java ## @@ -326,44 +327,46 @@ public void doInit() throws ResourceException { } } } - + if (managedResource == null) { -if (Method.PUT.equals(getMethod()) || Method.POST.equals(getMethod())) { +final String method = getSolrRequest().getHttpMethod(); +if ("PUT".equals(method) || "POST".equals(method)) { Review comment: If you click the details link, it explains. It's pretty wild what it suggests... it's a stretch IMO. Good luck pulling that attack off. Besides, this is the HTTP method (fixed vocab); it's not a param. Our use of Muse here is very new we haven't tweaked `.muse/config.toml` yet but it needs some taming. https://docs.muse.dev/docs/configuring-muse/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1938: SOLR-14659: Remove restlet as dependency for the ManagedResource API
dsmiley commented on a change in pull request #1938: URL: https://github.com/apache/lucene-solr/pull/1938#discussion_r499113148 ## File path: solr/core/src/java/org/apache/solr/rest/RestManager.java ## @@ -16,15 +16,33 @@ */ package org.apache.solr.rest; +import org.apache.solr.common.SolrException; Review comment: I'd prefer you configure your IDE to keep java.* up front. FWIW this is in the IntelliJ config in the project. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude commented on pull request #1938: SOLR-14659: Remove restlet as dependency for the ManagedResource API
thelabdude commented on pull request #1938: URL: https://github.com/apache/lucene-solr/pull/1938#issuecomment-703034422 Ok, I'm good with that, mainly just wanted some buy-in from others ;-) Unless there are any objections, I'll move ahead with merging to master and 8.x ... thanks for the help @noblepaul This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14659) Remove restlet from Solr
[ https://issues.apache.org/jira/browse/SOLR-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206585#comment-17206585 ] Noble Paul commented on SOLR-14659: --- I wish we could get rid of the {{RestManager}} name itself. There is no "REST" being managed. Basically, it is just "managed resources" . It has nothing to do with REST. But, we will deal with it in another ticket > Remove restlet from Solr > > > Key: SOLR-14659 > URL: https://issues.apache.org/jira/browse/SOLR-14659 > Project: Solr > Issue Type: Improvement >Reporter: Noble Paul >Assignee: Timothy Potter >Priority: Major > Fix For: master (9.0) > > Time Spent: 0.5h > Remaining Estimate: 0h > > restlet is only used by managed resources. We can support that even without a > restlet. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on pull request #1938: SOLR-14659: Remove restlet as dependency for the ManagedResource API
noblepaul commented on pull request #1938: URL: https://github.com/apache/lucene-solr/pull/1938#issuecomment-703029595 @thelabdude if all the tests pass, I wish this get committed to 8.x itself. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14749) Provide a clean API for cluster-level event processing
[ https://issues.apache.org/jira/browse/SOLR-14749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206584#comment-17206584 ] Noble Paul commented on SOLR-14749: --- {quote}I had the impression we are having a proper discussion, both here and on the PR. Your -1 means you have serious technical objections, correct? {quote} My objections are not on the ticket. I'm objecting to the PR. I see too many different changes being made and this is not healthy I request you to make focussed PRs (if required make sub JIRAs) so that others can make meaningful suggestions > Provide a clean API for cluster-level event processing > -- > > Key: SOLR-14749 > URL: https://issues.apache.org/jira/browse/SOLR-14749 > Project: Solr > Issue Type: Improvement > Components: AutoScaling >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Labels: clean-api > Fix For: master (9.0) > > Time Spent: 14h 50m > Remaining Estimate: 0h > > This is a companion issue to SOLR-14613 and it aims at providing a clean, > strongly typed API for the functionality formerly known as "triggers" - that > is, a component for generating cluster-level events corresponding to changes > in the cluster state, and a pluggable API for processing these events. > The 8x triggers have been removed so this functionality is currently missing > in 9.0. However, this functionality is crucial for implementing the automatic > collection repair and re-balancing as the cluster state changes (nodes going > down / up, becoming overloaded / unused / decommissioned, etc). > For this reason we need this API and a default implementation of triggers > that at least can perform automatic collection repair (maintaining the > desired replication factor in presence of live node changes). > As before, the actual changes to the collections will be executed using > existing CollectionAdmin API, which in turn may use the placement plugins > from SOLR-14613. > h3. Division of responsibility > * built-in Solr components (non-pluggable): > ** cluster state monitoring and event generation, > ** simple scheduler to periodically generate scheduled events > * plugins: > ** automatic collection repair on {{nodeLost}} events (provided by default) > ** re-balancing of replicas (periodic or on {{nodeAdded}} events) > ** reporting (eg. requesting additional node provisioning) > ** scheduled maintenance (eg. removing inactive shards after split) > h3. Other considerations > These plugins (unlike the placement plugins) need to execute on one > designated node in the cluster. Currently the easiest way to implement this > is to run them on the Overseer leader node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14911) Logger : UpdateLog Error Message : java.io.EOFException
[ https://issues.apache.org/jira/browse/SOLR-14911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] D updated SOLR-14911: - Summary: Logger : UpdateLog Error Message : java.io.EOFException (was: Looger : UpdateLog Error Message : java.io.EOFException) > Logger : UpdateLog Error Message : java.io.EOFException > --- > > Key: SOLR-14911 > URL: https://issues.apache.org/jira/browse/SOLR-14911 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCLI >Affects Versions: 7.5 >Reporter: D >Priority: Blocker > Attachments: GC events.PNG > > > *Events:* > # GC logs showing continuos Full GC events. Log report attached. > # Core filling failed , showing less data than expected. > # following warnings showing on dashboard before error. > |Level|Logger|Message| > |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to > managed, but the non-managed schema schema.xml is still loadable. > PLEASE REMOVE THIS FILE.| > |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to > managed, but the non-managed schema schema.xml is still loadable. > PLEASE REMOVE THIS FILE.| > |WARN false|SolrResourceLoader|Solr loaded a deprecated plugin/analysis class > [solr.TrieDateField]. Please consult documentation how to replace it > accordingly.| > |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to > managed, but the non-managed schema schema.xml is still loadable. > PLEASE REMOVE THIS FILE.| > |WARN false|UpdateLog|Starting log replay > tlog\{file=\data\tlog\tlog.0445482 refcount=2} > active=false starting pos=0 inSortedOrder=false| > * Total data in all cores around 8 GB > * *Other Configurations:* > ** -XX:+UseG1GC > ** -XX:+UseStringDeduplication > ** -XX:MaxGCPauseMillis=500 > ** -Xms15g > ** -Xmx15g > ** -Xss256k > * *OS Environment :* > ** Windows 10, > ** Filling cores by calling SQL query using jtds-1.3.1 library. > ** Solr Version 7.5 > ** Runtime: Oracle Corporation OpenJDK 64-Bit Server VM 11.0.2 11.0.2+9 > ** Processors : 48 > ** System Physical Memory : 128 GB > ** Swap Space : 256GB > * solr-spec7.5.0 > ** solr-impl7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55 > * lucene-spec7.5.0 > ** lucene-impl7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:01:1 > *Error Message :* > java.io.EOFException > at > org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:168) > at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:863) > at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:857) > at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:266) > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256) > at > org.apache.solr.common.util.JavaBinCodec.readSolrInputDocument(JavaBinCodec.java:603) > at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:315) > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256) > at org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:747) > at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:272) > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256) > at > org.apache.solr.update.TransactionLog$LogReader.next(TransactionLog.java:673) > at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1832) > at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1747) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9554) Expose pendingNumDocs from IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206550#comment-17206550 ] ASF subversion and git services commented on LUCENE-9554: - Commit 77396dbf33944502814bc993c3c9db84b974 in lucene-solr's branch refs/heads/branch_8x from Nhat Nguyen [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=77396db ] LUCENE-9554: Expose IndexWriter#pendingNumDocs (#1941) Some applications can use the pendingNumDocs from IndexWriter to estimate that the number of documents of an index is very close to the hard limit so that it can reject writes without constructing Lucene documents. > Expose pendingNumDocs from IndexWriter > -- > > Key: LUCENE-9554 > URL: https://issues.apache.org/jira/browse/LUCENE-9554 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Nhat Nguyen >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > > Some applications can use the pendingNumDocs from IndexWriter to estimate > that the number of documents of an index is reaching to the hard limit so > that it can reject writes without constructing Lucene documents. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9554) Expose pendingNumDocs from IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nhat Nguyen resolved LUCENE-9554. - Fix Version/s: 8.7 master (9.0) Resolution: Fixed > Expose pendingNumDocs from IndexWriter > -- > > Key: LUCENE-9554 > URL: https://issues.apache.org/jira/browse/LUCENE-9554 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Nhat Nguyen >Priority: Minor > Fix For: master (9.0), 8.7 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Some applications can use the pendingNumDocs from IndexWriter to estimate > that the number of documents of an index is reaching to the hard limit so > that it can reject writes without constructing Lucene documents. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dnhatn merged pull request #1944: LUCENE-9554: Expose IndexWriter#pendingNumDocs
dnhatn merged pull request #1944: URL: https://github.com/apache/lucene-solr/pull/1944 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dnhatn opened a new pull request #1944: LUCENE-9554: Expose IndexWriter#pendingNumDocs
dnhatn opened a new pull request #1944: URL: https://github.com/apache/lucene-solr/pull/1944 Some applications can use the pendingNumDocs from IndexWriter to estimate that the number of documents of an index is very close to the hard limit so that it can reject writes without constructing Lucene documents. Backport of #1941 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9554) Expose pendingNumDocs from IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206528#comment-17206528 ] ASF subversion and git services commented on LUCENE-9554: - Commit 7e04e4d0ca3951e90c064f2fd04ca89772080c91 in lucene-solr's branch refs/heads/master from Nhat Nguyen [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7e04e4d ] LUCENE-9554: Expose IndexWriter#pendingNumDocs (#1941) Some applications can use the pendingNumDocs from IndexWriter to estimate that the number of documents of an index is very close to the hard limit so that it can reject writes without constructing Lucene documents. > Expose pendingNumDocs from IndexWriter > -- > > Key: LUCENE-9554 > URL: https://issues.apache.org/jira/browse/LUCENE-9554 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Nhat Nguyen >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > Some applications can use the pendingNumDocs from IndexWriter to estimate > that the number of documents of an index is reaching to the hard limit so > that it can reject writes without constructing Lucene documents. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dnhatn merged pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs
dnhatn merged pull request #1941: URL: https://github.com/apache/lucene-solr/pull/1941 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dnhatn commented on pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs
dnhatn commented on pull request #1941: URL: https://github.com/apache/lucene-solr/pull/1941#issuecomment-702967476 Thanks Mike! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14766) Deprecate ManagedResources from Solr
[ https://issues.apache.org/jira/browse/SOLR-14766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206513#comment-17206513 ] Timothy Potter commented on SOLR-14766: --- I moved the work over to SOLR-14659. The issue with removing the ManagedResources framework is that LTR contrib relies on it for model / feature management. Of course we can change the LTR code to do something different but for now it seems sufficient to keep ManagedResources but w/o Restlet. > Deprecate ManagedResources from Solr > > > Key: SOLR-14766 > URL: https://issues.apache.org/jira/browse/SOLR-14766 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Timothy Potter >Priority: Major > Labels: deprecation > Attachments: SOLR-14766.patch > > Time Spent: 1h > Remaining Estimate: 0h > > This feature has the following problems. > * It's insecure because it is using restlet > * Nobody knows that code enough to even remove the restlet dependency > * Restlest dependency on Solr exists just because of this > We should deprecate this from 8.7 and remove it from master -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14659) Remove restlet from Solr
[ https://issues.apache.org/jira/browse/SOLR-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter updated SOLR-14659: -- Status: Patch Available (was: Open) > Remove restlet from Solr > > > Key: SOLR-14659 > URL: https://issues.apache.org/jira/browse/SOLR-14659 > Project: Solr > Issue Type: Improvement >Reporter: Noble Paul >Assignee: Timothy Potter >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > > restlet is only used by managed resources. We can support that even without a > restlet. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-10321) Unified highlighter returns empty fields when using glob
[ https://issues.apache.org/jira/browse/SOLR-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206507#comment-17206507 ] David Smiley commented on SOLR-10321: - Looking back at this... I think I'm inclined to just drop the empty snippet arrays. It wasn't and still won't be known if the document has a field value just by looking at the highlighting output, but it's better than needless empty output. I filed a LUCENE side issue for the UH to address this, but I'd rather wait to address it before a real user comes along instead of just a hypothetical need. > Unified highlighter returns empty fields when using glob > > > Key: SOLR-10321 > URL: https://issues.apache.org/jira/browse/SOLR-10321 > Project: Solr > Issue Type: Bug > Components: highlighter >Affects Versions: 6.4.2 >Reporter: Markus Jelsma >Priority: Minor > Fix For: 7.0 > > > {code} > q=lama=unified=content_* > {code} > returns: > {code} >name="http://www.nu.nl/weekend/3771311/dalai-lama-inspireert-westen.html;> > > > Nobelprijs Voorafgaand aan zijn bezoek aan Nederland is de dalai > emlama/em in Noorwegen om te vieren dat 25 jaar geleden de > Nobelprijs voor de Vrede aan hem werd toegekend. Anders dan in Nederland > wordt de dalai emlama/em niet ontvangen in het Noorse > parlement. > > > > > > > > > > > > {code} > FastVector and original do not emit: > {code} > > > > > > > > > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on pull request #1942: SOLR-14910: Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard, change //logok to //nowarn
dsmiley commented on pull request #1942: URL: https://github.com/apache/lucene-solr/pull/1942#issuecomment-702948130 > it was often an error to have logging calls with exception.getMessage() Ah; right, I agree with this very much and I thought I proposed this rule. > WDYT about adding a reference to SOLR-14523 to the output message? to the validation rule error? Yes; I think referring to JIRA issues is great -- gives people a place to go to learn/understand. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9556) UnifiedHighlighter: distinguish no field from no passages in field
David Smiley created LUCENE-9556: Summary: UnifiedHighlighter: distinguish no field from no passages in field Key: LUCENE-9556 URL: https://issues.apache.org/jira/browse/LUCENE-9556 Project: Lucene - Core Issue Type: Improvement Components: modules/highlighter Reporter: David Smiley The UnifiedHighlighter does not distinguish between highlighting a field that is absent on a document, and highlighting a field that is present but no passages were found (can happen for a variety of reasons) -- both cases produce null. While not a huge deal in general, it's an annoyance. It can be useful for the user/client to detect the content was present but not highlightable so that it might take some default action, like returning the whole field value (possibly to process in some way) or highlighting some other field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.
murblanc commented on a change in pull request #1758: URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r498982949 ## File path: solr/core/src/java/org/apache/solr/cluster/events/impl/CollectionsRepairEventListener.java ## @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.cluster.events.impl; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.atomic.AtomicInteger; + +import org.apache.solr.client.solrj.SolrClient; +import org.apache.solr.client.solrj.cloud.SolrCloudManager; +import org.apache.solr.client.solrj.request.CollectionAdminRequest; +import org.apache.solr.cloud.api.collections.Assign; +import org.apache.solr.cluster.events.ClusterEvent; +import org.apache.solr.cluster.events.ClusterEventListener; +import org.apache.solr.cluster.events.NodesDownEvent; +import org.apache.solr.cluster.events.ReplicasDownEvent; +import org.apache.solr.common.cloud.ClusterState; +import org.apache.solr.common.cloud.Replica; +import org.apache.solr.common.cloud.ReplicaPosition; +import org.apache.solr.core.CoreContainer; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * This is an illustration how to re-implement the combination of 8x + * NodeLostTrigger and AutoAddReplicasPlanAction to maintain the collection's replication factor. + * NOTE: there's no support for 'waitFor' yet. + * NOTE 2: this functionality would be probably more reliable when executed also as a + * periodically scheduled check - both as a reactive (listener) and proactive (scheduled) measure. + */ +public class CollectionsRepairEventListener implements ClusterEventListener { Review comment: Is this class registered somewhere to be actually called? If so I missed that part. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14911) Looger : UpdateLog Error Message : java.io.EOFException
[ https://issues.apache.org/jira/browse/SOLR-14911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-14911. --- Resolution: Invalid Please raise questions like this on the user's list, we try to reserve JIRAs for known bugs/enhancements rather than usage questions. The JIRA system is not a support portal. See: http://lucene.apache.org/solr/community.html#mailing-lists-irc there are links to both Lucene and Solr mailing lists there. A _lot_ more people will see your question on that list and may be able to help more quickly. You might want to review: https://wiki.apache.org/solr/UsingMailingLists If it's determined that this really is a code issue or enhancement to Lucene or Solr and not a configuration/usage problem, we can raise a new JIRA or reopen this one. > Looger : UpdateLog Error Message : java.io.EOFException > --- > > Key: SOLR-14911 > URL: https://issues.apache.org/jira/browse/SOLR-14911 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCLI >Affects Versions: 7.5 >Reporter: D >Priority: Blocker > Attachments: GC events.PNG > > > *Events:* > # GC logs showing continuos Full GC events. Log report attached. > # Core filling failed , showing less data than expected. > # following warnings showing on dashboard before error. > |Level|Logger|Message| > |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to > managed, but the non-managed schema schema.xml is still loadable. > PLEASE REMOVE THIS FILE.| > |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to > managed, but the non-managed schema schema.xml is still loadable. > PLEASE REMOVE THIS FILE.| > |WARN false|SolrResourceLoader|Solr loaded a deprecated plugin/analysis class > [solr.TrieDateField]. Please consult documentation how to replace it > accordingly.| > |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to > managed, but the non-managed schema schema.xml is still loadable. > PLEASE REMOVE THIS FILE.| > |WARN false|UpdateLog|Starting log replay > tlog\{file=\data\tlog\tlog.0445482 refcount=2} > active=false starting pos=0 inSortedOrder=false| > * Total data in all cores around 8 GB > * *Other Configurations:* > ** -XX:+UseG1GC > ** -XX:+UseStringDeduplication > ** -XX:MaxGCPauseMillis=500 > ** -Xms15g > ** -Xmx15g > ** -Xss256k > * *OS Environment :* > ** Windows 10, > ** Filling cores by calling SQL query using jtds-1.3.1 library. > ** Solr Version 7.5 > ** Runtime: Oracle Corporation OpenJDK 64-Bit Server VM 11.0.2 11.0.2+9 > ** Processors : 48 > ** System Physical Memory : 128 GB > ** Swap Space : 256GB > * solr-spec7.5.0 > ** solr-impl7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:07:55 > * lucene-spec7.5.0 > ** lucene-impl7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - > 2018-09-18 13:01:1 > *Error Message :* > java.io.EOFException > at > org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:168) > at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:863) > at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:857) > at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:266) > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256) > at > org.apache.solr.common.util.JavaBinCodec.readSolrInputDocument(JavaBinCodec.java:603) > at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:315) > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256) > at org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:747) > at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:272) > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256) > at > org.apache.solr.update.TransactionLog$LogReader.next(TransactionLog.java:673) > at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1832) > at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1747) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at >
[jira] [Created] (SOLR-14911) Looger : UpdateLog Error Message : java.io.EOFException
D created SOLR-14911: Summary: Looger : UpdateLog Error Message : java.io.EOFException Key: SOLR-14911 URL: https://issues.apache.org/jira/browse/SOLR-14911 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: SolrCLI Affects Versions: 7.5 Reporter: D Attachments: GC events.PNG *Events:* # GC logs showing continuos Full GC events. Log report attached. # Core filling failed , showing less data than expected. # following warnings showing on dashboard before error. |Level|Logger|Message| |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to managed, but the non-managed schema schema.xml is still loadable. PLEASE REMOVE THIS FILE.| |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to managed, but the non-managed schema schema.xml is still loadable. PLEASE REMOVE THIS FILE.| |WARN false|SolrResourceLoader|Solr loaded a deprecated plugin/analysis class [solr.TrieDateField]. Please consult documentation how to replace it accordingly.| |WARN false|ManagedIndexSchemaFactory|The schema has been upgraded to managed, but the non-managed schema schema.xml is still loadable. PLEASE REMOVE THIS FILE.| |WARN false|UpdateLog|Starting log replay tlog\{file=\data\tlog\tlog.0445482 refcount=2} active=false starting pos=0 inSortedOrder=false| * Total data in all cores around 8 GB * *Other Configurations:* ** -XX:+UseG1GC ** -XX:+UseStringDeduplication ** -XX:MaxGCPauseMillis=500 ** -Xms15g ** -Xmx15g ** -Xss256k * *OS Environment :* ** Windows 10, ** Filling cores by calling SQL query using jtds-1.3.1 library. ** Solr Version 7.5 ** Runtime: Oracle Corporation OpenJDK 64-Bit Server VM 11.0.2 11.0.2+9 ** Processors : 48 ** System Physical Memory : 128 GB ** Swap Space : 256GB * solr-spec7.5.0 ** solr-impl7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55 * lucene-spec7.5.0 ** lucene-impl7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:01:1 *Error Message :* java.io.EOFException at org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:168) at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:863) at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:857) at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:266) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256) at org.apache.solr.common.util.JavaBinCodec.readSolrInputDocument(JavaBinCodec.java:603) at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:315) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256) at org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:747) at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:272) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256) at org.apache.solr.update.TransactionLog$LogReader.next(TransactionLog.java:673) at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1832) at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1747) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.
murblanc commented on a change in pull request #1758: URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r498957288 ## File path: solr/core/src/java/org/apache/solr/cluster/events/ClusterEventProducer.java ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.cluster.events; + +import org.apache.solr.cloud.ClusterSingleton; + +import java.util.Collections; +import java.util.Map; +import java.util.Objects; +import java.util.Set; +import java.util.concurrent.ConcurrentHashMap; + +/** + * Component that produces {@link ClusterEvent} instances. + */ +public interface ClusterEventProducer extends ClusterSingleton { + + String PLUGIN_NAME = "clusterEventProducer"; + + default String getName() { +return PLUGIN_NAME; + } + + /** + * Returns a modifiable map of event types and listeners to process events + * of a given type. + */ + Map> getEventListeners(); + + /** + * Register an event listener for processing the specified event types. + * @param listener non-null listener. If the same instance of the listener is + * already registered it will be ignored. + * @param eventTypes non-empty array of event types that this listener + * is being registered for. If this is null or empty then all types will be used. + */ + default void registerListener(ClusterEventListener listener, ClusterEvent.EventType... eventTypes) throws Exception { +Objects.requireNonNull(listener); +if (eventTypes == null || eventTypes.length == 0) { + eventTypes = ClusterEvent.EventType.values(); +} +for (ClusterEvent.EventType type : eventTypes) { + Set perType = getEventListeners().computeIfAbsent(type, t -> ConcurrentHashMap.newKeySet()); + perType.add(listener); +} + } + + /** + * Unregister an event listener. + * @param listener non-null listener. + */ + default void unregisterListener(ClusterEventListener listener) { +Objects.requireNonNull(listener); +getEventListeners().forEach((type, listeners) -> { + listeners.remove(listener); +}); + } + + /** + * Unregister an event listener for specified event types. + * @param listener non-null listener. + * @param eventTypes event types from which the listener will be unregistered. If this + * is null or empty then all event types will be used + */ + default void unregisterListener(ClusterEventListener listener, ClusterEvent.EventType... eventTypes) { +Objects.requireNonNull(listener); +if (eventTypes == null || eventTypes.length == 0) { + eventTypes = ClusterEvent.EventType.values(); +} +for (ClusterEvent.EventType type : eventTypes) { + getEventListeners() + .getOrDefault(type, Collections.emptySet()) + .remove(listener); +} + } + + /** + * Fire an event. This method will call registered listeners that subscribed to the + * type of event being passed. + * @param event cluster event + */ + default void fireEvent(ClusterEvent event) { Review comment: Firing (publishing) events is definitely needed in solr core where events are created. Do we need to couple the publishing of events and managing the consumption of events as done here, or maybe have a simple publishing interface (with this single method) and a separate one for implementing the even distribution logic? This could eventually allow keeping in Solr core the things that really belong there (creating events related to solr internal state change and publishing them) while allowing to move the event bus logic outside of solr core. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dnhatn commented on a change in pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs
dnhatn commented on a change in pull request #1941: URL: https://github.com/apache/lucene-solr/pull/1941#discussion_r498957345 ## File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java ## @@ -5463,6 +5463,13 @@ private void tooManyDocs(long addedNumDocs) { throw new IllegalArgumentException("number of documents in the index cannot exceed " + actualMaxDocs + " (current document count is " + pendingNumDocs.get() + "; added numDocs is " + addedNumDocs + ")"); } + /** + * Returns the number of documents in the index including documents are being added (i.e., reserved). + */ Review comment: ++. I pushed https://github.com/apache/lucene-solr/pull/1941/commits/61d4fbc4a01cbfc50f386a8a88064e82d1027ee0 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] muse-dev[bot] commented on a change in pull request #1938: SOLR-14659: Remove restlet as dependency for the ManagedResource API
muse-dev[bot] commented on a change in pull request #1938: URL: https://github.com/apache/lucene-solr/pull/1938#discussion_r498956758 ## File path: solr/core/src/java/org/apache/solr/rest/RestManager.java ## @@ -160,6 +156,10 @@ private Pattern getReservedEndpointsPattern() { return Pattern.compile(builder.toString()); } +public boolean isPathRegistered(String s) { + return registered.containsKey(s); Review comment: *THREAD_SAFETY_VIOLATION:* Read/Write race. Non-private method `RestManager$Registry.isPathRegistered(...)` reads without synchronization from container `this.registered` via call to `Map.containsKey(...)`. Potentially races with write in method `RestManager$Registry.registerManagedResource(...)`. Reporting because another access to the same memory occurs on a background thread, although this access may not. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.
murblanc commented on a change in pull request #1758: URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r498954021 ## File path: solr/core/src/java/org/apache/solr/cluster/events/ClusterEvent.java ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.cluster.events; + +import org.apache.solr.common.MapWriter; + +import java.io.IOException; +import java.time.Instant; + +/** + * Cluster-level event. + */ +public interface ClusterEvent extends MapWriter { Review comment: The `MapWriter` related machinery (including the `writeMap` method) doesn't belong in the interfaces IMO. It should be hidden in the implementation if it's needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14749) Provide a clean API for cluster-level event processing
[ https://issues.apache.org/jira/browse/SOLR-14749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206361#comment-17206361 ] Andrzej Bialecki commented on SOLR-14749: - {quote}The PR is too complex. {quote} What part is too complex? The proposed interfaces or the proof-of-concept implementation? The APIs are quite simple IMHO. {quote}please do not make backward incompatible changes {quote} Sure, I reverted that change, although it would have been simple to accommodate the name change so that we preserve back-compat, and the current name is illogical - {{/cluster/plugin}} vs. {{/cluster/plugins}} when the location keeps information about multiple plugins. In any case, it's a trivial change so let's not focus on this. {quote}Here is my official -1 on this PR. We would like to have a proper discussion {quote} I had the impression we are having a proper discussion, both here and on the PR. Your -1 means you have serious technical objections, correct? If so then please explain in detail your objections to the proposed implementation, having in mind the goal and scope of this issue. If you have in mind an alternative approach that can satisfy these goals then I'm all ears. > Provide a clean API for cluster-level event processing > -- > > Key: SOLR-14749 > URL: https://issues.apache.org/jira/browse/SOLR-14749 > Project: Solr > Issue Type: Improvement > Components: AutoScaling >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Labels: clean-api > Fix For: master (9.0) > > Time Spent: 14h 20m > Remaining Estimate: 0h > > This is a companion issue to SOLR-14613 and it aims at providing a clean, > strongly typed API for the functionality formerly known as "triggers" - that > is, a component for generating cluster-level events corresponding to changes > in the cluster state, and a pluggable API for processing these events. > The 8x triggers have been removed so this functionality is currently missing > in 9.0. However, this functionality is crucial for implementing the automatic > collection repair and re-balancing as the cluster state changes (nodes going > down / up, becoming overloaded / unused / decommissioned, etc). > For this reason we need this API and a default implementation of triggers > that at least can perform automatic collection repair (maintaining the > desired replication factor in presence of live node changes). > As before, the actual changes to the collections will be executed using > existing CollectionAdmin API, which in turn may use the placement plugins > from SOLR-14613. > h3. Division of responsibility > * built-in Solr components (non-pluggable): > ** cluster state monitoring and event generation, > ** simple scheduler to periodically generate scheduled events > * plugins: > ** automatic collection repair on {{nodeLost}} events (provided by default) > ** re-balancing of replicas (periodic or on {{nodeAdded}} events) > ** reporting (eg. requesting additional node provisioning) > ** scheduled maintenance (eg. removing inactive shards after split) > h3. Other considerations > These plugins (unlike the placement plugins) need to execute on one > designated node in the cluster. Currently the easiest way to implement this > is to run them on the Overseer leader node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank
cpoerschke commented on a change in pull request #1571: URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r498948018 ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/search/LTRQParserPlugin.java ## @@ -146,93 +149,114 @@ public LTRQParser(String qstr, SolrParams localParams, SolrParams params, @Override public Query parse() throws SyntaxError { // ReRanking Model - final String modelName = localParams.get(LTRQParserPlugin.MODEL); - if ((modelName == null) || modelName.isEmpty()) { + final String[] modelNames = localParams.getParams(LTRQParserPlugin.MODEL); + if ((modelNames == null) || modelNames.length==0 || modelNames[0].isEmpty()) { throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Must provide model in the request"); } - - final LTRScoringModel ltrScoringModel = mr.getModel(modelName); - if (ltrScoringModel == null) { -throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, -"cannot find " + LTRQParserPlugin.MODEL + " " + modelName); - } - - final String modelFeatureStoreName = ltrScoringModel.getFeatureStoreName(); - final boolean extractFeatures = SolrQueryRequestContextUtils.isExtractingFeatures(req); - final String fvStoreName = SolrQueryRequestContextUtils.getFvStoreName(req); - // Check if features are requested and if the model feature store and feature-transform feature store are the same - final boolean featuresRequestedFromSameStore = (modelFeatureStoreName.equals(fvStoreName) || fvStoreName == null) ? extractFeatures:false; - if (threadManager != null) { - threadManager.setExecutor(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor()); - } - final LTRScoringQuery scoringQuery = new LTRScoringQuery(ltrScoringModel, - extractEFIParams(localParams), - featuresRequestedFromSameStore, threadManager); - - // Enable the feature vector caching if we are extracting features, and the features - // we requested are the same ones we are reranking with - if (featuresRequestedFromSameStore) { -scoringQuery.setFeatureLogger( SolrQueryRequestContextUtils.getFeatureLogger(req) ); + + LTRScoringQuery[] rerankingQueries = new LTRScoringQuery[modelNames.length]; + for (int i = 0; i < modelNames.length; i++) { +final LTRScoringQuery rerankingQuery; +if (!ORIGINAL_RANKING.equals(modelNames[i])) { Review comment: Ah, good point about the special "OriginalRanking" also appearing in the "[interleaving]" transformer! When using interleaving there's always at least two models to be interleaved, right? The models could all be actual models or one of them could be the "OriginalRanking" pseudo-model. I wonder if class inheritance might help us e.g. ``` class InterleavingLTRQParserPlugin extends LTRQParserPlugin ``` and (say) ``` ``` where ``` rq={!iltr model=myModelXYZ} ``` can be a convenience equivalent to ``` rq={!iltr model=OriginalRanking model=myModelXYZ} ``` i.e. if only one model is supplied then it is implied that the second model is the original ranking. And if the special "OriginalRanking" name doesn't suit someone (either because they already have a real model that happens to be called "OriginalRanking" or because they would prefer a different descriptor in the "[interleaving]" transformer output) then something like ``` rq={!iltr originalRankingModel=noModel model=noModel model=someModel} ``` would allow them to call the "OriginalRanking" something else e.g. "noModel" instead. We could even reject any "OriginalRanking" models that are actual models via something like ``` final String originalRankingModelName = localParams.get("originalRankingModel", "OriginalRanking" /* default */); if (null != mr.getModel(originalRankingModelName)) { throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Found an actual '" + originalRankingModelName + "' model, please ... } ``` However, the "[interleaving]" transformer still needs to know about the special name, hmm. At the moment we have ``` public static boolean isOriginalRanking(LTRScoringQuery rerankingQuery){ return rerankingQuery.getScoringModel() == null; } ``` in LTRQParserPlugin and in LTRInterleavingTransformerFactory ``` if (isOriginalRanking(rerankingQuery)) { doc.addField(name, ORIGINAL_RANKING); } else { doc.addField(name, rerankingQuery.getScoringModel().getName()); } ``` and if we gave LTRScoringQuery a getScoringModelName method ``` public String getScoringModelName() { return ltrScoringModel.getName(); } ```
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs
mikemccand commented on a change in pull request #1941: URL: https://github.com/apache/lucene-solr/pull/1941#discussion_r498937003 ## File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java ## @@ -5463,6 +5463,13 @@ private void tooManyDocs(long addedNumDocs) { throw new IllegalArgumentException("number of documents in the index cannot exceed " + actualMaxDocs + " (current document count is " + pendingNumDocs.get() + "; added numDocs is " + addedNumDocs + ")"); } + /** + * Returns the number of documents in the index including documents are being added (i.e., reserved). + */ Review comment: Maybe add `@lucene.experimental`? We are exposing (slightly) internal details about `IndexWriter` so maybe we need to reserve the right to change this API in the future ... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dnhatn commented on pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs
dnhatn commented on pull request #1941: URL: https://github.com/apache/lucene-solr/pull/1941#issuecomment-702837408 > I was a little confused by why we separate negative and non-negative cases in IndexWriter... @mikemccand Thank you for looking. I reverted that change and will make it in a separate PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4510) when a test's heart beats it should also throw up (dump stack of all threads)
[ https://issues.apache.org/jira/browse/LUCENE-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206317#comment-17206317 ] Michael McCandless commented on LUCENE-4510: Thanks [~dweiss], [~rcmuir] and [~uschindler]! I will attempt all of the above solutions when I next need it again :) > when a test's heart beats it should also throw up (dump stack of all threads) > - > > Key: LUCENE-4510 > URL: https://issues.apache.org/jira/browse/LUCENE-4510 > Project: Lucene - Core > Issue Type: Bug >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > > We've had numerous cases where tests were hung but the "operator" of that > particular Jenkins instance struggles to properly get a stack dump for all > threads and eg accidentally kills the process instead (rather awful that the > same powerful tool "kill" can be used to get stack traces and to destroy the > process...). > Is there some way the test infra could do this for us, eg when it prints the > HEARTBEAT message? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude commented on pull request #1938: SOLR-14659: Remove restlet as dependency for the ManagedResource API
thelabdude commented on pull request #1938: URL: https://github.com/apache/lucene-solr/pull/1938#issuecomment-702832996 @noblepaul so you had SOLR-14659 with fix version 8.7. Given all the tests pass, I think we could apply these changes for 8.7 but I'd be more comfortable if we just left 8.x as-is and apply these changes to 9.x ... wdyt? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206314#comment-17206314 ] Michael McCandless commented on LUCENE-9444: [~goankur] can we re-resolve this one now? > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Fix For: master (9.0), 8.7 > > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 5h 10m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on pull request #1943: LUCENE-9555 Ensure scorerIterator is fresh for opt
mayya-sharipova commented on pull request #1943: URL: https://github.com/apache/lucene-solr/pull/1943#issuecomment-702831820 This patch also addresses the failure of `TestUnifiedHighlighterStrictPhrases.testBasics`. This test was using `ReqExclBulkScorer` that lead to `scorerIterator` being already on max document when we were trying to make a conjunction between `scorerIterator` and `collectorIterator`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14659) Remove restlet from Solr
[ https://issues.apache.org/jira/browse/SOLR-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206313#comment-17206313 ] Timothy Potter commented on SOLR-14659: --- I'm changing the fix version to 9.0 for this effort. > Remove restlet from Solr > > > Key: SOLR-14659 > URL: https://issues.apache.org/jira/browse/SOLR-14659 > Project: Solr > Issue Type: Improvement >Reporter: Noble Paul >Assignee: Timothy Potter >Priority: Major > Fix For: master (9.0) > > > restlet is only used by managed resources. We can support that even without a > restlet. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14659) Remove restlet from Solr
[ https://issues.apache.org/jira/browse/SOLR-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter reassigned SOLR-14659: - Assignee: Timothy Potter > Remove restlet from Solr > > > Key: SOLR-14659 > URL: https://issues.apache.org/jira/browse/SOLR-14659 > Project: Solr > Issue Type: Improvement >Reporter: Noble Paul >Assignee: Timothy Potter >Priority: Major > Fix For: 8.7 > > > restlet is only used by managed resources. We can support that even without a > restlet. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14659) Remove restlet from Solr
[ https://issues.apache.org/jira/browse/SOLR-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter updated SOLR-14659: -- Fix Version/s: (was: 8.7) master (9.0) > Remove restlet from Solr > > > Key: SOLR-14659 > URL: https://issues.apache.org/jira/browse/SOLR-14659 > Project: Solr > Issue Type: Improvement >Reporter: Noble Paul >Assignee: Timothy Potter >Priority: Major > Fix For: master (9.0) > > > restlet is only used by managed resources. We can support that even without a > restlet. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova opened a new pull request #1943: LUCENE-9555 Ensure scorerIterator is fresh for opt
mayya-sharipova opened a new pull request #1943: URL: https://github.com/apache/lucene-solr/pull/1943 Some collectors provide iterators that can efficiently skip non-competitive docs. When using DefaultBulkScorer#score function we create a conjunction of scorerIterator and collectorIterator. As collectorIterator always starts from a docID = -1, and for creation of conjunction iterator we need all of its sub-iterators to be on the same doc, the creation of conjuction iterator will fail if scorerIterator has already been advanced to some other document. This patch ensures that we create conjunction between scorerIterator and collectorIterator only if scorerIterator has not been advanced yet. Relates to #1725 Relates to #1937 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9555) Sort optimization failure if scorerIterator is already advanced
[ https://issues.apache.org/jira/browse/LUCENE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova updated LUCENE-9555: Description: Some collectors provide iterators that can efficiently skip non-competitive docs. When using DefaultBulkScorer#score function we create a conjunction of scorerIterator and collectorIterator. The problem could be if scorerIterator has already been advanced. As collectorIterator always starts from a docID = -1, and for creation of conjunction iterator we need all of its sub-iterators to be on the same doc, the creation of conjunction iterator will fail. We need to create a conjunction between scorerIterator and collectorIterator only if scorerIterator has not been advanced yet. Relates to https://issues.apache.org/jira/browse/LUCENE-9280 Relates to https://issues.apache.org/jira/browse/LUCENE-9541 was: Some collectors provide iterators that can efficiently skip non-competitive docs. When using DefaultBulkScorer#score function we create a conjunction of scorerIterator and collectorIterator. The problem could be if scorerIterator has already been advanced. As collectorIterator always starts from a docID = -1, and for creation of conjunction iterator we need all of its sub-iterators to be on the same doc, the creation of conjunction iterator will fail. We need to create a conjunction between scorerIterator and collectorIterator only if scorerIterator has not been advanced yet. Relates to https://issues.apache.org/jira/browse/LUCENE-9280 Relates to https://issues.apache.org/jira/browse/LUCENE-9541 > Sort optimization failure if scorerIterator is already advanced > --- > > Key: LUCENE-9555 > URL: https://issues.apache.org/jira/browse/LUCENE-9555 > Project: Lucene - Core > Issue Type: Bug >Reporter: Mayya Sharipova >Priority: Minor > > Some collectors provide iterators that can efficiently skip non-competitive > docs. When using DefaultBulkScorer#score function we create a conjunction of > scorerIterator and collectorIterator. The problem could be if scorerIterator > has already been advanced. As collectorIterator always starts from a docID = > -1, and for creation of conjunction iterator we need all of its > sub-iterators to be on the same doc, the creation of conjunction iterator > will fail. > We need to create a conjunction between scorerIterator and collectorIterator > only if scorerIterator has not been advanced yet. > Relates to https://issues.apache.org/jira/browse/LUCENE-9280 > Relates to https://issues.apache.org/jira/browse/LUCENE-9541 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9555) Sort optimization failure if scorerIterator is already advanced
[ https://issues.apache.org/jira/browse/LUCENE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova updated LUCENE-9555: Description: Some collectors provide iterators that can efficiently skip non-competitive docs. When using DefaultBulkScorer#score function we create a conjunction of scorerIterator and collectorIterator. The problem could be if scorerIterator has already been advanced. As collectorIterator always starts from a docID = -1, and for creation of conjunction iterator we need all of its sub-iterators to be on the same doc, the creation of conjunction iterator will fail. We need to create a conjunction between scorerIterator and collectorIterator only if scorerIterator has not been advanced yet. Relates to https://issues.apache.org/jira/browse/LUCENE-9280 Relates to https://issues.apache.org/jira/browse/LUCENE-9541 was: Some collectors provide iterators that can efficiently skip non-competitive docs. When using DefaultBulkScorer#score function we create a conjunction of scorerIterator and collectorIterator. The problem could be that if scorerIterator has already been advanced, while collectorIterator always starts from a docID = -1. We need to create a conjunction between scorerIterator and collectorIterator only if scorerIterator has not been advanced yet. Relates to https://issues.apache.org/jira/browse/LUCENE-9280 Relates to https://issues.apache.org/jira/browse/LUCENE-9541 > Sort optimization failure if scorerIterator is already advanced > --- > > Key: LUCENE-9555 > URL: https://issues.apache.org/jira/browse/LUCENE-9555 > Project: Lucene - Core > Issue Type: Bug >Reporter: Mayya Sharipova >Priority: Minor > > Some collectors provide iterators that can efficiently skip non-competitive > docs. When using DefaultBulkScorer#score function > we create a conjunction of scorerIterator and collectorIterator. The problem > could be if scorerIterator has already been advanced. As collectorIterator > always starts from a docID = -1, and for creation of conjunction iterator we > need all of its > sub-iterators to be on the same doc, the creation of conjunction iterator > will fail. > We need to create a conjunction between scorerIterator and collectorIterator > only if scorerIterator has not been advanced yet. > Relates to https://issues.apache.org/jira/browse/LUCENE-9280 > Relates to https://issues.apache.org/jira/browse/LUCENE-9541 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1912: LUCENE-9535: Try to do larger flushes.
mikemccand commented on a change in pull request #1912: URL: https://github.com/apache/lucene-solr/pull/1912#discussion_r498242843 ## File path: lucene/core/src/java/org/apache/lucene/index/ApproximatePriorityQueue.java ## @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.index; + +import java.util.ArrayList; +import java.util.List; +import java.util.ListIterator; +import java.util.function.Predicate; + +/** + * An approximate priority queue, which attempts to poll items by decreasing + * log of the weight, though exact ordering is not guaranteed. + * This class doesn't support null elements. + */ +final class ApproximatePriorityQueue { + + // Indexes between 0 and 63 are sparely populated, and indexes that are + // greater than or equal to 64 are densely populated + // Items closed to the beginning of this list are more likely to have a Review comment: s/`closed`/`close`? ## File path: lucene/core/src/java/org/apache/lucene/index/ApproximatePriorityQueue.java ## @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.index; + +import java.util.ArrayList; +import java.util.List; +import java.util.ListIterator; +import java.util.function.Predicate; + +/** + * An approximate priority queue, which attempts to poll items by decreasing + * log of the weight, though exact ordering is not guaranteed. + * This class doesn't support null elements. + */ +final class ApproximatePriorityQueue { + + // Indexes between 0 and 63 are sparely populated, and indexes that are Review comment: s/`sparely`/`sparsely`? ## File path: lucene/core/src/java/org/apache/lucene/index/ApproximatePriorityQueue.java ## @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.index; + +import java.util.ArrayList; +import java.util.List; +import java.util.ListIterator; +import java.util.function.Predicate; + +/** + * An approximate priority queue, which attempts to poll items by decreasing + * log of the weight, though exact ordering is not guaranteed. + * This class doesn't support null elements. + */ +final class ApproximatePriorityQueue { + + // Indexes between 0 and 63 are sparely populated, and indexes that are + // greater than or equal to 64 are densely populated + // Items closed to the beginning of this list are more likely to have a + // higher weight. + private final List slots = new ArrayList<>(Long.SIZE); + + // A bitset where ones indicate that the corresponding index in `slots` is taken. + private long usedSlots = 0L; + + ApproximatePriorityQueue() { +for
[GitHub] [lucene-solr] ErickErickson commented on pull request #1942: SOLR-14910: Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard, change //logok to //nowarn
ErickErickson commented on pull request #1942: URL: https://github.com/apache/lucene-solr/pull/1942#issuecomment-702826141 It was decided (there’s a phrase I hate, used to avoid taking responsibility for something)… Anyway, Andras Salamon and I decided that it was often an error to have logging calls with exception.getMessage() rather than the full stack trace, see: SOLR-14523 and the checking code isn’t sophisticated enough to figure out that this call to getMessage() has nothing to do with an exception. This is a case that doesn’t matter whether it’s in an "if (loglevelenabled)” clause or not… WDYT about adding a reference to SOLR-14523 to the output message? That’s not obvious on a code review unfortunately... > On Oct 2, 2020, at 11:22 AM, David Smiley wrote: > > > @dsmiley commented on this pull request. > > In solr/core/src/java/org/apache/solr/servlet/ : > > > @@ -493,7 +493,7 @@ Action authorize() throws IOException { > } > if (statusCode == AuthorizationResponse.FORBIDDEN.statusCode) { >if (log.isDebugEnabled()) { > -log.debug("UNAUTHORIZED auth header {} context : {}, msg: {}", req.getHeader("Authorization"), context, authResponse.getMessage()); // logOk > +log.debug("UNAUTHORIZED auth header {} context : {}, msg: {}", req.getHeader("Authorization"), context, authResponse.getMessage()); // nowarn > > What does our source validation complain about here? > > Many of the logok/nowarn places look fine to me at a glance but I'm no match for the logging policeman ;-) > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub, or unsubscribe. > This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9555) Sort optimization failure if scorerIterator is already advanced
Mayya Sharipova created LUCENE-9555: --- Summary: Sort optimization failure if scorerIterator is already advanced Key: LUCENE-9555 URL: https://issues.apache.org/jira/browse/LUCENE-9555 Project: Lucene - Core Issue Type: Bug Reporter: Mayya Sharipova Some collectors provide iterators that can efficiently skip non-competitive docs. When using DefaultBulkScorer#score function we create a conjunction of scorerIterator and collectorIterator. The problem could be that if scorerIterator has already been advanced, while collectorIterator always starts from a docID = -1. We need to create a conjunction between scorerIterator and collectorIterator only if scorerIterator has not been advanced yet. Relates to https://issues.apache.org/jira/browse/LUCENE-9280 Relates to https://issues.apache.org/jira/browse/LUCENE-9541 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude commented on a change in pull request #1938: SOLR-14766: Remove restlet as dependency for the ManagedResource API
thelabdude commented on a change in pull request #1938: URL: https://github.com/apache/lucene-solr/pull/1938#discussion_r498916106 ## File path: solr/core/src/java/org/apache/solr/rest/RestManager.java ## @@ -326,44 +327,46 @@ public void doInit() throws ResourceException { } } } - + if (managedResource == null) { -if (Method.PUT.equals(getMethod()) || Method.POST.equals(getMethod())) { +final String method = getSolrRequest().getHttpMethod(); +if ("PUT".equals(method) || "POST".equals(method)) { Review comment: String equals unsafe? Not sure I understand what this warning is trying to convey? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude commented on pull request #1938: SOLR-14766: Remove restlet as dependency for the ManagedResource API
thelabdude commented on pull request #1938: URL: https://github.com/apache/lucene-solr/pull/1938#issuecomment-702821500 Thanks @noblepaul ... that's much cleaner. Patch applied. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] iverase commented on pull request #1940: LUCENE-9552: Adds a LatLonPoint query that accepts an array of LatLonGeometries
iverase commented on pull request #1940: URL: https://github.com/apache/lucene-solr/pull/1940#issuecomment-702819188 I have two motivations for this change: 1) Finding all points at a distance of a line can be approximated by creating a polygon and two circles and finding all points that is inside any of those geometries. Currently three queries are need inside a boolean OR. With this approach it can be executed in a single query. 2) API-wise, it simplifies the implementation as you don't need to know which geometry you are dealing with to create the query. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs
mikemccand commented on a change in pull request #1941: URL: https://github.com/apache/lucene-solr/pull/1941#discussion_r498907778 ## File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java ## @@ -1072,7 +1073,14 @@ public IndexWriter(Directory d, IndexWriterConfig conf) throws IOException { config.getFlushPolicy().init(config); bufferedUpdatesStream = new BufferedUpdatesStream(infoStream); - docWriter = new DocumentsWriter(flushNotifications, segmentInfos.getIndexCreatedVersionMajor(), pendingNumDocs, + final IntConsumer reserveDocs = numDocs -> { +if (numDocs > 0) { + reserveDocs(numDocs); Review comment: I'm confused why we are calling separate methods when `numDocs` is negative or not? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1942: SOLR-14910: Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard, change //logok to //n
dsmiley commented on a change in pull request #1942: URL: https://github.com/apache/lucene-solr/pull/1942#discussion_r498887554 ## File path: solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java ## @@ -493,7 +493,7 @@ Action authorize() throws IOException { } if (statusCode == AuthorizationResponse.FORBIDDEN.statusCode) { if (log.isDebugEnabled()) { -log.debug("UNAUTHORIZED auth header {} context : {}, msg: {}", req.getHeader("Authorization"), context, authResponse.getMessage()); // logOk +log.debug("UNAUTHORIZED auth header {} context : {}, msg: {}", req.getHeader("Authorization"), context, authResponse.getMessage()); // nowarn Review comment: What does our source validation complain about here? Many of the logok/nowarn places look fine to me at a glance but I'm no match for the logging policeman ;-) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson opened a new pull request #1942: SOLR-14910: Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard, change //logok to //nowarn
ErickErickson opened a new pull request #1942: URL: https://github.com/apache/lucene-solr/pull/1942 Most of this is just changing //logok to //nowarn. The substantive changes are in validate-log-calls.gradle, taking out the special handling for several files and _not_ producing failures if there's a //nowarn on the line instead, plus adding a few //nowarn tags in the files that were handled specially. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] rmuir commented on pull request #1940: LUCENE-9552: Adds a LatLonPoint query that accepts an array of LatLonGeometries
rmuir commented on pull request #1940: URL: https://github.com/apache/lucene-solr/pull/1940#issuecomment-702790376 Why is this needed when the existing Polygon is a multipolygon and optimizes for that case? Instead of making an array of polygons, just use a single MultiPolygon? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cpoerschke commented on pull request #1890: Rename ConfigSetsAPITest to TestConfigSetsAPISolrCloud
cpoerschke commented on pull request #1890: URL: https://github.com/apache/lucene-solr/pull/1890#issuecomment-702776305 > ... if adding of `ConfigSetsAPITest` functionality to `TestConfigSetsAPI` might be better? ... Looking more closely, there seem to be sufficient differences between the tests i.e. them remaining separate they would be clearer. However, `TestConfigSetsAPIShareSchema` could be a better alternative name than `TestConfigSetsAPISolrCloud`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dnhatn opened a new pull request #1941: LUCENE-9554: Expose IndexWriter#pendingNumDocs
dnhatn opened a new pull request #1941: URL: https://github.com/apache/lucene-solr/pull/1941 Some applications can use the pendingNumDocs from IndexWriter to estimate that the number of documents of an index is very close to the hard limit so that it can reject writes without constructing Lucene documents. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cpoerschke merged pull request #1920: branch_8x: add two missing(?) solr/CHANGES.txt entries
cpoerschke merged pull request #1920: URL: https://github.com/apache/lucene-solr/pull/1920 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9554) Expose pendingNumDocs from IndexWriter
Nhat Nguyen created LUCENE-9554: --- Summary: Expose pendingNumDocs from IndexWriter Key: LUCENE-9554 URL: https://issues.apache.org/jira/browse/LUCENE-9554 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nhat Nguyen Some applications can use the pendingNumDocs from IndexWriter to estimate that the number of documents of an index is reaching to the hard limit so that it can reject writes without constructing Lucene documents. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14910) Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard, change //logok to //nowarn
[ https://issues.apache.org/jira/browse/SOLR-14910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reassigned SOLR-14910: - Assignee: Erick Erickson > Use in-line tags for logger declarations in Gradle ValidateLogCalls that are > non-standard, change //logok to //nowarn > - > > Key: SOLR-14910 > URL: https://issues.apache.org/jira/browse/SOLR-14910 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1937: LUCENE-9541 ConjunctionDISI sub-iterators check
mayya-sharipova commented on a change in pull request #1937: URL: https://github.com/apache/lucene-solr/pull/1937#discussion_r498851410 ## File path: lucene/core/src/java/org/apache/lucene/search/ConjunctionDISI.java ## @@ -140,6 +141,13 @@ private static void addTwoPhaseIterator(TwoPhaseIterator twoPhaseIter, List allIterators, List twoPhaseIterators) { + +// assert that all sub-iterators are on the same doc ID +int curDoc = allIterators.size() > 0 ? allIterators.get(0).docID() : twoPhaseIterators.get(0).approximation.docID(); +boolean iteratorsOnTheSameDoc = allIterators.stream().allMatch(it -> it.docID() == curDoc); +iteratorsOnTheSameDoc = iteratorsOnTheSameDoc && twoPhaseIterators.stream().allMatch(it -> it.approximation().docID() == curDoc); +assert iteratorsOnTheSameDoc : "Sub-iterators of ConjunctionDISI are not the same document!"; Review comment: addressed in 74151e3 ## File path: lucene/core/src/java/org/apache/lucene/search/ConjunctionDISI.java ## @@ -227,6 +236,7 @@ private int doNext(int doc) throws IOException { @Override public int advance(int target) throws IOException { +assertItersOnSameDoc(); Review comment: addressed in 74151e3 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] iverase opened a new pull request #1940: LUCENE-9552: Adds a LatLonPoint query that accepts an array of LatLonGeometries
iverase opened a new pull request #1940: URL: https://github.com/apache/lucene-solr/pull/1940 New query that accepts an array of LatLonGeometries. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1921: SOLR-14829: Improve documentation for Request Handlers in RefGuide and solrconfig.xml
dsmiley commented on a change in pull request #1921: URL: https://github.com/apache/lucene-solr/pull/1921#discussion_r498832396 ## File path: solr/solr-ref-guide/src/common-query-parameters.adoc ## @@ -307,11 +307,13 @@ The `echoParams` parameter controls what information about request parameters is The `echoParams` parameter accepts the following values: -* `explicit`: This is the default value. Only parameters included in the actual request, plus the `_` parameter (which is a 64-bit numeric timestamp) will be added to the `params` section of the response header. +* `explicit`: Only parameters included in the actual request, plus the `_` parameter (which is a 64-bit numeric timestamp) will be added to the `params` section of the response header. Review comment: I did some digging and discovered that the admin UI will add this underscore. See `services.js` which does this via `Date.now()` all over the place. I don't know what purpose it has; git blame shows it was added with that whole UI refactor of it's day but perhaps it predated it? AFAICT Solr isn't doing anything with it Solr side; I set multiple conditional debugger breakpoints across SolrParams subclasses but no where is "_" requested. My suspicion is that it was added to foil HTTP cache attempts? I did some JIRA digging and found: https://issues.apache.org/jira/browse/SOLR-4311 which has a patch with the underscores but it was for version, not the timestamp. Given the underscore's use to defeat bad caching in that scenario, if I keep digging, I'll probably find it. In short, I don't think we should document this underscore param in the ref guide. It's not a special param to Solr; it's an oddity of our current admin UI. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14910) Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard, change //logok to //nowarn
[ https://issues.apache.org/jira/browse/SOLR-14910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-14910: -- Summary: Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard, change //logok to //nowarn (was: Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard) > Use in-line tags for logger declarations in Gradle ValidateLogCalls that are > non-standard, change //logok to //nowarn > - > > Key: SOLR-14910 > URL: https://issues.apache.org/jira/browse/SOLR-14910 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14910) Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard
[ https://issues.apache.org/jira/browse/SOLR-14910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206151#comment-17206151 ] Erick Erickson commented on SOLR-14910: --- [~dsmiley] pointed out that having to muck with the gradle build system to handle non-standard logger declarations is yucky, and looking back at that code I don't know what I was thinking. We already have a //logok flag, and adding being able to ignore a declaration if that tag is present isn't nearly so yucky. For instance, HttpServer2.java requires an upper-case LOG in order for Hadoop to function (yuck) and we want our log variables to be just lower-case "log". Or SolrCore.java declares requestLog and slowLog, which are perfectly valid but aren't just "log". Along the way, David suggested that //nowarn is more general and can be used in other situations than a specific //logok, which makes sense. PR shortly. I'll commit this probably tomorrow absent objections. > Use in-line tags for logger declarations in Gradle ValidateLogCalls that are > non-standard > - > > Key: SOLR-14910 > URL: https://issues.apache.org/jira/browse/SOLR-14910 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14910) Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard
Erick Erickson created SOLR-14910: - Summary: Use in-line tags for logger declarations in Gradle ValidateLogCalls that are non-standard Key: SOLR-14910 URL: https://issues.apache.org/jira/browse/SOLR-14910 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Erick Erickson -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] iverase opened a new pull request #1939: Adds a XYPoint query that accepts an array of XYGeometries
iverase opened a new pull request #1939: URL: https://github.com/apache/lucene-solr/pull/1939 New query that accepts an array of XYGeometries. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9004) Approximate nearest vector search
[ https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206088#comment-17206088 ] Jim Ferenczi commented on LUCENE-9004: -- > I think this is not so for the NSW algorithm; you just keep the docids, so >N*M*4 is the cost. Still this doesn't change any of the scaling conclusions. My comment was about the cost of building where you need to keep the list sorted by distances. +1 for the plan [~sokolov] and to start with a simple NSW graph that we can make hierarchical later if needed. Filtering with a query is tricky and will require special treatments based on the internal implementation so I agree that it would be simpler to consider it out of scope at the moment. > Approximate nearest vector search > - > > Key: LUCENE-9004 > URL: https://issues.apache.org/jira/browse/LUCENE-9004 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael Sokolov >Priority: Major > Attachments: hnsw_layered_graph.png > > Time Spent: 3h 20m > Remaining Estimate: 0h > > "Semantic" search based on machine-learned vector "embeddings" representing > terms, queries and documents is becoming a must-have feature for a modern > search engine. SOLR-12890 is exploring various approaches to this, including > providing vector-based scoring functions. This is a spinoff issue from that. > The idea here is to explore approximate nearest-neighbor search. Researchers > have found an approach based on navigating a graph that partially encodes the > nearest neighbor relation at multiple scales can provide accuracy > 95% (as > compared to exact nearest neighbor calculations) at a reasonable cost. This > issue will explore implementing HNSW (hierarchical navigable small-world) > graphs for the purpose of approximate nearest vector search (often referred > to as KNN or k-nearest-neighbor search). > At a high level the way this algorithm works is this. First assume you have a > graph that has a partial encoding of the nearest neighbor relation, with some > short and some long-distance links. If this graph is built in the right way > (has the hierarchical navigable small world property), then you can > efficiently traverse it to find nearest neighbors (approximately) in log N > time where N is the number of nodes in the graph. I believe this idea was > pioneered in [1]. The great insight in that paper is that if you use the > graph search algorithm to find the K nearest neighbors of a new document > while indexing, and then link those neighbors (undirectedly, ie both ways) to > the new document, then the graph that emerges will have the desired > properties. > The implementation I propose for Lucene is as follows. We need two new data > structures to encode the vectors and the graph. We can encode vectors using a > light wrapper around {{BinaryDocValues}} (we also want to encode the vector > dimension and have efficient conversion from bytes to floats). For the graph > we can use {{SortedNumericDocValues}} where the values we encode are the > docids of the related documents. Encoding the interdocument relations using > docids directly will make it relatively fast to traverse the graph since we > won't need to lookup through an id-field indirection. This choice limits us > to building a graph-per-segment since it would be impractical to maintain a > global graph for the whole index in the face of segment merges. However > graph-per-segment is a very natural at search time - we can traverse each > segments' graph independently and merge results as we do today for term-based > search. > At index time, however, merging graphs is somewhat challenging. While > indexing we build a graph incrementally, performing searches to construct > links among neighbors. When merging segments we must construct a new graph > containing elements of all the merged segments. Ideally we would somehow > preserve the work done when building the initial graphs, but at least as a > start I'd propose we construct a new graph from scratch when merging. The > process is going to be limited, at least initially, to graphs that can fit > in RAM since we require random access to the entire graph while constructing > it: In order to add links bidirectionally we must continually update existing > documents. > I think we want to express this API to users as a single joint > {{KnnGraphField}} abstraction that joins together the vectors and the graph > as a single joint field type. Mostly it just looks like a vector-valued > field, but has this graph attached to it. > I'll push a branch with my POC and would love to hear comments. It has many > nocommits, basic design is not really set, there is no Query implementation > and no integration iwth IndexSearcher, but it does work by
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2
dweiss commented on a change in pull request #1905: URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r498666120 ## File path: lucene/packaging/build.gradle ## @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// This project puts together a "distribution", assembling dependencies from +// various other projects. + +plugins { +id 'distribution' +} + +description = 'Lucene distribution packaging' + +// Declare all subprojects that should be included in binary distribution. +// By default everything is included, unless explicitly excluded. +def includeInBinaries = project(":lucene").subprojects.findAll {subproject -> +return !(subproject.path in [ +":lucene:packaging", +":lucene:analysis", +":lucene:luke", // nocommit - Encountered duplicate path "luke/lib/log4j-core-2.13.2.jar" Review comment: This wasn't obvious. Here's what happened. We create a configuration with dependency on "full project", it looked like this: ``` handler.add(confFull, project(path: includedProject.path), {... ``` This in fact is a dependency on the configuration aptly named 'default', which includes all of project binaries and transitive dependencies (archives in general). Luke declared a separate configuration for dependencies called 'standalone' which added a few non-project dependencies (the 'implementation' extended from it so they were visible on runtime classpath). The core of the problem was in that luke also exported an extra artifact that belonged to this 'standalone' configuration - this was a set of JARs assembled under a folder. So when the packaging process was collecting files for inclusion, it encountered log4j twice: a copy in that 'standalone' folder and a copy from transitive project dependencies. I recall you asked why this 'fail' on double archive entries is needed. Well, it's useful to catch pearls like this one above... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2
dweiss commented on a change in pull request #1905: URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r498666282 ## File path: lucene/packaging/build.gradle ## @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// This project puts together a "distribution", assembling dependencies from +// various other projects. + +plugins { +id 'distribution' +} + +description = 'Lucene distribution packaging' + +// Declare all subprojects that should be included in binary distribution. +// By default everything is included, unless explicitly excluded. +def includeInBinaries = project(":lucene").subprojects.findAll {subproject -> +return !(subproject.path in [ +":lucene:packaging", +":lucene:analysis", +":lucene:luke", // nocommit - Encountered duplicate path "luke/lib/log4j-core-2.13.2.jar" Review comment: I've verified that it collects Luke properly and the launch script starts it fine (on Windows). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9548) Publish master (9.x) snapshots to https://repository.apache.org
[ https://issues.apache.org/jira/browse/LUCENE-9548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206004#comment-17206004 ] Dawid Weiss commented on LUCENE-9548: - Hi Uwe. I've changed credential property names to use those mentioned on that cwiki page so in theory it should work if you run this from jenkins: {code} gradlew mavenToApacheSnapshots {code} I also added other task aliases for convention tasks (which have fairly long names): {code} mavenToApacheSnapshots - Publish Maven JARs and POMs to Apache Snapshots repository: https://repository.apache.org/content/repositories/snapshots mavenToLocalFolder - Publish Maven JARs and POMs locally to [...]\build\maven-local mavenToLocalRepo - Publish Maven JARs and POMs to current user's local maven repository. {code} We should probably add ApacheReleases repository too for final releases (or create a bundle uploadable to Nexus?). > Publish master (9.x) snapshots to https://repository.apache.org > --- > > Key: LUCENE-9548 > URL: https://issues.apache.org/jira/browse/LUCENE-9548 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > We should start publishing snapshot JARs to Apache repositories. I'm not sure > how to set it all up with gradle but maybe there are other Apache projects > that use gradle and we could peek at their config? Mostly it's about signing > artifacts (how to pass credentials for signing) and setting up Nexus > deployment repository. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9548) Publish master (9.x) snapshots to https://repository.apache.org
[ https://issues.apache.org/jira/browse/LUCENE-9548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206004#comment-17206004 ] Dawid Weiss edited comment on LUCENE-9548 at 10/2/20, 6:58 AM: --- Hi Uwe. I've changed credential property names to use those mentioned on that cwiki page so in theory it should work if you run this from jenkins: {code} gradlew mavenToApacheSnapshots {code} I also added other task aliases for convention tasks (which have fairly long names): {code} mavenToApacheSnapshots - Publish Maven JARs and POMs to Apache Snapshots repository: https://repository.apache.org/content/repositories/snapshots mavenToLocalFolder - Publish Maven JARs and POMs locally to [...]\build\maven-local mavenToLocalRepo - Publish Maven JARs and POMs to current user's local maven repository. {code} We should probably add ApacheReleases repository too for final releases (or create a bundle uploadable to Nexus?). was (Author: dweiss): Hi Uwe. I've changed credential property names to use those mentioned on that cwiki page so in theory it should work if you run this from jenkins: {code} gradlew mavenToApacheSnapshots {code} I also added other task aliases for convention tasks (which have fairly long names): {code} mavenToApacheSnapshots - Publish Maven JARs and POMs to Apache Snapshots repository: https://repository.apache.org/content/repositories/snapshots mavenToLocalFolder - Publish Maven JARs and POMs locally to [...]\build\maven-local mavenToLocalRepo - Publish Maven JARs and POMs to current user's local maven repository. {code} We should probably add ApacheReleases repository too for final releases (or create a bundle uploadable to Nexus?). > Publish master (9.x) snapshots to https://repository.apache.org > --- > > Key: LUCENE-9548 > URL: https://issues.apache.org/jira/browse/LUCENE-9548 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > We should start publishing snapshot JARs to Apache repositories. I'm not sure > how to set it all up with gradle but maybe there are other Apache projects > that use gradle and we could peek at their config? Mostly it's about signing > artifacts (how to pass credentials for signing) and setting up Nexus > deployment repository. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org