[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1724: SOLR-14684: CloudExitableDirectoryReaderTest failing about 25% of the time
dsmiley commented on a change in pull request #1724: URL: https://github.com/apache/lucene-solr/pull/1724#discussion_r468335971 ## File path: solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java ## @@ -155,6 +159,7 @@ public ServerIterator(Req req, Map zombieServers) { this.req = req; this.zombieServers = zombieServers; this.timeAllowedNano = getTimeAllowedInNanos(req.getRequest()); + log.info("TimeAllowedNano:{}", this.timeAllowedNano); Review comment: Are you sure we should log at info level here? This seems more like a debug situation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14680) Provide simple interfaces to our concrete SolrCloud classes
[ https://issues.apache.org/jira/browse/SOLR-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175239#comment-17175239 ] ASF subversion and git services commented on SOLR-14680: Commit 15ae014c598c0c02926ca3d7039f6389488e981e in lucene-solr's branch refs/heads/master from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=15ae014 ] SOLR-14680: Provide simple interfaces to our cloud classes (only API) (#1694) > Provide simple interfaces to our concrete SolrCloud classes > --- > > Key: SOLR-14680 > URL: https://issues.apache.org/jira/browse/SOLR-14680 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Minor > Time Spent: 10h 10m > Remaining Estimate: 0h > > All our current implementations of SolrCloud such as > # ClusterState > # DocCollection > # Slice > # Replica > etc are concrete classes. Providing alternate implementations or wrappers is > extremely difficult. > SOLR-14613 is attempting to create such interfaces to make their sdk simpler > The objective is not to have a comprehensive set of methods in these > interfaces. We will start out with a subset of required interfaces. We > guarantee is that signatures of methods in these interfaces will not be > deleted/changed . But we may add more methods as and when it suits us -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul merged pull request #1694: SOLR-14680: Provide simple interfaces to our cloud classes (only API)
noblepaul merged pull request #1694: URL: https://github.com/apache/lucene-solr/pull/1694 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley opened a new pull request #1735: LUCENE spell: Implement SuggestWord.toString
dsmiley opened a new pull request #1735: URL: https://github.com/apache/lucene-solr/pull/1735 This is simply an obvious toString impl on SuggestWord. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1726: SOLR-14722: timeAllowed should track from req creation
madrob commented on a change in pull request #1726: URL: https://github.com/apache/lucene-solr/pull/1726#discussion_r468315778 ## File path: solr/core/src/java/org/apache/solr/search/SolrQueryTimeoutImpl.java ## @@ -67,8 +69,21 @@ public boolean shouldExit() { } /** - * Method to set the time at which the timeOut should happen. - * @param timeAllowed set the time at which this thread should timeout. + * Sets or clears the time allowed based on how much time remains from the start of the request plus the configured + * {@link CommonParams#TIME_ALLOWED}. + */ + public static void set(SolrQueryRequest req) { +long timeAllowed = req.getParams().getLong(CommonParams.TIME_ALLOWED, -1L); +if (timeAllowed >= 0L) { + set(timeAllowed - (long)req.getRequestTimer().getTime()); // reduce by time already spent +} else { + reset(); +} + } + + /** + * Sets the time allowed (milliseconds), assuming we start a timer immediately. + * You should probably invoke {@link #set(SolrQueryRequest)} instead. */ public static void set(Long timeAllowed) { Review comment: should this be a primitive instead of a boxed type? ## File path: solr/core/src/java/org/apache/solr/search/SolrQueryTimeoutImpl.java ## @@ -67,8 +69,21 @@ public boolean shouldExit() { } /** - * Method to set the time at which the timeOut should happen. - * @param timeAllowed set the time at which this thread should timeout. + * Sets or clears the time allowed based on how much time remains from the start of the request plus the configured + * {@link CommonParams#TIME_ALLOWED}. + */ + public static void set(SolrQueryRequest req) { +long timeAllowed = req.getParams().getLong(CommonParams.TIME_ALLOWED, -1L); +if (timeAllowed >= 0L) { Review comment: Should be `>`, not `>=`. Doc on time allowed state that zero is no timeout, not immediate timeout. Looks like we were previously inconsistent about this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface
noblepaul commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r468260502 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/plugins/SamplePluginMinimizeCores.java ## @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement.plugins; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.Comparator; +import java.util.HashSet; +import java.util.Iterator; +import java.util.Set; +import java.util.Map; + +import com.google.common.collect.Ordering; +import com.google.common.collect.TreeMultimap; +import org.apache.solr.cluster.placement.Cluster; +import org.apache.solr.cluster.placement.CoresCountPropertyValue; +import org.apache.solr.cluster.placement.CreateNewCollectionPlacementRequest; +import org.apache.solr.cluster.placement.Node; +import org.apache.solr.cluster.placement.PlacementException; +import org.apache.solr.cluster.placement.PlacementPlugin; +import org.apache.solr.cluster.placement.PropertyKey; +import org.apache.solr.cluster.placement.PropertyKeyFactory; +import org.apache.solr.cluster.placement.PropertyValue; +import org.apache.solr.cluster.placement.PropertyValueFetcher; +import org.apache.solr.cluster.placement.Replica; +import org.apache.solr.cluster.placement.ReplicaPlacement; +import org.apache.solr.cluster.placement.PlacementRequest; +import org.apache.solr.cluster.placement.PlacementPlan; +import org.apache.solr.cluster.placement.PlacementPlanFactory; +import org.apache.solr.common.util.SuppressForbidden; + +/** + * Implements placing replicas to minimize number of cores per {@link Node}, while not placing two replicas of the same + * shard on the same node. + * + * TODO: code not tested and never run, there are no implementation yet for used interfaces + */ +public class SamplePluginMinimizeCores implements PlacementPlugin { + + @SuppressForbidden(reason = "Ordering.arbitrary() has no equivalent in Comparator class. Rather reuse than copy.") + public PlacementPlan computePlacement(Cluster cluster, PlacementRequest placementRequest, PropertyKeyFactory propertyFactory, +PropertyValueFetcher propertyFetcher, PlacementPlanFactory placementPlanFactory) throws PlacementException { +// This plugin only supports Creating a collection. +if (!(placementRequest instanceof CreateNewCollectionPlacementRequest)) { + throw new PlacementException("This toy plugin only supports creating collections"); +} + +final CreateNewCollectionPlacementRequest reqCreateCollection = (CreateNewCollectionPlacementRequest) placementRequest; + +final int totalReplicasPerShard = reqCreateCollection.getNrtReplicationFactor() + +reqCreateCollection.getTlogReplicationFactor() + reqCreateCollection.getPullReplicationFactor(); + +if (cluster.getLiveNodes().size() < totalReplicasPerShard) { + throw new PlacementException("Cluster size too small for number of replicas per shard"); +} + +// Get number of cores on each Node +TreeMultimap nodesByCores = TreeMultimap.create(Comparator.naturalOrder(), Ordering.arbitrary()); Review comment: I believe the property fetching is overly complicated . We should probably make it a lot simpler. Basically, the only requirement is strong typing. `TreeMultimap nodesByCores = TreeMultimap.create(Comparator.naturalOrder(), Ordering.arbitrary());` This definitely is not the easiest code we could write. A user just wants to get an integer value for the # of cores in a node. ## File path: solr/core/src/java/org/apache/solr/cluster/placement/plugins/SamplePluginMinimizeCores.java ## @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the Lic
[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution
[ https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175124#comment-17175124 ] Erick Erickson commented on SOLR-13412: --- Given the discussion on the dev list, I'm going to close this as "won't fix" absent objections. I think the most cogent comment is that we should enhance the Luke Request Handler if there's a need rather than try to awkwardly package a windowing app with a server distro. The origin of this Jira was "Hey! Luke has been integrated with Lucene, cool! Let's make it available from Solr". But as the discussion has continued, it seems like a poorer idea than it did at the start. > Make the Lucene Luke module available from a Solr distribution > -- > > Key: SOLR-13412 > URL: https://issues.apache.org/jira/browse/SOLR-13412 > Project: Solr > Issue Type: Improvement >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13412.patch > > > Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the > project, I think it would be good to be able to access it from a Solr distro. > I want to go to the right place under the Solr install directory and start > Luke up to examine the local indexes. > This ticket is explicitly _not_ about accessing it from the admin UI, Luke is > a stand-alone app that must be invoked on the node that has a Lucene index on > the local filesystem > We need to > * have it included in Solr when running "ant package". > * add some bits to the ref guide on how to invoke > ** Where to invoke it from > ** mention anything that has to be installed. > ** any other "gotchas" someone just installing Solr should be aware of. > * Ant should not be necessary. > * > > I'll assign this to myself to keep track of, but would not be offended in the > least if someone with more knowledge of "ant package" and the like wanted to > take it over ;) > If we can do it at all -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.
[ https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169538#comment-17169538 ] Mark Robert Miller edited comment on SOLR-14636 at 8/11/20, 12:36 AM: -- I ended up having some other responsibilities last week, so this milestone has been pushed out a week. was (Author: markrmiller): There is not enough fun that goes on in development anymore. Robert and I used to have that nailed. !solr-ref-branch.gif! > Provide a reference implementation for SolrCloud that is stable and fast. > - > > Key: SOLR-14636 > URL: https://issues.apache.org/jira/browse/SOLR-14636 > Project: Solr > Issue Type: Task >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Major > Attachments: IMG_5575 (1).jpg, jenkins.png, solr-ref-branch.gif > > > SolrCloud powers critical infrastructure and needs the ability to run quickly > with stability. This reference implementation will allow for this. > *location*: [https://github.com/apache/lucene-solr/tree/reference_impl] > *status*: alpha > *speed*: ludicrous > *tests***: > * *core*: {color:#00875a}*extremely stable*{color} with > *{color:#de350b}ignores{color}* > * *solrj*: {color:#00875a}*extremely stable*{color} with > {color:#de350b}*ignores*{color} > * *test-framework*: *extremely stable* with {color:#de350b}*ignores*{color} > * *contrib/analysis-extras*: *extremely stable* with > {color:#de350b}*ignores*{color} > * *contrib/analytics*: {color:#00875a}*extremely stable*{color} with > {color:#de350b}*ignores*{color} > * *contrib/clustering*: {color:#00875a}*extremely stable*{color} with > *{color:#de350b}ignores{color}* > * *contrib/dataimporthandler*: {color:#00875a}*extremely stable*{color} with > {color:#de350b}*ignores*{color} > * *contrib/dataimporthandler-extras*: {color:#00875a}*extremely > stable*{color} with *{color:#de350b}ignores{color}* > * *contrib/extraction*: {color:#00875a}*extremely stable*{color} with > {color:#de350b}*ignores*{color} > * *contrib/jaegertracer-configurator*: {color:#00875a}*extremely > stable*{color} with {color:#de350b}*ignores*{color} > * *contrib/langid*: {color:#00875a}*extremely stable*{color} with > {color:#de350b}*ignores*{color} > * *contrib/prometheus-exporter*: {color:#00875a}*extremely stable*{color} > with {color:#de350b}*ignores*{color} > * *contrib/velocity*: {color:#00875a}*extremely stable*{color} with > {color:#de350b}*ignores*{color} > _* Running tests quickly and efficiently with strict policing will more > frequently find bugs and requires a period of hardening._ > _** Non Nightly currently, Nightly comes last._ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul edited a comment on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API
noblepaul edited a comment on pull request #1730: URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671651150 >what is the target use case of the interface and lazy implementation? The objectives are many - Totally refactor Solr code base to minimize dependencies on concrete classes. This enables us to do simulation and testing, make code more readable, and enable refactoring - As we move to a new mode for Solr with a lean core and packages/plugins, we want to have less API surface area against which the plugins are written. This enables the plugins to work against a wider range of versions without rewriting/recompiling - The `LazySolrCluster` will be the default impl for these interfaces. Because, this is the current behaviour (mostly). We expect fresh data to be available all times The problem with the existing classes implementing the interfaces is that, users of the APIs will cast these objects to the underlying concrete classes, which defeats the purpose This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul edited a comment on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API
noblepaul edited a comment on pull request #1730: URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671651150 >what is the target use case of the interface and lazy implementation? The objectives are many - The `LazySolrCluster` will be the default impl for these interfaces. Because, this is the current behaviour (mostly). We expect fresh data to be available all times - Totally refactor Solr code base to minimize dependencies on concrete classes. This enables us to do simulation and testing, make code more readable, and enable refactoring - As we move to a new mode for Solr with a lean core and packages/plugins, we want to have less API surface area against which the plugins are written. This enables the plugins to work against a wider range of versions without rewriting/recompiling The problem with the existing classes implementing the interfaces is that, users of the APIs will cast these objects to the underlying concrete classes, which defeats the purpose This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul edited a comment on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API
noblepaul edited a comment on pull request #1730: URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671651150 >what is the target use case of the interface and lazy implementation? The objectives are many - Totally refactor Solr code base to minimize dependencies on concrete classes. This enables us to do simulation and testing, make code more readable, and enable refactoring - As we move to a new mode for Solr with a lean core and packages/plugins, we want to have less API surface area against which the plugins are written. This enables the plugins to work against a wider range of versions without rewriting/recompiling The problem with the existing classes implementing the interfaces is that, users of the APIs will cast these objects to the underlying concrete classes, which defeats the purpose This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API
noblepaul commented on pull request #1730: URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671651150 >what is the target use case of the interface and lazy implementation? The objectives are many - Totally refactor Solr code base to minimize dependencies on concrete classes. This enables us to do simulation and testing, make code more readable, and enable refactoring - As we move to a new mode for Solr with a lean core and packages/plugins, we want to have less API surface area against which the plugins are written. This enables us the plugins to work against a wider range of versions without rewriting/recompiling The problem with the existing classes implementing the interfaces is that, users of the APIs will cast these objects to the underlying concrete classes, which defeats the purpose This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async
[ https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175118#comment-17175118 ] Cao Manh Dat commented on SOLR-14354: - [~rishisankar] sure, if you can also do the benchmark that Ishan ask, it will be even better :D > HttpShardHandler send requests in async > --- > > Key: SOLR-14354 > URL: https://issues.apache.org/jira/browse/SOLR-14354 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: master (9.0), 8.7 > > Attachments: image-2020-03-23-10-04-08-399.png, > image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png > > Time Spent: 4h > Remaining Estimate: 0h > > h2. 1. Current approach (problem) of Solr > Below is the diagram describe the model on how currently handling a request. > !image-2020-03-23-10-04-08-399.png! > The main-thread that handles the search requests, will submit n requests (n > equals to number of shards) to an executor. So each request will correspond > to a thread, after sending a request that thread basically do nothing just > waiting for response from other side. That thread will be swapped out and CPU > will try to handle another thread (this is called context switch, CPU will > save the context of the current thread and switch to another one). When some > data (not all) come back, that thread will be called to parsing these data, > then it will wait until more data come back. So there will be lots of context > switching in CPU. That is quite inefficient on using threads.Basically we > want less threads and most of them must busy all the time, because threads > are not free as well as context switching. That is the main idea behind > everything, like executor > h2. 2. Async call of Jetty HttpClient > Jetty HttpClient offers async API like this. > {code:java} > httpClient.newRequest("http://domain.com/path";) > // Add request hooks > .onRequestQueued(request -> { ... }) > .onRequestBegin(request -> { ... }) > // Add response hooks > .onResponseBegin(response -> { ... }) > .onResponseHeaders(response -> { ... }) > .onResponseContent((response, buffer) -> { ... }) > .send(result -> { ... }); {code} > Therefore after calling {{send()}} the thread will return immediately without > any block. Then when the client received the header from other side, it will > call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not > all response) from the data it will call {{onContent(buffer)}} listeners. > When everything finished it will call {{onComplete}} listeners. One main > thing that will must notice here is all listeners should finish quick, if the > listener block, all further data of that request won’t be handled until the > listener finish. > h2. 3. Solution 1: Sending requests async but spin one thread per response > Jetty HttpClient already provides several listeners, one of them is > InputStreamResponseListener. This is how it is get used > {code:java} > InputStreamResponseListener listener = new InputStreamResponseListener(); > client.newRequest(...).send(listener); > // Wait for the response headers to arrive > Response response = listener.get(5, TimeUnit.SECONDS); > if (response.getStatus() == 200) { > // Obtain the input stream on the response content > try (InputStream input = listener.getInputStream()) { > // Read the response content > } > } {code} > In this case, there will be 2 thread > * one thread trying to read the response content from InputStream > * one thread (this is a short-live task) feeding content to above > InputStream whenever some byte[] is available. Note that if this thread > unable to feed data into InputStream, this thread will wait. > By using this one, the model of HttpShardHandler can be written into > something like this > {code:java} > handler.sendReq(req, (is) -> { > executor.submit(() -> > try (is) { > // Read the content from InputStream > } > ) > }) {code} > The first diagram will be changed into this > !image-2020-03-23-10-09-10-221.png! > Notice that although “sending req to shard1” is wide, it won’t take long time > since sending req is a very quick operation. With this operation, handling > threads won’t be spin up until first bytes are sent back. Notice that in this > approach we still have active threads waiting for more data from InputStream > h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread. > Jetty have another listener called BufferingResponseListener. This is how it > is get used > {code:java} > client.newRequest(...).send(new BufferingResponseListener() { > public void onComplete(Result result) { > try { >
[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution
[ https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175099#comment-17175099 ] Tomoko Uchida commented on SOLR-13412: -- FWIW, an Elasticsearch user notified us that he/she created Dockernized version Luke. We could revisit this. [https://github.com/DmitryKey/luke/issues/162] > Make the Lucene Luke module available from a Solr distribution > -- > > Key: SOLR-13412 > URL: https://issues.apache.org/jira/browse/SOLR-13412 > Project: Solr > Issue Type: Improvement >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13412.patch > > > Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the > project, I think it would be good to be able to access it from a Solr distro. > I want to go to the right place under the Solr install directory and start > Luke up to examine the local indexes. > This ticket is explicitly _not_ about accessing it from the admin UI, Luke is > a stand-alone app that must be invoked on the node that has a Lucene index on > the local filesystem > We need to > * have it included in Solr when running "ant package". > * add some bits to the ref guide on how to invoke > ** Where to invoke it from > ** mention anything that has to be installed. > ** any other "gotchas" someone just installing Solr should be aware of. > * Ant should not be necessary. > * > > I'll assign this to myself to keep track of, but would not be offended in the > least if someone with more knowledge of "ant package" and the like wanted to > take it over ;) > If we can do it at all -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob opened a new pull request #1734: LUCENE-9453 Add sync around volatile write
madrob opened a new pull request #1734: URL: https://github.com/apache/lucene-solr/pull/1734 checkoutAndBlock is not synchronized, but has a non-atomic write to numPending. Meanwhile, all of the other writes to numPending are in sync methods. In this case it turns out to be ok because all of the code paths calling this method are already sync: `synchronized doAfterDocument -> checkout -> checkoutAndBlock` `checkoutLargestNonPendingWriter -> synchronized(this) -> checkout -> checkoutAndBlock` If we make synchronized checkoutAndBlock that protects us against future changes, shouldn't cause any performance impact since the code paths will already be going through a sync block, and will make an IntelliJ warning go away. Found via IntelliJ warnings. https://issues.apache.org/jira/browse/LUCENE-9453 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1732: Clean up many small fixes
madrob commented on a change in pull request #1732: URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468227099 ## File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java ## @@ -324,12 +324,12 @@ synchronized void doOnAbort(DocumentsWriterPerThread perThread) { } } - private void checkoutAndBlock(DocumentsWriterPerThread perThread) { + private synchronized void checkoutAndBlock(DocumentsWriterPerThread perThread) { Review comment: https://issues.apache.org/jira/browse/LUCENE-9453 I explain in that issue why I believe it is minor, but it will help to get more eyes on it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9453) DocumentWriterFlushControl missing explicit sync on write
Mike Drob created LUCENE-9453: - Summary: DocumentWriterFlushControl missing explicit sync on write Key: LUCENE-9453 URL: https://issues.apache.org/jira/browse/LUCENE-9453 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Mike Drob checkoutAndBlock is not synchronized, but has a non-atomic write to {{numPending}}. Meanwhile, all of the other writes to numPending are in sync methods. In this case it turns out to be ok because all of the code paths calling this method are already sync: {{synchronized doAfterDocument -> checkout -> checkoutAndBlock}} {{checkoutLargestNonPendingWriter -> synchronized(this) -> checkout -> checkoutAndBlock}} If we make {{synchronized checkoutAndBlock}} that protects us against future changes, shouldn't cause any performance impact since the code paths will already be going through a sync block, and will make an IntelliJ warning go away. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1732: Clean up many small fixes
madrob commented on a change in pull request #1732: URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468221630 ## File path: lucene/core/src/java/org/apache/lucene/index/DocValuesUpdate.java ## @@ -152,12 +152,12 @@ static BytesRef readFrom(DataInput in, BytesRef scratch) throws IOException { } NumericDocValuesUpdate(Term term, String field, Long value) { - this(term, field, value != null ? value.longValue() : -1, BufferedUpdates.MAX_INT, value != null); + this(term, field, value != null ? value : -1, BufferedUpdates.MAX_INT, value != null); } -private NumericDocValuesUpdate(Term term, String field, long value, int docIDUpTo, boolean hasValue) { Review comment: There were 16 instances of `Upto` and 4 of `UpTo` so I went with the more common one for consistency. Happy to switch the other way if it's more correct according to English. Looking it up now and looks like "upto" isn't a word? ## File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java ## @@ -324,12 +324,12 @@ synchronized void doOnAbort(DocumentsWriterPerThread perThread) { } } - private void checkoutAndBlock(DocumentsWriterPerThread perThread) { + private synchronized void checkoutAndBlock(DocumentsWriterPerThread perThread) { Review comment: I'll split this out. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14687) Make child/parent query parsers natively aware of _nest_path_
[ https://issues.apache.org/jira/browse/SOLR-14687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175086#comment-17175086 ] Chris M. Hostetter commented on SOLR-14687: --- besides that fact that Jira's WYSIWYG editor lied to me and munged up some of the formatting of "STAR:STAR" and "UNDERSCORE nest UNDERSCORE path UNDERSCORE" in many places, something else has been nagging that i felt like i was overlooking and i finally figured out what it is: I hadn't really accounted for docs that _have_ a "nest path" but their path doesn't have any common ancestors with the {{parentPath}} specified – ie: how would a mix of {{/a/b/c}} hierarchy docs mixed in an index with docs having a hierarchy of {{/x/y/z}} wind up affecting each other? I *think* that what i described above would still mostly work for the "parent" parser – even if the "parent filter" generated by a {{parentPath="/a/b/c"}} as i described above didn't really "rule out" the other docs, because this still wouldn't match the "nest path with a prefix of /a/b/c" rule for the "children", but it still wouldn't really be a "correct" "parents bit set filter" as the underlying code expects it to be in terms of identifying all "non children" documents ... but** I'm _pretty sure_ it would be broken for the "child" parser case, because some doc with a n "/x" or "/x/y" path isn't going to be matched by the "parents filter bitset" so might get swallowed up in the list of children. The other thing that bugged me was the (mistaken & missguided) need to ' ... compute a list of all "prefix subpaths" ... ' – i'm not sure way i thought that was necessary, instead of just saying "must _NOT_ have a prefix of the specified path – ie: {code:java} GIVEN:{!foo parentPath="/a/b/c"} ... INSTEAD OF:PARENT FILTER BITSET = ((*:* -_nest_path_:*) OR _nest_path_:(/a /a/b /a/b/c)) JUST USE:PARENT FILTER BITSET = (*:* -{prefix f="_nest_path_" v="/a/b/c/"}) {code} ...which (IIUC) should solve both problems, by matching: * docs w/o any nest path * docs with a nest path that does NOT start with /a/b/c/ ** which includes the immediate "/a/b/c" parents, as well as their ancestors, as well as any docs with completely orthoginal paths (like /x/y/z) But of course: in the case of {{parentFilter="/"}} this would still simply be "docs w/o a nest path" That should work, right? I also think i made some mistakes/types in my examples above in trying to articular what the equivalent "old style" query would be, so let me restate all of the examples in full... {noformat} NEW: q={!parent parentPath="/a/b/c"}c_title:son OLD: q=(+{!field f="_nest_path_" v="/a/b/c"} +{!parent which=$ff v=$vv}) ff=(*:* -{prefix f="_nest_path_" v="/a/b/c/"}) vv=(+c_title:son +{prefix f="_nest_path_" v="/a/b/c/"}) {noformat} {noformat} NEW: q={!parent parentPath="/"}c_title:son OLD: q=(-_nest_path_:* +{!parent which=$ff v=$vv} ff=(*:* -_nest_path_:*) vv=(+c_title:son +_nest_path_:*) {noformat} {noformat} NEW: q={!child parentPath="/a/b/c"}p_title:dad OLD: q={!child of=$ff v=$vv}) ff=(*:* -{prefix f="_nest_path_" v="/a/b/c/"}) vv=(+p_title:dad +{field f="_nest_path_" v="/a/b/c"}) {noformat} {noformat} NEW: q={!child parentPath="/"}p_title:dad OLD: q={!child of=$ff v=$vv}) ff=(*:* -_nest_path_:*) vv=(+p_title:dad +_nest_path_:*) {noformat} [~mkhl] - what do you think about this approach? do you see any flaws in the logic here? ... if the logic looks correct, I'd like to write it up as "how to create a *safe* of/which local param when using nest path" doc tip for SOLR-14383 and move forward there as a documentation improvement, even if there are still feature/implementation/syntax concerns/discussion to happen here as far as a "new feature" > Make child/parent query parsers natively aware of _nest_path_ > - > > Key: SOLR-14687 > URL: https://issues.apache.org/jira/browse/SOLR-14687 > Project: Solr > Issue Type: Sub-task >Reporter: Chris M. Hostetter >Priority: Major > > A long standing pain point of the parent/child QParsers is the "all parents" > bitmask/filter specified via the "which" and "of" params (respectively). > This is particularly tricky/painful to "get right" when dealing with > multi-level nested documents... > * > https://issues.apache.org/jira/browse/SOLR-14383?focusedCommentId=17166339&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17166339 > * > [https://lists.apache.org/thread.html/r7633a366dd76e7ce9d98e6b9f2a65da8af8240e846f789d938c8113f%40%3Csolr-user.lucene.apache.org%3E] > ...and it's *really* hard to get right when the nested structure isn't 100% > consistent among all docs: > * collections that mix docs w/o children and
[GitHub] [lucene-solr] madrob commented on a change in pull request #1732: Clean up many small fixes
madrob commented on a change in pull request #1732: URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468220684 ## File path: lucene/core/src/java/org/apache/lucene/codecs/blocktree/BlockTreeTermsWriter.java ## @@ -709,7 +709,7 @@ private PendingBlock writeBlock(int prefixLength, boolean isFloor, int floorLead PendingTerm term = (PendingTerm) ent; - assert StringHelper.startsWith(term.termBytes, prefix): "term.term=" + term.termBytes + " prefix=" + prefix; + assert StringHelper.startsWith(term.termBytes, prefix): "term.term=" + new String(term.termBytes) + " prefix=" + prefix; Review comment: Are these UTF-8? I wasn't sure, and hoped somebody would let me know during review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1732: Clean up many small fixes
madrob commented on a change in pull request #1732: URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468220535 ## File path: lucene/core/src/java/org/apache/lucene/analysis/Analyzer.java ## @@ -94,7 +94,7 @@ * Create a new Analyzer, reusing the same set of components per-thread * across calls to {@link #tokenStream(String, Reader)}. */ - public Analyzer() { Review comment: I understand that it's notionally an API change, but `abstract` classes have no reason for public constructors. We can make everything protected and the subclasses that people use will be able to pick it up. I was over-zealous in a couple places going to package instead of protected, I'll fix that up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8776) Start offset going backwards has a legitimate purpose
[ https://issues.apache.org/jira/browse/LUCENE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175081#comment-17175081 ] Roman commented on LUCENE-8776: --- I too suffer from the same issue, we have multi-token synonyms that can even overlap. I recognize the arguments against the backward offsets but I find them surprisingly backwards: they are saying that the implementation dictates function. When the function is what (for many people) is the goal. The arguments seem also to say that the most efficient implementation (non-negative integer deltas) does not allow backward offsets, therefore backward offsets is a bug. Please recognize, that the most elegant implementation sometimes mean "as complex as needed" – it is not the same as "the simplest". If negative vints consume 5 bytes instead of 4, some people need to and are willing to pay that price. Their use cases cannot be simply 'boxed' into the world where one is only looking ahead and never back (NLP is one such world) Lucene is however inviting one particular solution: The implementation of vint seems not mind if there is a negative offset (https://issues.apache.org/jira/browse/LUCENE-3738) and DefaultIndexingChain extends DocConsumer – the name 'Default' suggests that at some point in the past, Lucene developers wanted to provide other implementations. As it is *right now*, it is not easy to plug in a different 'DocConsumer' – that surely seems like an important omission! (one size fits all?). So if we just add a simple mechanism to instruct Lucene which DocConsumer to use, then all could be happy and not have to resort to dirty hacks or forks. The most efficient impl will be the default, yet will allow us us - dirty bastards - shoot ourselves in foot if we so desire. SOLR as well as ElasticSearch devs might not mind having the option in the future - it can come in handy. Wouldn't that be wonderful? Well, wonderful certainly not, just useful... could I do it? [~rcmuir] [~mikemccand] [~simonw] > Start offset going backwards has a legitimate purpose > - > > Key: LUCENE-8776 > URL: https://issues.apache.org/jira/browse/LUCENE-8776 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 7.6 >Reporter: Ram Venkat >Priority: Major > > Here is the use case where startOffset can go backwards: > Say there is a line "Organic light-emitting-diode glows", and I want to run > span queries and highlight them properly. > During index time, light-emitting-diode is split into three words, which > allows me to search for 'light', 'emitting' and 'diode' individually. The > three words occupy adjacent positions in the index, as 'light' adjacent to > 'emitting' and 'light' at a distance of two words from 'diode' need to match > this word. So, the order of words after splitting are: Organic, light, > emitting, diode, glows. > But, I also want to search for 'organic' being adjacent to > 'light-emitting-diode' or 'light-emitting-diode' being adjacent to 'glows'. > The way I solved this was to also generate 'light-emitting-diode' at two > positions: (a) In the same position as 'light' and (b) in the same position > as 'glows', like below: > ||organic||light||emitting||diode||glows|| > | |light-emitting-diode| |light-emitting-diode| | > |0|1|2|3|4| > The positions of the two 'light-emitting-diode' are 1 and 3, but the offsets > are obviously the same. This works beautifully in Lucene 5.x in both > searching and highlighting with span queries. > But when I try this in Lucene 7.6, it hits the condition "Offsets must not go > backwards" at DefaultIndexingChain:818. This IllegalArgumentException is > being thrown without any comments on why this check is needed. As I explained > above, startOffset going backwards is perfectly valid, to deal with word > splitting and span operations on these specialized use cases. On the other > hand, it is not clear what value is added by this check and which highlighter > code is affected by offsets going backwards. This same check is done at > BaseTokenStreamTestCase:245. > I see others talk about how this check found bugs in WordDelimiter etc. but > it also prevents legitimate use cases. Can this check be removed? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] gautamworah96 opened a new pull request #1733: LUCENE-9450 Use BinaryDocValues in the taxonomy writer
gautamworah96 opened a new pull request #1733: URL: https://github.com/apache/lucene-solr/pull/1733 # Description This PR modifies the taxonomy writer and reader implementation to use BinaryDocValues instead of StoredValues. The taxonomy index uses stored fields today and must do a number of stored field lookups for each query to resolve taxonomy ordinals back to human presentable facet labels. # Solution Change the storage format to use DocValues # Tests ant test fails because `.binaryValue()` returns a `NullPointerException` To reproduce the error: `ant test -Dtestcase=TestExpressionAggregationFacetsExample -Dtests.method=testSimple -Dtests.seed=4544BD51622879A4 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=si -Dtests.timezone=Antarctica/DumontDUrville -Dtests.asserts=true -Dtests.file.encoding=US-ASCII` gives ```nit4:pickseed] Seed property 'tests.seed' already defined: 4544BD51622879A4 [mkdir] Created dir: /Users/gauworah/opensource/mystuff/lucene-solr/lucene/build/demo/test/temp [junit4] says Привет! Master seed: 4544BD51622879A4 [junit4] Executing 1 suite with 1 JVM. [junit4] [junit4] Started J0 PID(76859@localhost). [junit4] Suite: org.apache.lucene.demo.facet.TestExpressionAggregationFacetsExample [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestExpressionAggregationFacetsExample -Dtests.method=testSimple -Dtests.seed=4544BD51622879A4 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=si -Dtests.timezone=Antarctica/DumontDUrville -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] ERROR 0.61s | TestExpressionAggregationFacetsExample.testSimple <<< [junit4]> Throwable #1: java.lang.NullPointerException [junit4]>at __randomizedtesting.SeedInfo.seed([4544BD51622879A4:7DF799AF45DBAD75]:0) [junit4]>at org.apache.lucene.index.MultiDocValues$3.binaryValue(MultiDocValues.java:403) [junit4]>at org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader.getPath(DirectoryTaxonomyReader.java:328) [junit4]>at org.apache.lucene.facet.taxonomy.FloatTaxonomyFacets.getTopChildren(FloatTaxonomyFacets.java:151) [junit4]>at org.apache.lucene.demo.facet.ExpressionAggregationFacetsExample.search(ExpressionAggregationFacetsExample.java:107) [junit4]>at org.apache.lucene.demo.facet.ExpressionAggregationFacetsExample.runSearch(ExpressionAggregationFacetsExample.java:118) [junit4]>at org.apache.lucene.demo.facet.TestExpressionAggregationFacetsExample.testSimple(TestExpressionAggregationFacetsExample.java:28) [junit4]>at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4]>at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4]>at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4]>at java.base/java.lang.reflect.Method.invoke(Method.java:567) [junit4]>at java.base/java.lang.Thread.run(Thread.java:830) ``` 3 other tests also fail at the same line # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `master` branch. - [ ] I have run `ant precommit` and the appropriate test suite. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). **This is a draft PR** This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13528) Rate limiting in Solr
[ https://issues.apache.org/jira/browse/SOLR-13528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175072#comment-17175072 ] ASF subversion and git services commented on SOLR-13528: Commit 424a9a6cfc64476b8d3fbee4f38733ffcb297f7c in lucene-solr's branch refs/heads/master from Cassandra Targett [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=424a9a6 ] SOLR-13528: fix heading levels > Rate limiting in Solr > - > > Key: SOLR-13528 > URL: https://issues.apache.org/jira/browse/SOLR-13528 > Project: Solr > Issue Type: New Feature >Reporter: Anshum Gupta >Assignee: Atri Sharma >Priority: Major > Time Spent: 9h 40m > Remaining Estimate: 0h > > In relation to SOLR-13527, Solr also needs a way to throttle update and > search requests based on usage metrics. This is the umbrella JIRA for both > update and search rate limiting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13528) Rate limiting in Solr
[ https://issues.apache.org/jira/browse/SOLR-13528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175070#comment-17175070 ] Cassandra Targett edited comment on SOLR-13528 at 8/10/20, 9:34 PM: The Ref Guide docs in this commit were throwing some errors in the build (not failing it though) about inconsistent heading levels, which I only noticed because I was working on fixing our new Jenkins builds. I'm about to make a commit to fix that, but I noticed that the headings in question are all parameters users can configure. That's not generally how we structure parameters, especially when there is not a lot of text for each one (we have a couple examples of this being done, but I will someday get around to changing those to be like the majority of the parameters throughout the Guide, like in https://lucene.apache.org/solr/guide/8_6/detecting-languages-during-indexing.html#langid-parameters). I'm also uncomfortable with how the example parameters are shown (as separate source blocks). I think it might be simpler and more instructive for readers to restructure this a bit, to show a full configuration of the {{SolrRequestFilter}}, with all the parameters in a single block. Users can then copy/paste it more easily and will be less likely to miss a parameter. As it stands, I have to do a bit of mental gymnastics to figure out where this needs to go (full path to the file) and what it should look like. I'm fine doing this myself, I just wanted to let you know what I am going to do and why. Of course, if you'd like to do it, I'll be happy to let you. was (Author: ctargett): The Ref Guide docs in this commit were throwing some errors in the build (not failing it though) about inconsistent heading levels, which I only noticed because I was working on fixing our new Jenkins builds. I'm about to make a commit to fix that, but I noticed that the headings in question are all parameters users can configure. That's not generally how we structure parameters, especially when there is not a lot of text for each one (we have a couple examples of this being done, but I will someday get around to changing those to be like the majority of the parameters throughout the Guide, like in https://lucene.apache.org/solr/guide/8_6/detecting-languages-during-indexing.html#langid-parameters). I'm also uncomfortable with how the example parameters are shown (as separate source blocks). I think it might be simpler and more instructive for readers to restructure this a bit, to show a full configuration of the {{SolrRequestFilter}}, with all the parameters in a single block. Users can then copy/paste it more easily and will be less likely to miss a parameter. I'm fine doing this myself, I just wanted to let you know what I am going to do and why. Of course, if you'd like to do it, I'll be happy to let you. > Rate limiting in Solr > - > > Key: SOLR-13528 > URL: https://issues.apache.org/jira/browse/SOLR-13528 > Project: Solr > Issue Type: New Feature >Reporter: Anshum Gupta >Assignee: Atri Sharma >Priority: Major > Time Spent: 9h 40m > Remaining Estimate: 0h > > In relation to SOLR-13527, Solr also needs a way to throttle update and > search requests based on usage metrics. This is the umbrella JIRA for both > update and search rate limiting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13528) Rate limiting in Solr
[ https://issues.apache.org/jira/browse/SOLR-13528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175070#comment-17175070 ] Cassandra Targett commented on SOLR-13528: -- The Ref Guide docs in this commit were throwing some errors in the build (not failing it though) about inconsistent heading levels, which I only noticed because I was working on fixing our new Jenkins builds. I'm about to make a commit to fix that, but I noticed that the headings in question are all parameters users can configure. That's not generally how we structure parameters, especially when there is not a lot of text for each one (we have a couple examples of this being done, but I will someday get around to changing those to be like the majority of the parameters throughout the Guide, like in https://lucene.apache.org/solr/guide/8_6/detecting-languages-during-indexing.html#langid-parameters). I'm also uncomfortable with how the example parameters are shown (as separate source blocks). I think it might be simpler and more instructive for readers to restructure this a bit, to show a full configuration of the {{SolrRequestFilter}}, with all the parameters in a single block. Users can then copy/paste it more easily and will be less likely to miss a parameter. I'm fine doing this myself, I just wanted to let you know what I am going to do and why. Of course, if you'd like to do it, I'll be happy to let you. > Rate limiting in Solr > - > > Key: SOLR-13528 > URL: https://issues.apache.org/jira/browse/SOLR-13528 > Project: Solr > Issue Type: New Feature >Reporter: Anshum Gupta >Assignee: Atri Sharma >Priority: Major > Time Spent: 9h 40m > Remaining Estimate: 0h > > In relation to SOLR-13527, Solr also needs a way to throttle update and > search requests based on usage metrics. This is the umbrella JIRA for both > update and search rate limiting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9452) Remove jenkins.build.ref.guide.sh
[ https://issues.apache.org/jira/browse/LUCENE-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett resolved LUCENE-9452. --- Fix Version/s: 8.7 Resolution: Fixed I only backported to branch_8x, but could also remove it from branch_8_6 if it's necessary to do so. > Remove jenkins.build.ref.guide.sh > - > > Key: LUCENE-9452 > URL: https://issues.apache.org/jira/browse/LUCENE-9452 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Reporter: Cassandra Targett >Assignee: Cassandra Targett >Priority: Major > Fix For: 8.7 > > > After the move to Cloudbees (ci-builds.apache.org), the Ref Guide Jenkins > jobs stopped working. The {{dev-tools/scripts/jenkins.build.ref.guide.sh}} > script we used to build the Guide installed its own RVM and gemset for the > required gems to run with the Ant build and it was difficult to get the paths > right. Infra added the dependencies that we need to their Puppet-managed node > deploy process (see INFRA-20656) and now we don't need a script to do any of > that for us. > This issue is to track removing the script since it's no longer required. The > Ref Guide build jobs will just invoke Ant directly instead. > IIUC from SOLR-10568 when the script was added, there might still come a day > when there is a version mismatch between what was installed by default and > what our build needs, but I think it's fair to try to work with Infra to get > our needs met on the nodes instead of adding them to a script which makes > migration like this more complex. > All of these pre-build dependencies go away, however, when we move to Gradle, > so even if we have a version mismatch one time it won't be a persistent issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9452) Remove jenkins.build.ref.guide.sh
[ https://issues.apache.org/jira/browse/LUCENE-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175051#comment-17175051 ] ASF subversion and git services commented on LUCENE-9452: - Commit c5c1f43c0effaa2647312cc6ce8e5704bead020f in lucene-solr's branch refs/heads/branch_8x from Cassandra Targett [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c5c1f43 ] LUCENE-9452: remove jenkins.build.ref.guide.sh as it's no longer needed > Remove jenkins.build.ref.guide.sh > - > > Key: LUCENE-9452 > URL: https://issues.apache.org/jira/browse/LUCENE-9452 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Reporter: Cassandra Targett >Assignee: Cassandra Targett >Priority: Major > > After the move to Cloudbees (ci-builds.apache.org), the Ref Guide Jenkins > jobs stopped working. The {{dev-tools/scripts/jenkins.build.ref.guide.sh}} > script we used to build the Guide installed its own RVM and gemset for the > required gems to run with the Ant build and it was difficult to get the paths > right. Infra added the dependencies that we need to their Puppet-managed node > deploy process (see INFRA-20656) and now we don't need a script to do any of > that for us. > This issue is to track removing the script since it's no longer required. The > Ref Guide build jobs will just invoke Ant directly instead. > IIUC from SOLR-10568 when the script was added, there might still come a day > when there is a version mismatch between what was installed by default and > what our build needs, but I think it's fair to try to work with Infra to get > our needs met on the nodes instead of adding them to a script which makes > migration like this more complex. > All of these pre-build dependencies go away, however, when we move to Gradle, > so even if we have a version mismatch one time it won't be a persistent issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9452) Remove jenkins.build.ref.guide.sh
[ https://issues.apache.org/jira/browse/LUCENE-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175050#comment-17175050 ] ASF subversion and git services commented on LUCENE-9452: - Commit a747051c6ae348a7f16cf684e4de6b49e27fc5c3 in lucene-solr's branch refs/heads/master from Cassandra Targett [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a747051 ] LUCENE-9452: remove jenkins.build.ref.guide.sh as it's no longer needed > Remove jenkins.build.ref.guide.sh > - > > Key: LUCENE-9452 > URL: https://issues.apache.org/jira/browse/LUCENE-9452 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Reporter: Cassandra Targett >Assignee: Cassandra Targett >Priority: Major > > After the move to Cloudbees (ci-builds.apache.org), the Ref Guide Jenkins > jobs stopped working. The {{dev-tools/scripts/jenkins.build.ref.guide.sh}} > script we used to build the Guide installed its own RVM and gemset for the > required gems to run with the Ant build and it was difficult to get the paths > right. Infra added the dependencies that we need to their Puppet-managed node > deploy process (see INFRA-20656) and now we don't need a script to do any of > that for us. > This issue is to track removing the script since it's no longer required. The > Ref Guide build jobs will just invoke Ant directly instead. > IIUC from SOLR-10568 when the script was added, there might still come a day > when there is a version mismatch between what was installed by default and > what our build needs, but I think it's fair to try to work with Infra to get > our needs met on the nodes instead of adding them to a script which makes > migration like this more complex. > All of these pre-build dependencies go away, however, when we move to Gradle, > so even if we have a version mismatch one time it won't be a persistent issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async
[ https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175025#comment-17175025 ] Rishi Sankar commented on SOLR-14354: - [~caomanhdat] I am interested in this work as well, I am down to work on this if you'd like and do a PR with the API changes David suggested. > HttpShardHandler send requests in async > --- > > Key: SOLR-14354 > URL: https://issues.apache.org/jira/browse/SOLR-14354 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: master (9.0), 8.7 > > Attachments: image-2020-03-23-10-04-08-399.png, > image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png > > Time Spent: 4h > Remaining Estimate: 0h > > h2. 1. Current approach (problem) of Solr > Below is the diagram describe the model on how currently handling a request. > !image-2020-03-23-10-04-08-399.png! > The main-thread that handles the search requests, will submit n requests (n > equals to number of shards) to an executor. So each request will correspond > to a thread, after sending a request that thread basically do nothing just > waiting for response from other side. That thread will be swapped out and CPU > will try to handle another thread (this is called context switch, CPU will > save the context of the current thread and switch to another one). When some > data (not all) come back, that thread will be called to parsing these data, > then it will wait until more data come back. So there will be lots of context > switching in CPU. That is quite inefficient on using threads.Basically we > want less threads and most of them must busy all the time, because threads > are not free as well as context switching. That is the main idea behind > everything, like executor > h2. 2. Async call of Jetty HttpClient > Jetty HttpClient offers async API like this. > {code:java} > httpClient.newRequest("http://domain.com/path";) > // Add request hooks > .onRequestQueued(request -> { ... }) > .onRequestBegin(request -> { ... }) > // Add response hooks > .onResponseBegin(response -> { ... }) > .onResponseHeaders(response -> { ... }) > .onResponseContent((response, buffer) -> { ... }) > .send(result -> { ... }); {code} > Therefore after calling {{send()}} the thread will return immediately without > any block. Then when the client received the header from other side, it will > call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not > all response) from the data it will call {{onContent(buffer)}} listeners. > When everything finished it will call {{onComplete}} listeners. One main > thing that will must notice here is all listeners should finish quick, if the > listener block, all further data of that request won’t be handled until the > listener finish. > h2. 3. Solution 1: Sending requests async but spin one thread per response > Jetty HttpClient already provides several listeners, one of them is > InputStreamResponseListener. This is how it is get used > {code:java} > InputStreamResponseListener listener = new InputStreamResponseListener(); > client.newRequest(...).send(listener); > // Wait for the response headers to arrive > Response response = listener.get(5, TimeUnit.SECONDS); > if (response.getStatus() == 200) { > // Obtain the input stream on the response content > try (InputStream input = listener.getInputStream()) { > // Read the response content > } > } {code} > In this case, there will be 2 thread > * one thread trying to read the response content from InputStream > * one thread (this is a short-live task) feeding content to above > InputStream whenever some byte[] is available. Note that if this thread > unable to feed data into InputStream, this thread will wait. > By using this one, the model of HttpShardHandler can be written into > something like this > {code:java} > handler.sendReq(req, (is) -> { > executor.submit(() -> > try (is) { > // Read the content from InputStream > } > ) > }) {code} > The first diagram will be changed into this > !image-2020-03-23-10-09-10-221.png! > Notice that although “sending req to shard1” is wide, it won’t take long time > since sending req is a very quick operation. With this operation, handling > threads won’t be spin up until first bytes are sent back. Notice that in this > approach we still have active threads waiting for more data from InputStream > h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread. > Jetty have another listener called BufferingResponseListener. This is how it > is get used > {code:java} > client.newRequest(...).send(new BufferingResponseListener() { > public void on
[jira] [Created] (LUCENE-9452) Remove jenkins.build.ref.guide.sh
Cassandra Targett created LUCENE-9452: - Summary: Remove jenkins.build.ref.guide.sh Key: LUCENE-9452 URL: https://issues.apache.org/jira/browse/LUCENE-9452 Project: Lucene - Core Issue Type: Improvement Components: general/build Reporter: Cassandra Targett Assignee: Cassandra Targett After the move to Cloudbees (ci-builds.apache.org), the Ref Guide Jenkins jobs stopped working. The {{dev-tools/scripts/jenkins.build.ref.guide.sh}} script we used to build the Guide installed its own RVM and gemset for the required gems to run with the Ant build and it was difficult to get the paths right. Infra added the dependencies that we need to their Puppet-managed node deploy process (see INFRA-20656) and now we don't need a script to do any of that for us. This issue is to track removing the script since it's no longer required. The Ref Guide build jobs will just invoke Ant directly instead. IIUC from SOLR-10568 when the script was added, there might still come a day when there is a version mismatch between what was installed by default and what our build needs, but I think it's fair to try to work with Infra to get our needs met on the nodes instead of adding them to a script which makes migration like this more complex. All of these pre-build dependencies go away, however, when we move to Gradle, so even if we have a version mismatch one time it won't be a persistent issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1732: Clean up many small fixes
dweiss commented on a change in pull request #1732: URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468106643 ## File path: lucene/core/src/java/org/apache/lucene/codecs/blocktree/BlockTreeTermsWriter.java ## @@ -709,7 +709,7 @@ private PendingBlock writeBlock(int prefixLength, boolean isFloor, int floorLead PendingTerm term = (PendingTerm) ent; - assert StringHelper.startsWith(term.termBytes, prefix): "term.term=" + term.termBytes + " prefix=" + prefix; + assert StringHelper.startsWith(term.termBytes, prefix): "term.term=" + new String(term.termBytes) + " prefix=" + prefix; Review comment: This is wrong, uses default locale. ## File path: lucene/core/src/java/org/apache/lucene/analysis/Analyzer.java ## @@ -94,7 +94,7 @@ * Create a new Analyzer, reusing the same set of components per-thread * across calls to {@link #tokenStream(String, Reader)}. */ - public Analyzer() { Review comment: Can you not change those scopes in public API classes? This applies here and in other places -- protected changed to package-scope for source is not really an API-compatible change. ## File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java ## @@ -324,12 +324,12 @@ synchronized void doOnAbort(DocumentsWriterPerThread perThread) { } } - private void checkoutAndBlock(DocumentsWriterPerThread perThread) { + private synchronized void checkoutAndBlock(DocumentsWriterPerThread perThread) { Review comment: These are serious changes... you're adding synchronization on core classes. I don't think they should be piggybacked on top of trivial ones - I'm sure @s1monw would chip in whether this synchronization here makes sense but he'll probably overlook if it's a bulk of trivial changes on top. ## File path: lucene/core/src/java/org/apache/lucene/index/DocValuesUpdate.java ## @@ -152,12 +152,12 @@ static BytesRef readFrom(DataInput in, BytesRef scratch) throws IOException { } NumericDocValuesUpdate(Term term, String field, Long value) { - this(term, field, value != null ? value.longValue() : -1, BufferedUpdates.MAX_INT, value != null); + this(term, field, value != null ? value : -1, BufferedUpdates.MAX_INT, value != null); } -private NumericDocValuesUpdate(Term term, String field, long value, int docIDUpTo, boolean hasValue) { Review comment: previous version was correct camel case (upTo). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob opened a new pull request #1732: Clean up many small fixes
madrob opened a new pull request #1732: URL: https://github.com/apache/lucene-solr/pull/1732 * Abstract classes don't need public constructors since they can only be called by subclasses * Don't escape html characters in @code tags in javadoc * Fixed a few int/long arithmetic * Use Arrays.toString instead of implicit byte[].toString This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2822) TimeLimitingCollector starts thread in static {} with no way to stop them
[ https://issues.apache.org/jira/browse/LUCENE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175004#comment-17175004 ] David Smiley commented on LUCENE-2822: -- I could imagine an implementation that tracks doc ID advancing (includes those not collected) every X docs with tracking how many nanoseconds it took to do it so that the X could be adjusted to decide if it should be checked more frequently to meet the deadline. BTW it's unfortunate that ExitableDirectoryReader and TimeLimitingCollector don't even refer to each other in their javadocs, nor use the same exception, exist in different packages, and track time differently as well. Not user friendly. ExitableDirectoryReader is used earlier in the search process (covering query rewrite of wildcards, which is important), but _I think_ spans to nearly the end of collection, since the query should be reading the index relating to the final doc collected. So I wonder if we need a TimeLimitingCollector at all? > TimeLimitingCollector starts thread in static {} with no way to stop them > - > > Key: LUCENE-2822 > URL: https://issues.apache.org/jira/browse/LUCENE-2822 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Assignee: Simon Willnauer >Priority: Major > Fix For: 3.5, 4.0-ALPHA > > Attachments: LUCENE-2822.patch, LUCENE-2822.patch, LUCENE-2822.patch, > LUCENE-2822.patch > > > See the comment in LuceneTestCase. > If you even do Class.forName("TimeLimitingCollector") it starts up a thread > in a static method, and there isn't a way to kill it. > This is broken. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8626) standardise test class naming
[ https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174997#comment-17174997 ] Dawid Weiss commented on LUCENE-8626: - bq. Still, without automated enforcement For LuceneTestCase subclasses an automatic enforcement of this is trivial: add a test rule (or before class hook) that checks test class name (it can go up the chain of superclasses but doesn't have to). The benefit of doing this vs. file name checks is that actual test suites would be verified - not any other class that isn't a test suite. It would also work across all projects. Including those that import lucene-test-framework (which may be problematic for people?). > standardise test class naming > - > > Key: LUCENE-8626 > URL: https://issues.apache.org/jira/browse/LUCENE-8626 > Project: Lucene - Core > Issue Type: Test >Reporter: Christine Poerschke >Priority: Major > Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, > SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch > > > This was mentioned and proposed on the dev mailing list. Starting this ticket > here to start to make it happen? > History: This ticket was created as > https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got > JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174995#comment-17174995 ] David Eric Pugh commented on SOLR-14726: I notice that on the page where we discuss curl, https://lucene.apache.org/solr/guide/8_6/introduction-to-solr-indexing.html#introduction-to-solr-indexing, there is a rather random comment about using wget w/ PERL...Thoughts on removing this line: Instead of curl, you can use utilities such as GNU wget (http://www.gnu.org/software/wget/) or manage GETs and POSTS with Perl, although the command line options will differ. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API
murblanc commented on pull request #1730: URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671509879 > The current concrete classes do not use/implement these interfaces. These interfaces will only be a part of implementations. for instance, the `LazySolrCluster` is one of the impl. In the future we should add a couple more @noble what is the target use case of the interface and lazy implementation? I thought your aim was to create interfaces to existing internal classes therefore I expected them to implement these interfaces so these interfaces to be used in the code replacing the actual classes... Maybe it's just me not understading your intention here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14702) Remove Master and Slave from Code Base and Docs
[ https://issues.apache.org/jira/browse/SOLR-14702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Eduardo Fernandez Lobbe resolved SOLR-14702. -- Fix Version/s: 8.7 master (9.0) Resolution: Fixed Thanks [~marcussorealheis] for driving this and everyone who contributed. This has been merged and backported. Lets take any followup tasks as new Jira issues. > Remove Master and Slave from Code Base and Docs > --- > > Key: SOLR-14702 > URL: https://issues.apache.org/jira/browse/SOLR-14702 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: master (9.0) >Reporter: Marcus Eagan >Priority: Critical > Fix For: master (9.0), 8.7 > > Attachments: SOLR-14742-testfix.patch > > Time Spent: 17h > Remaining Estimate: 0h > > Every time I read _master_ and _slave_, I get pissed. > I think about the last and only time I remember visiting my maternal great > grandpa in Alabama at four years old. He was a sharecropper before WWI, where > he lost his legs, and then he was back to being a sharecropper somehow after > the war. Crazy, I know. I don't know if the world still called his job > sharecropping in 1993, but he was basically a slave—in America. He lived in > the same shack that his father, and his grandfather (born a slave) lived in > down in Alabama. Believe it or not, my dad's (born in 1926) grandfather was > actually born a slave, freed shortly after birth by his owner father. I never > met him, though. He died in the 40s. > Anyway, I cannot police all terms in the repo and do not wish to. This > master/slave shit is archaic and misleading on technical grounds. Thankfully, > there's only a handful of files in code and documentation that still talk > about masters and slaves. We should replace all of them. > There are so many ways to reword it. In fact, unless anyone else objects or > wants to do the grunt work to help my stress levels, I will open the pull > request myself in effort to make this project and community more inviting to > people of all backgrounds and histories. We can have leader/follower, or > primary/secondary, but none of this Master/Slave nonsense. I'm sick of the > garbage. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174951#comment-17174951 ] Ishan Chattopadhyaya commented on SOLR-14726: - The main idea behind using curl is not just to let the user be able to post the documents. The main benefit I see is that developers for almost any programming language can easily understand what is happening with the curl commands and likely already know how to achieve the same in his/her language of choice. The universal familiarity of curl helps here. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174948#comment-17174948 ] Ishan Chattopadhyaya commented on SOLR-14726: - bq. Everyone knows cURL. Agree with Marcus here. I think curl + jq should be sufficient. Is there anything else that the post tool can do which curl can't? > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r468065465 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlugin.java ## @@ -0,0 +1,41 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * Implemented by external plugins to control replica placement and movement on the search cluster (as well as other things + * such as cluster elasticity?) when cluster changes are required (initiated elsewhere, most likely following a Collection + * API call). + */ +public interface PlacementPlugin { Review comment: I believe we should let the plug-in manage this type of requirements rather than try to control it by the timing of when configs are passed. If there are licences to check the plug-in should cache the result and only confirm each time? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-2822) TimeLimitingCollector starts thread in static {} with no way to stop them
[ https://issues.apache.org/jira/browse/LUCENE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174930#comment-17174930 ] Uwe Schindler edited comment on LUCENE-2822 at 8/10/20, 5:18 PM: - In addition, the extra thread will soon be no issue anymore (the new thread impl coming with later java versions) - also known as fibers. was (Author: thetaphi): In addition, the extra thread will soon be no issue anymore (the new thread impl coming with later java versions). > TimeLimitingCollector starts thread in static {} with no way to stop them > - > > Key: LUCENE-2822 > URL: https://issues.apache.org/jira/browse/LUCENE-2822 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Assignee: Simon Willnauer >Priority: Major > Fix For: 3.5, 4.0-ALPHA > > Attachments: LUCENE-2822.patch, LUCENE-2822.patch, LUCENE-2822.patch, > LUCENE-2822.patch > > > See the comment in LuceneTestCase. > If you even do Class.forName("TimeLimitingCollector") it starts up a thread > in a static method, and there isn't a way to kill it. > This is broken. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2822) TimeLimitingCollector starts thread in static {} with no way to stop them
[ https://issues.apache.org/jira/browse/LUCENE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174930#comment-17174930 ] Uwe Schindler commented on LUCENE-2822: --- In addition, the extra thread will soon be no issue anymore (the new thread impl coming with later java versions). > TimeLimitingCollector starts thread in static {} with no way to stop them > - > > Key: LUCENE-2822 > URL: https://issues.apache.org/jira/browse/LUCENE-2822 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Assignee: Simon Willnauer >Priority: Major > Fix For: 3.5, 4.0-ALPHA > > Attachments: LUCENE-2822.patch, LUCENE-2822.patch, LUCENE-2822.patch, > LUCENE-2822.patch > > > See the comment in LuceneTestCase. > If you even do Class.forName("TimeLimitingCollector") it starts up a thread > in a static method, and there isn't a way to kill it. > This is broken. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2822) TimeLimitingCollector starts thread in static {} with no way to stop them
[ https://issues.apache.org/jira/browse/LUCENE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174917#comment-17174917 ] Uwe Schindler commented on LUCENE-2822: --- Hi, it depends on the operating system. Nanotime is still offcially a syscall on all operating systems, but some implementations of libc make it a volatile read from some address space mapped into the process (e.g macos), only falling back to syscall if the result can't be trusted. So in general, calling it on every hit is still a bad idea if don't rely on it. I'd use some modulo operation and maybe call it every 1000 hits. > TimeLimitingCollector starts thread in static {} with no way to stop them > - > > Key: LUCENE-2822 > URL: https://issues.apache.org/jira/browse/LUCENE-2822 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Assignee: Simon Willnauer >Priority: Major > Fix For: 3.5, 4.0-ALPHA > > Attachments: LUCENE-2822.patch, LUCENE-2822.patch, LUCENE-2822.patch, > LUCENE-2822.patch > > > See the comment in LuceneTestCase. > If you even do Class.forName("TimeLimitingCollector") it starts up a thread > in a static method, and there isn't a way to kill it. > This is broken. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2822) TimeLimitingCollector starts thread in static {} with no way to stop them
[ https://issues.apache.org/jira/browse/LUCENE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174461#comment-17174461 ] David Smiley commented on LUCENE-2822: -- [~uschindler] (or anyone), is System.nanoTime still considered expensive in modern JVMs to call once per collected doc? You commented about it's expense above. Alternatively, nanoTime could be called when the doc collection delta exceeds say 100 docs since the last nanoTime check. I'm digging this old issue up because, where I work, we've got some improvements to this utility many years ago relating to dealing with Thread starvation under load. As I look at it, I just don't like this Thread here at all, so I'm wondering if we can just remove it instead of enhancing it's existing mechanism. > TimeLimitingCollector starts thread in static {} with no way to stop them > - > > Key: LUCENE-2822 > URL: https://issues.apache.org/jira/browse/LUCENE-2822 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Assignee: Simon Willnauer >Priority: Major > Fix For: 3.5, 4.0-ALPHA > > Attachments: LUCENE-2822.patch, LUCENE-2822.patch, LUCENE-2822.patch, > LUCENE-2822.patch > > > See the comment in LuceneTestCase. > If you even do Class.forName("TimeLimitingCollector") it starts up a thread > in a static method, and there isn't a way to kill it. > This is broken. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174436#comment-17174436 ] Marcus Eagan commented on SOLR-14726: - bq. My experience from delivering the Solr version of Think Like a Relevance Engineer is that MANY MANY people aren't able to install python. They may be on Windows, they may not have "Developer" permissions, they may have Python 2 versus 3, or it's just not something they use at all. Yeah, if you're not a Python engineer I have watched very skilled engineers struggle with Python and understanding pip and virtual environments. I would vote absolutely against adding Python even though it is my favorite and strongest language by far. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1731: LUCENE-9451 Sort.rewrite does not always return this when unchanged
madrob commented on a change in pull request #1731: URL: https://github.com/apache/lucene-solr/pull/1731#discussion_r468031592 ## File path: lucene/core/src/java/org/apache/lucene/search/DoubleValuesSource.java ## @@ -456,13 +456,16 @@ public String toString() { @Override public SortField rewrite(IndexSearcher searcher) throws IOException { - DoubleValuesSortField rewritten = new DoubleValuesSortField(producer.rewrite(searcher), reverse); + DoubleValuesSource rewrittenSource = producer.rewrite(searcher); + if (rewrittenSource == producer) { Review comment: This might be better as an object equality check instead of reference equality, I'm not sure. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob opened a new pull request #1731: LUCENE-9451 Sort.rewrite does not always return this when unchanged
madrob opened a new pull request #1731: URL: https://github.com/apache/lucene-solr/pull/1731 https://issues.apache.org/jira/browse/LUCENE-9451 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174433#comment-17174433 ] David Eric Pugh commented on SOLR-14726: My experience from delivering the Solr version of _Think Like a Relevance Engineer_ is that MANY MANY people aren't able to install python. They may be on Windows, they may not have "Developer" permissions, they may have Python 2 versus 3, or it's just not something they use at all. In fact, I'm working on stripping out the Python requirement (https://github.com/o19s/solr-tmdb#index-tmdb-movies) for the sample data set, and hoping to go to ideally using the Solr Admin -> Documents -> File Upload feature (though I see it may be tied to the Extracting Request Handler) to load a Solr formatted .json file. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174434#comment-17174434 ] Marcus Eagan commented on SOLR-14726: - bq. If we could rely on Python and felt free to ask people to install things, I would lean towards HTTPie instead of curl: https://httpie.org/ HTTPie is amazing but still fringe in terms of adoption. Everyone knows cURL. I think that we should be able to point people to public datasets that are hosted elsewhere (maybe by one of us) rather than shipping Solr with example data. I'm happy to donate a public data repository for 10 years. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9451) Sort.rewrite doesn't always return this when unchanged
Mike Drob created LUCENE-9451: - Summary: Sort.rewrite doesn't always return this when unchanged Key: LUCENE-9451 URL: https://issues.apache.org/jira/browse/LUCENE-9451 Project: Lucene - Core Issue Type: Bug Components: core/search Affects Versions: 8.7 Reporter: Mike Drob Assignee: Mike Drob Sort.rewrite doesn't always return {{this}} as advertised in the Javadoc even if the underlying fields are unchanged. This is because the comparison uses reference equality. There are two solutions we can do here, 1) switch from reference equality to object equality, and 2) fix some of the underlying sort fields to not create unnecessary objects. cc: [~jpountz] [~romseygeek] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174412#comment-17174412 ] Alexandre Rafalovitch commented on SOLR-14726: -- If we could rely on Python and felt free to ask people to install things, I would lean towards HTTPie instead of curl: [https://httpie.org/] > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] HoustonPutman merged pull request #1716: SOLR-14706: Fix support for default autoscaling policy
HoustonPutman merged pull request #1716: URL: https://github.com/apache/lucene-solr/pull/1716 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14706) Upgrading 8.6.0 to 8.6.1 causes collection creation to fail
[ https://issues.apache.org/jira/browse/SOLR-14706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174398#comment-17174398 ] ASF subversion and git services commented on SOLR-14706: Commit 6e11a1c3f0599f1c918bc69c4f51928d23160e99 in lucene-solr's branch refs/heads/branch_8_6 from Houston Putman [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6e11a1c ] SOLR-14706: Fix support for default autoscaling policy (#1716) > Upgrading 8.6.0 to 8.6.1 causes collection creation to fail > --- > > Key: SOLR-14706 > URL: https://issues.apache.org/jira/browse/SOLR-14706 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 8.7, 8.6.1 > Environment: 8.6.1 upgraded from 8.6.0 with more than one node >Reporter: Gus Heck >Assignee: Houston Putman >Priority: Blocker > Fix For: 8.6.1 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > The following steps will reproduce a situation in which collection creation > fails with this stack trace: > {code:java} > 2020-08-03 12:17:58.617 INFO > (OverseerThreadFactory-22-thread-1-processing-n:192.168.2.106:8981_solr) [ > ] o.a.s.c.a.c.CreateCollectionCmd Create collection test861 > 2020-08-03 12:17:58.751 ERROR > (OverseerThreadFactory-22-thread-1-processing-n:192.168.2.106:8981_solr) [ > ] o.a.s.c.a.c.OverseerCollectionMessageHandler Collection: test861 operation: > create failed:org.apache.solr.common.SolrException > at > org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:347) > at > org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:264) > at > org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:517) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:212) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Only one extra tag supported for the > tag cores in { > "cores":"#EQUAL", > "node":"#ANY", > "strict":"false"} > at > org.apache.solr.client.solrj.cloud.autoscaling.Clause.(Clause.java:122) > at > org.apache.solr.client.solrj.cloud.autoscaling.Clause.create(Clause.java:235) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > at > org.apache.solr.client.solrj.cloud.autoscaling.Policy.(Policy.java:144) > at > org.apache.solr.client.solrj.cloud.autoscaling.AutoScalingConfig.getPolicy(AutoScalingConfig.java:372) > at > org.apache.solr.cloud.api.collections.Assign.usePolicyFramework(Assign.java:300) > at > org.apache.solr.cloud.api.collections.Assign.usePolicyFramework(Assign.java:277) > at > org.apache.solr.cloud.api.collections.Assign$AssignStrategyFactory.create(Assign.java:661) > at > org.apache.solr.cloud.api.collections.CreateCollectionCmd.buildReplicaPositions(CreateCollectionCmd.java:415) > at > org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:192) > ... 6 more > {code} > Generalized steps: > # Deploy 8.6.0 with separate data directories, create a collection to prove > it's working > # download > https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.6.1-RC1-reva32a3ac4e43f629df71e5ae30a3330be94b095f2/solr/solr-8.6.1.tgz > # Stop the server on all nodes > # replace the 8.6.0 with 8.6.1 > # Start the server > # via the admin UI create a collection > # Observe failure warning box (with no text), check logs, find above trace > Or more exactly here are my actual commands with a checkout of the 8.6.0 tag > in the working dir to which cloud.sh was configured: > # /cloud.sh new -r upgrademe > # Create collection named test860 via admin ui with _default > # ./cloud.sh stop > # cd upgrademe/ > # cp ../8_6_1_RC1/solr-8.6.1.tgz . > # mv solr-8.6.0-SNAPSHOT old >
[jira] [Commented] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174394#comment-17174394 ] Tomoko Uchida commented on LUCENE-2458: --- Seems spam account? > queryparser makes all CJK queries phrase queries regardless of analyzer > --- > > Key: LUCENE-2458 > URL: https://issues.apache.org/jira/browse/LUCENE-2458 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.1, 4.0-ALPHA > > Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, > LUCENE-2458.patch > > > The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, > ... queries into phrase queries, even though you didn't ask for one, and > there isn't a way to turn this off. > This completely breaks lucene for these languages, as it treats all queries > like 'grep'. > Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are > chinese characters, you get a phrasequery of "a b c d". if you use cjk > analyzer, its no better, its a phrasequery of "ab bc cd", and if you use > smartchinese analyzer, you get a phrasequery like "ab cd". But the user > didn't ask for one, and they cannot turn it off. > The reason is that the code to form phrase queries is not internationally > appropriate and assumes whitespace tokenization. If more than one token comes > out of whitespace delimited text, its automatically a phrase query no matter > what. > The proposed patch fixes the core queryparser (with all backwards compat > kept) to only form phrase queries when the double quote operator is used. > Implementing subclasses can always extend the QP and auto-generate whatever > kind of queries they want that might completely break search for languages > they don't care about, but core general-purpose QPs should be language > independent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174386#comment-17174386 ] Mr. Aleem edited comment on LUCENE-2458 at 8/10/20, 3:26 PM: - Я думаю, [A Place To Download Free Software|https://piratesfile.com/] что лучший способ продвинуться вперед - добавить поле CJK в solr, которое по умолчанию [onthis link|https://piratesfile.com/hitfilm-pro-crack/]имеет противоположное[just simply click on this|https://piratesfile.com/hitfilm-pro-crack]or you can press on linkповедение Ինչպես նկատել է Կոժին, կարծես թե [Security Cameras Best Software|piratesfile.com/security-monitor-pro-crack]this Hit Film Software will (т.е. рассматривает разделенные токены [MAC HitFilm|https://piratesfile.com/hitfilm-pro-crack] полностью отдельные). was (Author: maleem): Я думаю, [A Place To Download Free Software|https://piratesfile.com] что лучший способ продвинуться вперед - добавить поле CJK в solr, которое по умолчанию https://piratesfile.com/hitfilm-pro-crack/";>yeah this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/";>or you can press on linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/";>this Hit Film Software will (т.е. рассматривает разделенные токены https://piratesfile.com/hitfilm-pro-crack/";>как полностью отдельные). > queryparser makes all CJK queries phrase queries regardless of analyzer > --- > > Key: LUCENE-2458 > URL: https://issues.apache.org/jira/browse/LUCENE-2458 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.1, 4.0-ALPHA > > Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, > LUCENE-2458.patch > > > The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, > ... queries into phrase queries, even though you didn't ask for one, and > there isn't a way to turn this off. > This completely breaks lucene for these languages, as it treats all queries > like 'grep'. > Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are > chinese characters, you get a phrasequery of "a b c d". if you use cjk > analyzer, its no better, its a phrasequery of "ab bc cd", and if you use > smartchinese analyzer, you get a phrasequery like "ab cd". But the user > didn't ask for one, and they cannot turn it off. > The reason is that the code to form phrase queries is not internationally > appropriate and assumes whitespace tokenization. If more than one token comes > out of whitespace delimited text, its automatically a phrase query no matter > what. > The proposed patch fixes the core queryparser (with all backwards compat > kept) to only form phrase queries when the double quote operator is used. > Implementing subclasses can always extend the QP and auto-generate whatever > kind of queries they want that might completely break search for languages > they don't care about, but core general-purpose QPs should be language > independent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174386#comment-17174386 ] Mr. Aleem edited comment on LUCENE-2458 at 8/10/20, 3:26 PM: - Я думаю, [A Place To Download Free Software|https://piratesfile.com/] что лучший способ продвинуться вперед - добавить поле CJK в solr, которое по умолчанию [onthis link|https://piratesfile.com/hitfilm-pro-crack/]имеет противоположное[just simply click on this|https://piratesfile.com/hitfilm-pro-crack]or you can press on linkповедение Ինչպես նկատել է Կոժին, կարծես թե [Security Cameras Best Software|https://piratesfile.com/security-monitor-pro-crack]this Hit Film Software will (т.е. рассматривает разделенные токены [MAC HitFilm|https://piratesfile.com/hitfilm-pro-crack] полностью отдельные). was (Author: maleem): Я думаю, [A Place To Download Free Software|https://piratesfile.com/] что лучший способ продвинуться вперед - добавить поле CJK в solr, которое по умолчанию [onthis link|https://piratesfile.com/hitfilm-pro-crack/]имеет противоположное[just simply click on this|https://piratesfile.com/hitfilm-pro-crack]or you can press on linkповедение Ինչպես նկատել է Կոժին, կարծես թե [Security Cameras Best Software|piratesfile.com/security-monitor-pro-crack]this Hit Film Software will (т.е. рассматривает разделенные токены [MAC HitFilm|https://piratesfile.com/hitfilm-pro-crack] полностью отдельные). > queryparser makes all CJK queries phrase queries regardless of analyzer > --- > > Key: LUCENE-2458 > URL: https://issues.apache.org/jira/browse/LUCENE-2458 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.1, 4.0-ALPHA > > Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, > LUCENE-2458.patch > > > The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, > ... queries into phrase queries, even though you didn't ask for one, and > there isn't a way to turn this off. > This completely breaks lucene for these languages, as it treats all queries > like 'grep'. > Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are > chinese characters, you get a phrasequery of "a b c d". if you use cjk > analyzer, its no better, its a phrasequery of "ab bc cd", and if you use > smartchinese analyzer, you get a phrasequery like "ab cd". But the user > didn't ask for one, and they cannot turn it off. > The reason is that the code to form phrase queries is not internationally > appropriate and assumes whitespace tokenization. If more than one token comes > out of whitespace delimited text, its automatically a phrase query no matter > what. > The proposed patch fixes the core queryparser (with all backwards compat > kept) to only form phrase queries when the double quote operator is used. > Implementing subclasses can always extend the QP and auto-generate whatever > kind of queries they want that might completely break search for languages > they don't care about, but core general-purpose QPs should be language > independent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174386#comment-17174386 ] Mr. Aleem edited comment on LUCENE-2458 at 8/10/20, 3:19 PM: - Я думаю, [A Place To Download Free Software|https://piratesfile.com] что лучший способ продвинуться вперед - добавить поле CJK в solr, которое по умолчанию https://piratesfile.com/hitfilm-pro-crack/";>yeah this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/";>or you can press on linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/";>this Hit Film Software will (т.е. рассматривает разделенные токены https://piratesfile.com/hitfilm-pro-crack/";>как полностью отдельные). was (Author: maleem): Я думаю, [#https://piratesfile.com/";>A Place to download All PC Software]что лучший способ продвинуться вперед - добавить поле CJK в solr, которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/";>yeah this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/";>or you can press on linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/";>this Hit Film Software will (т.е. рассматривает разделенные токены https://piratesfile.com/hitfilm-pro-crack/";>как полностью отдельные). > queryparser makes all CJK queries phrase queries regardless of analyzer > --- > > Key: LUCENE-2458 > URL: https://issues.apache.org/jira/browse/LUCENE-2458 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.1, 4.0-ALPHA > > Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, > LUCENE-2458.patch > > > The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, > ... queries into phrase queries, even though you didn't ask for one, and > there isn't a way to turn this off. > This completely breaks lucene for these languages, as it treats all queries > like 'grep'. > Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are > chinese characters, you get a phrasequery of "a b c d". if you use cjk > analyzer, its no better, its a phrasequery of "ab bc cd", and if you use > smartchinese analyzer, you get a phrasequery like "ab cd". But the user > didn't ask for one, and they cannot turn it off. > The reason is that the code to form phrase queries is not internationally > appropriate and assumes whitespace tokenization. If more than one token comes > out of whitespace delimited text, its automatically a phrase query no matter > what. > The proposed patch fixes the core queryparser (with all backwards compat > kept) to only form phrase queries when the double quote operator is used. > Implementing subclasses can always extend the QP and auto-generate whatever > kind of queries they want that might completely break search for languages > they don't care about, but core general-purpose QPs should be language > independent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174386#comment-17174386 ] Mr. Aleem edited comment on LUCENE-2458 at 8/10/20, 3:17 PM: - Я думаю, [#https://piratesfile.com/";>A Place to download All PC Software]что лучший способ продвинуться вперед - добавить поле CJK в solr, которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/";>yeah this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/";>or you can press on linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/";>this Hit Film Software will (т.е. рассматривает разделенные токены https://piratesfile.com/hitfilm-pro-crack/";>как полностью отдельные). was (Author: maleem): Я думаю, https://piratesfile.com/";>A Place to download All PC Softwareчто лучший способ продвинуться вперед - добавить поле CJK в solr, которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/";>yeah this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/";>or you can press on linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/";>this Hit Film Software will (т.е. рассматривает разделенные токены https://piratesfile.com/hitfilm-pro-crack/";>как полностью отдельные). > queryparser makes all CJK queries phrase queries regardless of analyzer > --- > > Key: LUCENE-2458 > URL: https://issues.apache.org/jira/browse/LUCENE-2458 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.1, 4.0-ALPHA > > Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, > LUCENE-2458.patch > > > The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, > ... queries into phrase queries, even though you didn't ask for one, and > there isn't a way to turn this off. > This completely breaks lucene for these languages, as it treats all queries > like 'grep'. > Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are > chinese characters, you get a phrasequery of "a b c d". if you use cjk > analyzer, its no better, its a phrasequery of "ab bc cd", and if you use > smartchinese analyzer, you get a phrasequery like "ab cd". But the user > didn't ask for one, and they cannot turn it off. > The reason is that the code to form phrase queries is not internationally > appropriate and assumes whitespace tokenization. If more than one token comes > out of whitespace delimited text, its automatically a phrase query no matter > what. > The proposed patch fixes the core queryparser (with all backwards compat > kept) to only form phrase queries when the double quote operator is used. > Implementing subclasses can always extend the QP and auto-generate whatever > kind of queries they want that might completely break search for languages > they don't care about, but core general-purpose QPs should be language > independent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174386#comment-17174386 ] Mr. Aleem commented on LUCENE-2458: --- Я думаю, https://piratesfile.com/";>A Place to download All PC Softwareчто лучший способ продвинуться вперед - добавить поле CJK в solr, которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/";>yeah this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/";>or you can press on linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/";>this Hit Film Software will (т.е. рассматривает разделенные токены https://piratesfile.com/hitfilm-pro-crack/";>как полностью отдельные). > queryparser makes all CJK queries phrase queries regardless of analyzer > --- > > Key: LUCENE-2458 > URL: https://issues.apache.org/jira/browse/LUCENE-2458 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.1, 4.0-ALPHA > > Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, > LUCENE-2458.patch > > > The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, > ... queries into phrase queries, even though you didn't ask for one, and > there isn't a way to turn this off. > This completely breaks lucene for these languages, as it treats all queries > like 'grep'. > Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are > chinese characters, you get a phrasequery of "a b c d". if you use cjk > analyzer, its no better, its a phrasequery of "ab bc cd", and if you use > smartchinese analyzer, you get a phrasequery like "ab cd". But the user > didn't ask for one, and they cannot turn it off. > The reason is that the code to form phrase queries is not internationally > appropriate and assumes whitespace tokenization. If more than one token comes > out of whitespace delimited text, its automatically a phrase query no matter > what. > The proposed patch fixes the core queryparser (with all backwards compat > kept) to only form phrase queries when the double quote operator is used. > Implementing subclasses can always extend the QP and auto-generate whatever > kind of queries they want that might completely break search for languages > they don't care about, but core general-purpose QPs should be language > independent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Issue Comment Deleted] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mr. Aleem updated LUCENE-2458: -- Comment: was deleted (was: Я думаю, https://piratesfile.com/";>A Place to download All PC Softwareчто лучший способ продвинуться вперед - добавить поле CJK в solr, которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/";>yeah this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/";>or you can press on linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/";>this Hit Film Software will (т.е. рассматривает разделенные токены https://piratesfile.com/hitfilm-pro-crack/";>как полностью отдельные).) > queryparser makes all CJK queries phrase queries regardless of analyzer > --- > > Key: LUCENE-2458 > URL: https://issues.apache.org/jira/browse/LUCENE-2458 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.1, 4.0-ALPHA > > Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, > LUCENE-2458.patch > > > The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, > ... queries into phrase queries, even though you didn't ask for one, and > there isn't a way to turn this off. > This completely breaks lucene for these languages, as it treats all queries > like 'grep'. > Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are > chinese characters, you get a phrasequery of "a b c d". if you use cjk > analyzer, its no better, its a phrasequery of "ab bc cd", and if you use > smartchinese analyzer, you get a phrasequery like "ab cd". But the user > didn't ask for one, and they cannot turn it off. > The reason is that the code to form phrase queries is not internationally > appropriate and assumes whitespace tokenization. If more than one token comes > out of whitespace delimited text, its automatically a phrase query no matter > what. > The proposed patch fixes the core queryparser (with all backwards compat > kept) to only form phrase queries when the double quote operator is used. > Implementing subclasses can always extend the QP and auto-generate whatever > kind of queries they want that might completely break search for languages > they don't care about, but core general-purpose QPs should be language > independent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174385#comment-17174385 ] Mr. Aleem edited comment on LUCENE-2458 at 8/10/20, 3:15 PM: - Я думаю, https://piratesfile.com/";>A Place to download All PC Softwareчто лучший способ продвинуться вперед - добавить поле CJK в solr, которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/";>yeah this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/";>or you can press on linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/";>this Hit Film Software will (т.е. рассматривает разделенные токены https://piratesfile.com/hitfilm-pro-crack/";>как полностью отдельные). was (Author: maleem): Я думаю, https://piratesfile.com/";>A Place to download All PC Softwareчто лучший способ продвинуться вперед - добавить поле CJK в solr, которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/";>yeah this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/";>or you can press on linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/";>this Hit Film Software will (т.е. рассматривает разделенные токены https://piratesfile.com/hitfilm-pro-crack/";>как полностью отдельные). > queryparser makes all CJK queries phrase queries regardless of analyzer > --- > > Key: LUCENE-2458 > URL: https://issues.apache.org/jira/browse/LUCENE-2458 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.1, 4.0-ALPHA > > Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, > LUCENE-2458.patch > > > The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, > ... queries into phrase queries, even though you didn't ask for one, and > there isn't a way to turn this off. > This completely breaks lucene for these languages, as it treats all queries > like 'grep'. > Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are > chinese characters, you get a phrasequery of "a b c d". if you use cjk > analyzer, its no better, its a phrasequery of "ab bc cd", and if you use > smartchinese analyzer, you get a phrasequery like "ab cd". But the user > didn't ask for one, and they cannot turn it off. > The reason is that the code to form phrase queries is not internationally > appropriate and assumes whitespace tokenization. If more than one token comes > out of whitespace delimited text, its automatically a phrase query no matter > what. > The proposed patch fixes the core queryparser (with all backwards compat > kept) to only form phrase queries when the double quote operator is used. > Implementing subclasses can always extend the QP and auto-generate whatever > kind of queries they want that might completely break search for languages > they don't care about, but core general-purpose QPs should be language > independent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174385#comment-17174385 ] Mr. Aleem commented on LUCENE-2458: --- Я думаю, https://piratesfile.com/";>A Place to download All PC Softwareчто лучший способ продвинуться вперед - добавить поле CJK в solr, которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/";>yeah this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/";>or you can press on linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/";>this Hit Film Software will (т.е. рассматривает разделенные токены https://piratesfile.com/hitfilm-pro-crack/";>как полностью отдельные). > queryparser makes all CJK queries phrase queries regardless of analyzer > --- > > Key: LUCENE-2458 > URL: https://issues.apache.org/jira/browse/LUCENE-2458 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.1, 4.0-ALPHA > > Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, > LUCENE-2458.patch > > > The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, > ... queries into phrase queries, even though you didn't ask for one, and > there isn't a way to turn this off. > This completely breaks lucene for these languages, as it treats all queries > like 'grep'. > Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are > chinese characters, you get a phrasequery of "a b c d". if you use cjk > analyzer, its no better, its a phrasequery of "ab bc cd", and if you use > smartchinese analyzer, you get a phrasequery like "ab cd". But the user > didn't ask for one, and they cannot turn it off. > The reason is that the code to form phrase queries is not internationally > appropriate and assumes whitespace tokenization. If more than one token comes > out of whitespace delimited text, its automatically a phrase query no matter > what. > The proposed patch fixes the core queryparser (with all backwards compat > kept) to only form phrase queries when the double quote operator is used. > Implementing subclasses can always extend the QP and auto-generate whatever > kind of queries they want that might completely break search for languages > they don't care about, but core general-purpose QPs should be language > independent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467947455 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlanFactory.java ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * Allows plugins to create {@link PlacementPlan}s telling the Solr layer where to create replicas following the processing of + * a {@link PlacementRequest}. The Solr layer can (and will) check that the {@link PlacementPlan} conforms to the {@link PlacementRequest} (and + * if it does not, the requested operation will fail). + */ +public interface PlacementPlanFactory { + /** + * Creates a {@link PlacementPlan} for adding a new collection and its replicas. + * + * This is in support of {@link org.apache.solr.cloud.api.collections.CreateCollectionCmd}. + */ + PlacementPlan createPlacementPlanNewCollection(CreateNewCollectionPlacementRequest request, String CollectionName, Set replicaPlacements); + + /** + * Creates a {@link PlacementPlan} for adding replicas to a given shard of an existing collection. + * + * This is in support (directly or indirectly) of {@link org.apache.solr.cloud.api.collections.AddReplicaCmd}, + * {@link org.apache.solr.cloud.api.collections.CreateShardCmd}, {@link org.apache.solr.cloud.api.collections.ReplaceNodeCmd}, + * {@link org.apache.solr.cloud.api.collections.MoveReplicaCmd}, {@link org.apache.solr.cloud.api.collections.SplitShardCmd}, + * {@link org.apache.solr.cloud.api.collections.RestoreCmd} and {@link org.apache.solr.cloud.api.collections.MigrateCmd}. + * (as well as of {@link org.apache.solr.cloud.api.collections.CreateCollectionCmd} in the specific case of + * {@link org.apache.solr.common.params.CollectionAdminParams#WITH_COLLECTION} but this should be removed shortly and + * the section in parentheses of this comment should be removed when the {@code withCollection} javadoc link appears broken). + */ + PlacementPlan createPlacementPlanAddReplicas(AddReplicasPlacementRequest request, String CollectionName, Set replicaPlacements); + + /** + * Creates a {@link ReplicaPlacement} needed to be passed to some/all {@link PlacementPlan} factory methods. + */ + ReplicaPlacement createReplicaPlacement(String shardName, Node node, Replica.ReplicaType replicaType); Review comment: That’s the way for the plugin to build the replica placements it has decided in order to pass the set to the appropriate PlacementPlanFactory method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467945960 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlanFactory.java ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * Allows plugins to create {@link PlacementPlan}s telling the Solr layer where to create replicas following the processing of + * a {@link PlacementRequest}. The Solr layer can (and will) check that the {@link PlacementPlan} conforms to the {@link PlacementRequest} (and + * if it does not, the requested operation will fail). + */ +public interface PlacementPlanFactory { + /** + * Creates a {@link PlacementPlan} for adding a new collection and its replicas. + * + * This is in support of {@link org.apache.solr.cloud.api.collections.CreateCollectionCmd}. + */ + PlacementPlan createPlacementPlanNewCollection(CreateNewCollectionPlacementRequest request, String CollectionName, Set replicaPlacements); + + /** + * Creates a {@link PlacementPlan} for adding replicas to a given shard of an existing collection. + * + * This is in support (directly or indirectly) of {@link org.apache.solr.cloud.api.collections.AddReplicaCmd}, + * {@link org.apache.solr.cloud.api.collections.CreateShardCmd}, {@link org.apache.solr.cloud.api.collections.ReplaceNodeCmd}, + * {@link org.apache.solr.cloud.api.collections.MoveReplicaCmd}, {@link org.apache.solr.cloud.api.collections.SplitShardCmd}, + * {@link org.apache.solr.cloud.api.collections.RestoreCmd} and {@link org.apache.solr.cloud.api.collections.MigrateCmd}. + * (as well as of {@link org.apache.solr.cloud.api.collections.CreateCollectionCmd} in the specific case of + * {@link org.apache.solr.common.params.CollectionAdminParams#WITH_COLLECTION} but this should be removed shortly and + * the section in parentheses of this comment should be removed when the {@code withCollection} javadoc link appears broken). + */ + PlacementPlan createPlacementPlanAddReplicas(AddReplicasPlacementRequest request, String CollectionName, Set replicaPlacements); Review comment: Is the move replica command picking up the destination or is the destination specified in the API call? If the latter, there will be no call to the placement plugin. And if the former, the fact that no files are to be moved is relatively transparent to the plugin. The plugin doesn’t do any work but just tells Solr where to put things. Solr code would then either create or move depending on what command it was executing. The only difference could be move placement computation (if there’s one) should take into account the lower load on the source node (since replicas will be moved off of it). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467943083 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/Cluster.java ## @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.io.IOException; +import java.util.Optional; +import java.util.Set; + +/** + * A representation of the (initial) cluster state, providing information on which nodes are part of the cluster and a way + * to get to more detailed info. + * + * This instance can also be used as a {@link PropertyValueSource} if {@link PropertyKey}'s need to be specified with + * a global cluster target. + */ +public interface Cluster extends PropertyValueSource { + /** + * @return current set of live nodes. Never null, never empty (Solr wouldn't call the plugin if empty + * since no useful work could then be done). + */ + Set getLiveNodes(); + + /** + * Returns info about the given collection if one exists. Because it is not expected for plugins to request info about + * a large number of collections, requests can only be made one by one. + * + * This is also the reason we do not return a {@link java.util.Map} or {@link Set} of {@link SolrCollection}'s here: it would be + * wasteful to fetch all data and fill such a map when plugin code likely needs info about at most one or two collections. + */ + Optional getCollection(String collectionName) throws IOException; + + /** + * Allows getting all {@link SolrCollection} present in the cluster. + * + * WARNING: this call might be extremely inefficient on large clusters. Usage is discouraged. + */ + Set getAllCollections(); Review comment: I think there was no call to get only names (away from computer for a week or more). The only call on cluster state returned DocCollection set or map... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9450) Taxonomy index should use DocValues not StoredFields
[ https://issues.apache.org/jira/browse/LUCENE-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174343#comment-17174343 ] Michael McCandless commented on LUCENE-9450: +1, thanks [~gworah]! It is really silly that the taxonomy index uses stored fields today and must do a number of stored field lookups for each query to resolve taxonomy ordinals back to human presentable facet labels. At search time, after pulling the {{BinaryDocValues}}, you need to {{.advanceExact}} to that docid, confirm (maybe, {{assert}}?) that method returns {{true}}, then pull the {{.binaryValue()}}. Did you see an exception in tests when you tried your patch? The default {{Codec}} should throw an exception if you try to pull a {{.binaryValue()}} without first calling {{.advancExact()}} I hope. Also, at indexing time, it looks like you are no longer indexing the {{StringField}}, but I think you must keep indexing it, but change the {{Field.Store.YES}} to {{Field.Store.NO}}. This field is also stored in the inverted index and is what allows us to do the label -> ordinal lookup, I think. Maybe post some of the failing tests if those two above fixes still don't work? Thanks for tackling this! > Taxonomy index should use DocValues not StoredFields > > > Key: LUCENE-9450 > URL: https://issues.apache.org/jira/browse/LUCENE-9450 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.5.2 >Reporter: Gautam Worah >Priority: Minor > Labels: performance > Attachments: wip_taxonomy_patch > > > The taxonomy index that maps binning labels to ordinals was created before > Lucene added BinaryDocValues. > I've attached a WIP patch (does not pass tests currently) > Issue suggested by [~mikemccand] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13381) Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a PointField facet
[ https://issues.apache.org/jira/browse/SOLR-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174336#comment-17174336 ] Cassandra Targett commented on SOLR-13381: -- The exact strategies your addition mentions in passing are discussed in the section on reindexing strategies, https://lucene.apache.org/solr/guide/8_6/reindexing.html#reindexing-strategies, in greater detail. My feeling is this note is repetition that doesn't add value. I also don't like that it effectively becomes the 2nd sentence on the page, and it's overly general. Reindexing doesn't always require removing all documents first. Sometimes you _can_ just update the existing documents, but the page says a couple times that dropping the index is the preferred approach. The only thing I could think of adding, really, is to specifically add a sentence to the section on reindexing strategies that repeats the point that if you change field types, you must reindex *from scratch*. The folks who read the page and didn't understand these from the section on reindexing, could you perhaps share your thoughts on how we could have been more clear? I'm just skeptical that this one sentence at the top of the page is going to bring that point home if all the other discussion about it didn't. > Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a > PointField facet > -- > > Key: SOLR-13381 > URL: https://issues.apache.org/jira/browse/SOLR-13381 > Project: Solr > Issue Type: Bug > Components: faceting >Affects Versions: 7.0, 7.6, 7.7, 7.7.1 > Environment: solr, solrcloud >Reporter: Zhu JiaJun >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13381.patch, SOLR-13381.patch > > > Hey, > I got an "Unexpected docvalues type SORTED_NUMERIC" exception when I perform > group facet on an IntPointField. Debugging into the source code, the cause is > that internally the docvalue type for PointField is "NUMERIC" (single value) > or "SORTED_NUMERIC" (multi value), while the TermGroupFacetCollector class > requires the facet field must have a "SORTED" or "SOTRTED_SET" docvalue type: > [https://github.com/apache/lucene-solr/blob/2480b74887eff01f729d62a57b415d772f947c91/lucene/grouping/src/java/org/apache/lucene/search/grouping/TermGroupFacetCollector.java#L313] > > When I change schema for all int field to TrieIntField, the group facet then > work. Since internally the docvalue type for TrieField is SORTED (single > value) or SORTED_SET (multi value). > Regarding that the "TrieField" is depreciated in Solr7, please help on this > grouping facet issue for PointField. I also commented this issue in SOLR-7495. > > In addtional, all place of "${solr.tests.IntegerFieldType}" in the unit test > files seems to be using the "TrieintField", if change to "IntPointField", > some unit tests will fail, for example: > [https://github.com/apache/lucene-solr/blob/3de0b3671998cc9bc723d10f1b31ce48cbd4fa64/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L417] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13381) Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a PointField facet
[ https://issues.apache.org/jira/browse/SOLR-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-13381: -- Attachment: SOLR-13381.patch Status: Reopened (was: Reopened) [~ctargett] (or anyone) WDYT about the wordsmithing here? I verified that deleting all docs and committing actually does remove all the segments. I didn't want to get in to a long explanation about segments, so I just left the rather cryptic comment on segments. > Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a > PointField facet > -- > > Key: SOLR-13381 > URL: https://issues.apache.org/jira/browse/SOLR-13381 > Project: Solr > Issue Type: Bug > Components: faceting >Affects Versions: 7.7.1, 7.7, 7.6, 7.0 > Environment: solr, solrcloud >Reporter: Zhu JiaJun >Priority: Major > Attachments: SOLR-13381.patch, SOLR-13381.patch > > > Hey, > I got an "Unexpected docvalues type SORTED_NUMERIC" exception when I perform > group facet on an IntPointField. Debugging into the source code, the cause is > that internally the docvalue type for PointField is "NUMERIC" (single value) > or "SORTED_NUMERIC" (multi value), while the TermGroupFacetCollector class > requires the facet field must have a "SORTED" or "SOTRTED_SET" docvalue type: > [https://github.com/apache/lucene-solr/blob/2480b74887eff01f729d62a57b415d772f947c91/lucene/grouping/src/java/org/apache/lucene/search/grouping/TermGroupFacetCollector.java#L313] > > When I change schema for all int field to TrieIntField, the group facet then > work. Since internally the docvalue type for TrieField is SORTED (single > value) or SORTED_SET (multi value). > Regarding that the "TrieField" is depreciated in Solr7, please help on this > grouping facet issue for PointField. I also commented this issue in SOLR-7495. > > In addtional, all place of "${solr.tests.IntegerFieldType}" in the unit test > files seems to be using the "TrieintField", if change to "IntPointField", > some unit tests will fail, for example: > [https://github.com/apache/lucene-solr/blob/3de0b3671998cc9bc723d10f1b31ce48cbd4fa64/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L417] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-13381) Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a PointField facet
[ https://issues.apache.org/jira/browse/SOLR-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reassigned SOLR-13381: - Assignee: Erick Erickson > Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a > PointField facet > -- > > Key: SOLR-13381 > URL: https://issues.apache.org/jira/browse/SOLR-13381 > Project: Solr > Issue Type: Bug > Components: faceting >Affects Versions: 7.0, 7.6, 7.7, 7.7.1 > Environment: solr, solrcloud >Reporter: Zhu JiaJun >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13381.patch, SOLR-13381.patch > > > Hey, > I got an "Unexpected docvalues type SORTED_NUMERIC" exception when I perform > group facet on an IntPointField. Debugging into the source code, the cause is > that internally the docvalue type for PointField is "NUMERIC" (single value) > or "SORTED_NUMERIC" (multi value), while the TermGroupFacetCollector class > requires the facet field must have a "SORTED" or "SOTRTED_SET" docvalue type: > [https://github.com/apache/lucene-solr/blob/2480b74887eff01f729d62a57b415d772f947c91/lucene/grouping/src/java/org/apache/lucene/search/grouping/TermGroupFacetCollector.java#L313] > > When I change schema for all int field to TrieIntField, the group facet then > work. Since internally the docvalue type for TrieField is SORTED (single > value) or SORTED_SET (multi value). > Regarding that the "TrieField" is depreciated in Solr7, please help on this > grouping facet issue for PointField. I also commented this issue in SOLR-7495. > > In addtional, all place of "${solr.tests.IntegerFieldType}" in the unit test > files seems to be using the "TrieintField", if change to "IntPointField", > some unit tests will fail, for example: > [https://github.com/apache/lucene-solr/blob/3de0b3671998cc9bc723d10f1b31ce48cbd4fa64/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L417] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Reopened] (SOLR-13381) Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a PointField facet
[ https://issues.apache.org/jira/browse/SOLR-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reopened SOLR-13381: --- Good point, I'll add a note to the docs. > Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a > PointField facet > -- > > Key: SOLR-13381 > URL: https://issues.apache.org/jira/browse/SOLR-13381 > Project: Solr > Issue Type: Bug > Components: faceting >Affects Versions: 7.0, 7.6, 7.7, 7.7.1 > Environment: solr, solrcloud >Reporter: Zhu JiaJun >Priority: Major > Attachments: SOLR-13381.patch > > > Hey, > I got an "Unexpected docvalues type SORTED_NUMERIC" exception when I perform > group facet on an IntPointField. Debugging into the source code, the cause is > that internally the docvalue type for PointField is "NUMERIC" (single value) > or "SORTED_NUMERIC" (multi value), while the TermGroupFacetCollector class > requires the facet field must have a "SORTED" or "SOTRTED_SET" docvalue type: > [https://github.com/apache/lucene-solr/blob/2480b74887eff01f729d62a57b415d772f947c91/lucene/grouping/src/java/org/apache/lucene/search/grouping/TermGroupFacetCollector.java#L313] > > When I change schema for all int field to TrieIntField, the group facet then > work. Since internally the docvalue type for TrieField is SORTED (single > value) or SORTED_SET (multi value). > Regarding that the "TrieField" is depreciated in Solr7, please help on this > grouping facet issue for PointField. I also commented this issue in SOLR-7495. > > In addtional, all place of "${solr.tests.IntegerFieldType}" in the unit test > files seems to be using the "TrieintField", if change to "IntPointField", > some unit tests will fail, for example: > [https://github.com/apache/lucene-solr/blob/3de0b3671998cc9bc723d10f1b31ce48cbd4fa64/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L417] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14691) Metrics reporting should avoid creating objects
[ https://issues.apache.org/jira/browse/SOLR-14691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-14691: -- Priority: Blocker (was: Major) > Metrics reporting should avoid creating objects > --- > > Key: SOLR-14691 > URL: https://issues.apache.org/jira/browse/SOLR-14691 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andrzej Bialecki >Priority: Blocker > Fix For: 8.7 > > > {{MetricUtils}} unnecessarily creates a lot of short-lived objects (maps and > lists). This affects GC, especially since metrics are frequently polled by > clients. We should refactor it to use {{MapWriter}} as much as possible. > Alternatively we could provide our wrappers or subclasses of Codahale metrics > that implement {{MapWriter}}, then a lot of complexity in {{MetricUtils}} > wouldn't be needed at all. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14691) Metrics reporting should avoid creating objects
[ https://issues.apache.org/jira/browse/SOLR-14691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-14691: -- Fix Version/s: 8.7 > Metrics reporting should avoid creating objects > --- > > Key: SOLR-14691 > URL: https://issues.apache.org/jira/browse/SOLR-14691 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andrzej Bialecki >Priority: Major > Fix For: 8.7 > > > {{MetricUtils}} unnecessarily creates a lot of short-lived objects (maps and > lists). This affects GC, especially since metrics are frequently polled by > clients. We should refactor it to use {{MapWriter}} as much as possible. > Alternatively we could provide our wrappers or subclasses of Codahale metrics > that implement {{MapWriter}}, then a lot of complexity in {{MetricUtils}} > wouldn't be needed at all. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on pull request #1694: SOLR-14680: Provide simple interfaces to our cloud classes (only API)
noblepaul commented on pull request #1694: URL: https://github.com/apache/lucene-solr/pull/1694#issuecomment-671305266 I intend to merge this soon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1694: SOLR-14680: Provide simple interfaces to our cloud classes (only API)
noblepaul commented on a change in pull request #1694: URL: https://github.com/apache/lucene-solr/pull/1694#discussion_r467845512 ## File path: solr/solrj/src/java/org/apache/solr/cluster/api/SolrNode.java ## @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.api; + +import org.apache.solr.common.util.SimpleMap; + +/** A read only view of a Solr node */ +public interface SolrNode { + + /** The node name */ + String name(); + + /**Base http url for this node + * + * @param isV2 if true gives the /api endpoint , else /solr endpoint + */ + String baseUrl(boolean isV2); Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on pull request #1623: LUCENE-8962: Merge segments on getReader
s1monw commented on pull request #1623: URL: https://github.com/apache/lucene-solr/pull/1623#issuecomment-671301636 @mikemccand I do understand the issue now why holding the _flushLock_ is illegal here. The problem is again the lock ordering in combination with the _commitLock_. One option that we have here is to remove the _flushLock_ altogether and replace it's usage with the _commitLock_. I guess we need to find a better or new name for it but I don't see where having two different locks buys us much since they are both really just used to sync on administration of the IW. I personally also don't see why it would buys us anything in terms of concurrency. WDYT This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check
[ https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174251#comment-17174251 ] Cao Manh Dat commented on SOLR-14641: - bq. I disagree. In general, whoever wishes to introduce a change should own the performance testing, no matter who actually does it. Others can volunteer, but ultimate obligation should remain with the committer introducing the change. I said that because I feel that you did not even take a look at the commit, if you do you will see that the needs for a perf run here is not neccessary. > PeerSync, remove canHandleVersionRanges check > - > > Key: SOLR-14641 > URL: https://issues.apache.org/jira/browse/SOLR-14641 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and > 7.0. To maintain backward compatibility at the time we introduce an endpoint > in RealTimeGetComponent to check whether a node support that feature or not. > It served well its purpose and it should be removed to reduce complexity and > a request-response trip for asking that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14684) CloudExitableDirectoryReaderTest failing about 25% of the time
[ https://issues.apache.org/jira/browse/SOLR-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174249#comment-17174249 ] Erick Erickson commented on SOLR-14684: --- [~caomanhdat] I was able to run 1,000 iterations with the patch over night and got no failures, so this looks good! > CloudExitableDirectoryReaderTest failing about 25% of the time > -- > > Key: SOLR-14684 > URL: https://issues.apache.org/jira/browse/SOLR-14684 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) > Components: Tests >Affects Versions: master (9.0) >Reporter: Erick Erickson >Priority: Major > Attachments: stdout > > Time Spent: 10m > Remaining Estimate: 0h > > If I beast this on my local machine, it fails (non reproducibly of course) > about 1/4 of the time. Log attached. The test itself hasn't changed in 11 > months or so. > It looks like occasionally the calls throw an error rather than return > partial results with a message: "Time allowed to handle this request > exceeded:[]". > It's been failing very intermittently for a couple of years, but the failure > rate really picked up in the last couple of weeks. IDK whether the failures > prior to the last couple of weeks are the same root cause. > I'll do some spelunking to see if I can pinpoint the commit that made this > happen, but it'll take a while. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check
[ https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174244#comment-17174244 ] Ishan Chattopadhyaya commented on SOLR-14641: - bq. But that quite non-sense to me from the point of who did the commit to do performance test I disagree. In general, whoever wishes to introduce a change should own the performance testing, no matter who actually does it. Others can volunteer, but ultimate obligation should remain with the committer introducing the change. bq. this change just basically remove deprecated code rather than optimization. If you are *confident* this is just dead code removal, please feel free to go ahead _with this one_. Thanks for the clarification! > PeerSync, remove canHandleVersionRanges check > - > > Key: SOLR-14641 > URL: https://issues.apache.org/jira/browse/SOLR-14641 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and > 7.0. To maintain backward compatibility at the time we introduce an endpoint > in RealTimeGetComponent to check whether a node support that feature or not. > It served well its purpose and it should be removed to reduce complexity and > a request-response trip for asking that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check
[ https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174241#comment-17174241 ] Cao Manh Dat commented on SOLR-14641: - But that quite non-sense to me from the point of who did the commit to do performance test for this one since this change just basically remove deprecated code rather than optimization. Basically what we used to do here is * asking nodes wether they support versionRanges or not * if true (this is the default value since 7.0) go with versionRanges handling (instead of concerte versions). Changes made by this issue is * always go with verisionRanges since we know that all other nodes can support that so it quite wasteful to ask first. So if there are any performance regression it already happen long time ago. Anyway I'm ok with revert the change and letting your benchmark work finish if that makes thing easier. > PeerSync, remove canHandleVersionRanges check > - > > Key: SOLR-14641 > URL: https://issues.apache.org/jira/browse/SOLR-14641 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and > 7.0. To maintain backward compatibility at the time we introduce an endpoint > in RealTimeGetComponent to check whether a node support that feature or not. > It served well its purpose and it should be removed to reduce complexity and > a request-response trip for asking that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface
sigram commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467810904 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/PropertyKeyFactory.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * Factory used by the plugin to create property keys to request property values from Solr. + * + * Building of a {@link PropertyKey} requires specifying the target (context) from which the value of that key should be + * obtained. This is done by specifying the appropriate {@link PropertyValueSource}. + * For clarity, when only a single type of target is acceptable, the corresponding subtype of {@link PropertyValueSource} is used instead + * (for example {@link Node}). + */ +public interface PropertyKeyFactory { + /** + * Returns a property key to request the number of cores on a {@link Node}. + */ + PropertyKey createCoreCountKey(Node node); + + /** + * Returns a property key to request disk related info on a {@link Node}. + */ + PropertyKey createDiskInfoKey(Node node); + + /** + * Returns a property key to request the value of a system property on a {@link Node}. + * @param systemPropertyName the name of the system property to retrieve. + */ + PropertyKey createSystemPropertyKey(Node node, String systemPropertyName); + + /** + * Returns a property key to request the value of a metric. + * + * Not all metrics make sense everywhere, but metrics can be applied to different objects. For example + * SEARCHER.searcher.indexCommitSize would make sense for a given replica of a given shard of a given collection, + * and possibly in other contexts. + * + * @param metricSource The registry of the metric. For example a specific {@link Replica}. + * @param metricName for example SEARCHER.searcher.indexCommitSize. + */ + PropertyKey createMetricKey(PropertyValueSource metricSource, String metricName); Review comment: One node usually hosts many replicas. Each of these replicas has a unique registry name, in the form of `solr.core.`, so we could build PropertyKey from Replica because all components of the full metrics name are known. This is not the case with `node`, `jvm` and `jetty` - I think we need to explicitly specify the registry name in these cases. (Edit: or implement a PropertyValueSource that is a facade for registry name, to keep the API here consistent) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14641) PeerSync, remove canHandleVersionRanges check
[ https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174239#comment-17174239 ] Ishan Chattopadhyaya edited comment on SOLR-14641 at 8/10/20, 10:41 AM: bq. It doesn't make sense to asking everyone do a dedicated performance test before and after their commits. I am not requesting performance tests for every commit. But for those that affect the default code path for all/most users. PeerSync is enabled by default, as an example. bq. I believe the right way to ensure performance is coming up with something like lucene bench, so every downgrade and upgrade will be recorded and can be watched (per multiple commits). I totally agree, and that is where I'm going with https://github.com/thesearchstack/solr-bench. However, in the absence of that, there is absolutely no reason why we shouldn't perform performance testing manually before subjecting our users to the changes. I don't want us to repeat what happened with SOLR-14665 (where the commit happened without any performance testing, the issue was released and regression was caught only after the release. And what is worse is that a bugfix release has still not happened for that). was (Author: ichattopadhyaya): bq. It doesn't make sense to asking everyone do a dedicated performance test before and after their commits. I am not requesting performance tests for every commit. But for those that affect the default code path for all/most users. PeerSync is enabled by default, as an example. bq. I believe the right way to ensure performance is coming up with something like lucene bench, so every downgrade and upgrade will be recorded and can be watched (per multiple commits). I totally agree, and that is where I'm going with https://github.com/thesearchstack/solr-bench. However, in the absence of that, there is absolutely no reason why we shouldn't perform performance testing manually before subjecting our users to the changes. I don't want us to repeat what happened with SOLR-14665. > PeerSync, remove canHandleVersionRanges check > - > > Key: SOLR-14641 > URL: https://issues.apache.org/jira/browse/SOLR-14641 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and > 7.0. To maintain backward compatibility at the time we introduce an endpoint > in RealTimeGetComponent to check whether a node support that feature or not. > It served well its purpose and it should be removed to reduce complexity and > a request-response trip for asking that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check
[ https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174239#comment-17174239 ] Ishan Chattopadhyaya commented on SOLR-14641: - bq. It doesn't make sense to asking everyone do a dedicated performance test before and after their commits. I am not requesting performance tests for every commit. But for those that affect the default code path for all/most users. PeerSync is enabled by default, as an example. bq. I believe the right way to ensure performance is coming up with something like lucene bench, so every downgrade and upgrade will be recorded and can be watched (per multiple commits). I totally agree, and that is where I'm going with https://github.com/thesearchstack/solr-bench. However, in the absence of that, there is absolutely no reason why we shouldn't perform performance testing manually before subjecting our users to the changes. I don't want us to repeat what happened with SOLR-14665. > PeerSync, remove canHandleVersionRanges check > - > > Key: SOLR-14641 > URL: https://issues.apache.org/jira/browse/SOLR-14641 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and > 7.0. To maintain backward compatibility at the time we introduce an endpoint > in RealTimeGetComponent to check whether a node support that feature or not. > It served well its purpose and it should be removed to reduce complexity and > a request-response trip for asking that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async
[ https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174236#comment-17174236 ] Cao Manh Dat commented on SOLR-14354: - Ok then I will try my best to run it. > HttpShardHandler send requests in async > --- > > Key: SOLR-14354 > URL: https://issues.apache.org/jira/browse/SOLR-14354 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: master (9.0), 8.7 > > Attachments: image-2020-03-23-10-04-08-399.png, > image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png > > Time Spent: 4h > Remaining Estimate: 0h > > h2. 1. Current approach (problem) of Solr > Below is the diagram describe the model on how currently handling a request. > !image-2020-03-23-10-04-08-399.png! > The main-thread that handles the search requests, will submit n requests (n > equals to number of shards) to an executor. So each request will correspond > to a thread, after sending a request that thread basically do nothing just > waiting for response from other side. That thread will be swapped out and CPU > will try to handle another thread (this is called context switch, CPU will > save the context of the current thread and switch to another one). When some > data (not all) come back, that thread will be called to parsing these data, > then it will wait until more data come back. So there will be lots of context > switching in CPU. That is quite inefficient on using threads.Basically we > want less threads and most of them must busy all the time, because threads > are not free as well as context switching. That is the main idea behind > everything, like executor > h2. 2. Async call of Jetty HttpClient > Jetty HttpClient offers async API like this. > {code:java} > httpClient.newRequest("http://domain.com/path";) > // Add request hooks > .onRequestQueued(request -> { ... }) > .onRequestBegin(request -> { ... }) > // Add response hooks > .onResponseBegin(response -> { ... }) > .onResponseHeaders(response -> { ... }) > .onResponseContent((response, buffer) -> { ... }) > .send(result -> { ... }); {code} > Therefore after calling {{send()}} the thread will return immediately without > any block. Then when the client received the header from other side, it will > call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not > all response) from the data it will call {{onContent(buffer)}} listeners. > When everything finished it will call {{onComplete}} listeners. One main > thing that will must notice here is all listeners should finish quick, if the > listener block, all further data of that request won’t be handled until the > listener finish. > h2. 3. Solution 1: Sending requests async but spin one thread per response > Jetty HttpClient already provides several listeners, one of them is > InputStreamResponseListener. This is how it is get used > {code:java} > InputStreamResponseListener listener = new InputStreamResponseListener(); > client.newRequest(...).send(listener); > // Wait for the response headers to arrive > Response response = listener.get(5, TimeUnit.SECONDS); > if (response.getStatus() == 200) { > // Obtain the input stream on the response content > try (InputStream input = listener.getInputStream()) { > // Read the response content > } > } {code} > In this case, there will be 2 thread > * one thread trying to read the response content from InputStream > * one thread (this is a short-live task) feeding content to above > InputStream whenever some byte[] is available. Note that if this thread > unable to feed data into InputStream, this thread will wait. > By using this one, the model of HttpShardHandler can be written into > something like this > {code:java} > handler.sendReq(req, (is) -> { > executor.submit(() -> > try (is) { > // Read the content from InputStream > } > ) > }) {code} > The first diagram will be changed into this > !image-2020-03-23-10-09-10-221.png! > Notice that although “sending req to shard1” is wide, it won’t take long time > since sending req is a very quick operation. With this operation, handling > threads won’t be spin up until first bytes are sent back. Notice that in this > approach we still have active threads waiting for more data from InputStream > h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread. > Jetty have another listener called BufferingResponseListener. This is how it > is get used > {code:java} > client.newRequest(...).send(new BufferingResponseListener() { > public void onComplete(Result result) { > try { > byte[] response = getContent(); > //handling response
[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async
[ https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174234#comment-17174234 ] Ishan Chattopadhyaya commented on SOLR-14354: - bq. Ishan Chattopadhyaya, fair enough, do you want to do the benchmark? Sorry :-( I can work on setting up some automated benchmarking (basically, automated runs of https://github.com/thesearchstack/solr-bench), but I won't be able to finish this soon enough before 8.7 due to client priorities. As of now, I'm actively and aggressively working on a similar issue on a higher priority, SOLR-13933, and will set up both of them together on a public server once this is done. > HttpShardHandler send requests in async > --- > > Key: SOLR-14354 > URL: https://issues.apache.org/jira/browse/SOLR-14354 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: master (9.0), 8.7 > > Attachments: image-2020-03-23-10-04-08-399.png, > image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png > > Time Spent: 4h > Remaining Estimate: 0h > > h2. 1. Current approach (problem) of Solr > Below is the diagram describe the model on how currently handling a request. > !image-2020-03-23-10-04-08-399.png! > The main-thread that handles the search requests, will submit n requests (n > equals to number of shards) to an executor. So each request will correspond > to a thread, after sending a request that thread basically do nothing just > waiting for response from other side. That thread will be swapped out and CPU > will try to handle another thread (this is called context switch, CPU will > save the context of the current thread and switch to another one). When some > data (not all) come back, that thread will be called to parsing these data, > then it will wait until more data come back. So there will be lots of context > switching in CPU. That is quite inefficient on using threads.Basically we > want less threads and most of them must busy all the time, because threads > are not free as well as context switching. That is the main idea behind > everything, like executor > h2. 2. Async call of Jetty HttpClient > Jetty HttpClient offers async API like this. > {code:java} > httpClient.newRequest("http://domain.com/path";) > // Add request hooks > .onRequestQueued(request -> { ... }) > .onRequestBegin(request -> { ... }) > // Add response hooks > .onResponseBegin(response -> { ... }) > .onResponseHeaders(response -> { ... }) > .onResponseContent((response, buffer) -> { ... }) > .send(result -> { ... }); {code} > Therefore after calling {{send()}} the thread will return immediately without > any block. Then when the client received the header from other side, it will > call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not > all response) from the data it will call {{onContent(buffer)}} listeners. > When everything finished it will call {{onComplete}} listeners. One main > thing that will must notice here is all listeners should finish quick, if the > listener block, all further data of that request won’t be handled until the > listener finish. > h2. 3. Solution 1: Sending requests async but spin one thread per response > Jetty HttpClient already provides several listeners, one of them is > InputStreamResponseListener. This is how it is get used > {code:java} > InputStreamResponseListener listener = new InputStreamResponseListener(); > client.newRequest(...).send(listener); > // Wait for the response headers to arrive > Response response = listener.get(5, TimeUnit.SECONDS); > if (response.getStatus() == 200) { > // Obtain the input stream on the response content > try (InputStream input = listener.getInputStream()) { > // Read the response content > } > } {code} > In this case, there will be 2 thread > * one thread trying to read the response content from InputStream > * one thread (this is a short-live task) feeding content to above > InputStream whenever some byte[] is available. Note that if this thread > unable to feed data into InputStream, this thread will wait. > By using this one, the model of HttpShardHandler can be written into > something like this > {code:java} > handler.sendReq(req, (is) -> { > executor.submit(() -> > try (is) { > // Read the content from InputStream > } > ) > }) {code} > The first diagram will be changed into this > !image-2020-03-23-10-09-10-221.png! > Notice that although “sending req to shard1” is wide, it won’t take long time > since sending req is a very quick operation. With this operation, handling > threads won’t be spin up until first bytes are sent back. Notice that
[jira] [Comment Edited] (SOLR-14641) PeerSync, remove canHandleVersionRanges check
[ https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174228#comment-17174228 ] Cao Manh Dat edited comment on SOLR-14641 at 8/10/20, 10:29 AM: I believe the right way to ensure performance is coming up with something like lucene bench, so every downgrade and upgrade will be recorded and can be watched (per multiple commits). It doesn't make sense to asking everyone do a dedicated performance test before and after their commits. was (Author: caomanhdat): I believe the right way to ensure performance is coming up with something like lucene bench, so every downgrade and upgrade will be recorded and can be watched. It doesn't make sense to asking everyone do a dedicated performance test before and after their commits. > PeerSync, remove canHandleVersionRanges check > - > > Key: SOLR-14641 > URL: https://issues.apache.org/jira/browse/SOLR-14641 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and > 7.0. To maintain backward compatibility at the time we introduce an endpoint > in RealTimeGetComponent to check whether a node support that feature or not. > It served well its purpose and it should be removed to reduce complexity and > a request-response trip for asking that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check
[ https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174228#comment-17174228 ] Cao Manh Dat commented on SOLR-14641: - I believe the right way to ensure performance is coming up with something like lucene bench, so every downgrade and upgrade will be recorded and can be watched. It doesn't make sense to asking everyone do a dedicated performance test before and after their commits. > PeerSync, remove canHandleVersionRanges check > - > > Key: SOLR-14641 > URL: https://issues.apache.org/jira/browse/SOLR-14641 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and > 7.0. To maintain backward compatibility at the time we introduce an endpoint > in RealTimeGetComponent to check whether a node support that feature or not. > It served well its purpose and it should be removed to reduce complexity and > a request-response trip for asking that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check
[ https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174226#comment-17174226 ] Cao Manh Dat commented on SOLR-14641: - I kinda hesitate to do such performance testing for this one, what is the reason behind that? This issue just simply remove codepath that no longer used. > PeerSync, remove canHandleVersionRanges check > - > > Key: SOLR-14641 > URL: https://issues.apache.org/jira/browse/SOLR-14641 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and > 7.0. To maintain backward compatibility at the time we introduce an endpoint > in RealTimeGetComponent to check whether a node support that feature or not. > It served well its purpose and it should be removed to reduce complexity and > a request-response trip for asking that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface
sigram commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467801413 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/Cluster.java ## @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.io.IOException; +import java.util.Optional; +import java.util.Set; + +/** + * A representation of the (initial) cluster state, providing information on which nodes are part of the cluster and a way + * to get to more detailed info. + * + * This instance can also be used as a {@link PropertyValueSource} if {@link PropertyKey}'s need to be specified with + * a global cluster target. + */ +public interface Cluster extends PropertyValueSource { + /** + * @return current set of live nodes. Never null, never empty (Solr wouldn't call the plugin if empty + * since no useful work could then be done). + */ + Set getLiveNodes(); + + /** + * Returns info about the given collection if one exists. Because it is not expected for plugins to request info about + * a large number of collections, requests can only be made one by one. + * + * This is also the reason we do not return a {@link java.util.Map} or {@link Set} of {@link SolrCollection}'s here: it would be + * wasteful to fetch all data and fill such a map when plugin code likely needs info about at most one or two collections. + */ + Optional getCollection(String collectionName) throws IOException; + + /** + * Allows getting all {@link SolrCollection} present in the cluster. + * + * WARNING: this call might be extremely inefficient on large clusters. Usage is discouraged. + */ + Set getAllCollections(); Review comment: I meant just the list of names ... `Collection`, otherwise I agree it can be very inefficient. ## File path: solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlanFactory.java ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * Allows plugins to create {@link PlacementPlan}s telling the Solr layer where to create replicas following the processing of + * a {@link PlacementRequest}. The Solr layer can (and will) check that the {@link PlacementPlan} conforms to the {@link PlacementRequest} (and + * if it does not, the requested operation will fail). + */ +public interface PlacementPlanFactory { + /** + * Creates a {@link PlacementPlan} for adding a new collection and its replicas. + * + * This is in support of {@link org.apache.solr.cloud.api.collections.CreateCollectionCmd}. + */ + PlacementPlan createPlacementPlanNewCollection(CreateNewCollectionPlacementRequest request, String CollectionName, Set replicaPlacements); + + /** + * Creates a {@link PlacementPlan} for adding replicas to a given shard of an existing collection. + * + * This is in support (directly or indirectly) of {@link org.apache.solr.cloud.api.collections.AddReplicaCmd}, + * {@link org.apache.solr.cloud.api.collections.CreateShardCmd}, {@link org.apache.solr.cloud.api.collections.ReplaceNodeCmd}, + * {@link org.apache.solr.cloud.api.collections.MoveReplicaCmd}, {@link org.apache.solr.cloud.api.collections.SplitShardCmd}, + * {@link org.apache.solr.cloud.api.collections.RestoreCmd} and {@link org.apache.solr.cloud.api.
[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check
[ https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174225#comment-17174225 ] Ishan Chattopadhyaya commented on SOLR-14641: - Since this is a change that affects all users by default, I would still prefer that we have performance testing numbers to make sure there is no performance regression. > PeerSync, remove canHandleVersionRanges check > - > > Key: SOLR-14641 > URL: https://issues.apache.org/jira/browse/SOLR-14641 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and > 7.0. To maintain backward compatibility at the time we introduce an endpoint > in RealTimeGetComponent to check whether a node support that feature or not. > It served well its purpose and it should be removed to reduce complexity and > a request-response trip for asking that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async
[ https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174224#comment-17174224 ] Cao Manh Dat commented on SOLR-14354: - [~ichattopadhyaya], fair enough, do you want to do the benchmark? > HttpShardHandler send requests in async > --- > > Key: SOLR-14354 > URL: https://issues.apache.org/jira/browse/SOLR-14354 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: master (9.0), 8.7 > > Attachments: image-2020-03-23-10-04-08-399.png, > image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png > > Time Spent: 4h > Remaining Estimate: 0h > > h2. 1. Current approach (problem) of Solr > Below is the diagram describe the model on how currently handling a request. > !image-2020-03-23-10-04-08-399.png! > The main-thread that handles the search requests, will submit n requests (n > equals to number of shards) to an executor. So each request will correspond > to a thread, after sending a request that thread basically do nothing just > waiting for response from other side. That thread will be swapped out and CPU > will try to handle another thread (this is called context switch, CPU will > save the context of the current thread and switch to another one). When some > data (not all) come back, that thread will be called to parsing these data, > then it will wait until more data come back. So there will be lots of context > switching in CPU. That is quite inefficient on using threads.Basically we > want less threads and most of them must busy all the time, because threads > are not free as well as context switching. That is the main idea behind > everything, like executor > h2. 2. Async call of Jetty HttpClient > Jetty HttpClient offers async API like this. > {code:java} > httpClient.newRequest("http://domain.com/path";) > // Add request hooks > .onRequestQueued(request -> { ... }) > .onRequestBegin(request -> { ... }) > // Add response hooks > .onResponseBegin(response -> { ... }) > .onResponseHeaders(response -> { ... }) > .onResponseContent((response, buffer) -> { ... }) > .send(result -> { ... }); {code} > Therefore after calling {{send()}} the thread will return immediately without > any block. Then when the client received the header from other side, it will > call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not > all response) from the data it will call {{onContent(buffer)}} listeners. > When everything finished it will call {{onComplete}} listeners. One main > thing that will must notice here is all listeners should finish quick, if the > listener block, all further data of that request won’t be handled until the > listener finish. > h2. 3. Solution 1: Sending requests async but spin one thread per response > Jetty HttpClient already provides several listeners, one of them is > InputStreamResponseListener. This is how it is get used > {code:java} > InputStreamResponseListener listener = new InputStreamResponseListener(); > client.newRequest(...).send(listener); > // Wait for the response headers to arrive > Response response = listener.get(5, TimeUnit.SECONDS); > if (response.getStatus() == 200) { > // Obtain the input stream on the response content > try (InputStream input = listener.getInputStream()) { > // Read the response content > } > } {code} > In this case, there will be 2 thread > * one thread trying to read the response content from InputStream > * one thread (this is a short-live task) feeding content to above > InputStream whenever some byte[] is available. Note that if this thread > unable to feed data into InputStream, this thread will wait. > By using this one, the model of HttpShardHandler can be written into > something like this > {code:java} > handler.sendReq(req, (is) -> { > executor.submit(() -> > try (is) { > // Read the content from InputStream > } > ) > }) {code} > The first diagram will be changed into this > !image-2020-03-23-10-09-10-221.png! > Notice that although “sending req to shard1” is wide, it won’t take long time > since sending req is a very quick operation. With this operation, handling > threads won’t be spin up until first bytes are sent back. Notice that in this > approach we still have active threads waiting for more data from InputStream > h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread. > Jetty have another listener called BufferingResponseListener. This is how it > is get used > {code:java} > client.newRequest(...).send(new BufferingResponseListener() { > public void onComplete(Result result) { > try { > byte[] response = getContent();
[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check
[ https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174220#comment-17174220 ] Cao Manh Dat commented on SOLR-14641: - [~ichattopadhyaya] I don't think this will be a seeable boost in time, since this request is very lightweight. > PeerSync, remove canHandleVersionRanges check > - > > Key: SOLR-14641 > URL: https://issues.apache.org/jira/browse/SOLR-14641 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and > 7.0. To maintain backward compatibility at the time we introduce an endpoint > in RealTimeGetComponent to check whether a node support that feature or not. > It served well its purpose and it should be removed to reduce complexity and > a request-response trip for asking that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async
[ https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174221#comment-17174221 ] Ishan Chattopadhyaya commented on SOLR-14354: - For a change like this, I would like to see performance numbers. Unless we have that, I am not comfortable with releasing with this feature. If you would like to use https://github.com/thesearchstack/solr-bench, I can offer help and assistance. In the absence of performance numbers, I shall be inclined to request a revert of this change (veto). > HttpShardHandler send requests in async > --- > > Key: SOLR-14354 > URL: https://issues.apache.org/jira/browse/SOLR-14354 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: master (9.0), 8.7 > > Attachments: image-2020-03-23-10-04-08-399.png, > image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png > > Time Spent: 4h > Remaining Estimate: 0h > > h2. 1. Current approach (problem) of Solr > Below is the diagram describe the model on how currently handling a request. > !image-2020-03-23-10-04-08-399.png! > The main-thread that handles the search requests, will submit n requests (n > equals to number of shards) to an executor. So each request will correspond > to a thread, after sending a request that thread basically do nothing just > waiting for response from other side. That thread will be swapped out and CPU > will try to handle another thread (this is called context switch, CPU will > save the context of the current thread and switch to another one). When some > data (not all) come back, that thread will be called to parsing these data, > then it will wait until more data come back. So there will be lots of context > switching in CPU. That is quite inefficient on using threads.Basically we > want less threads and most of them must busy all the time, because threads > are not free as well as context switching. That is the main idea behind > everything, like executor > h2. 2. Async call of Jetty HttpClient > Jetty HttpClient offers async API like this. > {code:java} > httpClient.newRequest("http://domain.com/path";) > // Add request hooks > .onRequestQueued(request -> { ... }) > .onRequestBegin(request -> { ... }) > // Add response hooks > .onResponseBegin(response -> { ... }) > .onResponseHeaders(response -> { ... }) > .onResponseContent((response, buffer) -> { ... }) > .send(result -> { ... }); {code} > Therefore after calling {{send()}} the thread will return immediately without > any block. Then when the client received the header from other side, it will > call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not > all response) from the data it will call {{onContent(buffer)}} listeners. > When everything finished it will call {{onComplete}} listeners. One main > thing that will must notice here is all listeners should finish quick, if the > listener block, all further data of that request won’t be handled until the > listener finish. > h2. 3. Solution 1: Sending requests async but spin one thread per response > Jetty HttpClient already provides several listeners, one of them is > InputStreamResponseListener. This is how it is get used > {code:java} > InputStreamResponseListener listener = new InputStreamResponseListener(); > client.newRequest(...).send(listener); > // Wait for the response headers to arrive > Response response = listener.get(5, TimeUnit.SECONDS); > if (response.getStatus() == 200) { > // Obtain the input stream on the response content > try (InputStream input = listener.getInputStream()) { > // Read the response content > } > } {code} > In this case, there will be 2 thread > * one thread trying to read the response content from InputStream > * one thread (this is a short-live task) feeding content to above > InputStream whenever some byte[] is available. Note that if this thread > unable to feed data into InputStream, this thread will wait. > By using this one, the model of HttpShardHandler can be written into > something like this > {code:java} > handler.sendReq(req, (is) -> { > executor.submit(() -> > try (is) { > // Read the content from InputStream > } > ) > }) {code} > The first diagram will be changed into this > !image-2020-03-23-10-09-10-221.png! > Notice that although “sending req to shard1” is wide, it won’t take long time > since sending req is a very quick operation. With this operation, handling > threads won’t be spin up until first bytes are sent back. Notice that in this > approach we still have active threads waiting for more data from InputStream > h2. 4. Solution 2: Buffering data and
[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check
[ https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174214#comment-17174214 ] Ishan Chattopadhyaya commented on SOLR-14641: - bq. it should be removed to [...] a request-response trip for asking that. [~caomanhdat], based on your comment, it seems this is also a performance optimization. What is the level of performance testing/benchmarking that has been done for this issue? > PeerSync, remove canHandleVersionRanges check > - > > Key: SOLR-14641 > URL: https://issues.apache.org/jira/browse/SOLR-14641 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and > 7.0. To maintain backward compatibility at the time we introduce an endpoint > in RealTimeGetComponent to check whether a node support that feature or not. > It served well its purpose and it should be removed to reduce complexity and > a request-response trip for asking that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API
noblepaul commented on pull request #1730: URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671263417 I realized that I had to tweak the APIs after writing an implementation. I have updated the API-only PR #1694 to reflect the latest This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-6152) Pre-populating values into search parameters on the query page of solr admin
[ https://issues.apache.org/jira/browse/SOLR-6152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174211#comment-17174211 ] Jakob Furrer commented on SOLR-6152: New Patch: * Regression regarding the raw value field has been fixed. * Value in the 'qt' field is correctly applied to the browser address bar url and the REST response url. Note: When the 'qt' value starts with the '/' character, it is used in the resulting REST response url path (i.e. in front of the question mark). Otherwise, the value is appended like any other a parameter (i.e. as "&qt=qt_value"). This was is consistent with prior behavior. * The 'indent off' checkbox is persisted. Note that the checkbox displays the *inverse* value of the 'indent' parameter, i.e. "&indent=false" ticks the checkbox. * Regression fixed: When "Basic authentication plugin" was used (configured in security.json), an endless loop of redirects between login-page and query-page occurred. > Pre-populating values into search parameters on the query page of solr admin > > > Key: SOLR-6152 > URL: https://issues.apache.org/jira/browse/SOLR-6152 > Project: Solr > Issue Type: Improvement > Components: Admin UI >Affects Versions: 4.3.1 >Reporter: Dmitry Kan >Assignee: Jan Høydahl >Priority: Major > Attachments: SOLR-6152.patch, SOLR-6152.patch, SOLR-6152.patch, > SOLR-6152.patch, copy_url_to_clipboard.png, copy_url_to_clipboard_v2.png, > prefilling_and_extending_the_multivalue_parameter_fq.png, > prepoluate_query_parameters_query_page.bmp > > > In some use cases, it is highly desirable to be able to pre-populate the > query page of solr admin with specific values. > In particular use case of mine, the solr admin user must pass a date range > value without which the query would fail. > It isn't easy to remember the value format for non-solr experts, so I would > like to have a way of hooking that value "example" into the query page. > See the screenshot attached, where I have inserted the fq parameter with date > range into the Raw Query Parameters. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-6152) Pre-populating values into search parameters on the query page of solr admin
[ https://issues.apache.org/jira/browse/SOLR-6152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Furrer updated SOLR-6152: --- Attachment: SOLR-6152.patch > Pre-populating values into search parameters on the query page of solr admin > > > Key: SOLR-6152 > URL: https://issues.apache.org/jira/browse/SOLR-6152 > Project: Solr > Issue Type: Improvement > Components: Admin UI >Affects Versions: 4.3.1 >Reporter: Dmitry Kan >Assignee: Jan Høydahl >Priority: Major > Attachments: SOLR-6152.patch, SOLR-6152.patch, SOLR-6152.patch, > SOLR-6152.patch, copy_url_to_clipboard.png, copy_url_to_clipboard_v2.png, > prefilling_and_extending_the_multivalue_parameter_fq.png, > prepoluate_query_parameters_query_page.bmp > > > In some use cases, it is highly desirable to be able to pre-populate the > query page of solr admin with specific values. > In particular use case of mine, the solr admin user must pass a date range > value without which the query would fail. > It isn't easy to remember the value format for non-solr experts, so I would > like to have a way of hooking that value "example" into the query page. > See the screenshot attached, where I have inserted the fq parameter with date > range into the Raw Query Parameters. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module
[ https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174199#comment-17174199 ] Tomoko Uchida commented on LUCENE-9448: --- {quote}I always thought Luke is still a "stand-alone" tool so I suggested dependency assembly for a stand-alone tool {quote} I'm not sure if it is related... when Luke was integrated into Lucene, my very first suggestion was creating a stand-alone Luke app (zip/tar) that is separately distributed from Lucene, just like Solr; it was rejected and I did not argue about it. I just remembered that. > Make an equivalent to Ant's "run" target for Luke module > > > Key: LUCENE-9448 > URL: https://issues.apache.org/jira/browse/LUCENE-9448 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Tomoko Uchida >Priority: Minor > Attachments: LUCENE-9448.patch > > > With Ant build, Luke Swing app can be launched by "ant run" after checking > out the source code. "ant run" allows developers to immediately see the > effects of UI changes without creating the whole zip/tgz package (originally, > it was suggested when integrating Luke to Lucene). > In Gradle, {{:lucene:luke:run}} task would be easily implemented with > {{JavaExec}}, I think. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul edited a comment on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API
noblepaul edited a comment on pull request #1730: URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671229892 > Separating out the new lazy implementations into another PR and keeping this one for adding interfaces to internal classes would have made reviewing easier. Yeah, that was the other PR #1694 , I have revived it @murblanc please review it >Are there places in the code where currently the concrete classes are used and that could be changed to use the interfaces instead? In other words, how/where would these interfaces be used? The current concrete classes do not use/implement these interfaces. These interfaces will only be a part of implementations. for instance, the `LazySolrCluster` is one of the impl. In the future we should add a couple more This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul edited a comment on pull request #1730: SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes
noblepaul edited a comment on pull request #1730: URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671229892 > Separating out the new lazy implementations into another PR and keeping this one for adding interfaces to internal classes would have made reviewing easier. Yeah, that was the other PR #1694 , I have revived it >Are there places in the code where currently the concrete classes are used and that could be changed to use the interfaces instead? In other words, how/where would these interfaces be used? The current concrete classes do not use/implement these interfaces. These interfaces will only be a part of implementations. for instance, the `LazySolrCluster` is one of the impl. In the future we should add a couple more This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org