[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface
sigram commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466853804 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/CreateNewCollectionRequest.java ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * Request for creating a new collection with a given set of shards and replication factor for various replica types. + * The expected {@link WorkOrder} corresponding to this {@link Request} is created using + * {@link WorkOrderFactory#createWorkOrderNewCollection} + * + * Note there is no need at this stage to allow the plugin to know each shard hash range for example, this can be handled + * by the Solr side implementation of this interface without needing the plugin to worry about it (the implementation of this interface on + * the Solr side can maintain the ranges for each shard). + * + * Same goes for the {@link org.apache.solr.core.ConfigSet} name or other collection parameters. They are needed for + * creating a Collection but likely do not have to be exposed to the plugin (this can easily be changed if needed by + * adding accessors here, the underlying Solr side implementation of this interface has the information). + */ +public interface CreateNewCollectionRequest extends Request { + /** + * The name of the collection to be created and for which placement should be computed. + * + * Compare this method with {@link AddReplicasRequest#getCollection()}, there the collection already exists so can be + * directly passed in the {@link Request}. + * + * When processing this request, plugin code doesn't have to worry about existing {@link Replica}'s for the collection + * given that the collection is assumed not to exist. + */ + String getCollectionName(); + + Set getShardNames(); + + /** + * Properties passed through the Collection API by the client creating the collection. + * See {@link SolrCollection#getCustomProperty(String)}. + * + * Given this {@link Request} is for creating a new collection, it is not possible to pass the custom property values through + * the {@link SolrCollection} object. That instance does not exist yet, and is the reason {@link #getCollectionName()} exists + * rather than a method returning {@link SolrCollection}... + */ + String getCustomProperty(String customPropertyName); Review comment: Ok. ## File path: solr/core/src/java/org/apache/solr/cluster/placement/AddReplicasRequest.java ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * Request for creating one or more {@link Replica}'s for one or more {@link Shard}'s of an existing {@link SolrCollection}. + * The shard might or might not already exist, plugin code can easily find out by using {@link SolrCollection#getShards()} + * and verifying if the shard name(s) from {@link #getShardNames()} are there. + * + * As opposed to {@link CreateNewCollectionRequest}, the set of {@link Node}s on which the replicas should be placed + * is specified (defaults to being equal to the set returned by {@link Cluster#getLiveNodes()}). + * + * There is no extension between this interface and {@link CreateNewCollectionRequest} in either direction + * or from a common ance
[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface
sigram commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466850645 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/PropertyKeyFactory.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * Factory used by the plugin to create property keys to request property values from Solr. + * + * Building of a {@link PropertyKey} requires specifying the target (context) from which the value of that key should be + * obtained. This is done by specifying the appropriate {@link PropertyValueSource}. + * For clarity, when only a single type of target is acceptable, the corresponding subtype of {@link PropertyValueSource} is used instead + * (for example {@link Node}). + */ +public interface PropertyKeyFactory { + /** + * Returns a property key to request the number of cores on a {@link Node}. + */ + PropertyKey createCoreCountKey(Node node); + + /** + * Returns a property key to request disk related info on a {@link Node}. + */ + PropertyKey createDiskInfoKey(Node node); + + /** + * Returns a property key to request the value of a system property on a {@link Node}. + * @param systemPropertyName the name of the system property to retrieve. + */ + PropertyKey createSystemPropertyKey(Node node, String systemPropertyName); + + /** + * Returns a property key to request the value of a metric. + * + * Not all metrics make sense everywhere, but metrics can be applied to different objects. For example + * SEARCHER.searcher.indexCommitSize would make sense for a given replica of a given shard of a given collection, + * and possibly in other contexts. + * + * @param metricSource The registry of the metric. For example a specific {@link Replica}. + * @param metricName for example SEARCHER.searcher.indexCommitSize. + */ + PropertyKey createMetricKey(PropertyValueSource metricSource, String metricName); Review comment: `SolrDispatchFilter.setupJvmMetrics` initializes per-JVM metrics. They appear in a separate `solr.jvm` registry, which is different from `solr.node`. In 99% cases (or practically always in production) Solr node maps 1:1 to a JVM instance. In some cases (most notably tests) there can be multiple Solr nodes running in a single JVM, so it's a N:1 - but never the other way around because it wouldn't make sense. So in some rare cases we will have multiple `solr.node` registries in one JVM (reachable via different API endpoints), but always a single `solr.jvm` registry (also reachable via different endpoints). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface
sigram commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466848207 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlugin.java ## @@ -0,0 +1,41 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * Implemented by external plugins to control replica placement and movement on the search cluster (as well as other things + * such as cluster elasticity?) when cluster changes are required (initiated elsewhere, most likely following a Collection + * API call). + */ +public interface PlacementPlugin { Review comment: I think that we need to add an explicit mechanism for configuration of plugins, otherwise plugin implementors will have to use other Solr facilities anyway. Maybe add a `configure(Map config)` method? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config
dsmiley commented on a change in pull request #1602: URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466818795 ## File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java ## @@ -87,6 +100,7 @@ private SolrIndexConfig(SolrConfig solrConfig) { maxBufferedDocs = -1; ramBufferSizeMB = 100; ramPerThreadHardLimitMB = -1; +maxCommitMergeWaitMillis = -1; Review comment: ok, good point. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris commented on a change in pull request #1686: SOLR-13528: Implement Request Rate Limiters
atris commented on a change in pull request #1686: URL: https://github.com/apache/lucene-solr/pull/1686#discussion_r466813189 ## File path: solr/core/src/java/org/apache/solr/servlet/RateLimitManager.java ## @@ -38,9 +41,14 @@ * rate limiting is being done for a specific request type. */ public class RateLimitManager { + private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + public final static int DEFAULT_CONCURRENT_REQUESTS= (Runtime.getRuntime().availableProcessors()) * 3; public final static long DEFAULT_SLOT_ACQUISITION_TIMEOUT_MS = -1; private final Map requestRateLimiterMap; + + // IMPORTANT: The slot from the corresponding rate limiter should be acquired before adding the request + // to this map. Subsequently, the request should be deleted from the map before the slot is released. private final Map activeRequestsMap; Review comment: This is already a ConcurrentHashMap. The comment is redundant, removing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris commented on a change in pull request #1686: SOLR-13528: Implement Request Rate Limiters
atris commented on a change in pull request #1686: URL: https://github.com/apache/lucene-solr/pull/1686#discussion_r466812986 ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -102,31 +103,101 @@ public Boolean call() throws Exception { try { future.get(); } catch (Exception e) { - assertTrue("Not true " + e.getMessage(), e.getMessage().contains("non ok status: 429, message:Too Many Requests")); + assertThat(e.getMessage(), containsString("non ok status: 429, message:Too Many Requests")); } } MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY); - assertTrue("Incoming request count did not match. Expected == 25 incoming " + mockQueryRateLimiter.incomingRequestCount.get(), - mockQueryRateLimiter.incomingRequestCount.get() == 25); + assertEquals(mockQueryRateLimiter.incomingRequestCount.get(),25); assertTrue("Incoming accepted new request count did not match. Expected 5 incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(), mockQueryRateLimiter.acceptedNewRequestCount.get() < 25); assertTrue("Incoming rejected new request count did not match. Expected 20 incoming " + mockQueryRateLimiter.rejectedRequestCount.get(), mockQueryRateLimiter.rejectedRequestCount.get() > 0); - assertTrue("Incoming total processed requests count did not match. Expected " + mockQueryRateLimiter.incomingRequestCount.get() + " incoming " - + (mockQueryRateLimiter.acceptedNewRequestCount.get() + mockQueryRateLimiter.rejectedRequestCount.get()), - (mockQueryRateLimiter.acceptedNewRequestCount.get() + mockQueryRateLimiter.rejectedRequestCount.get()) == mockQueryRateLimiter.incomingRequestCount.get()); + assertEquals(mockQueryRateLimiter.acceptedNewRequestCount.get() + mockQueryRateLimiter.rejectedRequestCount.get(), + mockQueryRateLimiter.incomingRequestCount.get()); +} finally { + executor.shutdown(); +} + } + + @Test + public void testSlotBorrowing() throws Exception { +CloudSolrClient client = cluster.getSolrClient(); +client.setDefaultCollection(SECOND_COLLECTION); + +CollectionAdminRequest.createCollection(SECOND_COLLECTION, 1, 1).process(client); +cluster.waitForActiveCollection(SECOND_COLLECTION, 1, 1); + + +SolrDispatchFilter solrDispatchFilter = cluster.getJettySolrRunner(0).getSolrDispatchFilter(); + +RequestRateLimiter.RateLimiterConfig queryRateLimiterConfig = new RequestRateLimiter.RateLimiterConfig(SolrRequest.SolrRequestType.QUERY, +true, 1, DEFAULT_SLOT_ACQUISITION_TIMEOUT_MS, 5 /* allowedRequests */, true /* isSlotBorrowing */); +RequestRateLimiter.RateLimiterConfig indexRateLimiterConfig = new RequestRateLimiter.RateLimiterConfig(SolrRequest.SolrRequestType.UPDATE, +true, 1, DEFAULT_SLOT_ACQUISITION_TIMEOUT_MS, 5 /* allowedRequests */, true /* isSlotBorrowing */); +// We are fine with a null FilterConfig here since we ensure that MockBuilder never invokes its parent +RateLimitManager.Builder builder = new MockBuilder(null /*dummy FilterConfig */, new MockRequestRateLimiter(queryRateLimiterConfig, 5), new MockRequestRateLimiter(indexRateLimiterConfig, 5)); +RateLimitManager rateLimitManager = builder.build(); + +solrDispatchFilter.replaceRateLimitManager(rateLimitManager); + +for (int i = 0; i < 100; i++) { + SolrInputDocument doc = new SolrInputDocument(); + + doc.setField("id", i); + doc.setField("text", "foo"); + client.add(doc); +} + +client.commit(); + +ExecutorService executor = ExecutorUtil.newMDCAwareCachedThreadPool("threadpool"); +List> callableList = new ArrayList<>(); +List> futures; + +try { + for (int i = 0; i < 25; i++) { +callableList.add(() -> { + try { +QueryResponse response = client.query(new SolrQuery("*:*")); + +if (response.getResults().getNumFound() > 0) { + assertEquals(100, response.getResults().getNumFound()); +} + } catch (Exception e) { +throw new RuntimeException(e.getMessage()); + } + + return true; +}); + } + + futures = executor.invokeAll(callableList); + + for (Future future : futures) { +try { + future.get(); Review comment: assertTrue(future.get() != null); instead? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsu
[jira] [Updated] (SOLR-14712) Standardize RPC calls in Solr
[ https://issues.apache.org/jira/browse/SOLR-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-14712: -- Description: We should have a standard mechanism to make a request to the right replica/node across solr code. This RPC mechanism assumes that * The RPC mechanism is HTTP * It is aware of all collections,shards & their topology etc * it knows how to route a request to the correct core This is agnostic of wire level formats ,Solr documents etc. That is a layer above this. Anyone can use their own JSON parser or any other RPC wire level format on top of this for example a code like this {code} private void invokeOverseerOp(String electionNode, String op) { ModifiableSolrParams params = new ModifiableSolrParams(); ShardHandler shardHandler = shardHandlerFactory.getShardHandler(); params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString()); params.set("op", op); params.set("qt", adminPath); params.set("electionNode", electionNode); ShardRequest sreq = new ShardRequest(); sreq.purpose = 1; String replica = zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode)); sreq.shards = new String[]\{replica}; sreq.actualShards = sreq.shards; sreq.params = params; shardHandler.submit(sreq, replica, sreq.params); shardHandler.takeCompletedOrError(); } {code} will be replaced with {code} private void invokeOverseerOp(String electionNode, String op) { RpcFactory factory = null; factory.createCallRouter() .toNode(electionNode) .createHttpRpc() .withMethod(SolrRequest.METHOD.GET) .addParam(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString()) .addParam("op", op) .addParam("electionNode", electionNode) .addParam(ShardParams.SHARDS_PURPOSE, 1) .withV1Path(adminPath) .invoke(); } {code} was: We should have a standard mechanism to make a request to the right replica/node across solr code. This RPC mechanism assumes that * The RPC mechanism is HTTP * It is aware of all collections,shards & their topology etc * it knows how to route a request to the correct core This is agnostic of wire level formats ,Solr documents etc. That is a layer above this. Anyone can use their own JSON parser or any other RPC wire level format on top of this for example a code like this {code} private void invokeOverseerOp(String electionNode, String op) { ModifiableSolrParams params = new ModifiableSolrParams(); ShardHandler shardHandler = shardHandlerFactory.getShardHandler(); params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString()); params.set("op", op); params.set("qt", adminPath); params.set("electionNode", electionNode); ShardRequest sreq = new ShardRequest(); sreq.purpose = 1; String replica = zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode)); sreq.shards = new String[]\{replica}; sreq.actualShards = sreq.shards; sreq.params = params; shardHandler.submit(sreq, replica, sreq.params); shardHandler.takeCompletedOrError(); } {code} will be replaced with {code} private void invokeOverseerOp(String electionNode, String op) { RpcFactory factory = null; factory.createCallRouter() .toNode(electionNode) .createHttpRpc() .withMethod(SolrRequest.METHOD.GET) .addParam(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString()) .addParam("op", op) .addParam("electionNode", electionNode) .addParam(ShardParams.SHARDS_PURPOSE, 1) .withV1Path(adminPath) .invoke(); } {code} > Standardize RPC calls in Solr > - > > Key: SOLR-14712 > URL: https://issues.apache.org/jira/browse/SOLR-14712 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > We should have a standard mechanism to make a request to the right > replica/node across solr code. > This RPC mechanism assumes that > * The RPC mechanism is HTTP > * It is aware of all collections,shards & their topology etc > * it knows how to route a request to the correct core > This is agnostic of wire level formats ,Solr documents etc. That is a layer > above this. > Anyone can use their own JSON parser or any other RPC wire level format on > top of this > for example a code like this > {code} > private void invokeOverseerOp(String electionNode, String op) { > ModifiableSolrParams params = new ModifiableSolrParams(); > ShardHandler shardHandler = shardHandlerFactory.getShardHandler(); > params.set(CoreAdminParams.ACTION, C
[jira] [Updated] (SOLR-14712) Standardize RPC calls in Solr
[ https://issues.apache.org/jira/browse/SOLR-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-14712: -- Description: We should have a standard mechanism to make a request to the right replica/node across solr code. This RPC mechanism assumes that * The RPC mechanism is HTTP * It is aware of all collections,shards & their topology etc * it knows how to route a request to the correct core This is agnostic of wire level formats ,Solr documents etc. That is a layer above this. Anyone can use their own JSON parser or any other RPC wire level format on top of this for example a code like this {code} private void invokeOverseerOp(String electionNode, String op) { ModifiableSolrParams params = new ModifiableSolrParams(); ShardHandler shardHandler = shardHandlerFactory.getShardHandler(); params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString()); params.set("op", op); params.set("qt", adminPath); params.set("electionNode", electionNode); ShardRequest sreq = new ShardRequest(); sreq.purpose = 1; String replica = zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode)); sreq.shards = new String[]\{replica}; sreq.actualShards = sreq.shards; sreq.params = params; shardHandler.submit(sreq, replica, sreq.params); shardHandler.takeCompletedOrError(); } {code} will be replaced with {code} private void invokeOverseerOp(String electionNode, String op) { RpcFactory factory = null; factory.createCallRouter() .toNode(electionNode) .createHttpRpc() .withMethod(SolrRequest.METHOD.GET) .addParam(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString()) .addParam("op", op) .addParam("electionNode", electionNode) .addParam(ShardParams.SHARDS_PURPOSE, 1) .withV1Path(adminPath) .invoke(); } {code} was: We should have a standard mechanism to make a request to the right replica/node across solr code. This RPC mechanism assumes that * The RPC mechanism is HTTP * It is aware of all collections,shards & their topology etc * it knows how to route a request to the correct core This is agnostic of wire level formats ,Solr documents etc. That is a layer above this. Anyone can use their own JSON parser or any other RPC wire level format on top of this for example a code like this {code} private void invokeOverseerOp(String electionNode, String op) { ModifiableSolrParams params = new ModifiableSolrParams(); ShardHandler shardHandler = shardHandlerFactory.getShardHandler(); params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString()); params.set("op", op); params.set("qt", adminPath); params.set("electionNode", electionNode); ShardRequest sreq = new ShardRequest(); sreq.purpose = 1; String replica = zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode)); sreq.shards = new String[]\{replica}; sreq.actualShards = sreq.shards; sreq.params = params; shardHandler.submit(sreq, replica, sreq.params); shardHandler.takeCompletedOrError(); } {code} will be replaced with {code} private void invokeOverseerOp(String electionNode, String op) { HttpRpcFactory factory = null; factory.create() .withHttpMethod(SolrRequest.METHOD.GET) .addParam(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString()) .addParam("op", op) .addParam("electionNode", electionNode) .addParam(ShardParams.SHARDS_PURPOSE, "1") .withV1Uri(adminPath) .toNode(electionNode) .invoke(); } {code} > Standardize RPC calls in Solr > - > > Key: SOLR-14712 > URL: https://issues.apache.org/jira/browse/SOLR-14712 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > We should have a standard mechanism to make a request to the right > replica/node across solr code. > This RPC mechanism assumes that > * The RPC mechanism is HTTP > * It is aware of all collections,shards & their topology etc > * it knows how to route a request to the correct core > This is agnostic of wire level formats ,Solr documents etc. That is a layer > above this. > Anyone can use their own JSON parser or any other RPC wire level format on > top of this > for example a code like this > {code} > private void invokeOverseerOp(String electionNode, String op) { > ModifiableSolrParams params = new ModifiableSolrParams(); > ShardHandler shardHandler = shardHandlerFactory.getShardHandler(); > params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString()); > params.set("op", op); > params.set("qt", adminPath); > params.set("electionNode", electionNode); > ShardRequ
[GitHub] [lucene-solr] anshumg commented on a change in pull request #1720: SOLR-14712 Standardize RPC calls in Solr
anshumg commented on a change in pull request #1720: URL: https://github.com/apache/lucene-solr/pull/1720#discussion_r466802599 ## File path: solr/solrj/src/java/org/apache/solr/common/util/HttpRpc.java ## @@ -0,0 +1,95 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.common.util; + +import org.apache.solr.client.solrj.SolrRequest; + +import java.util.Map; + +/**Abstract out HTTP aspects of the request Review comment: New line ? ## File path: solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrClient.java ## @@ -478,4 +479,11 @@ public Builder getThis() { return this; } } + + private final RpcFactory factory = null;//TODO Review comment: You don't intend to commit this, right? ## File path: solr/solrj/src/java/org/apache/solr/common/util/CallRouter.java ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.common.util; + +public interface CallRouter { +/** + * send to a specific node. usually admin requests + */ +CallRouter toNode(String nodeName); + +/** + * Make a request to any replica of the shard of type + */ +CallRouter toShard(String collection, String shard, ReplicaType type); + +/** + * Identify the shard using the route key and send the request to a given replica type + */ +CallRouter toShard(String collection, ReplicaType type, String routeKey); Review comment: can we reorder this so the decision maker for the routing i.e. routeKey is the 2nd param like `CallRouter toShard(String collection, String shard, ReplicaType type);` ? ## File path: solr/solrj/src/java/org/apache/solr/common/util/RpcFactory.java ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.common.util; + +import org.apache.solr.common.SolrException; +import org.apache.solr.common.params.CommonParams; + +import java.io.IOException; +import java.io.InputStream; +import java.io.OutputStream; +import java.util.function.Function; + +/**A factory that creates any type of RPC calls in Solr + * This is designed to create low level access to the RPC mechanism. + * This is agnostic of Solr documents or other internal concepts of Solr + * But it knows certain things + * a) how to locate a Solr core/replica + * b) basic HTTP access, + * c) serialization/deserialization is the responsibility of the code that is making a request + * + */ +public interface RpcFactory { + +CallRouter createCallRouter(); + +HttpRpc createHttpRpc(); + + +interface ResponseConsumer { +/**Allows this imp
[jira] [Resolved] (SOLR-14717) Writing parquets to solr shards
[ https://issues.apache.org/jira/browse/SOLR-14717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Eduardo Fernandez Lobbe resolved SOLR-14717. -- Resolution: Invalid Hi Kevin, Jira issues typically for reporting bugs or suggesting features. Please ask these kinds of questions in the users list. > Writing parquets to solr shards > --- > > Key: SOLR-14717 > URL: https://issues.apache.org/jira/browse/SOLR-14717 > Project: Solr > Issue Type: Wish > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Kevin Van Lieshout >Priority: Major > > Is there any assistance around writing parquets from spark to solr shards or > is it possible to customize a DIH to import a parquet to a solr shard. Let me > know if this is possible, or the best work around for this. Much appreciated, > thanks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface
murblanc commented on pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#issuecomment-670233414 Thanks @sigram for the comments. They're useful, will update the PR tomorrow. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466731274 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/ReplicaPlacement.java ## @@ -0,0 +1,29 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * Placement decision for a single {@link Replica}. Note this placement decision is used as part of a {@link WorkOrder}, + * it does not directly lead to the plugin code getting a corresponding {@link Replica} instance, nor does it require the + * plugin to provide a {@link Shard} instance (the plugin code gets such instances for existing replicas and shards in the + * cluster but does not create them directly for adding new replicas for new or existing shards). + * + * Captures the {@link Shard} (via the shard name), {@link Node} and {@link Replica.ReplicaType} of a Replica to be created. + */ +public interface ReplicaPlacement { Review comment: It does include the `Node`. See `WorkOrderFactory.createReplicaPlacement()`. It does not directly refer to a `Request`, the reference to `Request` is captured in the `WorkOrder` created using the same factory and in which the `ReplicaPlacement` are used. Everything passed to the creation factories can be made accessible on the returned instances if needed (given it's captured in the underlying implementations), but I'm not convinced it's useful, so kept it simple. Assumption is that plugin code creates these instances so plugin code knows why and keeps track of what each created instance refers to... But again, easy to add here and in other instances returned by factories (might need to define subinterfaces then to make the appropriate values accessible - BTW that's how I started coding locally, but then simplified to limit the number of interfaces and for the reasons exposed above). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8626) standardise test class naming
[ https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172712#comment-17172712 ] Marcus Eagan commented on LUCENE-8626: -- There are many many areas where I am looking to improve the developer experience and the code hygiene. I'm not some guru of clean code or anything, but I am starting go through my laundry list of things that drive me (and others) nuts and reduces the overall quality of the project. I intend to add a pre-commit check to enforce this and other standards as they come through. > standardise test class naming > - > > Key: LUCENE-8626 > URL: https://issues.apache.org/jira/browse/LUCENE-8626 > Project: Lucene - Core > Issue Type: Test >Reporter: Christine Poerschke >Priority: Major > Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, > SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch > > > This was mentioned and proposed on the dev mailing list. Starting this ticket > here to start to make it happen? > History: This ticket was created as > https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got > JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8626) standardise test class naming
[ https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172709#comment-17172709 ] Marcus Eagan commented on LUCENE-8626: -- [~cpoerschke] Thank you very much for kicking this effort off back in 2018. This issue has many implications for problems. And for what I am working on, there is a detriment to my productivity because of it. I'm sure I am not alone. If you don't mind, I'd like to take this effort a bit further and to completion via a PR. > standardise test class naming > - > > Key: LUCENE-8626 > URL: https://issues.apache.org/jira/browse/LUCENE-8626 > Project: Lucene - Core > Issue Type: Test >Reporter: Christine Poerschke >Priority: Major > Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, > SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch > > > This was mentioned and proposed on the dev mailing list. Starting this ticket > here to start to make it happen? > History: This ticket was created as > https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got > JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466729056 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/PropertyKeyFactory.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * Factory used by the plugin to create property keys to request property values from Solr. + * + * Building of a {@link PropertyKey} requires specifying the target (context) from which the value of that key should be + * obtained. This is done by specifying the appropriate {@link PropertyValueSource}. + * For clarity, when only a single type of target is acceptable, the corresponding subtype of {@link PropertyValueSource} is used instead + * (for example {@link Node}). + */ +public interface PropertyKeyFactory { + /** + * Returns a property key to request the number of cores on a {@link Node}. + */ + PropertyKey createCoreCountKey(Node node); + + /** + * Returns a property key to request disk related info on a {@link Node}. + */ + PropertyKey createDiskInfoKey(Node node); + + /** + * Returns a property key to request the value of a system property on a {@link Node}. + * @param systemPropertyName the name of the system property to retrieve. + */ + PropertyKey createSystemPropertyKey(Node node, String systemPropertyName); + + /** + * Returns a property key to request the value of a metric. + * + * Not all metrics make sense everywhere, but metrics can be applied to different objects. For example + * SEARCHER.searcher.indexCommitSize would make sense for a given replica of a given shard of a given collection, + * and possibly in other contexts. + * + * @param metricSource The registry of the metric. For example a specific {@link Replica}. + * @param metricName for example SEARCHER.searcher.indexCommitSize. + */ + PropertyKey createMetricKey(PropertyValueSource metricSource, String metricName); Review comment: So these would be metrics that live on a node but that are accessed differently or with a different name? If we were able to distinguish by some other mean, would Node be an appropriate PropertyValueSource? Another type of PropertyValueSource can be introduced, but then would have to point to a specific JVM... Can you point me to examples of these two metrics in 8x or trunk? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466727764 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/PropertyKeyFactory.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * Factory used by the plugin to create property keys to request property values from Solr. + * + * Building of a {@link PropertyKey} requires specifying the target (context) from which the value of that key should be + * obtained. This is done by specifying the appropriate {@link PropertyValueSource}. + * For clarity, when only a single type of target is acceptable, the corresponding subtype of {@link PropertyValueSource} is used instead + * (for example {@link Node}). + */ +public interface PropertyKeyFactory { + /** + * Returns a property key to request the number of cores on a {@link Node}. + */ + PropertyKey createCoreCountKey(Node node); Review comment: If we add new types of `PropertyKeys` we will have to add implementations for these new keys. Wouldn't we need to touch the Solr codebase anyway? Clients (plugins) using the interface would have to know about the new implementation classes and update their code to use them. Technically they could pass a class name through config or other means to use new implementations without code change, but is it a realistic scenario? What would they do with these keys? What are the values these keys will fetch and how will they be used? I'm not against making generic and highly flexible code but only if it's really needed. So if you have a real use case in mind that we should support, I'm open. Otherwise I'd rather keep things strongly typed for now (and as long as we only add stuff to these interfaces we're not breaking anything so we can add later). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14718) Multiple flaws in tracking which UpdateCommand is associated with a given failure logged by ErrorReportingConcurrentUpdateSolrClient: "cmd=add{,id=(null)}"
[ https://issues.apache.org/jira/browse/SOLR-14718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172705#comment-17172705 ] Chris M. Hostetter commented on SOLR-14718: --- # "{{add\{,id=(null)}}}" is what you get from an {{AddUpdateCommand.toString()}} if either: ** the document it contains has no uniqueKey (ie: not in use by the schema, not yet filled in by some custom processor, etc...) ... this situation is pretty rare in practice ** the document itself is null # {{JavabinLoader.parseAndLoadDocs}} takes a "re-use" approach with {{AddUpdateCommand}} ... ** it initializes a single {{AddUpdateCommand addCmd}} for the whole request ** it calls {{addCmd.solrDoc = document; ... processor.processAdd(addCmd); addCmd.clear();}} for each document # {{DistributedUpdateProcessor}} uses {{SolrCmdDistributor}} uses {{StreamingSolrClients}} to create & asynchornously process a {{Req}} for each of the individual _documents_ ** but along the way it keeps a reference to the (Add) {{UpdateCommand}} that document came from, evidently in order to log info about it during error handling ** which is useless once {{JavabinLoader}} has nulled out the details of the {{AddUpdateCommand}} *** IIRC there's code to "clone" the {{SolrInputDocument}} for local processing so we don't accidentally modify it in update processors while it's 'in flight' for async remote updates, but in this case it's the {{UpdateCommand}} that's getting modified, and i guess nothing clones that? # BUT! ... even if we "fix" this {{AddUpdateCommand}} re-use (or clone the entire {{UpdateCommand}}, not just the SolrInputDocument) there appears to be another problem (which i think already affects things like the XML/JSON loaders when indexing multiple documents per request? ** the way the {{Req}} (and thus {{UpdateCommand}} ) is tracked for use if/when there is an error is as a member variable on the {{ErrorReportingConcurrentUpdateSolrClient}} {{StreamingSolrClients.getSolrClient(Req)}} initializes ** Except... ** {{StreamingSolrClients}} maintains a {{Map solrClients}} "cache" of solr clients key'ed off of the {{Req}} objects {{req.node.getUrl()}} *** So if a single {{SolrQueryRequest}} includes a batch of multiple documents destined for the same node (shard? leader?) then (AFAICT) any document which has a failure is going to be reported as the _first_ document in that batch *** ie: instead of "{{add\{,id=(null)}}}" you might get a failure for "{{add\{,id=xxx}}}" even though 'xxx' may have been indexed just fine, but doc 'yyy' (which lives in the same shard as 'xxx') may have been the "add" command that really failed > Multiple flaws in tracking which UpdateCommand is associated with a given > failure logged by ErrorReportingConcurrentUpdateSolrClient: > "cmd=add{,id=(null)}" > --- > > Key: SOLR-14718 > URL: https://issues.apache.org/jira/browse/SOLR-14718 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Priority: Major > > Here's an example, taken from SOLR-13486, of an ERROR logged by > {{ErrorReportingConcurrentUpdateSolrClient}} when a distrubted update failure > occured... > {noformat} >[junit4] 2> 1704143 ERROR > (updateExecutor-6525-thread-1-processing-x:outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n1 > r:core_node2 null n:127.0.0.1:34940_solr > c:outOfSyncReplicasCannotBecomeLeader-false s:shard1) [n:127.0.0.1:34940_solr > c:outOfSyncReplicasCannotBecomeLeader-false s:shard1 r:core_node2 > x:outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n1] > o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling > SolrCmdDistributor$Req: cmd=add{,id=(null)}; node=StdNode: > http://127.0.0.1:40376/solr/outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n5/ > to > http://127.0.0.1:40376/solr/outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n5/ >[junit4] 2> => java.io.IOException: java.net.ConnectException: > Connection refused > {noformat} > In this case the the underlying cause was a ConnectException - but the same > ERROR msg format is used regardless of the underlying Exception that was > thrown - and it's the result of these two bits of code... > {code:java} > // ErrorReportingConcurrentUpdateSolrClient.handleError > log.error("Error when calling {} to {}", req, req.node.getUrl(), ex); > // Req.toString()... > public String toString() { > StringBuilder sb = new StringBuilder(); > sb.append("SolrCmdDistributor$Req: cmd=").append(cmd.toString()); > sb.append("; node=").append(String.valueOf(node)); >
[jira] [Created] (SOLR-14718) Multiple flaws in tracking which UpdateCommand is associated with a given failure logged by ErrorReportingConcurrentUpdateSolrClient: "cmd=add{,id=(null)}"
Chris M. Hostetter created SOLR-14718: - Summary: Multiple flaws in tracking which UpdateCommand is associated with a given failure logged by ErrorReportingConcurrentUpdateSolrClient: "cmd=add{,id=(null)}" Key: SOLR-14718 URL: https://issues.apache.org/jira/browse/SOLR-14718 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Chris M. Hostetter Here's an example, taken from SOLR-13486, of an ERROR logged by {{ErrorReportingConcurrentUpdateSolrClient}} when a distrubted update failure occured... {noformat} [junit4] 2> 1704143 ERROR (updateExecutor-6525-thread-1-processing-x:outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n1 r:core_node2 null n:127.0.0.1:34940_solr c:outOfSyncReplicasCannotBecomeLeader-false s:shard1) [n:127.0.0.1:34940_solr c:outOfSyncReplicasCannotBecomeLeader-false s:shard1 r:core_node2 x:outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n1] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling SolrCmdDistributor$Req: cmd=add{,id=(null)}; node=StdNode: http://127.0.0.1:40376/solr/outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n5/ to http://127.0.0.1:40376/solr/outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n5/ [junit4] 2> => java.io.IOException: java.net.ConnectException: Connection refused {noformat} In this case the the underlying cause was a ConnectException - but the same ERROR msg format is used regardless of the underlying Exception that was thrown - and it's the result of these two bits of code... {code:java} // ErrorReportingConcurrentUpdateSolrClient.handleError log.error("Error when calling {} to {}", req, req.node.getUrl(), ex); // Req.toString()... public String toString() { StringBuilder sb = new StringBuilder(); sb.append("SolrCmdDistributor$Req: cmd=").append(cmd.toString()); sb.append("; node=").append(String.valueOf(node)); return sb.toString(); } {code} I was recently asked why the {{UpdateCommand cmd}} reported by the {{Req.toString()}} was *ALWAYS* showing up as {{add\{,id=(null)};}} (ie: an "empty" {{AddUpdateCommand}} ) instead of correctly identifying which document was failing. In the above case of a "ConnectionException" this may not matter, but the same problem exists if an individual document has problem, perhaps due to schema conflictss detected by the leader when some other node forwards TOLEADER. Based on an audit of the code, there appears to be at least 2 diff bugs in Solr that can cause the "cmd" reported in these error situations to be wrong: * UpdateCommand re-use in JavabinLoader * ErrorReportingConcurrentUpdateSolrClient in StreamingSolrClients ...full notes to follow in comment. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466724009 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/CreateNewCollectionRequest.java ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * Request for creating a new collection with a given set of shards and replication factor for various replica types. + * The expected {@link WorkOrder} corresponding to this {@link Request} is created using + * {@link WorkOrderFactory#createWorkOrderNewCollection} + * + * Note there is no need at this stage to allow the plugin to know each shard hash range for example, this can be handled + * by the Solr side implementation of this interface without needing the plugin to worry about it (the implementation of this interface on + * the Solr side can maintain the ranges for each shard). + * + * Same goes for the {@link org.apache.solr.core.ConfigSet} name or other collection parameters. They are needed for + * creating a Collection but likely do not have to be exposed to the plugin (this can easily be changed if needed by + * adding accessors here, the underlying Solr side implementation of this interface has the information). + */ +public interface CreateNewCollectionRequest extends Request { + /** + * The name of the collection to be created and for which placement should be computed. + * + * Compare this method with {@link AddReplicasRequest#getCollection()}, there the collection already exists so can be + * directly passed in the {@link Request}. + * + * When processing this request, plugin code doesn't have to worry about existing {@link Replica}'s for the collection + * given that the collection is assumed not to exist. + */ + String getCollectionName(); + + Set getShardNames(); + + /** + * Properties passed through the Collection API by the client creating the collection. + * See {@link SolrCollection#getCustomProperty(String)}. + * + * Given this {@link Request} is for creating a new collection, it is not possible to pass the custom property values through + * the {@link SolrCollection} object. That instance does not exist yet, and is the reason {@link #getCollectionName()} exists + * rather than a method returning {@link SolrCollection}... + */ + String getCustomProperty(String customPropertyName); Review comment: I can add such an enumeration (but then would skip non `String` properties, or just `toString()` everything) but unclear to me how a plugin would be basing placement decisions on properties it doesn't know about. Indeed the general idea is that the plugin does not do the calls and does not need to access all the information from the Collection API call. It is called for placement computation, the code on the Solr side knows everything the Collection API call has provided and will handle the CREATE command (to Overseer) like it does today. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466722280 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/Cluster.java ## @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.io.IOException; +import java.util.Optional; +import java.util.Set; + +/** + * A representation of the (initial) cluster state, providing information on which nodes are part of the cluster and a way + * to get to more detailed info. + * + * This instance can also be used as a {@link PropertyValueSource} if {@link PropertyKey}'s need to be specified with + * a global cluster target. + */ +public interface Cluster extends PropertyValueSource { + /** + * @return current set of live nodes. Never null, never empty (Solr wouldn't call the plugin if empty + * since no useful could then be done). + */ + Set getLiveNodes(); + + /** + * Returns info about the given collection if one exists. Because it is not expected for plugins to request info about + * a large number of collections, requests can only be made one by one. + * + * This is also the reason we do not return a {@link java.util.Map} or {@link Set} of {@link SolrCollection}'s here: it would be + * wasteful to fetch all data and fill such a map when plugin code likely needs info about at most one or two collections. + */ + Optional getCollection(String collectionName) throws IOException; Review comment: Ok. I don't really see a use case for this (interested if you have something specific in mind) but will add. My thinking was that plugins will be interested in the Collection they need to compute placement for and other specific collections if "their" collection properties reference another collection (for example something along the lines of `withCollection`, even though we remove it, a plugin could reimplement). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface
murblanc commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466721386 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/AddReplicasRequest.java ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * Request for creating one or more {@link Replica}'s for one or more {@link Shard}'s of an existing {@link SolrCollection}. + * The shard might or might not already exist, plugin code can easily find out by using {@link SolrCollection#getShards()} + * and verifying if the shard name(s) from {@link #getShardNames()} are there. + * + * As opposed to {@link CreateNewCollectionRequest}, the set of {@link Node}s on which the replicas should be placed + * is specified (defaults to being equal to the set returned by {@link Cluster#getLiveNodes()}). + * + * There is no extension between this interface and {@link CreateNewCollectionRequest} in either direction + * or from a common ancestor for readability. An ancestor could make sense and would be an "abstract interface" not intended + * to be implemented directly, but this does not exist in Java. + * + * Plugin code would likely treat the two types of requests differently since here existing {@link Replica}'s must be taken + * into account for placement whereas in {@link CreateNewCollectionRequest} no {@link Replica}'s are assumed to exist. + */ +public interface AddReplicasRequest extends Request { + /** + * The {@link SolrCollection} to add {@link Replica}(s) to. The replicas are to be added to a shard that might or might + * not yet exist when the plugin's {@link PlacementPlugin#computePlacement} is called. + */ + SolrCollection getCollection(); + + /** + * Shard name(s) for which new replicas placement should be computed. The shard(s) might exist or not (that's why this + * method returns a {@link Set} of {@link String}'s and not directly a set of {@link Shard} instances). + * + * Note the Collection API allows specifying the shard name or a {@code _route_} parameter. The Solr implementation will + * convert either specification into the relevant shard name so the plugin code doesn't have to worry about this. + */ + Set getShardNames(); + + /** Replicas should only be placed on nodes from the set returned by this method. */ + Set getTargetNodes(); Review comment: My motivation here was to not have the plugin worry about it and if no specific set of subnodes was passed, just make this set equivalent to all live nodes. That way no specific logic on the plugin side for dealing with this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14717) Writing parquets to solr shards
Kevin Van Lieshout created SOLR-14717: - Summary: Writing parquets to solr shards Key: SOLR-14717 URL: https://issues.apache.org/jira/browse/SOLR-14717 Project: Solr Issue Type: Wish Security Level: Public (Default Security Level. Issues are Public) Reporter: Kevin Van Lieshout Is there any assistance around writing parquets from spark to solr shards or is it possible to customize a DIH to import a parquet to a solr shard. Let me know if this is possible, or the best work around for this. Much appreciated, thanks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1686: SOLR-13528: Implement Request Rate Limiters
madrob commented on a change in pull request #1686: URL: https://github.com/apache/lucene-solr/pull/1686#discussion_r466588870 ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -102,31 +103,101 @@ public Boolean call() throws Exception { try { future.get(); } catch (Exception e) { - assertTrue("Not true " + e.getMessage(), e.getMessage().contains("non ok status: 429, message:Too Many Requests")); + assertThat(e.getMessage(), containsString("non ok status: 429, message:Too Many Requests")); } } MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY); - assertTrue("Incoming request count did not match. Expected == 25 incoming " + mockQueryRateLimiter.incomingRequestCount.get(), - mockQueryRateLimiter.incomingRequestCount.get() == 25); + assertEquals(mockQueryRateLimiter.incomingRequestCount.get(),25); Review comment: nit: swap the parameters. assertEquals(expected, actual) ## File path: solr/core/src/java/org/apache/solr/servlet/RateLimitManager.java ## @@ -92,17 +100,28 @@ public boolean handleRequest(HttpServletRequest request) throws InterruptedExcep * For each request rate limiter whose type that is not of the type of the request which got rejected, * check if slot borrowing is enabled. If enabled, try to acquire a slot. * If allotted, return else try next request type. + * + * @lucene.gexperimental -- Can cause slots to be blocked if a request borrows a slot and is itself long lived. Review comment: s/gexperimental/experimental ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -102,31 +103,101 @@ public Boolean call() throws Exception { try { future.get(); } catch (Exception e) { - assertTrue("Not true " + e.getMessage(), e.getMessage().contains("non ok status: 429, message:Too Many Requests")); + assertThat(e.getMessage(), containsString("non ok status: 429, message:Too Many Requests")); } } MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY); - assertTrue("Incoming request count did not match. Expected == 25 incoming " + mockQueryRateLimiter.incomingRequestCount.get(), - mockQueryRateLimiter.incomingRequestCount.get() == 25); + assertEquals(mockQueryRateLimiter.incomingRequestCount.get(),25); assertTrue("Incoming accepted new request count did not match. Expected 5 incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(), mockQueryRateLimiter.acceptedNewRequestCount.get() < 25); assertTrue("Incoming rejected new request count did not match. Expected 20 incoming " + mockQueryRateLimiter.rejectedRequestCount.get(), mockQueryRateLimiter.rejectedRequestCount.get() > 0); - assertTrue("Incoming total processed requests count did not match. Expected " + mockQueryRateLimiter.incomingRequestCount.get() + " incoming " - + (mockQueryRateLimiter.acceptedNewRequestCount.get() + mockQueryRateLimiter.rejectedRequestCount.get()), - (mockQueryRateLimiter.acceptedNewRequestCount.get() + mockQueryRateLimiter.rejectedRequestCount.get()) == mockQueryRateLimiter.incomingRequestCount.get()); + assertEquals(mockQueryRateLimiter.acceptedNewRequestCount.get() + mockQueryRateLimiter.rejectedRequestCount.get(), Review comment: (expected, actual) ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -102,31 +103,101 @@ public Boolean call() throws Exception { try { future.get(); } catch (Exception e) { - assertTrue("Not true " + e.getMessage(), e.getMessage().contains("non ok status: 429, message:Too Many Requests")); + assertThat(e.getMessage(), containsString("non ok status: 429, message:Too Many Requests")); } } MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY); - assertTrue("Incoming request count did not match. Expected == 25 incoming " + mockQueryRateLimiter.incomingRequestCount.get(), - mockQueryRateLimiter.incomingRequestCount.get() == 25); + assertEquals(mockQueryRateLimiter.incomingRequestCount.get(),25); assertTrue("Incoming accepted new request count did not match. Expected 5 incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(), mockQueryRateLimiter.acceptedNewRequestCount.get() < 25); assertTrue("Incoming rejected new request count did not match. Expected 20 incoming " + mockQueryRateLimiter.rejectedRequest
[jira] [Commented] (LUCENE-8626) standardise test class naming
[ https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172666#comment-17172666 ] Erick Erickson commented on LUCENE-8626: Hmmm, one other random thought. While I'd prefer some kind of enforcement, I'd claim that if we just changed the test file names however we agree, the fact that they'd all be consistent would make it less likely that new test classes are created with the abandoned pattern. I think it's worth making the change first, then worrying about enforcement. > standardise test class naming > - > > Key: LUCENE-8626 > URL: https://issues.apache.org/jira/browse/LUCENE-8626 > Project: Lucene - Core > Issue Type: Test >Reporter: Christine Poerschke >Priority: Major > Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, > SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch > > > This was mentioned and proposed on the dev mailing list. Starting this ticket > here to start to make it happen? > History: This ticket was created as > https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got > JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-9959) SolrInfoMBean-s category and hierarchy cleanup
[ https://issues.apache.org/jira/browse/SOLR-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172658#comment-17172658 ] Andrzej Bialecki commented on SOLR-9959: It's a left-over from refactoring in SOLR-13858, it needs to be deleted (it was actually in use when it was first introduced) - I'll take care of this. > SolrInfoMBean-s category and hierarchy cleanup > -- > > Key: SOLR-9959 > URL: https://issues.apache.org/jira/browse/SOLR-9959 > Project: Solr > Issue Type: Improvement > Components: metrics >Affects Versions: 7.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Blocker > Fix For: 7.0 > > Attachments: SOLR-9959.patch, SOLR-9959.patch, SOLR-9959.patch > > > SOLR-9947 changed categories of some of {{SolrInfoMBean-s}}, and it also > added an alternative view in JMX, similar to the one produced by > {{SolrJmxReporter}}. > Some changes were left out from that issue because they would break the > back-compatibility in 6.x, but they should be done before 7.0: > * remove the old JMX view of {{SolrInfoMBean}}-s and improve the new one so > that it's more readable and useful. > * in many cases {{SolrInfoMBean.getName()}} just returns a FQCN, but it could > be more informative - eg. for highlighter or query plugins this could be the > symbolic name of a plugin that users know and use in configuration. > * top-level categories need more thought. On one hand it's best to minimize > their number, on the other hand they need to meaningfully represent the > functionality of components that use them. SOLR-9947 made some cosmetic > changes, but more discussion is necessary (eg. QUERY vs. SEARCHHANDLER) > * we should consider removing some of the methods in {{SolrInfoMBean}} that > usually don't return any useful information, eg. {{getDocs}}, {{getSource()}} > and {{getVersion()}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] gus-asf commented on a change in pull request #1716: SOLR-14706: Fix support for default autoscaling policy
gus-asf commented on a change in pull request #1716: URL: https://github.com/apache/lucene-solr/pull/1716#discussion_r466670298 ## File path: solr/solr-ref-guide/src/solr-upgrade-notes.adoc ## @@ -72,7 +84,9 @@ More information about this new feature is available in the section <
[GitHub] [lucene-solr] HoustonPutman commented on a change in pull request #1716: SOLR-14706: Fix support for default autoscaling policy
HoustonPutman commented on a change in pull request #1716: URL: https://github.com/apache/lucene-solr/pull/1716#discussion_r466657348 ## File path: solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/Clause.java ## @@ -117,10 +117,10 @@ private Clause(Map m) { strict = Boolean.parseBoolean(String.valueOf(m.getOrDefault("strict", "true"))); Optional globalTagName = m.keySet().stream().filter(Policy.GLOBAL_ONLY_TAGS::contains).findFirst(); if (globalTagName.isPresent()) { - globalTag = parse(globalTagName.get(), m); - if (m.size() > 2) { -throw new RuntimeException("Only one extra tag supported for the tag " + globalTagName.get() + " in " + toJSONString(m)); + if (m.size() > 3) { Review comment: I think that's the logic Gus used. It's equivalent to: ```java private void validateGlobalTag(Map m, String tagName) { if (m.size() > 2) { if (!(m.containsKey("strict") && m.size() == 3)) { throw new RuntimeException("Only, 'strict' and one extra tag supported for the tag " + tagName + " in " + toJSONString(m)); } } } ``` This will error unless there are exactly 3 keys and once of them is `strict`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface
sigram commented on a change in pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466617275 ## File path: solr/core/src/java/org/apache/solr/cloud/api/collections/Assign.java ## @@ -569,14 +574,20 @@ public AssignStrategy create(ClusterState clusterState, DocCollection collection case RULES: List rules = new ArrayList<>(); for (Object map : ruleMaps) rules.add(new Rule((Map) map)); + @SuppressWarnings({"rawtypes"}) + List snitches = (List) collection.get(SNITCH); return new RulesBasedAssignStrategy(rules, snitches, clusterState); +case PLUGIN_PLACEMENT: + // TODO need to decide which plugin class to use. Global config (single plugin for all PLUGIN_PLACEMENT collections?) or per collection config? + // TODO hardconding a sample plugin for now. DO NOT MERGE this as is. + return new PlacementPluginAssignStrategy(new SamplePluginMinimizeCores()); default: throw new Assign.AssignmentException("Unknown strategy type: " + strategy); } } private enum Strategy { - LEGACY, RULES; + LEGACY, RULES, PLUGIN_PLACEMENT; Review comment: `Strategy` already describes how to perform placement. Maybe we should rename `Strategy` -> `Placement` and simply use `PLUGIN` here? ## File path: solr/core/src/java/org/apache/solr/cluster/placement/Cluster.java ## @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.io.IOException; +import java.util.Optional; +import java.util.Set; + +/** + * A representation of the (initial) cluster state, providing information on which nodes are part of the cluster and a way + * to get to more detailed info. + * + * This instance can also be used as a {@link PropertyValueSource} if {@link PropertyKey}'s need to be specified with + * a global cluster target. + */ +public interface Cluster extends PropertyValueSource { + /** + * @return current set of live nodes. Never null, never empty (Solr wouldn't call the plugin if empty + * since no useful could then be done). Review comment: `no useful` -> `no useful work` ## File path: solr/core/src/java/org/apache/solr/cluster/placement/SystemPropertyPropertyValue.java ## @@ -0,0 +1,28 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +/** + * A {@link PropertyValue} representing a System property on the target {@link Node}. + */ +public interface SystemPropertyPropertyValue extends PropertyValue { Review comment: Maybe rename it to `SyspropPropertyValue` to avoid this weird repetition? ## File path: solr/core/src/java/org/apache/solr/cluster/placement/MetricPropertyValue.java ## @@ -0,0 +1,30 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software +
[GitHub] [lucene-solr] HoustonPutman commented on a change in pull request #1716: SOLR-14706: Fix support for default autoscaling policy
HoustonPutman commented on a change in pull request #1716: URL: https://github.com/apache/lucene-solr/pull/1716#discussion_r466643442 ## File path: solr/solr-ref-guide/src/solr-upgrade-notes.adoc ## @@ -34,17 +34,29 @@ Detailed steps for upgrading a Solr cluster are in the section <> below. -=== Solr 8.6.1 +=== Solr 8.6.1 (Upgrading from 8.6.0 only) + +See the https://cwiki.apache.org/confluence/display/SOLR/ReleaseNote861[8.6.1 Release Notes^] +for an overview of the fixes included in Solr 8.6.1. + +When upgrading to 8.6.1 users should be aware of the following major changes from 8.6.0. *Autoscaling* * As mentioned in the 8.6 upgrade notes, a default autoscaling policy was provided starting in 8.6.0. This default autoscaling policy resulted in increasingly slow collection creation calls in large clusters (50+ collections). + In 8.6.1 the default autoscaling policy has been removed, and clouds will not use autoscaling unless a policy has explicitly been created. -In order to fix the performance degradations introduced in 8.6.0, merely upgrade to 8.6.1. +If your cloud is running 8.6.0 and **not using an explicit autoscaling policy**, upgrade to 8.6.1 and remove the default cluster policy and preferences via the following command. +Replace `localhost:8983` with your Solr endpoint. ++ +``` +curl -X POST -H 'Content-type:application/json' -d '{set-cluster-policy : [], set-cluster-preferences : []}' http://localhost:8983/api/cluster/autoscaling Review comment: Hmmm, I'm a bit confused about that. So if I spin up a new cloud with 8.5, I don't see a `cluster-preferences`. It basically looks the same as a 8.6.1 cloud that has the above command run. So if we reverted the change of the defaulting patch, shouldn't a 8.5 cloud and 8.6.1 cloud with identical `autoscaling.json` ZNodes behave the same? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9379) Directory based approach for index encryption
[ https://issues.apache.org/jira/browse/LUCENE-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172622#comment-17172622 ] Rajeswari Natarajan commented on LUCENE-9379: - We have a use case where we want to fit multiple index/tenant per collection and each index/tenant should have a separate key and we would like to use composite ID router. The use of composite id router do not limit each index/tenant per shard/directory . In this scenario , is OS level encryption possible? > Directory based approach for index encryption > - > > Key: LUCENE-9379 > URL: https://issues.apache.org/jira/browse/LUCENE-9379 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Bruno Roustant >Assignee: Bruno Roustant >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > +Important+: This Lucene Directory wrapper approach is to be considered only > if an OS level encryption is not possible. OS level encryption better fits > Lucene usage of OS cache, and thus is more performant. > But there are some use-case where OS level encryption is not possible. This > Jira issue was created to address those. > > > The goal is to provide optional encryption of the index, with a scope limited > to an encryptable Lucene Directory wrapper. > Encryption is at rest on disk, not in memory. > This simple approach should fit any Codec as it would be orthogonal, without > modifying APIs as much as possible. > Use a standard encryption method. Limit perf/memory impact as much as > possible. > Determine how callers provide encryption keys. They must not be stored on > disk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8626) standardise test class naming
[ https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172538#comment-17172538 ] Michael Sokolov edited comment on LUCENE-8626 at 8/6/20, 6:58 PM: -- Personally, I prefer -prefixing- suffixing (um suffix is the one that comes after, right?), but more than that, I'd value consistency. Still, without automated enforcement, we won't get that either. So, I'd be -0 to this change unless it comes along with enforcement (banning files with the nonstandard naming scheme). Otherwise we'll just be back here again in a year... uh, actually reading the thread now I see we do have an enforcement mechanism, OK that's great. If we can come to some consensus here, then rename away! was (Author: sokolov): Personally, I prefer prefixing, but more than that, I'd value consistency. Still, without automated enforcement, we won't get that either. So, I'd be -0 to this change unless it comes along with enforcement (banning files with the nonstandard naming scheme). Otherwise we'll just be back here again in a year... uh, actually reading the thread now I see we do have an enforcement mechanism, OK that's great. If we can come to some consensus here, then rename away! > standardise test class naming > - > > Key: LUCENE-8626 > URL: https://issues.apache.org/jira/browse/LUCENE-8626 > Project: Lucene - Core > Issue Type: Test >Reporter: Christine Poerschke >Priority: Major > Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, > SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch > > > This was mentioned and proposed on the dev mailing list. Starting this ticket > here to start to make it happen? > History: This ticket was created as > https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got > JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #1716: SOLR-14706: Fix support for default autoscaling policy
sigram commented on a change in pull request #1716: URL: https://github.com/apache/lucene-solr/pull/1716#discussion_r466605349 ## File path: solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/Clause.java ## @@ -117,10 +117,10 @@ private Clause(Map m) { strict = Boolean.parseBoolean(String.valueOf(m.getOrDefault("strict", "true"))); Optional globalTagName = m.keySet().stream().filter(Policy.GLOBAL_ONLY_TAGS::contains).findFirst(); if (globalTagName.isPresent()) { - globalTag = parse(globalTagName.get(), m); - if (m.size() > 2) { -throw new RuntimeException("Only one extra tag supported for the tag " + globalTagName.get() + " in " + toJSONString(m)); + if (m.size() > 3) { Review comment: @gus-asf yes, that's the intent, though your pseudo-code is still incorrect - if there's no `strict` tag then `m.size() > 2` is already an error. In other words, for global tags we expect exactly 2 keys (tag and operand), with optional third key `strict`. ## File path: solr/solr-ref-guide/src/solr-upgrade-notes.adoc ## @@ -72,7 +84,9 @@ More information about this new feature is available in the section < newValues = new HashMap<>(scenario.cluster.getSimNodeStateProvider().simGetNodeValues(node)); Review comment: +1, this fixed a genuine bug in < 8.6.0 ## File path: solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/Clause.java ## @@ -686,7 +686,7 @@ boolean isShardAbsent() { for (Row r : session.matrix) { computedValueEvaluator.node = r.node; SealedClause sealedClause = getSealedClause(computedValueEvaluator); -if (!sealedClause.getGlobalTag().isPass(r)) { +if (r.isLive() && !sealedClause.getGlobalTag().isPass(r)) { Review comment: +1, this was a genuine bug in 8.5. ## File path: solr/solr-ref-guide/src/solr-upgrade-notes.adoc ## @@ -34,17 +34,29 @@ Detailed steps for upgrading a Solr cluster are in the section <> below. -=== Solr 8.6.1 +=== Solr 8.6.1 (Upgrading from 8.6.0 only) + +See the https://cwiki.apache.org/confluence/display/SOLR/ReleaseNote861[8.6.1 Release Notes^] +for an overview of the fixes included in Solr 8.6.1. + +When upgrading to 8.6.1 users should be aware of the following major changes from 8.6.0. *Autoscaling* * As mentioned in the 8.6 upgrade notes, a default autoscaling policy was provided starting in 8.6.0. This default autoscaling policy resulted in increasingly slow collection creation calls in large clusters (50+ collections). + In 8.6.1 the default autoscaling policy has been removed, and clouds will not use autoscaling unless a policy has explicitly been created. -In order to fix the performance degradations introduced in 8.6.0, merely upgrade to 8.6.1. +If your cloud is running 8.6.0 and **not using an explicit autoscaling policy**, upgrade to 8.6.1 and remove the default cluster policy and preferences via the following command. +Replace `localhost:8983` with your Solr endpoint. ++ +``` +curl -X POST -H 'Content-type:application/json' -d '{set-cluster-policy : [], set-cluster-preferences : []}' http://localhost:8983/api/cluster/autoscaling Review comment: Setting preferences to `[]` actually removes the default preferences that have always been implicitly present, so this changes the behavior as compared with 8.5 and earlier. We should only reset the cluster_policy to `[]`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface
murblanc commented on pull request #1684: URL: https://github.com/apache/lucene-solr/pull/1684#issuecomment-670099928 I have implemented the cluster state abstractions and added some (naive and temporary) wiring to select this assign strategy. A lot of missing parts, this can't be merged anymore at this stage. Work In Progress. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config
tflobbe commented on a change in pull request #1602: URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466606166 ## File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java ## @@ -68,6 +68,19 @@ public final double ramBufferSizeMB; public final int ramPerThreadHardLimitMB; + /** Review comment: I didn't want to drop the link/see by itself, so I added a single line introducing the functionality. After that I added specifics about Solr (solrconfig configuration and the warning setting). I honestly don't see the problem here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config
tflobbe commented on a change in pull request #1602: URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466603545 ## File path: solr/core/src/test/org/apache/solr/update/SolrIndexConfigTest.java ## @@ -208,4 +210,16 @@ public void testToMap() throws Exception { assertEquals(mSizeExpected, m.size()); } + + public void testMaxCommitMergeWaitSeconds() throws Exception { +SolrConfig sc = new SolrConfig(TEST_PATH().resolve("collection1"), "solrconfig-test-misc.xml"); +assertEquals(-1, sc.indexConfig.maxCommitMergeWaitMillis); +assertEquals(IndexWriterConfig.DEFAULT_MAX_COMMIT_MERGE_WAIT_MILLIS, sc.indexConfig.toIndexWriterConfig(h.getCore()).getMaxCommitMergeWaitMillis()); +System.setProperty("solr.tests.maxCommitMergeWait", "10"); +sc = new SolrConfig(TEST_PATH().resolve("collection1"), "solrconfig-test-misc.xml"); +assertEquals(10, sc.indexConfig.maxCommitMergeWaitMillis); +assertEquals(10, sc.indexConfig.toIndexWriterConfig(h.getCore()).getMaxCommitMergeWaitMillis()); +System.clearProperty("solr.tests.maxCommitMergeWait"); Review comment: Yes, I checked. I believe the cleanup is after the class. I discussed with @madrob, and I'll move the cleanup to a `tearDown()` method This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tflobbe commented on pull request #1719: SOLR-14702: Remove Master and Slave from Code Base and Docs (8.x)
tflobbe commented on pull request #1719: URL: https://github.com/apache/lucene-solr/pull/1719#issuecomment-670093537 > I thought was changed in my PR... They were, my point is that I didn't re-set these to legacy terms (after I backported your commit) because in this particular case we don't need it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1719: SOLR-14702: Remove Master and Slave from Code Base and Docs (8.x)
tflobbe commented on a change in pull request #1719: URL: https://github.com/apache/lucene-solr/pull/1719#discussion_r466597411 ## File path: solr/core/src/java/org/apache/solr/cloud/ReplicateFromLeader.java ## @@ -76,12 +76,12 @@ public void startReplication(boolean switchTransactionLog) throws InterruptedExc } log.info("Will start replication from leader with poll interval: {}", pollIntervalStr ); - NamedList slaveConfig = new NamedList<>(); - slaveConfig.add("fetchFromLeader", Boolean.TRUE); - slaveConfig.add(ReplicationHandler.SKIP_COMMIT_ON_MASTER_VERSION_ZERO, switchTransactionLog); - slaveConfig.add("pollInterval", pollIntervalStr); + NamedList followerConfig = new NamedList<>(); + followerConfig.add("fetchFromLeader", Boolean.TRUE); + followerConfig.add(ReplicationHandler.SKIP_COMMIT_ON_LEADER_VERSION_ZERO, switchTransactionLog); + followerConfig.add("pollInterval", pollIntervalStr); NamedList replicationConfig = new NamedList<>(); - replicationConfig.add("slave", slaveConfig); + replicationConfig.add("follower", followerConfig); Review comment: For 8.x branches, every time we set a parameter or configuration (for requests, etc) we use the legacy names. In this particular case, I'm using the new parameters even for setting, and the reason is that this is a configuration that is set and read locally only. The TLOG/PULL replicas create this to create their replication process. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14702) Remove Master and Slave from Code Base and Docs
[ https://issues.apache.org/jira/browse/SOLR-14702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172583#comment-17172583 ] Tomas Eduardo Fernandez Lobbe commented on SOLR-14702: -- Thanks Cassandra. Then, the followup tasks are all in progress here: * Update upgrade notes: https://github.com/apache/lucene-solr/pull/1718 * Make 8.x version of the PR: https://github.com/apache/lucene-solr/pull/1719. I'll wait to merge that one until we get some of the testing Marcus is working on: https://issues.apache.org/jira/browse/SOLR-14708 * Rename the "master/slave mode": https://issues.apache.org/jira/browse/SOLR-14716 > Remove Master and Slave from Code Base and Docs > --- > > Key: SOLR-14702 > URL: https://issues.apache.org/jira/browse/SOLR-14702 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: master (9.0) >Reporter: Marcus Eagan >Priority: Critical > Attachments: SOLR-14742-testfix.patch > > Time Spent: 16h 10m > Remaining Estimate: 0h > > Every time I read _master_ and _slave_, I get pissed. > I think about the last and only time I remember visiting my maternal great > grandpa in Alabama at four years old. He was a sharecropper before WWI, where > he lost his legs, and then he was back to being a sharecropper somehow after > the war. Crazy, I know. I don't know if the world still called his job > sharecropping in 1993, but he was basically a slave—in America. He lived in > the same shack that his father, and his grandfather (born a slave) lived in > down in Alabama. Believe it or not, my dad's (born in 1926) grandfather was > actually born a slave, freed shortly after birth by his owner father. I never > met him, though. He died in the 40s. > Anyway, I cannot police all terms in the repo and do not wish to. This > master/slave shit is archaic and misleading on technical grounds. Thankfully, > there's only a handful of files in code and documentation that still talk > about masters and slaves. We should replace all of them. > There are so many ways to reword it. In fact, unless anyone else objects or > wants to do the grunt work to help my stress levels, I will open the pull > request myself in effort to make this project and community more inviting to > people of all backgrounds and histories. We can have leader/follower, or > primary/secondary, but none of this Master/Slave nonsense. I'm sick of the > garbage. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config
dsmiley commented on a change in pull request #1602: URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466589294 ## File path: solr/core/src/test/org/apache/solr/update/SolrIndexConfigTest.java ## @@ -208,4 +210,16 @@ public void testToMap() throws Exception { assertEquals(mSizeExpected, m.size()); } + + public void testMaxCommitMergeWaitSeconds() throws Exception { +SolrConfig sc = new SolrConfig(TEST_PATH().resolve("collection1"), "solrconfig-test-misc.xml"); +assertEquals(-1, sc.indexConfig.maxCommitMergeWaitMillis); +assertEquals(IndexWriterConfig.DEFAULT_MAX_COMMIT_MERGE_WAIT_MILLIS, sc.indexConfig.toIndexWriterConfig(h.getCore()).getMaxCommitMergeWaitMillis()); +System.setProperty("solr.tests.maxCommitMergeWait", "10"); +sc = new SolrConfig(TEST_PATH().resolve("collection1"), "solrconfig-test-misc.xml"); +assertEquals(10, sc.indexConfig.maxCommitMergeWaitMillis); +assertEquals(10, sc.indexConfig.toIndexWriterConfig(h.getCore()).getMaxCommitMergeWaitMillis()); +System.clearProperty("solr.tests.maxCommitMergeWait"); Review comment: Please verify if this is so before removing. Perhaps there may be a difference in behavior between gradle & IntelliJ; so try both. I thought clearing wasn't necessary as well but I recall hearing from someone (@dweiss ?) that the auto-clearing wasn't working. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config
dsmiley commented on a change in pull request #1602: URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466587705 ## File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java ## @@ -129,8 +143,9 @@ public SolrIndexConfig(SolrConfig solrConfig, String prefix, SolrIndexConfig def true); useCompoundFile = solrConfig.getBool(prefix+"/useCompoundFile", def.useCompoundFile); - maxBufferedDocs=solrConfig.getInt(prefix+"/maxBufferedDocs",def.maxBufferedDocs); +maxBufferedDocs = solrConfig.getInt(prefix+"/maxBufferedDocs", def.maxBufferedDocs); ramBufferSizeMB = solrConfig.getDouble(prefix+"/ramBufferSizeMB", def.ramBufferSizeMB); +maxCommitMergeWaitMillis = solrConfig.getInt(prefix+"/maxCommitMergeWait", def.maxCommitMergeWaitMillis); Review comment: The "Wait" part is important too too. "maxCommitMergeWaitTime" is my preference This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tflobbe commented on pull request #1718: SOLR-14702: Add Upgrade Notes and CHANGES entry
tflobbe commented on pull request #1718: URL: https://github.com/apache/lucene-solr/pull/1718#issuecomment-670083339 Ah, good point, I seemed to remember some change with the release notes recently but I couldn't remember what it was This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config
tflobbe commented on a change in pull request #1602: URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466565653 ## File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java ## @@ -129,8 +143,9 @@ public SolrIndexConfig(SolrConfig solrConfig, String prefix, SolrIndexConfig def true); useCompoundFile = solrConfig.getBool(prefix+"/useCompoundFile", def.useCompoundFile); - maxBufferedDocs=solrConfig.getInt(prefix+"/maxBufferedDocs",def.maxBufferedDocs); +maxBufferedDocs = solrConfig.getInt(prefix+"/maxBufferedDocs", def.maxBufferedDocs); ramBufferSizeMB = solrConfig.getDouble(prefix+"/ramBufferSizeMB", def.ramBufferSizeMB); +maxCommitMergeWaitMillis = solrConfig.getInt(prefix+"/maxCommitMergeWait", def.maxCommitMergeWaitMillis); Review comment: I had it with `MIllis` and I removed that because I didn't see anything else in the solrconfig that included the unit for time (it's always milliseconds). You are suggesting `maxCommitMergeWaitMillis` or `maxCommitMergeTime`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config
tflobbe commented on a change in pull request #1602: URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466565653 ## File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java ## @@ -129,8 +143,9 @@ public SolrIndexConfig(SolrConfig solrConfig, String prefix, SolrIndexConfig def true); useCompoundFile = solrConfig.getBool(prefix+"/useCompoundFile", def.useCompoundFile); - maxBufferedDocs=solrConfig.getInt(prefix+"/maxBufferedDocs",def.maxBufferedDocs); +maxBufferedDocs = solrConfig.getInt(prefix+"/maxBufferedDocs", def.maxBufferedDocs); ramBufferSizeMB = solrConfig.getDouble(prefix+"/ramBufferSizeMB", def.ramBufferSizeMB); +maxCommitMergeWaitMillis = solrConfig.getInt(prefix+"/maxCommitMergeWait", def.maxCommitMergeWaitMillis); Review comment: I had it with `MIllis` and I removed that because I didn't see anything else in the solrconfig that included the unit for time (it's always milliseconds). You are suggesting `maxCommitMergeWait` or `maxCommitMergeTime`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config
tflobbe commented on a change in pull request #1602: URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466561784 ## File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java ## @@ -87,6 +100,7 @@ private SolrIndexConfig(SolrConfig solrConfig) { maxBufferedDocs = -1; ramBufferSizeMB = 100; ramPerThreadHardLimitMB = -1; +maxCommitMergeWaitMillis = -1; Review comment: I don't think that's a good idea. "-1" from the Solr perspective means "don't set the value". If someone has a custom merge policy where they have a different value set by code, we would be changing it to `org.apache.lucene.index.IndexWriterConfig#DEFAULT_MAX_COMMIT_MERGE_WAIT_MILLIS` without them having set anything on `solrconfig.xml`. > In the future, this constant might change, and that's good. That's fine, since we aren't calling `iwc.setMaxCommitMergeWaitMillis(...)` in the default case, we'll be taking the new default for all cores where no alternative configuration has been provided. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8626) standardise test class naming
[ https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172545#comment-17172545 ] Erick Erickson commented on LUCENE-8626: Contrariwise, I prefer suffixing on the theory that when looking at a directory, TestFoo TestBar TestBlivet requires me to read past the "Test" before being able to see the class name, whereas FooTest, BarTest, BlivetTest is easier on the eyes. That said, I'll abide by whatever the person leading the charge decides, Michael Sokolov's comment about valuing consistency is germane. > standardise test class naming > - > > Key: LUCENE-8626 > URL: https://issues.apache.org/jira/browse/LUCENE-8626 > Project: Lucene - Core > Issue Type: Test >Reporter: Christine Poerschke >Priority: Major > Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, > SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch > > > This was mentioned and proposed on the dev mailing list. Starting this ticket > here to start to make it happen? > History: This ticket was created as > https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got > JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172540#comment-17172540 ] Mr. Aleem commented on LUCENE-2458: --- Ինչպես նկատել է Կոժին, կարծես թե կատարվածը պատահաբար փոխել է մոլի կանխադրված պահվածքը (այսինքն ՝ վերջին կցված կարկատը չի եղել, բայց կատարվածը): Pdp-11- ի հարցումը այժմ տեքստի արդյունք է տալիս. Pdp Կամ տեքստ. Տեքստի փոխարեն `11:" pdp 11 " Միգուցե մենք պետք է փոխենք SolrQueryParser- ը ՝ օգտագործելով == LUCENE_24 տարբերակը (կամ LUCENE_29- ը նույնպես կաշխատի): [Like given this|https://piratesfile.com/hitfilm-pro-crack] > queryparser makes all CJK queries phrase queries regardless of analyzer > --- > > Key: LUCENE-2458 > URL: https://issues.apache.org/jira/browse/LUCENE-2458 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Blocker > Fix For: 3.1, 4.0-ALPHA > > Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, > LUCENE-2458.patch > > > The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, > ... queries into phrase queries, even though you didn't ask for one, and > there isn't a way to turn this off. > This completely breaks lucene for these languages, as it treats all queries > like 'grep'. > Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are > chinese characters, you get a phrasequery of "a b c d". if you use cjk > analyzer, its no better, its a phrasequery of "ab bc cd", and if you use > smartchinese analyzer, you get a phrasequery like "ab cd". But the user > didn't ask for one, and they cannot turn it off. > The reason is that the code to form phrase queries is not internationally > appropriate and assumes whitespace tokenization. If more than one token comes > out of whitespace delimited text, its automatically a phrase query no matter > what. > The proposed patch fixes the core queryparser (with all backwards compat > kept) to only form phrase queries when the double quote operator is used. > Implementing subclasses can always extend the QP and auto-generate whatever > kind of queries they want that might completely break search for languages > they don't care about, but core general-purpose QPs should be language > independent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8626) standardise test class naming
[ https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172538#comment-17172538 ] Michael Sokolov edited comment on LUCENE-8626 at 8/6/20, 4:52 PM: -- Personally, I prefer prefixing, but more than that, I'd value consistency. Still, without automated enforcement, we won't get that either. So, I'd be -0 to this change unless it comes along with enforcement (banning files with the nonstandard naming scheme). Otherwise we'll just be back here again in a year... uh, actually reading the thread now I see we do have an enforcement mechanism, OK that's great. If we can come to some consensus here, then rename away! was (Author: sokolov): Personally, I prefer prefixing, but more than that, I'd value consistency. Still, without automated enforcement, we won't get that either. So, I'd be -0 to this change unless it comes along with enforcement (banning files with the nonstandard naming scheme). Otherwise we'll just be back here again in a year > standardise test class naming > - > > Key: LUCENE-8626 > URL: https://issues.apache.org/jira/browse/LUCENE-8626 > Project: Lucene - Core > Issue Type: Test >Reporter: Christine Poerschke >Priority: Major > Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, > SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch > > > This was mentioned and proposed on the dev mailing list. Starting this ticket > here to start to make it happen? > History: This ticket was created as > https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got > JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8626) standardise test class naming
[ https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172538#comment-17172538 ] Michael Sokolov commented on LUCENE-8626: - Personally, I prefer prefixing, but more than that, I'd value consistency. Still, without automated enforcement, we won't get that either. So, I'd be -0 to this change unless it comes along with enforcement (banning files with the nonstandard naming scheme). Otherwise we'll just be back here again in a year > standardise test class naming > - > > Key: LUCENE-8626 > URL: https://issues.apache.org/jira/browse/LUCENE-8626 > Project: Lucene - Core > Issue Type: Test >Reporter: Christine Poerschke >Priority: Major > Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, > SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch > > > This was mentioned and proposed on the dev mailing list. Starting this ticket > here to start to make it happen? > History: This ticket was created as > https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got > JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9447) Make BEST_COMPRESSION compress more aggressively?
[ https://issues.apache.org/jira/browse/LUCENE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172510#comment-17172510 ] Robert Muir commented on LUCENE-9447: - >From my experiments done on LUCENE-6100, increasing block size for the most >part is a workaround, it only helps hide the waste of rebooting the deflate >dictionary from *scratch* for every block? So the "crappy preset" tried to >show that there. Sadly I couldn't find a decent/simple way to use preset dictionary that made me happy with what is available in the JDK. It doesn't expose some zlib methods that you would need (e.g. retrieving current dictionary) and as I mentioned, there were some inefficiencies, at least with how we had compression hooked in at the time. > Make BEST_COMPRESSION compress more aggressively? > - > > Key: LUCENE-9447 > URL: https://issues.apache.org/jira/browse/LUCENE-9447 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > > The Lucene86 codec supports setting a "Mode" for stored fields compression, > that is either "BEST_SPEED", which translates to blocks of 16kB or 128 > documents (whichever is hit first) compressed with LZ4, or > "BEST_COMPRESSION", which translates to blocks of 60kB or 512 documents > compressed with DEFLATE with default compression level (6). > After looking at indices that spent most disk space on stored fields > recently, I noticed that there was quite some room for improvement by > increasing the block size even further: > ||Block size||Stored fields size|| > |60kB|168412338| > |128kB|130813639| > |256kB|113587009| > |512kB|104776378| > |1MB|100367095| > |2MB|98152464| > |4MB|97034425| > |8MB|96478746| > For this specific dataset, I had 1M documents that each had about 2kB of > stored fields each and quite some redundancy. > This makes me want to look into bumping this block size to maybe 256kB. It > would be interesting to re-do the experiments we did on LUCENE-6100 to see > how this affects the merging speed. That said I don't think it would be > terrible if the merging time increased a bit given that we already offer the > BEST_SPEED option for CPU-savvy users. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14582) Expose IWC.setMaxCommitMergeWaitMillis as an expert feature in Solr's index config
[ https://issues.apache.org/jira/browse/SOLR-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley reassigned SOLR-14582: --- Assignee: Tomas Eduardo Fernandez Lobbe (was: David Smiley) > Expose IWC.setMaxCommitMergeWaitMillis as an expert feature in Solr's index > config > -- > > Key: SOLR-14582 > URL: https://issues.apache.org/jira/browse/SOLR-14582 > Project: Solr > Issue Type: Improvement >Reporter: Tomas Eduardo Fernandez Lobbe >Assignee: Tomas Eduardo Fernandez Lobbe >Priority: Trivial > Time Spent: 20m > Remaining Estimate: 0h > > LUCENE-8962 added the ability to merge segments synchronously on commit. This > isn't done by default and the default {{MergePolicy}} won't do it, but custom > merge policies can take advantage of this. Solr allows plugging in custom > merge policies, so if someone wants to make use of this feature they could, > however, they need to set {{IndexWriterConfig.maxCommitMergeWaitSeconds}} to > something greater than 0. > Since this is an expert feature, I plan to document it only in javadoc and > not the ref guide. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config
dsmiley commented on a change in pull request #1602: URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466530152 ## File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java ## @@ -129,8 +143,9 @@ public SolrIndexConfig(SolrConfig solrConfig, String prefix, SolrIndexConfig def true); useCompoundFile = solrConfig.getBool(prefix+"/useCompoundFile", def.useCompoundFile); - maxBufferedDocs=solrConfig.getInt(prefix+"/maxBufferedDocs",def.maxBufferedDocs); +maxBufferedDocs = solrConfig.getInt(prefix+"/maxBufferedDocs", def.maxBufferedDocs); ramBufferSizeMB = solrConfig.getDouble(prefix+"/ramBufferSizeMB", def.ramBufferSizeMB); +maxCommitMergeWaitMillis = solrConfig.getInt(prefix+"/maxCommitMergeWait", def.maxCommitMergeWaitMillis); Review comment: I noticed you omitted a "Millis" setting. I think either you should add this for clarity, or for consistency with some other times I see in solrconfig (e.g. maxTime), use suffix "Time". This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config
dsmiley commented on a change in pull request #1602: URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466526966 ## File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java ## @@ -87,6 +100,7 @@ private SolrIndexConfig(SolrConfig solrConfig) { maxBufferedDocs = -1; ramBufferSizeMB = 100; ramPerThreadHardLimitMB = -1; +maxCommitMergeWaitMillis = -1; Review comment: Lets use `org.apache.lucene.index.IndexWriterConfig#DEFAULT_MAX_COMMIT_MERGE_WAIT_MILLIS`. In the future, this constant might change, and that's good. ## File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java ## @@ -68,6 +68,19 @@ public final double ramBufferSizeMB; public final int ramPerThreadHardLimitMB; + /** Review comment: I appreciate documentation but _redundant_ documentation -- not so much. Can't you do a @see or @link to one place -- `org.apache.lucene.index.IndexWriterConfig#setMaxCommitMergeWaitMillis` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14582) Expose IWC.setMaxCommitMergeWaitMillis as an expert feature in Solr's index config
[ https://issues.apache.org/jira/browse/SOLR-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172497#comment-17172497 ] Tomas Eduardo Fernandez Lobbe commented on SOLR-14582: -- David, there is a PR for this that I forgot to merge(don’t know why it’s not linked here): https://github.com/apache/lucene-solr/pull/1602 > Expose IWC.setMaxCommitMergeWaitMillis as an expert feature in Solr's index > config > -- > > Key: SOLR-14582 > URL: https://issues.apache.org/jira/browse/SOLR-14582 > Project: Solr > Issue Type: Improvement >Reporter: Tomas Eduardo Fernandez Lobbe >Assignee: David Smiley >Priority: Trivial > > LUCENE-8962 added the ability to merge segments synchronously on commit. This > isn't done by default and the default {{MergePolicy}} won't do it, but custom > merge policies can take advantage of this. Solr allows plugging in custom > merge policies, so if someone wants to make use of this feature they could, > however, they need to set {{IndexWriterConfig.maxCommitMergeWaitSeconds}} to > something greater than 0. > Since this is an expert feature, I plan to document it only in javadoc and > not the ref guide. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14582) Expose IWC.setMaxCommitMergeWaitMillis as an expert feature in Solr's index config
[ https://issues.apache.org/jira/browse/SOLR-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley reassigned SOLR-14582: --- Assignee: David Smiley > Expose IWC.setMaxCommitMergeWaitMillis as an expert feature in Solr's index > config > -- > > Key: SOLR-14582 > URL: https://issues.apache.org/jira/browse/SOLR-14582 > Project: Solr > Issue Type: Improvement >Reporter: Tomas Eduardo Fernandez Lobbe >Assignee: David Smiley >Priority: Trivial > > LUCENE-8962 added the ability to merge segments synchronously on commit. This > isn't done by default and the default {{MergePolicy}} won't do it, but custom > merge policies can take advantage of this. Solr allows plugging in custom > merge policies, so if someone wants to make use of this feature they could, > however, they need to set {{IndexWriterConfig.maxCommitMergeWaitSeconds}} to > something greater than 0. > Since this is an expert feature, I plan to document it only in javadoc and > not the ref guide. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley opened a new pull request #1723: SOLR prometheus: simplify concurrent collection
dsmiley opened a new pull request #1723: URL: https://github.com/apache/lucene-solr/pull/1723 The intent of this is to simplify some concurrent code in the Prometheus exporter that I think is too confusing / contorted -- particularly Async.java. Git blame points at @shalinmangar so I would love a review to see what you think. I played with a few different approaches, and ultimately realized that we're working around using an Executor instead of an ExecutorService to benefit from invokeAll. I wish Java didn't have a distinction between Executor & ExecutorService but there is and we should all just ignore plain Executor IMO. I haven't run this in-the-field but I could do so locally. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14654) Remove plugin loading from .system collection (for 9.0)
[ https://issues.apache.org/jira/browse/SOLR-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172459#comment-17172459 ] ASF subversion and git services commented on SOLR-14654: Commit 35bf1785ec2f4131694cf7f23a139dbb7291cc7c in lucene-solr's branch refs/heads/master from Cassandra Targett [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=35bf178 ] SOLR-14654: actually fix the Ref Guide build failure > Remove plugin loading from .system collection (for 9.0) > --- > > Key: SOLR-14654 > URL: https://issues.apache.org/jira/browse/SOLR-14654 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > Fix For: master (9.0) > > Time Spent: 0.5h > Remaining Estimate: 0h > > This code must go from master > all places where "runtimeLib" can be used will be removed from 9.0 . With > the new package system in place we don;t need this anymore -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] HoustonPutman commented on pull request #1718: SOLR-14702: Add Upgrade Notes and CHANGES entry
HoustonPutman commented on pull request #1718: URL: https://github.com/apache/lucene-solr/pull/1718#issuecomment-669971947 I think adding something in the ref guide upgrade notes too would be worthwile. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14716) Ref Guide: update leader/follower terminology
[ https://issues.apache.org/jira/browse/SOLR-14716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-14716: - Description: The effort to remove oppressive terminology in SOLR-14702 led to somewhat awkward phrasing on how to refer to non-SolrCloud configurations, specifically "leader/follower mode", which is potentially very confusing since SolrCloud also has leaders and one could consider replicas to be followers. I propose that we standardize what we call these two modes as "coordinated mode" (SolrCloud) and "uncoordinated mode" (or "non-coordinated" if people prefer). I chose this because in thinking about what really differentiates the two approaches is the ZooKeeper coordination for requests, configs, etc. There are other differences too, of course, but that's the biggest one that stuck out to me as a key differentiator and applicable in the naming. There are also places in the Ref Guide where we refer to "standalone mode", which in many cases means "any cluster not running SolrCloud". This has always been problematic, because the word "standalone" implies a single node, but it's of course pretty much always been possible to have a cluster of multiple nodes that don't run SolrCloud/ZK. This issue would address those examples also. Note that I'm not proposing replacing the word "SolrCloud" throughout the documentation. Instead I'll augment the use of the word "SolrCloud" with clarification that this term means "coordinated mode". Later if we ever replace SolrCloud references in code and fully remove that name, the conceptual groundwork will have already been laid for users. was: The effort to remove oppressive terminology in SOLR-14702 led to somewhat awkward phrasing on how to refer to non-SolrCloud configurations, specifically "leader/follower mode", which is potentially very confusing since SolrCloud also has leaders and one could consider replicas to be followers. I propose that we standardize what we call these two modes as "coordinated mode" (SolrCloud) and "uncoordinated mode" (or "non-coordinated" if people prefer). I chose this because in thinking about what really differentiates the two approaches is the ZooKeeper coordination for requests. There are other differences too, of course, but that's the biggest one that stuck out to me as a key differentiator and applicable in the naming. There are also places in the Ref Guide where we refer to "standalone mode", which in many cases means "any cluster not running SolrCloud". This has always been problematic, because the word "standalone" implies a single node, but it's of course pretty much always been possible to have a cluster of multiple nodes that don't run SolrCloud/ZK. This issue would address those examples also. Note that I'm not proposing replacing the word "SolrCloud" throughout the documentation. Instead I'll augment the use of the word "SolrCloud" with clarification that this term means "coordinated mode". Later if we ever replace SolrCloud references in code and fully remove that name, the conceptual groundwork will have already been laid for users. > Ref Guide: update leader/follower terminology > - > > Key: SOLR-14716 > URL: https://issues.apache.org/jira/browse/SOLR-14716 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Cassandra Targett >Priority: Major > > The effort to remove oppressive terminology in SOLR-14702 led to somewhat > awkward phrasing on how to refer to non-SolrCloud configurations, > specifically "leader/follower mode", which is potentially very confusing > since SolrCloud also has leaders and one could consider replicas to be > followers. > I propose that we standardize what we call these two modes as "coordinated > mode" (SolrCloud) and "uncoordinated mode" (or "non-coordinated" if people > prefer). I chose this because in thinking about what really differentiates > the two approaches is the ZooKeeper coordination for requests, configs, etc. > There are other differences too, of course, but that's the biggest one that > stuck out to me as a key differentiator and applicable in the naming. > There are also places in the Ref Guide where we refer to "standalone mode", > which in many cases means "any cluster not running SolrCloud". This has > always been problematic, because the word "standalone" implies a single node, > but it's of course pretty much always been possible to have a cluster of > multiple nodes that don't run SolrCloud/ZK. This issue would address those > examples also. > Note that I'm not proposing replacing the word "SolrCloud" throughout the > documentation. Instead I'll augment the use of the word "SolrCloud" wit
[jira] [Created] (SOLR-14716) Ref Guide: update leader/follower terminology
Cassandra Targett created SOLR-14716: Summary: Ref Guide: update leader/follower terminology Key: SOLR-14716 URL: https://issues.apache.org/jira/browse/SOLR-14716 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: documentation Reporter: Cassandra Targett The effort to remove oppressive terminology in SOLR-14702 led to somewhat awkward phrasing on how to refer to non-SolrCloud configurations, specifically "leader/follower mode", which is potentially very confusing since SolrCloud also has leaders and one could consider replicas to be followers. I propose that we standardize what we call these two modes as "coordinated mode" (SolrCloud) and "uncoordinated mode" (or "non-coordinated" if people prefer). I chose this because in thinking about what really differentiates the two approaches is the ZooKeeper coordination for requests. There are other differences too, of course, but that's the biggest one that stuck out to me as a key differentiator and applicable in the naming. There are also places in the Ref Guide where we refer to "standalone mode", which in many cases means "any cluster not running SolrCloud". This has always been problematic, because the word "standalone" implies a single node, but it's of course pretty much always been possible to have a cluster of multiple nodes that don't run SolrCloud/ZK. This issue would address those examples also. Note that I'm not proposing replacing the word "SolrCloud" throughout the documentation. Instead I'll augment the use of the word "SolrCloud" with clarification that this term means "coordinated mode". Later if we ever replace SolrCloud references in code and fully remove that name, the conceptual groundwork will have already been laid for users. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14654) Remove plugin loading from .system collection (for 9.0)
[ https://issues.apache.org/jira/browse/SOLR-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172368#comment-17172368 ] Cassandra Targett commented on SOLR-14654: -- That didn't fix it, [~noble.paul], it's the reference in the {{page-children}} section at the top of the file that creates the page hierarchy. That's where it needs to be removed. > Remove plugin loading from .system collection (for 9.0) > --- > > Key: SOLR-14654 > URL: https://issues.apache.org/jira/browse/SOLR-14654 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > Fix For: master (9.0) > > Time Spent: 0.5h > Remaining Estimate: 0h > > This code must go from master > all places where "runtimeLib" can be used will be removed from 9.0 . With > the new package system in place we don;t need this anymore -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14654) Remove plugin loading from .system collection (for 9.0)
[ https://issues.apache.org/jira/browse/SOLR-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172360#comment-17172360 ] ASF subversion and git services commented on SOLR-14654: Commit ddbe9495fc4e3348dd4db653eb01c3f62c1e1a10 in lucene-solr's branch refs/heads/master from noblepaul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ddbe949 ] SOLR-14654: ref-guide build failure > Remove plugin loading from .system collection (for 9.0) > --- > > Key: SOLR-14654 > URL: https://issues.apache.org/jira/browse/SOLR-14654 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > Fix For: master (9.0) > > Time Spent: 0.5h > Remaining Estimate: 0h > > This code must go from master > all places where "runtimeLib" can be used will be removed from 9.0 . With > the new package system in place we don;t need this anymore -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz edited a comment on pull request #1543: LUCENE-9378: Disable compression on binary values whose length is less than 32.
jpountz edited a comment on pull request #1543: URL: https://github.com/apache/lucene-solr/pull/1543#issuecomment-669927391 > But, couldn't we instead just subclass Lucene's default codec, override {{getDocValuesFormatPerField}} to subclass {{Lucene80DocValuesFormat}} (oh, I see, yeah we cannot do that -- this class is final, which makes sense). I was thinking since this (whether to compress each block) is purely a write time decision, it could still be done as Lucene80 doc values format SPI. To me we only guarantee backward compatibility for users of the default codec. With the approach you mentioned, indices would be backward compatible, but I'm seeing this as accidental rather than something we guarantee. > But then I wonder why not just add a boolean compress option to Lucene80DocValuesFormat? This is similar to the compression Mode we pass to stored fields and term vectors format at write time, and it'd allow users who would like to disable BINARY doc values compression to keep backwards compatibility. I wanted to look into whether we could avoid this as it would boil down to maintaining two doc-value formats, but this might be the best way forward as it looks like the heuristics we tried out above don't work well to disable compression for use-cases when it hurts more than it helps. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on pull request #1543: LUCENE-9378: Disable compression on binary values whose length is less than 32.
jpountz commented on pull request #1543: URL: https://github.com/apache/lucene-solr/pull/1543#issuecomment-669927391 > But, couldn't we instead just subclass Lucene's default codec, override {{getDocValuesFormatPerField}} to subclass {{Lucene80DocValuesFormat}} (oh, I see, yeah we cannot do that -- this class is final, which makes sense). I was thinking since this (whether to compress each block) is purely a write time decision, it could still be done as Lucene80 doc values format SPI. The codec is final, but you can still do the same thing with FilterCodec. To me we only guarantee backward compatibility for users of the default codec. With the approach you mentioned, indices would be backward compatible, but I'm seeing this as accidental rather than something we guarantee. > But then I wonder why not just add a boolean compress option to Lucene80DocValuesFormat? This is similar to the compression Mode we pass to stored fields and term vectors format at write time, and it'd allow users who would like to disable BINARY doc values compression to keep backwards compatibility. I wanted to look into whether we could avoid this as it would boil down to maintaining two doc-value formats, but this might be the best way forward as it looks like the heuristics we tried out above don't work well to disable compression for use-cases when it hurts more than it helps. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14702) Remove Master and Slave from Code Base and Docs
[ https://issues.apache.org/jira/browse/SOLR-14702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172355#comment-17172355 ] Cassandra Targett commented on SOLR-14702: -- bq. I don't know how others will feel about dropping the "SolrCloud" term, someone suggested in the past I think? I didn't plan on wholesale replacing "SolrCloud" as a term, but this could be a first step is doing that. My thought was that I would try to make it clear that "SolrCloud" = "coordinated mode" and what we call SolrCloud really means that. A conceptual shift, as it were. Later when/if we get around to replacing the SolrCloud terminology in the code, we can eradicate the term from the docs. Since I'll be doing this as a separate thing from this issue, I'll file a new Jira and we can discuss what I'm thinking in more detail there to agree on the terminology. > Remove Master and Slave from Code Base and Docs > --- > > Key: SOLR-14702 > URL: https://issues.apache.org/jira/browse/SOLR-14702 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: master (9.0) >Reporter: Marcus Eagan >Priority: Critical > Attachments: SOLR-14742-testfix.patch > > Time Spent: 15h 50m > Remaining Estimate: 0h > > Every time I read _master_ and _slave_, I get pissed. > I think about the last and only time I remember visiting my maternal great > grandpa in Alabama at four years old. He was a sharecropper before WWI, where > he lost his legs, and then he was back to being a sharecropper somehow after > the war. Crazy, I know. I don't know if the world still called his job > sharecropping in 1993, but he was basically a slave—in America. He lived in > the same shack that his father, and his grandfather (born a slave) lived in > down in Alabama. Believe it or not, my dad's (born in 1926) grandfather was > actually born a slave, freed shortly after birth by his owner father. I never > met him, though. He died in the 40s. > Anyway, I cannot police all terms in the repo and do not wish to. This > master/slave shit is archaic and misleading on technical grounds. Thankfully, > there's only a handful of files in code and documentation that still talk > about masters and slaves. We should replace all of them. > There are so many ways to reword it. In fact, unless anyone else objects or > wants to do the grunt work to help my stress levels, I will open the pull > request myself in effort to make this project and community more inviting to > people of all backgrounds and histories. We can have leader/follower, or > primary/secondary, but none of this Master/Slave nonsense. I'm sick of the > garbage. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14557) Unable to parse local params followed by parenthesis like {!lucene}(gigabyte)
[ https://issues.apache.org/jira/browse/SOLR-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172345#comment-17172345 ] Mikhail Khludnev commented on SOLR-14557: - Trap syntax prevails at [https://lucene.apache.org/solr/guide/8_6/local-parameters-in-queries.html], however the safer one is mentioned there as well. I think it make sense to ban {{\\{!prefix}}}syntax from 9.0. > Unable to parse local params followed by parenthesis like {!lucene}(gigabyte) > - > > Key: SOLR-14557 > URL: https://issues.apache.org/jira/browse/SOLR-14557 > Project: Solr > Issue Type: Bug > Components: query parsers >Reporter: Mikhail Khludnev >Assignee: Mikhail Khludnev >Priority: Major > Labels: painful > Attachments: SOLR-14557.patch, SOLR-14557.patch, SOLR-14557.patch > > > h2. Solr 4.5 > {{/select?defType=edismax&q=\{!lucene}(foo)&debugQuery=true}} > > goes like > {code} > \{!lucene}(foo) > content:foo > LuceneQParser > {code} > fine > h2. Solr 8.2 > with luceneMatchVersion=4.5 following SOLR-11501 I know it's a grey zone but > it's a question of migrating existing queries. > {{/select?defType=edismax&q=\{!lucene}(foo)&debugQuery=true}} > goes like > {code} > "querystring":"\{!lucene}(foo)", > "parsedquery":"+DisjunctionMaxQuery(((Project.Address:lucene > Project.Address:foo) | (Project.OwnerType:lucene Project.OwnerType:foo) > "QParser":"ExtendedDismaxQParser", > {code} > blah... > but removing braces in 8.2 works perfectly fine > {code} > "querystring":"\{!lucene}foo", > "parsedquery":"+content:foo", > "parsedquery_toString":"+content:foo", > "QParser":"ExtendedDismaxQParser", > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components
dweiss commented on a change in pull request #1721: URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466404602 ## File path: lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java ## @@ -0,0 +1,503 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search.matchhighlight; + +import org.apache.lucene.analysis.Analyzer; +import org.apache.lucene.analysis.TokenStream; +import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; +import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; +import org.apache.lucene.document.Document; +import org.apache.lucene.index.FieldInfo; +import org.apache.lucene.index.FieldInfos; +import org.apache.lucene.index.IndexReader; +import org.apache.lucene.index.LeafReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.search.IndexSearcher; +import org.apache.lucene.search.Matches; +import org.apache.lucene.search.MatchesIterator; +import org.apache.lucene.search.Query; +import org.apache.lucene.search.QueryVisitor; +import org.apache.lucene.search.ScoreMode; +import org.apache.lucene.search.Weight; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PrimitiveIterator; +import java.util.Set; +import java.util.TreeMap; +import java.util.TreeSet; +import java.util.function.Predicate; + +/** + * Utility class to compute a list of "hit regions" for a given query, searcher and + * document(s) using {@link Matches} API. + */ +public class MatchRegionRetriever { + private final List leaves; + private final Weight weight; + private final TreeSet affectedFields; + private final Map offsetStrategies; + private final Set preloadFields; + + public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer analyzer) + throws IOException { +leaves = searcher.getIndexReader().leaves(); +assert checkOrderConsistency(leaves); + +weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0); + +// Compute the subset of fields affected by this query so that we don't load or scan +// fields that are irrelevant. +affectedFields = new TreeSet<>(); +query.visit( +new QueryVisitor() { + @Override + public boolean acceptField(String field) { +affectedFields.add(field); +return false; + } +}); + +// Compute value offset retrieval strategy for all affected fields. +offsetStrategies = +computeOffsetStrategies(affectedFields, searcher.getIndexReader(), analyzer); + +// Ask offset strategies if they'll need field values. +preloadFields = new HashSet<>(); +offsetStrategies.forEach( +(field, strategy) -> { + if (strategy.requiresDocument()) { +preloadFields.add(field); + } +}); + +// Only preload those field values that can be affected by the query and are required +// by strategies. +preloadFields.retainAll(affectedFields); + } + + public void highlightDocuments(PrimitiveIterator.OfInt docIds, HitRegionConsumer consumer) + throws IOException { +if (leaves.isEmpty() || affectedFields.isEmpty()) { + return; +} + +Iterator ctx = leaves.iterator(); +LeafReaderContext currentContext = ctx.next(); +int previousDocId = -1; +Map> highlights = new TreeMap<>(); +while (docIds.hasNext()) { + int docId = docIds.nextInt(); + + if (docId < previousDocId) { +throw new RuntimeException("Input document IDs must be sorted (increasing)."); + } + previousDocId = docId; + + while (docId >= currentContext.docBase + currentContext.reader().maxDoc()) { +currentContext = ctx.next(); + } + + int contextRelativeDocId = docId - currentContext.docBase; + + // Only preload fields we may potentially need. + FieldValueProvider documentSupplier; + if (preloadFields.isEmpty()) { +documentSupplier = null; + } else { +
[jira] [Created] (LUCENE-9447) Make BEST_COMPRESSION compress more aggressively?
Adrien Grand created LUCENE-9447: Summary: Make BEST_COMPRESSION compress more aggressively? Key: LUCENE-9447 URL: https://issues.apache.org/jira/browse/LUCENE-9447 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand The Lucene86 codec supports setting a "Mode" for stored fields compression, that is either "BEST_SPEED", which translates to blocks of 16kB or 128 documents (whichever is hit first) compressed with LZ4, or "BEST_COMPRESSION", which translates to blocks of 60kB or 512 documents compressed with DEFLATE with default compression level (6). After looking at indices that spent most disk space on stored fields recently, I noticed that there was quite some room for improvement by increasing the block size even further: ||Block size||Stored fields size|| |60kB|168412338| |128kB|130813639| |256kB|113587009| |512kB|104776378| |1MB|100367095| |2MB|98152464| |4MB|97034425| |8MB|96478746| For this specific dataset, I had 1M documents that each had about 2kB of stored fields each and quite some redundancy. This makes me want to look into bumping this block size to maybe 256kB. It would be interesting to re-do the experiments we did on LUCENE-6100 to see how this affects the merging speed. That said I don't think it would be terrible if the merging time increased a bit given that we already offer the BEST_SPEED option for CPU-savvy users. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14654) Remove plugin loading from .system collection (for 9.0)
[ https://issues.apache.org/jira/browse/SOLR-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172330#comment-17172330 ] Cassandra Targett commented on SOLR-14654: -- The Ref Guide changes in the last commit here [~noble.paul] broke the Ref Guide build because while you deleted the page {{adding-custom-plugins-in-solrcloud-mode.adoc}}, you did not remove the reference to it from its prior parent, {{solr-plugins.adoc}}. (The Ref Guide builds are down now due to the CI migration, otherwise they would have been complaining about this.) > Remove plugin loading from .system collection (for 9.0) > --- > > Key: SOLR-14654 > URL: https://issues.apache.org/jira/browse/SOLR-14654 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > Fix For: master (9.0) > > Time Spent: 0.5h > Remaining Estimate: 0h > > This code must go from master > all places where "runtimeLib" can be used will be removed from 9.0 . With > the new package system in place we don;t need this anymore -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components
dweiss commented on a change in pull request #1721: URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466387137 ## File path: lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java ## @@ -0,0 +1,503 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search.matchhighlight; + +import org.apache.lucene.analysis.Analyzer; +import org.apache.lucene.analysis.TokenStream; +import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; +import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; +import org.apache.lucene.document.Document; +import org.apache.lucene.index.FieldInfo; +import org.apache.lucene.index.FieldInfos; +import org.apache.lucene.index.IndexReader; +import org.apache.lucene.index.LeafReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.search.IndexSearcher; +import org.apache.lucene.search.Matches; +import org.apache.lucene.search.MatchesIterator; +import org.apache.lucene.search.Query; +import org.apache.lucene.search.QueryVisitor; +import org.apache.lucene.search.ScoreMode; +import org.apache.lucene.search.Weight; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PrimitiveIterator; +import java.util.Set; +import java.util.TreeMap; +import java.util.TreeSet; +import java.util.function.Predicate; + +/** + * Utility class to compute a list of "hit regions" for a given query, searcher and + * document(s) using {@link Matches} API. + */ +public class MatchRegionRetriever { + private final List leaves; + private final Weight weight; + private final TreeSet affectedFields; + private final Map offsetStrategies; + private final Set preloadFields; + + public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer analyzer) + throws IOException { +leaves = searcher.getIndexReader().leaves(); +assert checkOrderConsistency(leaves); + +weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0); + +// Compute the subset of fields affected by this query so that we don't load or scan +// fields that are irrelevant. +affectedFields = new TreeSet<>(); +query.visit( +new QueryVisitor() { + @Override + public boolean acceptField(String field) { +affectedFields.add(field); +return false; + } +}); + +// Compute value offset retrieval strategy for all affected fields. +offsetStrategies = +computeOffsetStrategies(affectedFields, searcher.getIndexReader(), analyzer); + +// Ask offset strategies if they'll need field values. +preloadFields = new HashSet<>(); +offsetStrategies.forEach( +(field, strategy) -> { + if (strategy.requiresDocument()) { +preloadFields.add(field); + } +}); + +// Only preload those field values that can be affected by the query and are required +// by strategies. +preloadFields.retainAll(affectedFields); + } + + public void highlightDocuments(PrimitiveIterator.OfInt docIds, HitRegionConsumer consumer) Review comment: Will do, thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1721: LUCENE-9439: match region highlighter components
romseygeek commented on a change in pull request #1721: URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466387150 ## File path: lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java ## @@ -0,0 +1,503 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search.matchhighlight; + +import org.apache.lucene.analysis.Analyzer; +import org.apache.lucene.analysis.TokenStream; +import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; +import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; +import org.apache.lucene.document.Document; +import org.apache.lucene.index.FieldInfo; +import org.apache.lucene.index.FieldInfos; +import org.apache.lucene.index.IndexReader; +import org.apache.lucene.index.LeafReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.search.IndexSearcher; +import org.apache.lucene.search.Matches; +import org.apache.lucene.search.MatchesIterator; +import org.apache.lucene.search.Query; +import org.apache.lucene.search.QueryVisitor; +import org.apache.lucene.search.ScoreMode; +import org.apache.lucene.search.Weight; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PrimitiveIterator; +import java.util.Set; +import java.util.TreeMap; +import java.util.TreeSet; +import java.util.function.Predicate; + +/** + * Utility class to compute a list of "hit regions" for a given query, searcher and + * document(s) using {@link Matches} API. + */ +public class MatchRegionRetriever { + private final List leaves; + private final Weight weight; + private final TreeSet affectedFields; + private final Map offsetStrategies; + private final Set preloadFields; + + public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer analyzer) + throws IOException { +leaves = searcher.getIndexReader().leaves(); +assert checkOrderConsistency(leaves); + +weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0); + +// Compute the subset of fields affected by this query so that we don't load or scan +// fields that are irrelevant. +affectedFields = new TreeSet<>(); +query.visit( +new QueryVisitor() { + @Override + public boolean acceptField(String field) { +affectedFields.add(field); +return false; + } +}); + +// Compute value offset retrieval strategy for all affected fields. +offsetStrategies = +computeOffsetStrategies(affectedFields, searcher.getIndexReader(), analyzer); + +// Ask offset strategies if they'll need field values. +preloadFields = new HashSet<>(); +offsetStrategies.forEach( +(field, strategy) -> { + if (strategy.requiresDocument()) { +preloadFields.add(field); + } +}); + +// Only preload those field values that can be affected by the query and are required +// by strategies. +preloadFields.retainAll(affectedFields); + } + + public void highlightDocuments(PrimitiveIterator.OfInt docIds, HitRegionConsumer consumer) + throws IOException { +if (leaves.isEmpty() || affectedFields.isEmpty()) { + return; +} + +Iterator ctx = leaves.iterator(); +LeafReaderContext currentContext = ctx.next(); +int previousDocId = -1; +Map> highlights = new TreeMap<>(); +while (docIds.hasNext()) { + int docId = docIds.nextInt(); + + if (docId < previousDocId) { +throw new RuntimeException("Input document IDs must be sorted (increasing)."); + } + previousDocId = docId; + + while (docId >= currentContext.docBase + currentContext.reader().maxDoc()) { +currentContext = ctx.next(); + } + + int contextRelativeDocId = docId - currentContext.docBase; + + // Only preload fields we may potentially need. + FieldValueProvider documentSupplier; + if (preloadFields.isEmpty()) { +documentSupplier = null; + } else { +
[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1721: LUCENE-9439: match region highlighter components
romseygeek commented on a change in pull request #1721: URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466384339 ## File path: lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java ## @@ -0,0 +1,503 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search.matchhighlight; + +import org.apache.lucene.analysis.Analyzer; +import org.apache.lucene.analysis.TokenStream; +import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; +import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; +import org.apache.lucene.document.Document; +import org.apache.lucene.index.FieldInfo; +import org.apache.lucene.index.FieldInfos; +import org.apache.lucene.index.IndexReader; +import org.apache.lucene.index.LeafReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.search.IndexSearcher; +import org.apache.lucene.search.Matches; +import org.apache.lucene.search.MatchesIterator; +import org.apache.lucene.search.Query; +import org.apache.lucene.search.QueryVisitor; +import org.apache.lucene.search.ScoreMode; +import org.apache.lucene.search.Weight; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PrimitiveIterator; +import java.util.Set; +import java.util.TreeMap; +import java.util.TreeSet; +import java.util.function.Predicate; + +/** + * Utility class to compute a list of "hit regions" for a given query, searcher and + * document(s) using {@link Matches} API. + */ +public class MatchRegionRetriever { + private final List leaves; + private final Weight weight; + private final TreeSet affectedFields; + private final Map offsetStrategies; + private final Set preloadFields; + + public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer analyzer) + throws IOException { +leaves = searcher.getIndexReader().leaves(); +assert checkOrderConsistency(leaves); + +weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0); + +// Compute the subset of fields affected by this query so that we don't load or scan +// fields that are irrelevant. +affectedFields = new TreeSet<>(); +query.visit( +new QueryVisitor() { + @Override + public boolean acceptField(String field) { +affectedFields.add(field); +return false; + } +}); + +// Compute value offset retrieval strategy for all affected fields. +offsetStrategies = +computeOffsetStrategies(affectedFields, searcher.getIndexReader(), analyzer); + +// Ask offset strategies if they'll need field values. +preloadFields = new HashSet<>(); +offsetStrategies.forEach( +(field, strategy) -> { + if (strategy.requiresDocument()) { +preloadFields.add(field); + } +}); + +// Only preload those field values that can be affected by the query and are required +// by strategies. +preloadFields.retainAll(affectedFields); + } + + public void highlightDocuments(PrimitiveIterator.OfInt docIds, HitRegionConsumer consumer) Review comment: A wrapper method that takes `TopDocs` and sorts the internal ids sounds like the best option here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components
dweiss commented on a change in pull request #1721: URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466380491 ## File path: lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java ## @@ -0,0 +1,503 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search.matchhighlight; + +import org.apache.lucene.analysis.Analyzer; +import org.apache.lucene.analysis.TokenStream; +import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; +import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; +import org.apache.lucene.document.Document; +import org.apache.lucene.index.FieldInfo; +import org.apache.lucene.index.FieldInfos; +import org.apache.lucene.index.IndexReader; +import org.apache.lucene.index.LeafReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.search.IndexSearcher; +import org.apache.lucene.search.Matches; +import org.apache.lucene.search.MatchesIterator; +import org.apache.lucene.search.Query; +import org.apache.lucene.search.QueryVisitor; +import org.apache.lucene.search.ScoreMode; +import org.apache.lucene.search.Weight; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PrimitiveIterator; +import java.util.Set; +import java.util.TreeMap; +import java.util.TreeSet; +import java.util.function.Predicate; + +/** + * Utility class to compute a list of "hit regions" for a given query, searcher and + * document(s) using {@link Matches} API. + */ +public class MatchRegionRetriever { + private final List leaves; + private final Weight weight; + private final TreeSet affectedFields; + private final Map offsetStrategies; + private final Set preloadFields; + + public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer analyzer) + throws IOException { +leaves = searcher.getIndexReader().leaves(); +assert checkOrderConsistency(leaves); + +weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0); + +// Compute the subset of fields affected by this query so that we don't load or scan +// fields that are irrelevant. +affectedFields = new TreeSet<>(); +query.visit( +new QueryVisitor() { + @Override + public boolean acceptField(String field) { +affectedFields.add(field); +return false; + } +}); + +// Compute value offset retrieval strategy for all affected fields. +offsetStrategies = +computeOffsetStrategies(affectedFields, searcher.getIndexReader(), analyzer); + +// Ask offset strategies if they'll need field values. +preloadFields = new HashSet<>(); +offsetStrategies.forEach( +(field, strategy) -> { + if (strategy.requiresDocument()) { +preloadFields.add(field); + } +}); + +// Only preload those field values that can be affected by the query and are required +// by strategies. +preloadFields.retainAll(affectedFields); + } + + public void highlightDocuments(PrimitiveIterator.OfInt docIds, HitRegionConsumer consumer) Review comment: Oh, there is another reason too. Internally this "streaming" method requires increasing document IDs so that the bookkeeping of leaf readers is simplified (no need to continuously bisect each document ID). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14715) Update processor initialization should be skipped on PULL replicas
Erick Erickson created SOLR-14715: - Summary: Update processor initialization should be skipped on PULL replicas Key: SOLR-14715 URL: https://issues.apache.org/jira/browse/SOLR-14715 Project: Solr Issue Type: Test Security Level: Public (Default Security Level. Issues are Public) Reporter: Erick Erickson >From the user's list: {quote} Our PULL replicas... fail to start with below exception when DocBasedVersionConstraintsProcessorFactory is added to UpdateProcessorChain. Caused by: org.apache.solr.common.SolrException: updateLog must be enabled. at org.apache.solr.core.SolrCore.(SolrCore.java:1014) at org.apache.solr.core.SolrCore.(SolrCore.java:869) at org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1138) ... 45 more Caused by: org.apache.solr.common.SolrException: updateLog must be enabled. at org.apache.solr.update.processor.DocBasedVersionConstraintsProcessorFactory.inform(DocBasedVersionConstraintsProcessorFactory.java:168) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:696) at org.apache.solr.core.SolrCore.(SolrCore.java:993) ... 47 more {quote} and Tomás' reply: {quote} This is an interesting bug. I’m wondering if we can completely skip the initialization of UpdateRequestProcessorFactories in PULL replicas... {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on pull request #1721: LUCENE-9439: match region highlighter components
dweiss commented on pull request #1721: URL: https://github.com/apache/lucene-solr/pull/1721#issuecomment-669879507 I don't use hamcrest but I'll take a look at what I can do. No need to pull in another library just for one test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components
dweiss commented on a change in pull request #1721: URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466354710 ## File path: lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java ## @@ -0,0 +1,503 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search.matchhighlight; + +import org.apache.lucene.analysis.Analyzer; +import org.apache.lucene.analysis.TokenStream; +import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; +import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; +import org.apache.lucene.document.Document; +import org.apache.lucene.index.FieldInfo; +import org.apache.lucene.index.FieldInfos; +import org.apache.lucene.index.IndexReader; +import org.apache.lucene.index.LeafReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.search.IndexSearcher; +import org.apache.lucene.search.Matches; +import org.apache.lucene.search.MatchesIterator; +import org.apache.lucene.search.Query; +import org.apache.lucene.search.QueryVisitor; +import org.apache.lucene.search.ScoreMode; +import org.apache.lucene.search.Weight; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PrimitiveIterator; +import java.util.Set; +import java.util.TreeMap; +import java.util.TreeSet; +import java.util.function.Predicate; + +/** + * Utility class to compute a list of "hit regions" for a given query, searcher and + * document(s) using {@link Matches} API. + */ +public class MatchRegionRetriever { + private final List leaves; + private final Weight weight; + private final TreeSet affectedFields; + private final Map offsetStrategies; + private final Set preloadFields; + + public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer analyzer) + throws IOException { +leaves = searcher.getIndexReader().leaves(); +assert checkOrderConsistency(leaves); + +weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0); + +// Compute the subset of fields affected by this query so that we don't load or scan +// fields that are irrelevant. +affectedFields = new TreeSet<>(); +query.visit( +new QueryVisitor() { + @Override + public boolean acceptField(String field) { +affectedFields.add(field); +return false; + } +}); + +// Compute value offset retrieval strategy for all affected fields. +offsetStrategies = +computeOffsetStrategies(affectedFields, searcher.getIndexReader(), analyzer); + +// Ask offset strategies if they'll need field values. +preloadFields = new HashSet<>(); +offsetStrategies.forEach( +(field, strategy) -> { + if (strategy.requiresDocument()) { +preloadFields.add(field); + } +}); + +// Only preload those field values that can be affected by the query and are required +// by strategies. +preloadFields.retainAll(affectedFields); + } + + public void highlightDocuments(PrimitiveIterator.OfInt docIds, HitRegionConsumer consumer) + throws IOException { +if (leaves.isEmpty() || affectedFields.isEmpty()) { + return; +} + +Iterator ctx = leaves.iterator(); +LeafReaderContext currentContext = ctx.next(); +int previousDocId = -1; +Map> highlights = new TreeMap<>(); +while (docIds.hasNext()) { + int docId = docIds.nextInt(); + + if (docId < previousDocId) { +throw new RuntimeException("Input document IDs must be sorted (increasing)."); + } + previousDocId = docId; + + while (docId >= currentContext.docBase + currentContext.reader().maxDoc()) { +currentContext = ctx.next(); + } + + int contextRelativeDocId = docId - currentContext.docBase; + + // Only preload fields we may potentially need. + FieldValueProvider documentSupplier; + if (preloadFields.isEmpty()) { +documentSupplier = null; + } else { +
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components
dweiss commented on a change in pull request #1721: URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466353991 ## File path: lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java ## @@ -0,0 +1,503 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search.matchhighlight; + +import org.apache.lucene.analysis.Analyzer; +import org.apache.lucene.analysis.TokenStream; +import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; +import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; +import org.apache.lucene.document.Document; +import org.apache.lucene.index.FieldInfo; +import org.apache.lucene.index.FieldInfos; +import org.apache.lucene.index.IndexReader; +import org.apache.lucene.index.LeafReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.search.IndexSearcher; +import org.apache.lucene.search.Matches; +import org.apache.lucene.search.MatchesIterator; +import org.apache.lucene.search.Query; +import org.apache.lucene.search.QueryVisitor; +import org.apache.lucene.search.ScoreMode; +import org.apache.lucene.search.Weight; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PrimitiveIterator; +import java.util.Set; +import java.util.TreeMap; +import java.util.TreeSet; +import java.util.function.Predicate; + +/** + * Utility class to compute a list of "hit regions" for a given query, searcher and + * document(s) using {@link Matches} API. + */ +public class MatchRegionRetriever { + private final List leaves; + private final Weight weight; + private final TreeSet affectedFields; + private final Map offsetStrategies; + private final Set preloadFields; + + public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer analyzer) + throws IOException { +leaves = searcher.getIndexReader().leaves(); +assert checkOrderConsistency(leaves); + +weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0); + +// Compute the subset of fields affected by this query so that we don't load or scan +// fields that are irrelevant. +affectedFields = new TreeSet<>(); +query.visit( +new QueryVisitor() { + @Override + public boolean acceptField(String field) { +affectedFields.add(field); +return false; + } +}); + +// Compute value offset retrieval strategy for all affected fields. +offsetStrategies = +computeOffsetStrategies(affectedFields, searcher.getIndexReader(), analyzer); + +// Ask offset strategies if they'll need field values. +preloadFields = new HashSet<>(); +offsetStrategies.forEach( +(field, strategy) -> { + if (strategy.requiresDocument()) { +preloadFields.add(field); + } +}); + +// Only preload those field values that can be affected by the query and are required +// by strategies. +preloadFields.retainAll(affectedFields); + } + + public void highlightDocuments(PrimitiveIterator.OfInt docIds, HitRegionConsumer consumer) Review comment: I can add another method that would wrap it up? The arbitrary sequence of document IDs is motivated by, ahem, private needs - the TopDocs is contained, the iterator (and consumer) can stream over a large number of documents. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components
dweiss commented on a change in pull request #1721: URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466353337 ## File path: lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java ## @@ -0,0 +1,503 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search.matchhighlight; + +import org.apache.lucene.analysis.Analyzer; +import org.apache.lucene.analysis.TokenStream; +import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; +import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; +import org.apache.lucene.document.Document; +import org.apache.lucene.index.FieldInfo; +import org.apache.lucene.index.FieldInfos; +import org.apache.lucene.index.IndexReader; +import org.apache.lucene.index.LeafReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.search.IndexSearcher; +import org.apache.lucene.search.Matches; +import org.apache.lucene.search.MatchesIterator; +import org.apache.lucene.search.Query; +import org.apache.lucene.search.QueryVisitor; +import org.apache.lucene.search.ScoreMode; +import org.apache.lucene.search.Weight; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PrimitiveIterator; +import java.util.Set; +import java.util.TreeMap; +import java.util.TreeSet; +import java.util.function.Predicate; + +/** + * Utility class to compute a list of "hit regions" for a given query, searcher and + * document(s) using {@link Matches} API. + */ +public class MatchRegionRetriever { + private final List leaves; + private final Weight weight; + private final TreeSet affectedFields; + private final Map offsetStrategies; + private final Set preloadFields; + + public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer analyzer) + throws IOException { +leaves = searcher.getIndexReader().leaves(); +assert checkOrderConsistency(leaves); + +weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0); + +// Compute the subset of fields affected by this query so that we don't load or scan +// fields that are irrelevant. +affectedFields = new TreeSet<>(); +query.visit( +new QueryVisitor() { + @Override + public boolean acceptField(String field) { +affectedFields.add(field); +return false; + } +}); + +// Compute value offset retrieval strategy for all affected fields. +offsetStrategies = +computeOffsetStrategies(affectedFields, searcher.getIndexReader(), analyzer); + +// Ask offset strategies if they'll need field values. +preloadFields = new HashSet<>(); +offsetStrategies.forEach( +(field, strategy) -> { + if (strategy.requiresDocument()) { +preloadFields.add(field); + } +}); + +// Only preload those field values that can be affected by the query and are required +// by strategies. +preloadFields.retainAll(affectedFields); + } + + public void highlightDocuments(PrimitiveIterator.OfInt docIds, HitRegionConsumer consumer) + throws IOException { +if (leaves.isEmpty() || affectedFields.isEmpty()) { + return; +} + +Iterator ctx = leaves.iterator(); +LeafReaderContext currentContext = ctx.next(); +int previousDocId = -1; +Map> highlights = new TreeMap<>(); +while (docIds.hasNext()) { + int docId = docIds.nextInt(); + + if (docId < previousDocId) { +throw new RuntimeException("Input document IDs must be sorted (increasing)."); + } + previousDocId = docId; + + while (docId >= currentContext.docBase + currentContext.reader().maxDoc()) { +currentContext = ctx.next(); + } + + int contextRelativeDocId = docId - currentContext.docBase; + + // Only preload fields we may potentially need. + FieldValueProvider documentSupplier; + if (preloadFields.isEmpty()) { +documentSupplier = null; + } else { +
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components
dweiss commented on a change in pull request #1721: URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466352634 ## File path: lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java ## @@ -0,0 +1,503 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search.matchhighlight; + +import org.apache.lucene.analysis.Analyzer; +import org.apache.lucene.analysis.TokenStream; +import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; +import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; +import org.apache.lucene.document.Document; +import org.apache.lucene.index.FieldInfo; +import org.apache.lucene.index.FieldInfos; +import org.apache.lucene.index.IndexReader; +import org.apache.lucene.index.LeafReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.search.IndexSearcher; +import org.apache.lucene.search.Matches; +import org.apache.lucene.search.MatchesIterator; +import org.apache.lucene.search.Query; +import org.apache.lucene.search.QueryVisitor; +import org.apache.lucene.search.ScoreMode; +import org.apache.lucene.search.Weight; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PrimitiveIterator; +import java.util.Set; +import java.util.TreeMap; +import java.util.TreeSet; +import java.util.function.Predicate; + +/** + * Utility class to compute a list of "hit regions" for a given query, searcher and + * document(s) using {@link Matches} API. + */ +public class MatchRegionRetriever { + private final List leaves; + private final Weight weight; + private final TreeSet affectedFields; + private final Map offsetStrategies; + private final Set preloadFields; + + public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer analyzer) + throws IOException { +leaves = searcher.getIndexReader().leaves(); +assert checkOrderConsistency(leaves); + +weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0); + +// Compute the subset of fields affected by this query so that we don't load or scan +// fields that are irrelevant. +affectedFields = new TreeSet<>(); +query.visit( +new QueryVisitor() { + @Override + public boolean acceptField(String field) { +affectedFields.add(field); +return false; + } +}); + +// Compute value offset retrieval strategy for all affected fields. +offsetStrategies = +computeOffsetStrategies(affectedFields, searcher.getIndexReader(), analyzer); + +// Ask offset strategies if they'll need field values. +preloadFields = new HashSet<>(); +offsetStrategies.forEach( +(field, strategy) -> { + if (strategy.requiresDocument()) { +preloadFields.add(field); + } +}); + +// Only preload those field values that can be affected by the query and are required +// by strategies. +preloadFields.retainAll(affectedFields); + } + + public void highlightDocuments(PrimitiveIterator.OfInt docIds, HitRegionConsumer consumer) + throws IOException { +if (leaves.isEmpty() || affectedFields.isEmpty()) { + return; +} + +Iterator ctx = leaves.iterator(); +LeafReaderContext currentContext = ctx.next(); +int previousDocId = -1; +Map> highlights = new TreeMap<>(); +while (docIds.hasNext()) { + int docId = docIds.nextInt(); + + if (docId < previousDocId) { +throw new RuntimeException("Input document IDs must be sorted (increasing)."); + } + previousDocId = docId; + + while (docId >= currentContext.docBase + currentContext.reader().maxDoc()) { +currentContext = ctx.next(); + } + + int contextRelativeDocId = docId - currentContext.docBase; + + // Only preload fields we may potentially need. + FieldValueProvider documentSupplier; + if (preloadFields.isEmpty()) { +documentSupplier = null; + } else { +
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components
dweiss commented on a change in pull request #1721: URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466350632 ## File path: lucene/core/src/java/org/apache/lucene/search/DisjunctionMatchesIterator.java ## @@ -201,8 +201,9 @@ protected boolean lessThan(MatchesIterator a, MatchesIterator b) { @Override public boolean next() throws IOException { -if (started == false) { - return started = true; +if (!started) { Review comment: Change you must when asked! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] romseygeek commented on pull request #1671: LUCENE-9427: Ensure unified highlighter considers all terms in fuzzy query.
romseygeek commented on pull request #1671: URL: https://github.com/apache/lucene-solr/pull/1671#issuecomment-669869944 Merged to master in 688583fc2d01c39bba63d19cf57bb5720eda1afd This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9427) Unified highlighter can fail to highlight fuzzy query
[ https://issues.apache.org/jira/browse/LUCENE-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Woodward resolved LUCENE-9427. --- Fix Version/s: 8.7 Resolution: Fixed Thanks [~jtibshirani]! > Unified highlighter can fail to highlight fuzzy query > - > > Key: LUCENE-9427 > URL: https://issues.apache.org/jira/browse/LUCENE-9427 > Project: Lucene - Core > Issue Type: Bug >Reporter: Julie Tibshirani >Priority: Major > Fix For: 8.7 > > Time Spent: 50m > Remaining Estimate: 0h > > If a fuzzy query corresponds to an exact match (for example it has with > maxEdits: 0), then the unified highlighter doesn't produce highlights for the > matching terms. > I think this is due to the fact that when visiting a fuzzy query, the exact > terms are now consumed separately from automata. The unified highlighter > doesn't account for the terms and misses highlighting them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] romseygeek closed pull request #1671: LUCENE-9427: Ensure unified highlighter considers all terms in fuzzy query.
romseygeek closed pull request #1671: URL: https://github.com/apache/lucene-solr/pull/1671 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9427) Unified highlighter can fail to highlight fuzzy query
[ https://issues.apache.org/jira/browse/LUCENE-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172273#comment-17172273 ] ASF subversion and git services commented on LUCENE-9427: - Commit b6806355c3cf7c866ab3b2302b78f2b478691876 in lucene-solr's branch refs/heads/branch_8x from Julie Tibshirani [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b680635 ] LUCENE-9427: Fuzzy query should always call consumeTermsMatching in visitor > Unified highlighter can fail to highlight fuzzy query > - > > Key: LUCENE-9427 > URL: https://issues.apache.org/jira/browse/LUCENE-9427 > Project: Lucene - Core > Issue Type: Bug >Reporter: Julie Tibshirani >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > If a fuzzy query corresponds to an exact match (for example it has with > maxEdits: 0), then the unified highlighter doesn't produce highlights for the > matching terms. > I think this is due to the fact that when visiting a fuzzy query, the exact > terms are now consumed separately from automata. The unified highlighter > doesn't account for the terms and misses highlighting them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9427) Unified highlighter can fail to highlight fuzzy query
[ https://issues.apache.org/jira/browse/LUCENE-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172274#comment-17172274 ] ASF subversion and git services commented on LUCENE-9427: - Commit 688583fc2d01c39bba63d19cf57bb5720eda1afd in lucene-solr's branch refs/heads/master from Julie Tibshirani [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=688583f ] LUCENE-9427: Fuzzy query should always call consumeTermsMatching in visitor > Unified highlighter can fail to highlight fuzzy query > - > > Key: LUCENE-9427 > URL: https://issues.apache.org/jira/browse/LUCENE-9427 > Project: Lucene - Core > Issue Type: Bug >Reporter: Julie Tibshirani >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > If a fuzzy query corresponds to an exact match (for example it has with > maxEdits: 0), then the unified highlighter doesn't produce highlights for the > matching terms. > I think this is due to the fact that when visiting a fuzzy query, the exact > terms are now consumed separately from automata. The unified highlighter > doesn't account for the terms and misses highlighting them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14714) Solr.cmd in windows loads the incorrect jetty module when using java>=9
[ https://issues.apache.org/jira/browse/SOLR-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Endika Posadas updated SOLR-14714: -- Description: In Solr.cmd, when using SSL, there is a check to verify what version of java solr is running with. If this version of Java is greater or equal java 9 it will load the jetty https module while for java 8 it will use https8. However, this java version check is done before the java major version variable has been assigned. As a result, Solr in windows doesn't work when SSL is enabled. To fix this issue, it is enough if java checks are done before SSL checks. I have attached a patch with the modifications. was: In Solr.cmd, when using SSL, there is a check to verify what version of java solr is running with. If this version of Java is greater or equal java 9 it will load the jetty https module while for java 8 it will use https8. However, this java version check is done before the java major version variable has been assigned. As a result, Solr in windows doesn't work when SSL is enabled. To fix this issue is enough if java checks are done before SSL checks. I have attached a patch with the modifications. > Solr.cmd in windows loads the incorrect jetty module when using java>=9 > --- > > Key: SOLR-14714 > URL: https://issues.apache.org/jira/browse/SOLR-14714 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: scripts and tools >Affects Versions: 8.6 > Environment: Windows environment running in Solr Cloud mode with SSL. >Reporter: Endika Posadas >Priority: Major > Attachments: load_java_info_first.patch > > > In Solr.cmd, when using SSL, there is a check to verify what version of java > solr is running with. If this version of Java is greater or equal java 9 it > will load the jetty https module while for java 8 it will use https8. > However, this java version check is done before the java major version > variable has been assigned. As a result, Solr in windows doesn't work when > SSL is enabled. > > To fix this issue, it is enough if java checks are done before SSL checks. > > I have attached a patch with the modifications. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1721: LUCENE-9439: match region highlighter components
romseygeek commented on a change in pull request #1721: URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466306845 ## File path: lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java ## @@ -0,0 +1,503 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search.matchhighlight; + +import org.apache.lucene.analysis.Analyzer; +import org.apache.lucene.analysis.TokenStream; +import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; +import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; +import org.apache.lucene.document.Document; +import org.apache.lucene.index.FieldInfo; +import org.apache.lucene.index.FieldInfos; +import org.apache.lucene.index.IndexReader; +import org.apache.lucene.index.LeafReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.search.IndexSearcher; +import org.apache.lucene.search.Matches; +import org.apache.lucene.search.MatchesIterator; +import org.apache.lucene.search.Query; +import org.apache.lucene.search.QueryVisitor; +import org.apache.lucene.search.ScoreMode; +import org.apache.lucene.search.Weight; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.PrimitiveIterator; +import java.util.Set; +import java.util.TreeMap; +import java.util.TreeSet; +import java.util.function.Predicate; + +/** + * Utility class to compute a list of "hit regions" for a given query, searcher and + * document(s) using {@link Matches} API. + */ +public class MatchRegionRetriever { + private final List leaves; + private final Weight weight; + private final TreeSet affectedFields; + private final Map offsetStrategies; + private final Set preloadFields; + + public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer analyzer) + throws IOException { +leaves = searcher.getIndexReader().leaves(); +assert checkOrderConsistency(leaves); + +weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0); + +// Compute the subset of fields affected by this query so that we don't load or scan +// fields that are irrelevant. +affectedFields = new TreeSet<>(); +query.visit( +new QueryVisitor() { + @Override + public boolean acceptField(String field) { +affectedFields.add(field); +return false; + } +}); + +// Compute value offset retrieval strategy for all affected fields. +offsetStrategies = +computeOffsetStrategies(affectedFields, searcher.getIndexReader(), analyzer); + +// Ask offset strategies if they'll need field values. +preloadFields = new HashSet<>(); +offsetStrategies.forEach( +(field, strategy) -> { + if (strategy.requiresDocument()) { +preloadFields.add(field); + } +}); + +// Only preload those field values that can be affected by the query and are required +// by strategies. +preloadFields.retainAll(affectedFields); + } + + public void highlightDocuments(PrimitiveIterator.OfInt docIds, HitRegionConsumer consumer) + throws IOException { +if (leaves.isEmpty() || affectedFields.isEmpty()) { + return; +} + +Iterator ctx = leaves.iterator(); +LeafReaderContext currentContext = ctx.next(); +int previousDocId = -1; +Map> highlights = new TreeMap<>(); +while (docIds.hasNext()) { + int docId = docIds.nextInt(); + + if (docId < previousDocId) { +throw new RuntimeException("Input document IDs must be sorted (increasing)."); + } + previousDocId = docId; + + while (docId >= currentContext.docBase + currentContext.reader().maxDoc()) { +currentContext = ctx.next(); + } + + int contextRelativeDocId = docId - currentContext.docBase; + + // Only preload fields we may potentially need. + FieldValueProvider documentSupplier; + if (preloadFields.isEmpty()) { +documentSupplier = null; + } else { +
[jira] [Updated] (SOLR-14714) Solr.cmd in windows loads the incorrect jetty module when using java>=9
[ https://issues.apache.org/jira/browse/SOLR-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Endika Posadas updated SOLR-14714: -- Summary: Solr.cmd in windows loads the incorrect jetty module when using java>=9 (was: Solr.cmd in windows loads the incorrect jetty module when using java>9) > Solr.cmd in windows loads the incorrect jetty module when using java>=9 > --- > > Key: SOLR-14714 > URL: https://issues.apache.org/jira/browse/SOLR-14714 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: scripts and tools >Affects Versions: 8.6 > Environment: Windows environment running in Solr Cloud mode with SSL. >Reporter: Endika Posadas >Priority: Major > Attachments: load_java_info_first.patch > > > In Solr.cmd, when using SSL, there is a check to verify what version of java > solr is running with. If this version of Java is greater or equal java 9 it > will load the jetty https module while for java 8 it will use https8. > However, this java version check is done before the java major version > variable has been assigned. As a result, Solr in windows doesn't work when > SSL is enabled. > > To fix this issue is enough if java checks are done before SSL checks. > > I have attached a patch with the modifications. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14714) Solr.cmd in windows loads the incorrect jetty module when using java>9
Endika Posadas created SOLR-14714: - Summary: Solr.cmd in windows loads the incorrect jetty module when using java>9 Key: SOLR-14714 URL: https://issues.apache.org/jira/browse/SOLR-14714 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: scripts and tools Affects Versions: 8.6 Environment: Windows environment running in Solr Cloud mode with SSL. Reporter: Endika Posadas Attachments: load_java_info_first.patch In Solr.cmd, when using SSL, there is a check to verify what version of java solr is running with. If this version of Java is greater or equal java 9 it will load the jetty https module while for java 8 it will use https8. However, this java version check is done before the java major version variable has been assigned. As a result, Solr in windows doesn't work when SSL is enabled. To fix this issue is enough if java checks are done before SSL checks. I have attached a patch with the modifications. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172213#comment-17172213 ] Michael McCandless commented on LUCENE-9444: {quote}Should this class extends {{DocIdSetIterator}} to allow intersection with another {{DocIdSetIterator}} created from {{FacetsCollector.MatchingDoc.bits}} ? {quote} I think so? {quote}Making {{dim}} part of ctor feels a bit restrictive, how about providing 2 separate APIs, one that accepts dimension and another that does not ? {quote} Maybe the method that produces the iterator could optionally take a {{dim}} to filter for only those labels under that dimension? {quote}how about returning a {{java.util.Iterator}} instead of {{FacetLabel[]}} ? {quote} +1 > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14713) Single thread on streaming updates
[ https://issues.apache.org/jira/browse/SOLR-14713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172203#comment-17172203 ] Cao Manh Dat commented on SOLR-14713: - I created a PR for this, it is not finish yet and no tests so far. But the PR also solve the problem of incorrectly handling retry request. Here is the scenario: * {{UpdateRequest}} is converted to multiple {{Req}}s * Solr failed to send the second Req * Solr retry the first Req (since we only refer/point to the first one) * It success * The whole UpdateRequest becomes success. > Single thread on streaming updates > -- > > Key: SOLR-14713 > URL: https://issues.apache.org/jira/browse/SOLR-14713 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Or great simplify SolrCmdDistributor > h2. Current way for fan out updates of Solr > Currently on receiving an updateRequest, Solr will create a new > UpdateProcessors for handling that request, then it parses one by one > document from the request and let’s processor handle it. > {code:java} > onReceiving(UpdateRequest update): > processors = createNewProcessors(); > for (Document doc : update) { > processors.handle(doc) > } > {code} > Let’s say the number of replicas in the current shard is N, updateProcessor > will create N-1 queues and runners for each other replica. > Runner is basically a thread that dequeues updates from its corresponding > queue and sends it to a corresponding replica endpoint. > Note 1: all Runners share the same client hence connection pool and same > thread pool. > Note 2: A runner will send all documents of its UpdateRequest in a single > HTTP POST request (to reduce the number of threads for handling requests on > the other side). Therefore its lifetime equals the total time of handling its > UpdateRequest. Below is a typical activity that happens in a runner's life > cycle. > h2. Problems of current approach > The current approach have two problems: > - Problem 1: It uses lots of threads for fan out requests. > - Problem 2 which is more important: it is very complex. Solr is also using > ConcurrentUpdateSolrClient (CUSC for short) for that, CUSC implementation > allows using a single queue but multiple runners for same queue (although we > only use one runner at max) this raise the complexity of the whole flow up to > the top. Single fix for a problem can raise multiple problems later, i.e: in > SOLR-13975 on trying to handle the problem when the other endpoint is hanging > out for so long, we introduced a bug that lets the runner keep running even > when the updateRequest is fully handled in the leader. > h2. Doing everything in single thread > Since we are already supporting sending requests in an async manner, why > don’t we let the main thread which is handling the update request to send > updates to all others without the need of runners or queues. The code will be > something like this > {code:java} > Class UpdateProcessor: >Map pendingOutStreams > >func handleAddDoc(doc): > for (replica: replicas): > pendingOutStreams.get(replica).send(doc) > >func onEndUpdateRequest(): > pendingOutStreams.values().forEach(out -> > closeAndHandleResponse(out)){code} > > By doing this we will use less threads and the code is much more simpler and > cleaner. Of course that there will be some downgrade in the time for handling > an updateRequest since we are doing it serially instead of concurrently. In a > formal way it will be like this > {code:java} > oldTime = timeForIndexing(update) + timeForSendingUpdates(update) > newTime = timeForIndexing(update) + (N-1) * > timeForSendingUpdates(update){code} > But I believe that timeForIndexing is much more than timeForSendingUpdates so > we do not really need to be concerned about this. Even that is really a > problem users can simply create more threads for indexing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] CaoManhDat opened a new pull request #1722: SOLR-14713: Single thread on streaming updates
CaoManhDat opened a new pull request #1722: URL: https://github.com/apache/lucene-solr/pull/1722 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14713) Single thread on streaming updates
[ https://issues.apache.org/jira/browse/SOLR-14713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172196#comment-17172196 ] ASF subversion and git services commented on SOLR-14713: Commit 5986b4cc3c83ac89d014d85ec4ea53d303800fe7 in lucene-solr's branch refs/heads/jira/SOLR-14713 from Cao Manh Dat [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5986b4c ] SOLR-14713: Single thread on streaming updates > Single thread on streaming updates > -- > > Key: SOLR-14713 > URL: https://issues.apache.org/jira/browse/SOLR-14713 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > > Or great simplify SolrCmdDistributor > h2. Current way for fan out updates of Solr > Currently on receiving an updateRequest, Solr will create a new > UpdateProcessors for handling that request, then it parses one by one > document from the request and let’s processor handle it. > {code:java} > onReceiving(UpdateRequest update): > processors = createNewProcessors(); > for (Document doc : update) { > processors.handle(doc) > } > {code} > Let’s say the number of replicas in the current shard is N, updateProcessor > will create N-1 queues and runners for each other replica. > Runner is basically a thread that dequeues updates from its corresponding > queue and sends it to a corresponding replica endpoint. > Note 1: all Runners share the same client hence connection pool and same > thread pool. > Note 2: A runner will send all documents of its UpdateRequest in a single > HTTP POST request (to reduce the number of threads for handling requests on > the other side). Therefore its lifetime equals the total time of handling its > UpdateRequest. Below is a typical activity that happens in a runner's life > cycle. > h2. Problems of current approach > The current approach have two problems: > - Problem 1: It uses lots of threads for fan out requests. > - Problem 2 which is more important: it is very complex. Solr is also using > ConcurrentUpdateSolrClient (CUSC for short) for that, CUSC implementation > allows using a single queue but multiple runners for same queue (although we > only use one runner at max) this raise the complexity of the whole flow up to > the top. Single fix for a problem can raise multiple problems later, i.e: in > SOLR-13975 on trying to handle the problem when the other endpoint is hanging > out for so long, we introduced a bug that lets the runner keep running even > when the updateRequest is fully handled in the leader. > h2. Doing everything in single thread > Since we are already supporting sending requests in an async manner, why > don’t we let the main thread which is handling the update request to send > updates to all others without the need of runners or queues. The code will be > something like this > {code:java} > Class UpdateProcessor: >Map pendingOutStreams > >func handleAddDoc(doc): > for (replica: replicas): > pendingOutStreams.get(replica).send(doc) > >func onEndUpdateRequest(): > pendingOutStreams.values().forEach(out -> > closeAndHandleResponse(out)){code} > > By doing this we will use less threads and the code is much more simpler and > cleaner. Of course that there will be some downgrade in the time for handling > an updateRequest since we are doing it serially instead of concurrently. In a > formal way it will be like this > {code:java} > oldTime = timeForIndexing(update) + timeForSendingUpdates(update) > newTime = timeForIndexing(update) + (N-1) * > timeForSendingUpdates(update){code} > But I believe that timeForIndexing is much more than timeForSendingUpdates so > we do not really need to be concerned about this. Even that is really a > problem users can simply create more threads for indexing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14713) Single thread on streaming updates
Cao Manh Dat created SOLR-14713: --- Summary: Single thread on streaming updates Key: SOLR-14713 URL: https://issues.apache.org/jira/browse/SOLR-14713 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Cao Manh Dat Assignee: Cao Manh Dat Or great simplify SolrCmdDistributor h2. Current way for fan out updates of Solr Currently on receiving an updateRequest, Solr will create a new UpdateProcessors for handling that request, then it parses one by one document from the request and let’s processor handle it. {code:java} onReceiving(UpdateRequest update): processors = createNewProcessors(); for (Document doc : update) { processors.handle(doc) } {code} Let’s say the number of replicas in the current shard is N, updateProcessor will create N-1 queues and runners for each other replica. Runner is basically a thread that dequeues updates from its corresponding queue and sends it to a corresponding replica endpoint. Note 1: all Runners share the same client hence connection pool and same thread pool. Note 2: A runner will send all documents of its UpdateRequest in a single HTTP POST request (to reduce the number of threads for handling requests on the other side). Therefore its lifetime equals the total time of handling its UpdateRequest. Below is a typical activity that happens in a runner's life cycle. h2. Problems of current approach The current approach have two problems: - Problem 1: It uses lots of threads for fan out requests. - Problem 2 which is more important: it is very complex. Solr is also using ConcurrentUpdateSolrClient (CUSC for short) for that, CUSC implementation allows using a single queue but multiple runners for same queue (although we only use one runner at max) this raise the complexity of the whole flow up to the top. Single fix for a problem can raise multiple problems later, i.e: in SOLR-13975 on trying to handle the problem when the other endpoint is hanging out for so long, we introduced a bug that lets the runner keep running even when the updateRequest is fully handled in the leader. h2. Doing everything in single thread Since we are already supporting sending requests in an async manner, why don’t we let the main thread which is handling the update request to send updates to all others without the need of runners or queues. The code will be something like this {code:java} Class UpdateProcessor: Map pendingOutStreams func handleAddDoc(doc): for (replica: replicas): pendingOutStreams.get(replica).send(doc) func onEndUpdateRequest(): pendingOutStreams.values().forEach(out -> closeAndHandleResponse(out)){code} By doing this we will use less threads and the code is much more simpler and cleaner. Of course that there will be some downgrade in the time for handling an updateRequest since we are doing it serially instead of concurrently. In a formal way it will be like this {code:java} oldTime = timeForIndexing(update) + timeForSendingUpdates(update) newTime = timeForIndexing(update) + (N-1) * timeForSendingUpdates(update){code} But I believe that timeForIndexing is much more than timeForSendingUpdates so we do not really need to be concerned about this. Even that is really a problem users can simply create more threads for indexing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9439) Matches API should enumerate hit fields that have no positions (no iterator)
[ https://issues.apache.org/jira/browse/LUCENE-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172186#comment-17172186 ] Dawid Weiss commented on LUCENE-9439: - Hi Alan. Thank you for your feedback. Works like a charm. The "no-positions" strategy approach allows for some interesting deviations - one could add match regions for entire values or just for tokens returned from analysis (so you can "see" individual tokens over the value text). I piggybacked a small fix to disjunction matches iterator because it looked like a bug to me (unrelated). [1] Otherwise it's really well separated from existing code and works great for me. For example, I tried interval queries and they just work out of the box. A more complex expression highlights more than it should but this is related to the match range returned so is nicely decoupled from the "highlighting engine" itself. [2] I think it's worth adding to Lucene. Would have to get rid of the assertj dependency first though. Or maybe we should add it and allow its use? The nice thing about assertj is that it formats assertion failures in a much better way, especially for stream or collection assertions. [1] https://github.com/apache/lucene-solr/pull/1721/files#diff-f5538289e23aabdd53bc3bcbc59da342 [2] https://github.com/apache/lucene-solr/blob/c0562c1f2d789679432f9d72375aa3747e4b6526/lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/MatchRegionRetrieverTest.java#L335-L355 > Matches API should enumerate hit fields that have no positions (no iterator) > > > Key: LUCENE-9439 > URL: https://issues.apache.org/jira/browse/LUCENE-9439 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Minor > Attachments: LUCENE-9439.patch, matchhighlighter.patch > > Time Spent: 40m > Remaining Estimate: 0h > > I have been fiddling with Matches API and it's great. There is one corner > case that doesn't work for me though -- queries that affect fields without > positions return {{MatchesUtil.MATCH_WITH_NO_TERMS}} but this constant is > problematic as it doesn't carry the field name that caused it (returns null). > The associated fromSubMatches combines all these constants into one (or > swallows them) which is another problem. > I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with > a true match (carrying field name) returning an empty iterator (or a constant > "empty" iterator NO_TERMS). > I have a very compelling use case: I wrote an "auto-highlighter" that runs on > top of Matches API and automatically picks up query-relevant fields and > snippets. Everything works beautifully except for cases where fields are > searchable but don't have any positions (token-like fields). > I can work on a patch but wanted to reach out first - [~romseygeek]? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss opened a new pull request #1721: LUCENE-9439: match region highlighter components
dweiss opened a new pull request #1721: URL: https://github.com/apache/lucene-solr/pull/1721 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss closed pull request #1689: LUCENE-9439: Matches API should enumerate hit fields that have no positions (support empty iterator)
dweiss closed pull request #1689: URL: https://github.com/apache/lucene-solr/pull/1689 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14712) Standardize RPC calls in Solr
[ https://issues.apache.org/jira/browse/SOLR-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-14712: -- Description: We should have a standard mechanism to make a request to the right replica/node across solr code. This RPC mechanism assumes that * The RPC mechanism is HTTP * It is aware of all collections,shards & their topology etc * it knows how to route a request to the correct core This is agnostic of wire level formats ,Solr documents etc. That is a layer above this. Anyone can use their own JSON parser or any other RPC wire level format on top of this for example a code like this {code} private void invokeOverseerOp(String electionNode, String op) { ModifiableSolrParams params = new ModifiableSolrParams(); ShardHandler shardHandler = shardHandlerFactory.getShardHandler(); params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString()); params.set("op", op); params.set("qt", adminPath); params.set("electionNode", electionNode); ShardRequest sreq = new ShardRequest(); sreq.purpose = 1; String replica = zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode)); sreq.shards = new String[]\{replica}; sreq.actualShards = sreq.shards; sreq.params = params; shardHandler.submit(sreq, replica, sreq.params); shardHandler.takeCompletedOrError(); } {code} will be replaced with {code} private void invokeOverseerOp(String electionNode, String op) { HttpRpcFactory factory = null; factory.create() .withHttpMethod(SolrRequest.METHOD.GET) .addParam(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString()) .addParam("op", op) .addParam("electionNode", electionNode) .addParam(ShardParams.SHARDS_PURPOSE, "1") .withV1Uri(adminPath) .toNode(electionNode) .invoke(); } {code} was: We should have a standard mechanism to make a request to the right replica/node across solr code. This RPC mechanism assumes that * The RPC mechanism is HTTP * It is aware of all collections,shards & their topology etc * it knows how to route a request to the correct core This is agnostic of wire level formats ,Solr documents etc. That is a layer above this. Anyone can use their own JSON parser or any other RPC wire level format on top of this > Standardize RPC calls in Solr > - > > Key: SOLR-14712 > URL: https://issues.apache.org/jira/browse/SOLR-14712 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We should have a standard mechanism to make a request to the right > replica/node across solr code. > This RPC mechanism assumes that > * The RPC mechanism is HTTP > * It is aware of all collections,shards & their topology etc > * it knows how to route a request to the correct core > This is agnostic of wire level formats ,Solr documents etc. That is a layer > above this. > Anyone can use their own JSON parser or any other RPC wire level format on > top of this > for example a code like this > {code} > private void invokeOverseerOp(String electionNode, String op) { > ModifiableSolrParams params = new ModifiableSolrParams(); > ShardHandler shardHandler = shardHandlerFactory.getShardHandler(); > params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString()); > params.set("op", op); > params.set("qt", adminPath); > params.set("electionNode", electionNode); > ShardRequest sreq = new ShardRequest(); > sreq.purpose = 1; > String replica = > zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode)); > sreq.shards = new String[]\{replica}; > sreq.actualShards = sreq.shards; > sreq.params = params; > shardHandler.submit(sreq, replica, sreq.params); > shardHandler.takeCompletedOrError(); > } > {code} > will be replaced with > {code} > private void invokeOverseerOp(String electionNode, String op) { > HttpRpcFactory factory = null; > factory.create() > .withHttpMethod(SolrRequest.METHOD.GET) > .addParam(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString()) > .addParam("op", op) > .addParam("electionNode", electionNode) > .addParam(ShardParams.SHARDS_PURPOSE, "1") > .withV1Uri(adminPath) > .toNode(electionNode) > .invoke(); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14712) Standardize RPC calls in Solr
[ https://issues.apache.org/jira/browse/SOLR-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-14712: -- Description: We should have a standard mechanism to make a request to the right replica/node across solr code. This RPC mechanism assumes that * The RPC mechanism is HTTP * It is aware of all collections,shards & their topology etc * it knows how to route a request to the correct core This is agnostic of wire level formats ,Solr documents etc. That is a layer above this. Anyone can use their own JSON parser or any other RPC wire level format on top of this was: We should have a standard mechanism to make a request to the right replica/node across solr code. This RPC mechanism assumes that * The RPC mechanism is HTTP * It is aware of all collections,shards & their topology etc * it knows how to route a request to the correct core > Standardize RPC calls in Solr > - > > Key: SOLR-14712 > URL: https://issues.apache.org/jira/browse/SOLR-14712 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We should have a standard mechanism to make a request to the right > replica/node across solr code. > This RPC mechanism assumes that > * The RPC mechanism is HTTP > * It is aware of all collections,shards & their topology etc > * it knows how to route a request to the correct core > This is agnostic of wire level formats ,Solr documents etc. That is a layer > above this. > Anyone can use their own JSON parser or any other RPC wire level format on > top of this -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8626) standardise test class naming
[ https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172101#comment-17172101 ] Christine Poerschke commented on LUCENE-8626: - bq. This was mentioned and proposed on the dev mailing list. ... The "Test Harness behaviour on a package run" [thread| https://lists.apache.org/thread.html/3a8e38e6ca9abe2e4be0c12cfd23d103cf60d0891c54df45e9c7bf18%40%3Cdev.lucene.apache.org%3E] had led to the ticket here at the end of 2018. Great to see the interest and effort resume via the "Standardize Leading Test or Trailing Test" [thread|https://lists.apache.org/thread.html/rde0276272a86582c5e6f9456ad592233c9ed575579a0f812d88be486%40%3Cdev.lucene.apache.org%3E] now. > standardise test class naming > - > > Key: LUCENE-8626 > URL: https://issues.apache.org/jira/browse/LUCENE-8626 > Project: Lucene - Core > Issue Type: Test >Reporter: Christine Poerschke >Priority: Major > Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, > SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch > > > This was mentioned and proposed on the dev mailing list. Starting this ticket > here to start to make it happen? > History: This ticket was created as > https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got > JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14712) Standardize RPC calls in Solr
[ https://issues.apache.org/jira/browse/SOLR-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-14712: -- Description: We should have a standard mechanism to make a request to the right replica/node across solr code. This RPC mechanism assumes that * The RPC mechanism is HTTP * It is aware of all collections,shards & their topology etc * it knows how to route a request to the correct core was: We should have a standard mechanism to make a request to the right replica/node across solr code. > Standardize RPC calls in Solr > - > > Key: SOLR-14712 > URL: https://issues.apache.org/jira/browse/SOLR-14712 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We should have a standard mechanism to make a request to the right > replica/node across solr code. > This RPC mechanism assumes that > * The RPC mechanism is HTTP > * It is aware of all collections,shards & their topology etc > * it knows how to route a request to the correct core > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org