[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-08-06 Thread GitBox


sigram commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466853804



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/CreateNewCollectionRequest.java
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.util.Set;
+
+/**
+ * Request for creating a new collection with a given set of shards and 
replication factor for various replica types.
+ * The expected {@link WorkOrder} corresponding to this {@link Request} is 
created using
+ * {@link WorkOrderFactory#createWorkOrderNewCollection}
+ *
+ * Note there is no need at this stage to allow the plugin to know each 
shard hash range for example, this can be handled
+ * by the Solr side implementation of this interface without needing the 
plugin to worry about it (the implementation of this interface on
+ * the Solr side can maintain the ranges for each shard).
+ *
+ * Same goes for the {@link org.apache.solr.core.ConfigSet} name or other 
collection parameters. They are needed for
+ * creating a Collection but likely do not have to be exposed to the plugin 
(this can easily be changed if needed by
+ * adding accessors here, the underlying Solr side implementation of this 
interface has the information).
+ */
+public interface CreateNewCollectionRequest extends Request {
+  /**
+   * The name of the collection to be created and for which placement 
should be computed.
+   *
+   * Compare this method with {@link AddReplicasRequest#getCollection()}, 
there the collection already exists so can be
+   * directly passed in the {@link Request}.
+   *
+   * When processing this request, plugin code doesn't have to worry about 
existing {@link Replica}'s for the collection
+   * given that the collection is assumed not to exist.
+   */
+  String getCollectionName();
+
+  Set getShardNames();
+
+  /**
+   * Properties passed through the Collection API by the client creating 
the collection.
+   * See {@link SolrCollection#getCustomProperty(String)}.
+   *
+   * Given this {@link Request} is for creating a new collection, it is not 
possible to pass the custom property values through
+   * the {@link SolrCollection} object. That instance does not exist yet, and 
is the reason {@link #getCollectionName()} exists
+   * rather than a method returning {@link SolrCollection}...
+   */
+  String getCustomProperty(String customPropertyName);

Review comment:
   Ok.

##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/AddReplicasRequest.java
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.util.Set;
+
+/**
+ * Request for creating one or more {@link Replica}'s for one or more 
{@link Shard}'s of an existing {@link SolrCollection}.
+ * The shard might or might not already exist, plugin code can easily find out 
by using {@link SolrCollection#getShards()}
+ * and verifying if the shard name(s) from {@link #getShardNames()} are there.
+ *
+ * As opposed to {@link CreateNewCollectionRequest}, the set of {@link 
Node}s on which the replicas should be placed
+ * is specified (defaults to being equal to the set returned by {@link 
Cluster#getLiveNodes()}).
+ *
+ * There is no extension between this interface and {@link 
CreateNewCollectionRequest} in either direction
+ * or from a common ance

[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-08-06 Thread GitBox


sigram commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466850645



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PropertyKeyFactory.java
##
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Factory used by the plugin to create property keys to request property 
values from Solr.
+ *
+ * Building of a {@link PropertyKey} requires specifying the target (context) 
from which the value of that key should be
+ * obtained. This is done by specifying the appropriate {@link 
PropertyValueSource}.
+ * For clarity, when only a single type of target is acceptable, the 
corresponding subtype of {@link PropertyValueSource} is used instead
+ * (for example {@link Node}).
+ */
+public interface PropertyKeyFactory {
+  /**
+   * Returns a property key to request the number of cores on a {@link Node}.
+   */
+  PropertyKey createCoreCountKey(Node node);
+
+  /**
+   * Returns a property key to request disk related info on a {@link Node}.
+   */
+  PropertyKey createDiskInfoKey(Node node);
+
+  /**
+   * Returns a property key to request the value of a system property on a 
{@link Node}.
+   * @param systemPropertyName the name of the system property to retrieve.
+   */
+  PropertyKey createSystemPropertyKey(Node node, String systemPropertyName);
+
+  /**
+   * Returns a property key to request the value of a metric.
+   *
+   * Not all metrics make sense everywhere, but metrics can be applied to 
different objects. For example
+   * SEARCHER.searcher.indexCommitSize would make sense for a 
given replica of a given shard of a given collection,
+   * and possibly in other contexts.
+   *
+   * @param metricSource The registry of the metric. For example a specific 
{@link Replica}.
+   * @param metricName for example 
SEARCHER.searcher.indexCommitSize.
+   */
+  PropertyKey createMetricKey(PropertyValueSource metricSource, String 
metricName);

Review comment:
   `SolrDispatchFilter.setupJvmMetrics` initializes per-JVM metrics. They 
appear in a separate `solr.jvm` registry, which is different from `solr.node`.
   
   In 99% cases (or practically always in production) Solr node maps 1:1 to a 
JVM instance. In some cases (most notably tests) there can be multiple Solr 
nodes running in a single JVM, so it's a N:1 - but never the other way around 
because it wouldn't make sense. So in some rare cases we will have multiple 
`solr.node` registries in one JVM (reachable via different API endpoints), but 
always a single `solr.jvm` registry (also reachable via different endpoints).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-08-06 Thread GitBox


sigram commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466848207



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlugin.java
##
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Implemented by external plugins to control replica placement and movement 
on the search cluster (as well as other things
+ * such as cluster elasticity?) when cluster changes are required (initiated 
elsewhere, most likely following a Collection
+ * API call).
+ */
+public interface PlacementPlugin {

Review comment:
   I think that we need to add an explicit mechanism for configuration of 
plugins, otherwise plugin implementors will have to use other Solr facilities 
anyway.
   
   Maybe add a `configure(Map config)` method?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config

2020-08-06 Thread GitBox


dsmiley commented on a change in pull request #1602:
URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466818795



##
File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java
##
@@ -87,6 +100,7 @@ private SolrIndexConfig(SolrConfig solrConfig) {
 maxBufferedDocs = -1;
 ramBufferSizeMB = 100;
 ramPerThreadHardLimitMB = -1;
+maxCommitMergeWaitMillis = -1;

Review comment:
   ok, good point.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #1686: SOLR-13528: Implement Request Rate Limiters

2020-08-06 Thread GitBox


atris commented on a change in pull request #1686:
URL: https://github.com/apache/lucene-solr/pull/1686#discussion_r466813189



##
File path: solr/core/src/java/org/apache/solr/servlet/RateLimitManager.java
##
@@ -38,9 +41,14 @@
  * rate limiting is being done for a specific request type.
  */
 public class RateLimitManager {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
   public final static int DEFAULT_CONCURRENT_REQUESTS= 
(Runtime.getRuntime().availableProcessors()) * 3;
   public final static long DEFAULT_SLOT_ACQUISITION_TIMEOUT_MS = -1;
   private final Map requestRateLimiterMap;
+
+  // IMPORTANT: The slot from the corresponding rate limiter should be 
acquired before adding the request
+  // to this map. Subsequently, the request should be deleted from the map 
before the slot is released.
   private final Map activeRequestsMap;

Review comment:
   This is already a ConcurrentHashMap. The comment is redundant, removing.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #1686: SOLR-13528: Implement Request Rate Limiters

2020-08-06 Thread GitBox


atris commented on a change in pull request #1686:
URL: https://github.com/apache/lucene-solr/pull/1686#discussion_r466812986



##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -102,31 +103,101 @@ public Boolean call() throws Exception {
 try {
   future.get();
 } catch (Exception e) {
-  assertTrue("Not true " + e.getMessage(), 
e.getMessage().contains("non ok status: 429, message:Too Many Requests"));
+  assertThat(e.getMessage(), containsString("non ok status: 429, 
message:Too Many Requests"));
 }
   }
 
   MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) 
rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY);
 
-  assertTrue("Incoming request count did not match. Expected == 25  
incoming " + mockQueryRateLimiter.incomingRequestCount.get(),
-  mockQueryRateLimiter.incomingRequestCount.get() == 25);
+  assertEquals(mockQueryRateLimiter.incomingRequestCount.get(),25);
   assertTrue("Incoming accepted new request count did not match. Expected 
5 incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(),
   mockQueryRateLimiter.acceptedNewRequestCount.get() < 25);
   assertTrue("Incoming rejected new request count did not match. Expected 
20 incoming " + mockQueryRateLimiter.rejectedRequestCount.get(),
   mockQueryRateLimiter.rejectedRequestCount.get() > 0);
-  assertTrue("Incoming total processed requests count did not match. 
Expected " + mockQueryRateLimiter.incomingRequestCount.get() + " incoming "
-  + (mockQueryRateLimiter.acceptedNewRequestCount.get() + 
mockQueryRateLimiter.rejectedRequestCount.get()),
-  (mockQueryRateLimiter.acceptedNewRequestCount.get() + 
mockQueryRateLimiter.rejectedRequestCount.get()) == 
mockQueryRateLimiter.incomingRequestCount.get());
+  assertEquals(mockQueryRateLimiter.acceptedNewRequestCount.get() + 
mockQueryRateLimiter.rejectedRequestCount.get(),
+  mockQueryRateLimiter.incomingRequestCount.get());
+} finally {
+  executor.shutdown();
+}
+  }
+
+  @Test
+  public void testSlotBorrowing() throws Exception {
+CloudSolrClient client = cluster.getSolrClient();
+client.setDefaultCollection(SECOND_COLLECTION);
+
+CollectionAdminRequest.createCollection(SECOND_COLLECTION, 1, 
1).process(client);
+cluster.waitForActiveCollection(SECOND_COLLECTION, 1, 1);
+
+
+SolrDispatchFilter solrDispatchFilter = 
cluster.getJettySolrRunner(0).getSolrDispatchFilter();
+
+RequestRateLimiter.RateLimiterConfig queryRateLimiterConfig = new 
RequestRateLimiter.RateLimiterConfig(SolrRequest.SolrRequestType.QUERY,
+true, 1, DEFAULT_SLOT_ACQUISITION_TIMEOUT_MS, 5 /* allowedRequests */, 
true /* isSlotBorrowing */);
+RequestRateLimiter.RateLimiterConfig indexRateLimiterConfig = new 
RequestRateLimiter.RateLimiterConfig(SolrRequest.SolrRequestType.UPDATE,
+true, 1, DEFAULT_SLOT_ACQUISITION_TIMEOUT_MS, 5 /* allowedRequests */, 
true /* isSlotBorrowing */);
+// We are fine with a null FilterConfig here since we ensure that 
MockBuilder never invokes its parent
+RateLimitManager.Builder builder = new MockBuilder(null /*dummy 
FilterConfig */, new MockRequestRateLimiter(queryRateLimiterConfig, 5), new 
MockRequestRateLimiter(indexRateLimiterConfig, 5));
+RateLimitManager rateLimitManager = builder.build();
+
+solrDispatchFilter.replaceRateLimitManager(rateLimitManager);
+
+for (int i = 0; i < 100; i++) {
+  SolrInputDocument doc = new SolrInputDocument();
+
+  doc.setField("id", i);
+  doc.setField("text", "foo");
+  client.add(doc);
+}
+
+client.commit();
+
+ExecutorService executor = 
ExecutorUtil.newMDCAwareCachedThreadPool("threadpool");
+List> callableList = new ArrayList<>();
+List> futures;
+
+try {
+  for (int i = 0; i < 25; i++) {
+callableList.add(() -> {
+  try {
+QueryResponse response = client.query(new SolrQuery("*:*"));
+
+if (response.getResults().getNumFound() > 0) {
+  assertEquals(100, response.getResults().getNumFound());
+}
+  } catch (Exception e) {
+throw new RuntimeException(e.getMessage());
+  }
+
+  return true;
+});
+  }
+
+  futures = executor.invokeAll(callableList);
+
+  for (Future future : futures) {
+try {
+  future.get();

Review comment:
   assertTrue(future.get() != null); instead?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsu

[jira] [Updated] (SOLR-14712) Standardize RPC calls in Solr

2020-08-06 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14712:
--
Description: 
We should have a standard mechanism to make a request to the right replica/node 
across solr code.

This RPC mechanism assumes that
 * The RPC mechanism is HTTP
 * It is aware of all collections,shards & their topology etc
 * it knows how to route a request to the correct core

 This is agnostic of wire level formats ,Solr documents etc. That is a layer 
above this.

Anyone can use their own JSON parser or any other RPC wire level format on top 
of this

for example a code like this 

{code}

private void invokeOverseerOp(String electionNode, String op) {
ModifiableSolrParams params = new ModifiableSolrParams();
 ShardHandler shardHandler = shardHandlerFactory.getShardHandler();
 params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString());
 params.set("op", op);
 params.set("qt", adminPath);
 params.set("electionNode", electionNode);
 ShardRequest sreq = new ShardRequest();
 sreq.purpose = 1;
 String replica = 
zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode));
 sreq.shards = new String[]\{replica};
 sreq.actualShards = sreq.shards;
 sreq.params = params;
 shardHandler.submit(sreq, replica, sreq.params);
 shardHandler.takeCompletedOrError();
}

{code}

will be replaced with
{code}

private void invokeOverseerOp(String electionNode, String op) {
 RpcFactory factory = null;
factory.createCallRouter()
.toNode(electionNode)
.createHttpRpc()
.withMethod(SolrRequest.METHOD.GET)
.addParam(CoreAdminParams.ACTION, 
CoreAdminAction.OVERSEEROP.toString())
.addParam("op", op)
.addParam("electionNode", electionNode)
.addParam(ShardParams.SHARDS_PURPOSE, 1)
.withV1Path(adminPath)
.invoke();
}
{code}

  was:
We should have a standard mechanism to make a request to the right replica/node 
across solr code.

This RPC mechanism assumes that
 * The RPC mechanism is HTTP
 * It is aware of all collections,shards & their topology etc
 * it knows how to route a request to the correct core

 This is agnostic of wire level formats ,Solr documents etc. That is a layer 
above this.

Anyone can use their own JSON parser or any other RPC wire level format on top 
of this

for example a code like this 

{code}

private void invokeOverseerOp(String electionNode, String op) {
ModifiableSolrParams params = new ModifiableSolrParams();
 ShardHandler shardHandler = shardHandlerFactory.getShardHandler();
 params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString());
 params.set("op", op);
 params.set("qt", adminPath);
 params.set("electionNode", electionNode);
 ShardRequest sreq = new ShardRequest();
 sreq.purpose = 1;
 String replica = 
zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode));
 sreq.shards = new String[]\{replica};
 sreq.actualShards = sreq.shards;
 sreq.params = params;
 shardHandler.submit(sreq, replica, sreq.params);
 shardHandler.takeCompletedOrError();
}

{code}

will be replaced with
{code}

private void invokeOverseerOp(String electionNode, String op) {
 RpcFactory factory = null;
factory.createCallRouter()
.toNode(electionNode)
.createHttpRpc()
.withMethod(SolrRequest.METHOD.GET)
.addParam(CoreAdminParams.ACTION, 
CoreAdminAction.OVERSEEROP.toString())
.addParam("op", op)
.addParam("electionNode", electionNode)
.addParam(ShardParams.SHARDS_PURPOSE, 1)
.withV1Path(adminPath)
.invoke();
}
{code}


> Standardize RPC calls in Solr
> -
>
> Key: SOLR-14712
> URL: https://issues.apache.org/jira/browse/SOLR-14712
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We should have a standard mechanism to make a request to the right 
> replica/node across solr code.
> This RPC mechanism assumes that
>  * The RPC mechanism is HTTP
>  * It is aware of all collections,shards & their topology etc
>  * it knows how to route a request to the correct core
>  This is agnostic of wire level formats ,Solr documents etc. That is a layer 
> above this.
> Anyone can use their own JSON parser or any other RPC wire level format on 
> top of this
> for example a code like this 
> {code}
> private void invokeOverseerOp(String electionNode, String op) {
> ModifiableSolrParams params = new ModifiableSolrParams();
>  ShardHandler shardHandler = shardHandlerFactory.getShardHandler();
>  params.set(CoreAdminParams.ACTION, C

[jira] [Updated] (SOLR-14712) Standardize RPC calls in Solr

2020-08-06 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14712:
--
Description: 
We should have a standard mechanism to make a request to the right replica/node 
across solr code.

This RPC mechanism assumes that
 * The RPC mechanism is HTTP
 * It is aware of all collections,shards & their topology etc
 * it knows how to route a request to the correct core

 This is agnostic of wire level formats ,Solr documents etc. That is a layer 
above this.

Anyone can use their own JSON parser or any other RPC wire level format on top 
of this

for example a code like this 

{code}

private void invokeOverseerOp(String electionNode, String op) {
ModifiableSolrParams params = new ModifiableSolrParams();
 ShardHandler shardHandler = shardHandlerFactory.getShardHandler();
 params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString());
 params.set("op", op);
 params.set("qt", adminPath);
 params.set("electionNode", electionNode);
 ShardRequest sreq = new ShardRequest();
 sreq.purpose = 1;
 String replica = 
zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode));
 sreq.shards = new String[]\{replica};
 sreq.actualShards = sreq.shards;
 sreq.params = params;
 shardHandler.submit(sreq, replica, sreq.params);
 shardHandler.takeCompletedOrError();
}

{code}

will be replaced with
{code}

private void invokeOverseerOp(String electionNode, String op) {
 RpcFactory factory = null;
factory.createCallRouter()
.toNode(electionNode)
.createHttpRpc()
.withMethod(SolrRequest.METHOD.GET)
.addParam(CoreAdminParams.ACTION, 
CoreAdminAction.OVERSEEROP.toString())
.addParam("op", op)
.addParam("electionNode", electionNode)
.addParam(ShardParams.SHARDS_PURPOSE, 1)
.withV1Path(adminPath)
.invoke();
}
{code}

  was:
We should have a standard mechanism to make a request to the right replica/node 
across solr code.

This RPC mechanism assumes that
 * The RPC mechanism is HTTP
 * It is aware of all collections,shards & their topology etc
 * it knows how to route a request to the correct core

 This is agnostic of wire level formats ,Solr documents etc. That is a layer 
above this.

Anyone can use their own JSON parser or any other RPC wire level format on top 
of this

for example a code like this 

{code}

private void invokeOverseerOp(String electionNode, String op) {
ModifiableSolrParams params = new ModifiableSolrParams();
 ShardHandler shardHandler = shardHandlerFactory.getShardHandler();
 params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString());
 params.set("op", op);
 params.set("qt", adminPath);
 params.set("electionNode", electionNode);
 ShardRequest sreq = new ShardRequest();
 sreq.purpose = 1;
 String replica = 
zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode));
 sreq.shards = new String[]\{replica};
 sreq.actualShards = sreq.shards;
 sreq.params = params;
 shardHandler.submit(sreq, replica, sreq.params);
 shardHandler.takeCompletedOrError();
}

{code}

will be replaced with
{code}

private void invokeOverseerOp(String electionNode, String op) {
 HttpRpcFactory factory = null;
 factory.create()
 .withHttpMethod(SolrRequest.METHOD.GET)
 .addParam(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString())
 .addParam("op", op)
 .addParam("electionNode", electionNode)
 .addParam(ShardParams.SHARDS_PURPOSE, "1")
 .withV1Uri(adminPath)
 .toNode(electionNode)
 .invoke();

}
{code}


> Standardize RPC calls in Solr
> -
>
> Key: SOLR-14712
> URL: https://issues.apache.org/jira/browse/SOLR-14712
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We should have a standard mechanism to make a request to the right 
> replica/node across solr code.
> This RPC mechanism assumes that
>  * The RPC mechanism is HTTP
>  * It is aware of all collections,shards & their topology etc
>  * it knows how to route a request to the correct core
>  This is agnostic of wire level formats ,Solr documents etc. That is a layer 
> above this.
> Anyone can use their own JSON parser or any other RPC wire level format on 
> top of this
> for example a code like this 
> {code}
> private void invokeOverseerOp(String electionNode, String op) {
> ModifiableSolrParams params = new ModifiableSolrParams();
>  ShardHandler shardHandler = shardHandlerFactory.getShardHandler();
>  params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString());
>  params.set("op", op);
>  params.set("qt", adminPath);
>  params.set("electionNode", electionNode);
>  ShardRequ

[GitHub] [lucene-solr] anshumg commented on a change in pull request #1720: SOLR-14712 Standardize RPC calls in Solr

2020-08-06 Thread GitBox


anshumg commented on a change in pull request #1720:
URL: https://github.com/apache/lucene-solr/pull/1720#discussion_r466802599



##
File path: solr/solrj/src/java/org/apache/solr/common/util/HttpRpc.java
##
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.common.util;
+
+import org.apache.solr.client.solrj.SolrRequest;
+
+import java.util.Map;
+
+/**Abstract out HTTP aspects of the request

Review comment:
   New line ?

##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrClient.java
##
@@ -478,4 +479,11 @@ public Builder getThis() {
   return this;
 }
   }
+
+  private final RpcFactory factory = null;//TODO

Review comment:
   You don't intend to commit this, right?

##
File path: solr/solrj/src/java/org/apache/solr/common/util/CallRouter.java
##
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.common.util;
+
+public interface CallRouter {
+/**
+ * send to a specific node. usually admin requests
+ */
+CallRouter toNode(String nodeName);
+
+/**
+ * Make a request to any replica of the shard of type
+ */
+CallRouter toShard(String collection, String shard, ReplicaType type);
+
+/**
+ * Identify the shard using the route key and send the request to a given 
replica type
+ */
+CallRouter toShard(String collection, ReplicaType type, String routeKey);

Review comment:
   can we reorder this so the decision maker for the routing i.e. routeKey 
is the 2nd param like `CallRouter toShard(String collection, String shard, 
ReplicaType type);` ?

##
File path: solr/solrj/src/java/org/apache/solr/common/util/RpcFactory.java
##
@@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.common.util;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.CommonParams;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.function.Function;
+
+/**A factory that creates any type of RPC calls in Solr
+ * This is designed to create low level access to the RPC mechanism.
+ * This is agnostic of Solr documents or other internal concepts of Solr
+ * But it knows certain things
+ * a) how to locate a Solr core/replica
+ * b) basic HTTP access,
+ * c) serialization/deserialization is the responsibility of the code that is 
making a request
+ *
+ */
+public interface RpcFactory {
+
+CallRouter createCallRouter();
+
+HttpRpc createHttpRpc();
+
+
+interface ResponseConsumer {
+/**Allows this imp

[jira] [Resolved] (SOLR-14717) Writing parquets to solr shards

2020-08-06 Thread Tomas Eduardo Fernandez Lobbe (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomas Eduardo Fernandez Lobbe resolved SOLR-14717.
--
Resolution: Invalid

Hi Kevin, Jira issues typically for reporting bugs or suggesting features. 
Please ask these kinds of questions in the users list.

> Writing parquets to solr shards
> ---
>
> Key: SOLR-14717
> URL: https://issues.apache.org/jira/browse/SOLR-14717
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Kevin Van Lieshout
>Priority: Major
>
> Is there any assistance around writing parquets from spark to solr shards or 
> is it possible to customize a DIH to import a parquet to a solr shard. Let me 
> know if this is possible, or the best work around for this. Much appreciated, 
> thanks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-08-06 Thread GitBox


murblanc commented on pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#issuecomment-670233414


   Thanks @sigram for the comments. They're useful, will update the PR tomorrow.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-08-06 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466731274



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/ReplicaPlacement.java
##
@@ -0,0 +1,29 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Placement decision for a single {@link Replica}. Note this placement 
decision is used as part of a {@link WorkOrder},
+ * it does not directly lead to the plugin code getting a corresponding {@link 
Replica} instance, nor does it require the
+ * plugin to provide a {@link Shard} instance (the plugin code gets such 
instances for existing replicas and shards in the
+ * cluster but does not create them directly for adding new replicas for new 
or existing shards).
+ *
+ * Captures the {@link Shard} (via the shard name), {@link Node} and {@link 
Replica.ReplicaType} of a Replica to be created.
+ */
+public interface ReplicaPlacement {

Review comment:
   It does include the `Node`. See 
`WorkOrderFactory.createReplicaPlacement()`. It does not directly refer to a 
`Request`, the reference to `Request` is captured in the `WorkOrder` created 
using the same factory and in which the `ReplicaPlacement` are used.
   Everything passed to the creation factories can be made accessible on the 
returned instances if needed (given it's captured in the underlying 
implementations), but I'm not convinced it's useful, so kept it simple. 
Assumption is that plugin code creates these instances so plugin code knows why 
and keeps track of what each created instance refers to... But again, easy to 
add here and in other instances returned by factories (might need to define 
subinterfaces then to make the appropriate values accessible - BTW that's how I 
started coding locally, but then simplified to limit the number of interfaces 
and for the reasons exposed above).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8626) standardise test class naming

2020-08-06 Thread Marcus Eagan (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172712#comment-17172712
 ] 

Marcus Eagan commented on LUCENE-8626:
--

There are many many areas where I am looking to improve the developer 
experience and the code hygiene. I'm not some guru of clean code or anything, 
but I am starting go through my laundry list of things that drive me (and 
others) nuts and reduces the overall quality of the project. 

I intend to add a pre-commit check to enforce this and other standards as they 
come through.

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8626) standardise test class naming

2020-08-06 Thread Marcus Eagan (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172709#comment-17172709
 ] 

Marcus Eagan commented on LUCENE-8626:
--

[~cpoerschke] Thank you very much for kicking this effort off back in 2018. 
This issue has many implications for problems. And for what I am working on, 
there is a detriment to my productivity because of it. I'm sure I am not alone. 

If you don't mind, I'd like to take this effort a bit further and to completion 
via a PR. 

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-08-06 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466729056



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PropertyKeyFactory.java
##
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Factory used by the plugin to create property keys to request property 
values from Solr.
+ *
+ * Building of a {@link PropertyKey} requires specifying the target (context) 
from which the value of that key should be
+ * obtained. This is done by specifying the appropriate {@link 
PropertyValueSource}.
+ * For clarity, when only a single type of target is acceptable, the 
corresponding subtype of {@link PropertyValueSource} is used instead
+ * (for example {@link Node}).
+ */
+public interface PropertyKeyFactory {
+  /**
+   * Returns a property key to request the number of cores on a {@link Node}.
+   */
+  PropertyKey createCoreCountKey(Node node);
+
+  /**
+   * Returns a property key to request disk related info on a {@link Node}.
+   */
+  PropertyKey createDiskInfoKey(Node node);
+
+  /**
+   * Returns a property key to request the value of a system property on a 
{@link Node}.
+   * @param systemPropertyName the name of the system property to retrieve.
+   */
+  PropertyKey createSystemPropertyKey(Node node, String systemPropertyName);
+
+  /**
+   * Returns a property key to request the value of a metric.
+   *
+   * Not all metrics make sense everywhere, but metrics can be applied to 
different objects. For example
+   * SEARCHER.searcher.indexCommitSize would make sense for a 
given replica of a given shard of a given collection,
+   * and possibly in other contexts.
+   *
+   * @param metricSource The registry of the metric. For example a specific 
{@link Replica}.
+   * @param metricName for example 
SEARCHER.searcher.indexCommitSize.
+   */
+  PropertyKey createMetricKey(PropertyValueSource metricSource, String 
metricName);

Review comment:
   So these would be metrics that live on a node but that are accessed 
differently or with a different name? If we were able to distinguish by some 
other mean, would Node be an appropriate PropertyValueSource? Another type of 
PropertyValueSource can be introduced, but then would have to point to a 
specific JVM...
   Can you point me to examples of these two metrics in 8x or trunk?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-08-06 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466727764



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PropertyKeyFactory.java
##
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Factory used by the plugin to create property keys to request property 
values from Solr.
+ *
+ * Building of a {@link PropertyKey} requires specifying the target (context) 
from which the value of that key should be
+ * obtained. This is done by specifying the appropriate {@link 
PropertyValueSource}.
+ * For clarity, when only a single type of target is acceptable, the 
corresponding subtype of {@link PropertyValueSource} is used instead
+ * (for example {@link Node}).
+ */
+public interface PropertyKeyFactory {
+  /**
+   * Returns a property key to request the number of cores on a {@link Node}.
+   */
+  PropertyKey createCoreCountKey(Node node);

Review comment:
   If we add new types of `PropertyKeys` we will have to add 
implementations for these new keys. Wouldn't we need to touch the Solr codebase 
anyway? Clients (plugins) using the interface would have to know about the new 
implementation classes and update their code to use them. Technically they 
could pass a class name through config or other means to use new 
implementations without code change, but is it a realistic scenario? What would 
they do with these keys? What are the values these keys will fetch and how will 
they be used?
   I'm not against making generic and highly flexible code but only if it's 
really needed. So if you have a real use case in mind that we should support, 
I'm open. Otherwise I'd rather keep things strongly typed for now (and as long 
as we only add stuff to these interfaces we're not breaking anything so we can 
add later).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14718) Multiple flaws in tracking which UpdateCommand is associated with a given failure logged by ErrorReportingConcurrentUpdateSolrClient: "cmd=add{,id=(null)}"

2020-08-06 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172705#comment-17172705
 ] 

Chris M. Hostetter commented on SOLR-14718:
---

 
 # "{{add\{,id=(null)}}}" is what you get from an 
{{AddUpdateCommand.toString()}} if either:
 ** the document it contains has no uniqueKey (ie: not in use by the schema, 
not yet filled in by some custom processor, etc...) ... this situation is 
pretty rare in practice
 ** the document itself is null
 # {{JavabinLoader.parseAndLoadDocs}} takes a "re-use" approach with 
{{AddUpdateCommand}} ...
 ** it initializes a single {{AddUpdateCommand addCmd}} for the whole request
 ** it calls {{addCmd.solrDoc = document; ... processor.processAdd(addCmd); 
addCmd.clear();}} for each document
 # {{DistributedUpdateProcessor}} uses {{SolrCmdDistributor}} uses 
{{StreamingSolrClients}} to create & asynchornously process a {{Req}} for each 
of the individual _documents_
 ** but along the way it keeps a reference to the (Add) {{UpdateCommand}} that 
document came from, evidently in order to log info about it during error 
handling
 ** which is useless once {{JavabinLoader}} has nulled out the details of the 
{{AddUpdateCommand}}
 *** IIRC there's code to "clone" the {{SolrInputDocument}} for local 
processing so we don't accidentally modify it in update processors while it's 
'in flight' for async remote updates, but in this case it's the 
{{UpdateCommand}} that's getting modified, and i guess nothing clones that?
 # BUT! ... even if we "fix" this {{AddUpdateCommand}} re-use (or clone the 
entire {{UpdateCommand}}, not just the SolrInputDocument) there appears to be 
another problem (which i think already affects things like the XML/JSON loaders 
when indexing multiple documents per request?
 ** the way the {{Req}} (and thus {{UpdateCommand}} ) is tracked for use 
if/when there is an error is as a member variable on the 
{{ErrorReportingConcurrentUpdateSolrClient}} 
{{StreamingSolrClients.getSolrClient(Req)}} initializes
 ** Except...
 ** {{StreamingSolrClients}} maintains a {{Map solrClients}} "cache" of solr clients key'ed 
off of the {{Req}} objects {{req.node.getUrl()}}
 *** So if a single {{SolrQueryRequest}} includes a batch of multiple documents 
destined for the same node (shard? leader?) then (AFAICT) any document which 
has a failure is going to be reported as the _first_ document in that batch
 *** ie: instead of "{{add\{,id=(null)}}}" you might get a failure for 
"{{add\{,id=xxx}}}" even though 'xxx' may have been indexed just fine, but doc 
'yyy' (which lives in the same shard as 'xxx') may have been the "add" command 
that really failed

> Multiple flaws in tracking which UpdateCommand is associated with a given 
> failure logged by ErrorReportingConcurrentUpdateSolrClient: 
> "cmd=add{,id=(null)}"
> ---
>
> Key: SOLR-14718
> URL: https://issues.apache.org/jira/browse/SOLR-14718
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
>
> Here's an example, taken from SOLR-13486, of an ERROR logged by 
> {{ErrorReportingConcurrentUpdateSolrClient}} when a distrubted update failure 
> occured...
> {noformat}
>[junit4]   2> 1704143 ERROR 
> (updateExecutor-6525-thread-1-processing-x:outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n1
>  r:core_node2 null n:127.0.0.1:34940_solr 
> c:outOfSyncReplicasCannotBecomeLeader-false s:shard1) [n:127.0.0.1:34940_solr 
> c:outOfSyncReplicasCannotBecomeLeader-false s:shard1 r:core_node2 
> x:outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n1] 
> o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling 
> SolrCmdDistributor$Req: cmd=add{,id=(null)}; node=StdNode: 
> http://127.0.0.1:40376/solr/outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n5/
>  to 
> http://127.0.0.1:40376/solr/outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n5/
>[junit4]   2>   => java.io.IOException: java.net.ConnectException: 
> Connection refused
> {noformat}
> In this case the the underlying cause was a ConnectException - but the same 
> ERROR msg format is used regardless of the underlying Exception that was 
> thrown - and it's the result of these two bits of code...
> {code:java}
> // ErrorReportingConcurrentUpdateSolrClient.handleError
> log.error("Error when calling {} to {}", req, req.node.getUrl(), ex);
> // Req.toString()...
> public String toString() {
>   StringBuilder sb = new StringBuilder();
>   sb.append("SolrCmdDistributor$Req: cmd=").append(cmd.toString());
>   sb.append("; node=").append(String.valueOf(node));
>   

[jira] [Created] (SOLR-14718) Multiple flaws in tracking which UpdateCommand is associated with a given failure logged by ErrorReportingConcurrentUpdateSolrClient: "cmd=add{,id=(null)}"

2020-08-06 Thread Chris M. Hostetter (Jira)
Chris M. Hostetter created SOLR-14718:
-

 Summary: Multiple flaws in tracking which UpdateCommand is 
associated with a given failure logged by 
ErrorReportingConcurrentUpdateSolrClient: "cmd=add{,id=(null)}"
 Key: SOLR-14718
 URL: https://issues.apache.org/jira/browse/SOLR-14718
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Chris M. Hostetter


Here's an example, taken from SOLR-13486, of an ERROR logged by 
{{ErrorReportingConcurrentUpdateSolrClient}} when a distrubted update failure 
occured...
{noformat}
   [junit4]   2> 1704143 ERROR 
(updateExecutor-6525-thread-1-processing-x:outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n1
 r:core_node2 null n:127.0.0.1:34940_solr 
c:outOfSyncReplicasCannotBecomeLeader-false s:shard1) [n:127.0.0.1:34940_solr 
c:outOfSyncReplicasCannotBecomeLeader-false s:shard1 r:core_node2 
x:outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n1] 
o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling 
SolrCmdDistributor$Req: cmd=add{,id=(null)}; node=StdNode: 
http://127.0.0.1:40376/solr/outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n5/
 to 
http://127.0.0.1:40376/solr/outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n5/
   [junit4]   2>   => java.io.IOException: java.net.ConnectException: 
Connection refused
{noformat}
In this case the the underlying cause was a ConnectException - but the same 
ERROR msg format is used regardless of the underlying Exception that was thrown 
- and it's the result of these two bits of code...
{code:java}
// ErrorReportingConcurrentUpdateSolrClient.handleError
log.error("Error when calling {} to {}", req, req.node.getUrl(), ex);

// Req.toString()...
public String toString() {
  StringBuilder sb = new StringBuilder();
  sb.append("SolrCmdDistributor$Req: cmd=").append(cmd.toString());
  sb.append("; node=").append(String.valueOf(node));
  return sb.toString();
}
{code}
I was recently asked why the {{UpdateCommand cmd}} reported by the 
{{Req.toString()}} was *ALWAYS* showing up as {{add\{,id=(null)};}} (ie: an 
"empty" {{AddUpdateCommand}} ) instead of correctly identifying which document 
was failing.

In the above case of a "ConnectionException" this may not matter, but the same 
problem exists if an individual document has problem, perhaps due to schema 
conflictss detected by the leader when some other node forwards TOLEADER.

Based on an audit of the code, there appears to be at least 2 diff bugs in Solr 
that can cause the "cmd" reported in these error situations to be wrong:
 * UpdateCommand re-use in JavabinLoader
 * ErrorReportingConcurrentUpdateSolrClient in StreamingSolrClients

...full notes to follow in comment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-08-06 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466724009



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/CreateNewCollectionRequest.java
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.util.Set;
+
+/**
+ * Request for creating a new collection with a given set of shards and 
replication factor for various replica types.
+ * The expected {@link WorkOrder} corresponding to this {@link Request} is 
created using
+ * {@link WorkOrderFactory#createWorkOrderNewCollection}
+ *
+ * Note there is no need at this stage to allow the plugin to know each 
shard hash range for example, this can be handled
+ * by the Solr side implementation of this interface without needing the 
plugin to worry about it (the implementation of this interface on
+ * the Solr side can maintain the ranges for each shard).
+ *
+ * Same goes for the {@link org.apache.solr.core.ConfigSet} name or other 
collection parameters. They are needed for
+ * creating a Collection but likely do not have to be exposed to the plugin 
(this can easily be changed if needed by
+ * adding accessors here, the underlying Solr side implementation of this 
interface has the information).
+ */
+public interface CreateNewCollectionRequest extends Request {
+  /**
+   * The name of the collection to be created and for which placement 
should be computed.
+   *
+   * Compare this method with {@link AddReplicasRequest#getCollection()}, 
there the collection already exists so can be
+   * directly passed in the {@link Request}.
+   *
+   * When processing this request, plugin code doesn't have to worry about 
existing {@link Replica}'s for the collection
+   * given that the collection is assumed not to exist.
+   */
+  String getCollectionName();
+
+  Set getShardNames();
+
+  /**
+   * Properties passed through the Collection API by the client creating 
the collection.
+   * See {@link SolrCollection#getCustomProperty(String)}.
+   *
+   * Given this {@link Request} is for creating a new collection, it is not 
possible to pass the custom property values through
+   * the {@link SolrCollection} object. That instance does not exist yet, and 
is the reason {@link #getCollectionName()} exists
+   * rather than a method returning {@link SolrCollection}...
+   */
+  String getCustomProperty(String customPropertyName);

Review comment:
   I can add such an enumeration (but then would skip non `String` 
properties, or just `toString()` everything) but unclear to me how a plugin 
would be basing placement decisions on properties it doesn't know about.
   Indeed the general idea is that the plugin does not do the calls and does 
not need to access all the information from the Collection API call. It is 
called for placement computation, the code on the Solr side knows everything 
the Collection API call has provided and will handle the CREATE command (to 
Overseer) like it does today.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-08-06 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466722280



##
File path: solr/core/src/java/org/apache/solr/cluster/placement/Cluster.java
##
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.io.IOException;
+import java.util.Optional;
+import java.util.Set;
+
+/**
+ * A representation of the (initial) cluster state, providing information 
on which nodes are part of the cluster and a way
+ * to get to more detailed info.
+ *
+ * This instance can also be used as a {@link PropertyValueSource} if 
{@link PropertyKey}'s need to be specified with
+ * a global cluster target.
+ */
+public interface Cluster extends PropertyValueSource {
+  /**
+   * @return current set of live nodes. Never null, never empty 
(Solr wouldn't call the plugin if empty
+   * since no useful could then be done).
+   */
+  Set getLiveNodes();
+
+  /**
+   * Returns info about the given collection if one exists. Because it is 
not expected for plugins to request info about
+   * a large number of collections, requests can only be made one by one.
+   *
+   * This is also the reason we do not return a {@link java.util.Map} or 
{@link Set} of {@link SolrCollection}'s here: it would be
+   * wasteful to fetch all data and fill such a map when plugin code likely 
needs info about at most one or two collections.
+   */
+  Optional getCollection(String collectionName) throws 
IOException;

Review comment:
   Ok. I don't really see a use case for this (interested if you have 
something specific in mind) but will add. My thinking was that plugins will be 
interested in the Collection they need to compute placement for and other 
specific collections if "their" collection properties reference another 
collection (for example something along the lines of `withCollection`, even 
though we remove it, a plugin could reimplement).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-08-06 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466721386



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/AddReplicasRequest.java
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.util.Set;
+
+/**
+ * Request for creating one or more {@link Replica}'s for one or more 
{@link Shard}'s of an existing {@link SolrCollection}.
+ * The shard might or might not already exist, plugin code can easily find out 
by using {@link SolrCollection#getShards()}
+ * and verifying if the shard name(s) from {@link #getShardNames()} are there.
+ *
+ * As opposed to {@link CreateNewCollectionRequest}, the set of {@link 
Node}s on which the replicas should be placed
+ * is specified (defaults to being equal to the set returned by {@link 
Cluster#getLiveNodes()}).
+ *
+ * There is no extension between this interface and {@link 
CreateNewCollectionRequest} in either direction
+ * or from a common ancestor for readability. An ancestor could make sense and 
would be an "abstract interface" not intended
+ * to be implemented directly, but this does not exist in Java.
+ *
+ * Plugin code would likely treat the two types of requests differently 
since here existing {@link Replica}'s must be taken
+ * into account for placement whereas in {@link CreateNewCollectionRequest} no 
{@link Replica}'s are assumed to exist.
+ */
+public interface AddReplicasRequest extends Request {
+  /**
+   * The {@link SolrCollection} to add {@link Replica}(s) to. The replicas are 
to be added to a shard that might or might
+   * not yet exist when the plugin's {@link PlacementPlugin#computePlacement} 
is called.
+   */
+  SolrCollection getCollection();
+
+  /**
+   * Shard name(s) for which new replicas placement should be computed. The 
shard(s) might exist or not (that's why this
+   * method returns a {@link Set} of {@link String}'s and not directly a set 
of {@link Shard} instances).
+   *
+   * Note the Collection API allows specifying the shard name or a {@code 
_route_} parameter. The Solr implementation will
+   * convert either specification into the relevant shard name so the plugin 
code doesn't have to worry about this.
+   */
+  Set getShardNames();
+
+  /** Replicas should only be placed on nodes from the set returned by this 
method. */
+  Set getTargetNodes();

Review comment:
   My motivation here was to not have the plugin worry about it and if no 
specific set of subnodes was passed, just make this set equivalent to all live 
nodes. That way no specific logic on the plugin side for dealing with this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14717) Writing parquets to solr shards

2020-08-06 Thread Kevin Van Lieshout (Jira)
Kevin Van Lieshout created SOLR-14717:
-

 Summary: Writing parquets to solr shards
 Key: SOLR-14717
 URL: https://issues.apache.org/jira/browse/SOLR-14717
 Project: Solr
  Issue Type: Wish
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Kevin Van Lieshout


Is there any assistance around writing parquets from spark to solr shards or is 
it possible to customize a DIH to import a parquet to a solr shard. Let me know 
if this is possible, or the best work around for this. Much appreciated, thanks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1686: SOLR-13528: Implement Request Rate Limiters

2020-08-06 Thread GitBox


madrob commented on a change in pull request #1686:
URL: https://github.com/apache/lucene-solr/pull/1686#discussion_r466588870



##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -102,31 +103,101 @@ public Boolean call() throws Exception {
 try {
   future.get();
 } catch (Exception e) {
-  assertTrue("Not true " + e.getMessage(), 
e.getMessage().contains("non ok status: 429, message:Too Many Requests"));
+  assertThat(e.getMessage(), containsString("non ok status: 429, 
message:Too Many Requests"));
 }
   }
 
   MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) 
rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY);
 
-  assertTrue("Incoming request count did not match. Expected == 25  
incoming " + mockQueryRateLimiter.incomingRequestCount.get(),
-  mockQueryRateLimiter.incomingRequestCount.get() == 25);
+  assertEquals(mockQueryRateLimiter.incomingRequestCount.get(),25);

Review comment:
   nit: swap the parameters. assertEquals(expected, actual)

##
File path: solr/core/src/java/org/apache/solr/servlet/RateLimitManager.java
##
@@ -92,17 +100,28 @@ public boolean handleRequest(HttpServletRequest request) 
throws InterruptedExcep
* For each request rate limiter whose type that is not of the type of the 
request which got rejected,
* check if slot borrowing is enabled. If enabled, try to acquire a slot.
* If allotted, return else try next request type.
+   *
+   * @lucene.gexperimental -- Can cause slots to be blocked if a request 
borrows a slot and is itself long lived.

Review comment:
   s/gexperimental/experimental

##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -102,31 +103,101 @@ public Boolean call() throws Exception {
 try {
   future.get();
 } catch (Exception e) {
-  assertTrue("Not true " + e.getMessage(), 
e.getMessage().contains("non ok status: 429, message:Too Many Requests"));
+  assertThat(e.getMessage(), containsString("non ok status: 429, 
message:Too Many Requests"));
 }
   }
 
   MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) 
rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY);
 
-  assertTrue("Incoming request count did not match. Expected == 25  
incoming " + mockQueryRateLimiter.incomingRequestCount.get(),
-  mockQueryRateLimiter.incomingRequestCount.get() == 25);
+  assertEquals(mockQueryRateLimiter.incomingRequestCount.get(),25);
   assertTrue("Incoming accepted new request count did not match. Expected 
5 incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(),
   mockQueryRateLimiter.acceptedNewRequestCount.get() < 25);
   assertTrue("Incoming rejected new request count did not match. Expected 
20 incoming " + mockQueryRateLimiter.rejectedRequestCount.get(),
   mockQueryRateLimiter.rejectedRequestCount.get() > 0);
-  assertTrue("Incoming total processed requests count did not match. 
Expected " + mockQueryRateLimiter.incomingRequestCount.get() + " incoming "
-  + (mockQueryRateLimiter.acceptedNewRequestCount.get() + 
mockQueryRateLimiter.rejectedRequestCount.get()),
-  (mockQueryRateLimiter.acceptedNewRequestCount.get() + 
mockQueryRateLimiter.rejectedRequestCount.get()) == 
mockQueryRateLimiter.incomingRequestCount.get());
+  assertEquals(mockQueryRateLimiter.acceptedNewRequestCount.get() + 
mockQueryRateLimiter.rejectedRequestCount.get(),

Review comment:
   (expected, actual)

##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -102,31 +103,101 @@ public Boolean call() throws Exception {
 try {
   future.get();
 } catch (Exception e) {
-  assertTrue("Not true " + e.getMessage(), 
e.getMessage().contains("non ok status: 429, message:Too Many Requests"));
+  assertThat(e.getMessage(), containsString("non ok status: 429, 
message:Too Many Requests"));
 }
   }
 
   MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) 
rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY);
 
-  assertTrue("Incoming request count did not match. Expected == 25  
incoming " + mockQueryRateLimiter.incomingRequestCount.get(),
-  mockQueryRateLimiter.incomingRequestCount.get() == 25);
+  assertEquals(mockQueryRateLimiter.incomingRequestCount.get(),25);
   assertTrue("Incoming accepted new request count did not match. Expected 
5 incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(),
   mockQueryRateLimiter.acceptedNewRequestCount.get() < 25);
   assertTrue("Incoming rejected new request count did not match. Expected 
20 incoming " + mockQueryRateLimiter.rejectedRequest

[jira] [Commented] (LUCENE-8626) standardise test class naming

2020-08-06 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172666#comment-17172666
 ] 

Erick Erickson commented on LUCENE-8626:


Hmmm, one other random thought. While I'd prefer some kind of enforcement, I'd 
claim that if we just changed the test file names however we agree, the fact 
that they'd all be consistent would make it less likely that new test classes 
are created with the abandoned pattern. I think it's worth making the change 
first, then worrying about enforcement.

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-9959) SolrInfoMBean-s category and hierarchy cleanup

2020-08-06 Thread Andrzej Bialecki (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172658#comment-17172658
 ] 

Andrzej Bialecki commented on SOLR-9959:


It's a left-over from refactoring in SOLR-13858, it needs to be deleted (it was 
actually in use when it was first introduced) - I'll take care of this.

> SolrInfoMBean-s category and hierarchy cleanup
> --
>
> Key: SOLR-9959
> URL: https://issues.apache.org/jira/browse/SOLR-9959
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 7.0
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Blocker
> Fix For: 7.0
>
> Attachments: SOLR-9959.patch, SOLR-9959.patch, SOLR-9959.patch
>
>
> SOLR-9947 changed categories of some of {{SolrInfoMBean-s}}, and it also 
> added an alternative view in JMX, similar to the one produced by 
> {{SolrJmxReporter}}.
> Some changes were left out from that issue because they would break the 
> back-compatibility in 6.x, but they should be done before 7.0:
> * remove the old JMX view of {{SolrInfoMBean}}-s and improve the new one so 
> that it's more readable and useful.
> * in many cases {{SolrInfoMBean.getName()}} just returns a FQCN, but it could 
> be more informative - eg. for highlighter or query plugins this could be the 
> symbolic name of a plugin that users know and use in configuration.
> * top-level categories need more thought. On one hand it's best to minimize 
> their number, on the other hand they need to meaningfully represent the 
> functionality of components that use them. SOLR-9947 made some cosmetic 
> changes, but more discussion is necessary (eg. QUERY vs. SEARCHHANDLER)
> * we should consider removing some of the methods in {{SolrInfoMBean}} that 
> usually don't return any useful information, eg. {{getDocs}}, {{getSource()}} 
> and {{getVersion()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] gus-asf commented on a change in pull request #1716: SOLR-14706: Fix support for default autoscaling policy

2020-08-06 Thread GitBox


gus-asf commented on a change in pull request #1716:
URL: https://github.com/apache/lucene-solr/pull/1716#discussion_r466670298



##
File path: solr/solr-ref-guide/src/solr-upgrade-notes.adoc
##
@@ -72,7 +84,9 @@ More information about this new feature is available in the 
section <

[GitHub] [lucene-solr] HoustonPutman commented on a change in pull request #1716: SOLR-14706: Fix support for default autoscaling policy

2020-08-06 Thread GitBox


HoustonPutman commented on a change in pull request #1716:
URL: https://github.com/apache/lucene-solr/pull/1716#discussion_r466657348



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/Clause.java
##
@@ -117,10 +117,10 @@ private Clause(Map m) {
 strict = Boolean.parseBoolean(String.valueOf(m.getOrDefault("strict", 
"true")));
 Optional globalTagName = 
m.keySet().stream().filter(Policy.GLOBAL_ONLY_TAGS::contains).findFirst();
 if (globalTagName.isPresent()) {
-  globalTag = parse(globalTagName.get(), m);
-  if (m.size() > 2) {
-throw new RuntimeException("Only one extra tag supported for the tag " 
+ globalTagName.get() + " in " + toJSONString(m));
+  if (m.size() > 3) {

Review comment:
   I think that's the logic Gus used. It's equivalent to:
   
   ```java
   private void validateGlobalTag(Map m, String tagName) {
   if (m.size() > 2) {
 if (!(m.containsKey("strict") && m.size() == 3)) {
   throw new RuntimeException("Only, 'strict' and one extra tag 
supported for the tag " + tagName + " in " + toJSONString(m));
 }
   }
 }
   ```
   
   This will error unless there are exactly 3 keys and once of them is `strict`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-08-06 Thread GitBox


sigram commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r466617275



##
File path: solr/core/src/java/org/apache/solr/cloud/api/collections/Assign.java
##
@@ -569,14 +574,20 @@ public AssignStrategy create(ClusterState clusterState, 
DocCollection collection
 case RULES:
   List rules = new ArrayList<>();
   for (Object map : ruleMaps) rules.add(new Rule((Map) map));
+  @SuppressWarnings({"rawtypes"})
+  List snitches = (List) collection.get(SNITCH);
   return new RulesBasedAssignStrategy(rules, snitches, clusterState);
+case PLUGIN_PLACEMENT:
+  // TODO need to decide which plugin class to use. Global config 
(single plugin for all PLUGIN_PLACEMENT collections?) or per collection config?
+  // TODO hardconding a sample plugin for now. DO NOT MERGE this as is.
+  return new PlacementPluginAssignStrategy(new 
SamplePluginMinimizeCores());
 default:
   throw new Assign.AssignmentException("Unknown strategy type: " + 
strategy);
   }
 }
 
 private enum Strategy {
-  LEGACY, RULES;
+  LEGACY, RULES, PLUGIN_PLACEMENT;

Review comment:
   `Strategy` already describes how to perform placement. Maybe we should 
rename `Strategy` -> `Placement` and simply use `PLUGIN` here?

##
File path: solr/core/src/java/org/apache/solr/cluster/placement/Cluster.java
##
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.io.IOException;
+import java.util.Optional;
+import java.util.Set;
+
+/**
+ * A representation of the (initial) cluster state, providing information 
on which nodes are part of the cluster and a way
+ * to get to more detailed info.
+ *
+ * This instance can also be used as a {@link PropertyValueSource} if 
{@link PropertyKey}'s need to be specified with
+ * a global cluster target.
+ */
+public interface Cluster extends PropertyValueSource {
+  /**
+   * @return current set of live nodes. Never null, never empty 
(Solr wouldn't call the plugin if empty
+   * since no useful could then be done).

Review comment:
   `no useful` -> `no useful work`

##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/SystemPropertyPropertyValue.java
##
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * A {@link PropertyValue} representing a System property on the target {@link 
Node}.
+ */
+public interface SystemPropertyPropertyValue extends PropertyValue {

Review comment:
   Maybe rename it to `SyspropPropertyValue` to avoid this weird repetition?

##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/MetricPropertyValue.java
##
@@ -0,0 +1,30 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ 

[GitHub] [lucene-solr] HoustonPutman commented on a change in pull request #1716: SOLR-14706: Fix support for default autoscaling policy

2020-08-06 Thread GitBox


HoustonPutman commented on a change in pull request #1716:
URL: https://github.com/apache/lucene-solr/pull/1716#discussion_r466643442



##
File path: solr/solr-ref-guide/src/solr-upgrade-notes.adoc
##
@@ -34,17 +34,29 @@ Detailed steps for upgrading a Solr cluster are in the 
section <> 
below.
 
-=== Solr 8.6.1
+=== Solr 8.6.1 (Upgrading from 8.6.0 only)
+
+See the https://cwiki.apache.org/confluence/display/SOLR/ReleaseNote861[8.6.1 
Release Notes^]
+for an overview of the fixes included in Solr 8.6.1.
+
+When upgrading to 8.6.1 users should be aware of the following major changes 
from 8.6.0.
 
 *Autoscaling*
 
 * As mentioned in the 8.6 upgrade notes, a default autoscaling policy was 
provided starting in 8.6.0.
 This default autoscaling policy resulted in increasingly slow collection 
creation calls in large clusters (50+ collections).
 +
 In 8.6.1 the default autoscaling policy has been removed, and clouds will not 
use autoscaling unless a policy has explicitly been created.
-In order to fix the performance degradations introduced in 8.6.0, merely 
upgrade to 8.6.1.
+If your cloud is running 8.6.0 and **not using an explicit autoscaling 
policy**, upgrade to 8.6.1 and remove the default cluster policy and 
preferences via the following command.
+Replace `localhost:8983` with your Solr endpoint.
++
+```
+curl -X POST -H 'Content-type:application/json'  -d '{set-cluster-policy : [], 
set-cluster-preferences : []}' http://localhost:8983/api/cluster/autoscaling

Review comment:
   Hmmm, I'm a bit confused about that. So if I spin up a new cloud with 
8.5, I don't see a `cluster-preferences`. It basically looks the same as a 
8.6.1 cloud that has the above command run. So if we reverted the change of the 
defaulting patch, shouldn't a 8.5 cloud and 8.6.1 cloud with identical 
`autoscaling.json` ZNodes behave the same?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9379) Directory based approach for index encryption

2020-08-06 Thread Rajeswari Natarajan (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172622#comment-17172622
 ] 

Rajeswari Natarajan commented on LUCENE-9379:
-

We have a use case where we want to fit multiple index/tenant per collection 
and each index/tenant should have a separate key and we would like to use 
composite ID router. The use of composite id router do not limit each 
index/tenant per shard/directory . In this scenario , is OS level encryption 
possible?

> Directory based approach for index encryption
> -
>
> Key: LUCENE-9379
> URL: https://issues.apache.org/jira/browse/LUCENE-9379
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Bruno Roustant
>Assignee: Bruno Roustant
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> +Important+: This Lucene Directory wrapper approach is to be considered only 
> if an OS level encryption is not possible. OS level encryption better fits 
> Lucene usage of OS cache, and thus is more performant.
> But there are some use-case where OS level encryption is not possible. This 
> Jira issue was created to address those.
> 
>  
> The goal is to provide optional encryption of the index, with a scope limited 
> to an encryptable Lucene Directory wrapper.
> Encryption is at rest on disk, not in memory.
> This simple approach should fit any Codec as it would be orthogonal, without 
> modifying APIs as much as possible.
> Use a standard encryption method. Limit perf/memory impact as much as 
> possible.
> Determine how callers provide encryption keys. They must not be stored on 
> disk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8626) standardise test class naming

2020-08-06 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172538#comment-17172538
 ] 

Michael Sokolov edited comment on LUCENE-8626 at 8/6/20, 6:58 PM:
--

Personally, I prefer -prefixing- suffixing (um suffix is the one that comes 
after, right?), but more than that, I'd value consistency. Still, without 
automated enforcement, we won't get that either. So, I'd be -0 to this change 
unless it comes along with enforcement (banning files with the nonstandard 
naming scheme). Otherwise we'll just be back here again in a year...  uh, 
actually reading the thread now I see we do have an enforcement mechanism, OK 
that's great. If we can come to some consensus here, then rename away!


was (Author: sokolov):
Personally, I prefer prefixing, but more than that, I'd value consistency. 
Still, without automated enforcement, we won't get that either. So, I'd be -0 
to this change unless it comes along with enforcement (banning files with the 
nonstandard naming scheme). Otherwise we'll just be back here again in a 
year...  uh, actually reading the thread now I see we do have an enforcement 
mechanism, OK that's great. If we can come to some consensus here, then rename 
away!

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1716: SOLR-14706: Fix support for default autoscaling policy

2020-08-06 Thread GitBox


sigram commented on a change in pull request #1716:
URL: https://github.com/apache/lucene-solr/pull/1716#discussion_r466605349



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/Clause.java
##
@@ -117,10 +117,10 @@ private Clause(Map m) {
 strict = Boolean.parseBoolean(String.valueOf(m.getOrDefault("strict", 
"true")));
 Optional globalTagName = 
m.keySet().stream().filter(Policy.GLOBAL_ONLY_TAGS::contains).findFirst();
 if (globalTagName.isPresent()) {
-  globalTag = parse(globalTagName.get(), m);
-  if (m.size() > 2) {
-throw new RuntimeException("Only one extra tag supported for the tag " 
+ globalTagName.get() + " in " + toJSONString(m));
+  if (m.size() > 3) {

Review comment:
   @gus-asf yes, that's the intent, though your pseudo-code is still 
incorrect - if there's no `strict` tag then `m.size() > 2` is already an error. 
In other words, for global tags we expect exactly 2 keys (tag and operand), 
with optional third key `strict`.

##
File path: solr/solr-ref-guide/src/solr-upgrade-notes.adoc
##
@@ -72,7 +84,9 @@ More information about this new feature is available in the 
section < newValues = new 
HashMap<>(scenario.cluster.getSimNodeStateProvider().simGetNodeValues(node));

Review comment:
   +1, this fixed a genuine bug in < 8.6.0

##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/Clause.java
##
@@ -686,7 +686,7 @@ boolean isShardAbsent() {
   for (Row r : session.matrix) {
 computedValueEvaluator.node = r.node;
 SealedClause sealedClause = getSealedClause(computedValueEvaluator);
-if (!sealedClause.getGlobalTag().isPass(r)) {
+if (r.isLive() && !sealedClause.getGlobalTag().isPass(r)) {

Review comment:
   +1, this was a genuine bug in 8.5.

##
File path: solr/solr-ref-guide/src/solr-upgrade-notes.adoc
##
@@ -34,17 +34,29 @@ Detailed steps for upgrading a Solr cluster are in the 
section <> 
below.
 
-=== Solr 8.6.1
+=== Solr 8.6.1 (Upgrading from 8.6.0 only)
+
+See the https://cwiki.apache.org/confluence/display/SOLR/ReleaseNote861[8.6.1 
Release Notes^]
+for an overview of the fixes included in Solr 8.6.1.
+
+When upgrading to 8.6.1 users should be aware of the following major changes 
from 8.6.0.
 
 *Autoscaling*
 
 * As mentioned in the 8.6 upgrade notes, a default autoscaling policy was 
provided starting in 8.6.0.
 This default autoscaling policy resulted in increasingly slow collection 
creation calls in large clusters (50+ collections).
 +
 In 8.6.1 the default autoscaling policy has been removed, and clouds will not 
use autoscaling unless a policy has explicitly been created.
-In order to fix the performance degradations introduced in 8.6.0, merely 
upgrade to 8.6.1.
+If your cloud is running 8.6.0 and **not using an explicit autoscaling 
policy**, upgrade to 8.6.1 and remove the default cluster policy and 
preferences via the following command.
+Replace `localhost:8983` with your Solr endpoint.
++
+```
+curl -X POST -H 'Content-type:application/json'  -d '{set-cluster-policy : [], 
set-cluster-preferences : []}' http://localhost:8983/api/cluster/autoscaling

Review comment:
   Setting preferences to `[]` actually removes the default preferences 
that have always been implicitly present, so this changes the behavior as 
compared with 8.5 and earlier. We should only reset the cluster_policy to `[]`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on pull request #1684: SOLR-14613: strongly typed initial proposal for plugin interface

2020-08-06 Thread GitBox


murblanc commented on pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#issuecomment-670099928


   I have implemented the cluster state abstractions and added some (naive and 
temporary) wiring to select this assign strategy.
   A lot of missing parts, this can't be merged anymore at this stage. Work In 
Progress.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config

2020-08-06 Thread GitBox


tflobbe commented on a change in pull request #1602:
URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466606166



##
File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java
##
@@ -68,6 +68,19 @@
 
   public final double ramBufferSizeMB;
   public final int ramPerThreadHardLimitMB;
+  /**

Review comment:
   I didn't want to drop the link/see by itself, so I added a single line 
introducing the functionality. After that I added specifics about Solr 
(solrconfig configuration and the warning setting). I honestly don't see the 
problem here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config

2020-08-06 Thread GitBox


tflobbe commented on a change in pull request #1602:
URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466603545



##
File path: solr/core/src/test/org/apache/solr/update/SolrIndexConfigTest.java
##
@@ -208,4 +210,16 @@ public void testToMap() throws Exception {
 
 assertEquals(mSizeExpected, m.size());
   }
+  
+  public void testMaxCommitMergeWaitSeconds() throws Exception {
+SolrConfig sc = new SolrConfig(TEST_PATH().resolve("collection1"), 
"solrconfig-test-misc.xml");
+assertEquals(-1, sc.indexConfig.maxCommitMergeWaitMillis);
+assertEquals(IndexWriterConfig.DEFAULT_MAX_COMMIT_MERGE_WAIT_MILLIS, 
sc.indexConfig.toIndexWriterConfig(h.getCore()).getMaxCommitMergeWaitMillis());
+System.setProperty("solr.tests.maxCommitMergeWait", "10");
+sc = new SolrConfig(TEST_PATH().resolve("collection1"), 
"solrconfig-test-misc.xml");
+assertEquals(10, sc.indexConfig.maxCommitMergeWaitMillis);
+assertEquals(10, 
sc.indexConfig.toIndexWriterConfig(h.getCore()).getMaxCommitMergeWaitMillis());
+System.clearProperty("solr.tests.maxCommitMergeWait");

Review comment:
   Yes, I checked. I believe the cleanup is after the class. I discussed 
with @madrob, and I'll move the cleanup to a `tearDown()` method





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] tflobbe commented on pull request #1719: SOLR-14702: Remove Master and Slave from Code Base and Docs (8.x)

2020-08-06 Thread GitBox


tflobbe commented on pull request #1719:
URL: https://github.com/apache/lucene-solr/pull/1719#issuecomment-670093537


   > I thought was changed in my PR...
   
   They were, my point is that I didn't re-set these to legacy terms (after I 
backported your commit) because in this particular case we don't need it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1719: SOLR-14702: Remove Master and Slave from Code Base and Docs (8.x)

2020-08-06 Thread GitBox


tflobbe commented on a change in pull request #1719:
URL: https://github.com/apache/lucene-solr/pull/1719#discussion_r466597411



##
File path: solr/core/src/java/org/apache/solr/cloud/ReplicateFromLeader.java
##
@@ -76,12 +76,12 @@ public void startReplication(boolean switchTransactionLog) 
throws InterruptedExc
   }
   log.info("Will start replication from leader with poll interval: {}", 
pollIntervalStr );
 
-  NamedList slaveConfig = new NamedList<>();
-  slaveConfig.add("fetchFromLeader", Boolean.TRUE);
-  slaveConfig.add(ReplicationHandler.SKIP_COMMIT_ON_MASTER_VERSION_ZERO, 
switchTransactionLog);
-  slaveConfig.add("pollInterval", pollIntervalStr);
+  NamedList followerConfig = new NamedList<>();
+  followerConfig.add("fetchFromLeader", Boolean.TRUE);
+  
followerConfig.add(ReplicationHandler.SKIP_COMMIT_ON_LEADER_VERSION_ZERO, 
switchTransactionLog);
+  followerConfig.add("pollInterval", pollIntervalStr);
   NamedList replicationConfig = new NamedList<>();
-  replicationConfig.add("slave", slaveConfig);
+  replicationConfig.add("follower", followerConfig);

Review comment:
   For 8.x branches, every time we set a parameter or configuration (for 
requests, etc) we use the legacy names. In this particular case, I'm using the 
new parameters even for setting, and the reason is that this is a configuration 
that is set and read locally only. The TLOG/PULL replicas create this to create 
their replication process.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14702) Remove Master and Slave from Code Base and Docs

2020-08-06 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172583#comment-17172583
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-14702:
--

Thanks Cassandra. Then, the followup tasks are all in progress here:
* Update upgrade notes: https://github.com/apache/lucene-solr/pull/1718
* Make 8.x version of the PR: https://github.com/apache/lucene-solr/pull/1719. 
I'll wait to merge that one until we get some of the testing Marcus is working 
on: https://issues.apache.org/jira/browse/SOLR-14708
* Rename the "master/slave mode": 
https://issues.apache.org/jira/browse/SOLR-14716


> Remove Master and Slave from Code Base and Docs
> ---
>
> Key: SOLR-14702
> URL: https://issues.apache.org/jira/browse/SOLR-14702
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: master (9.0)
>Reporter: Marcus Eagan
>Priority: Critical
> Attachments: SOLR-14742-testfix.patch
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> Every time I read _master_ and _slave_, I get pissed.
> I think about the last and only time I remember visiting my maternal great 
> grandpa in Alabama at four years old. He was a sharecropper before WWI, where 
> he lost his legs, and then he was back to being a sharecropper somehow after 
> the war. Crazy, I know. I don't know if the world still called his job 
> sharecropping in 1993, but he was basically a slave—in America. He lived in 
> the same shack that his father, and his grandfather (born a slave) lived in 
> down in Alabama. Believe it or not, my dad's (born in 1926) grandfather was 
> actually born a slave, freed shortly after birth by his owner father. I never 
> met him, though. He died in the 40s.
> Anyway, I cannot police all terms in the repo and do not wish to. This 
> master/slave shit is archaic and misleading on technical grounds. Thankfully, 
> there's only a handful of files in code and documentation that still talk 
> about masters and slaves. We should replace all of them.
> There are so many ways to reword it. In fact, unless anyone else objects or 
> wants to do the grunt work to help my stress levels, I will open the pull 
> request myself in effort to make this project and community more inviting to 
> people of all backgrounds and histories. We can have leader/follower, or 
> primary/secondary, but none of this Master/Slave nonsense. I'm sick of the 
> garbage. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config

2020-08-06 Thread GitBox


dsmiley commented on a change in pull request #1602:
URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466589294



##
File path: solr/core/src/test/org/apache/solr/update/SolrIndexConfigTest.java
##
@@ -208,4 +210,16 @@ public void testToMap() throws Exception {
 
 assertEquals(mSizeExpected, m.size());
   }
+  
+  public void testMaxCommitMergeWaitSeconds() throws Exception {
+SolrConfig sc = new SolrConfig(TEST_PATH().resolve("collection1"), 
"solrconfig-test-misc.xml");
+assertEquals(-1, sc.indexConfig.maxCommitMergeWaitMillis);
+assertEquals(IndexWriterConfig.DEFAULT_MAX_COMMIT_MERGE_WAIT_MILLIS, 
sc.indexConfig.toIndexWriterConfig(h.getCore()).getMaxCommitMergeWaitMillis());
+System.setProperty("solr.tests.maxCommitMergeWait", "10");
+sc = new SolrConfig(TEST_PATH().resolve("collection1"), 
"solrconfig-test-misc.xml");
+assertEquals(10, sc.indexConfig.maxCommitMergeWaitMillis);
+assertEquals(10, 
sc.indexConfig.toIndexWriterConfig(h.getCore()).getMaxCommitMergeWaitMillis());
+System.clearProperty("solr.tests.maxCommitMergeWait");

Review comment:
   Please verify if this is so before removing.  Perhaps there may be a 
difference in behavior between gradle & IntelliJ; so try both.  I thought 
clearing wasn't necessary as well but I recall hearing from someone (@dweiss ?) 
that the auto-clearing wasn't working.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config

2020-08-06 Thread GitBox


dsmiley commented on a change in pull request #1602:
URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466587705



##
File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java
##
@@ -129,8 +143,9 @@ public SolrIndexConfig(SolrConfig solrConfig, String 
prefix, SolrIndexConfig def
 true);
 
 useCompoundFile = solrConfig.getBool(prefix+"/useCompoundFile", 
def.useCompoundFile);
-
maxBufferedDocs=solrConfig.getInt(prefix+"/maxBufferedDocs",def.maxBufferedDocs);
+maxBufferedDocs = solrConfig.getInt(prefix+"/maxBufferedDocs", 
def.maxBufferedDocs);
 ramBufferSizeMB = solrConfig.getDouble(prefix+"/ramBufferSizeMB", 
def.ramBufferSizeMB);
+maxCommitMergeWaitMillis = solrConfig.getInt(prefix+"/maxCommitMergeWait", 
def.maxCommitMergeWaitMillis);

Review comment:
   The "Wait" part is important too too.  "maxCommitMergeWaitTime" is my 
preference





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] tflobbe commented on pull request #1718: SOLR-14702: Add Upgrade Notes and CHANGES entry

2020-08-06 Thread GitBox


tflobbe commented on pull request #1718:
URL: https://github.com/apache/lucene-solr/pull/1718#issuecomment-670083339


   Ah, good point, I seemed to remember some change with the release notes 
recently but I couldn't remember what it was



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config

2020-08-06 Thread GitBox


tflobbe commented on a change in pull request #1602:
URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466565653



##
File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java
##
@@ -129,8 +143,9 @@ public SolrIndexConfig(SolrConfig solrConfig, String 
prefix, SolrIndexConfig def
 true);
 
 useCompoundFile = solrConfig.getBool(prefix+"/useCompoundFile", 
def.useCompoundFile);
-
maxBufferedDocs=solrConfig.getInt(prefix+"/maxBufferedDocs",def.maxBufferedDocs);
+maxBufferedDocs = solrConfig.getInt(prefix+"/maxBufferedDocs", 
def.maxBufferedDocs);
 ramBufferSizeMB = solrConfig.getDouble(prefix+"/ramBufferSizeMB", 
def.ramBufferSizeMB);
+maxCommitMergeWaitMillis = solrConfig.getInt(prefix+"/maxCommitMergeWait", 
def.maxCommitMergeWaitMillis);

Review comment:
   I had it with `MIllis` and I removed that because I didn't see anything 
else in the solrconfig that included the unit for time (it's always 
milliseconds). You are suggesting `maxCommitMergeWaitMillis` or 
`maxCommitMergeTime`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config

2020-08-06 Thread GitBox


tflobbe commented on a change in pull request #1602:
URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466565653



##
File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java
##
@@ -129,8 +143,9 @@ public SolrIndexConfig(SolrConfig solrConfig, String 
prefix, SolrIndexConfig def
 true);
 
 useCompoundFile = solrConfig.getBool(prefix+"/useCompoundFile", 
def.useCompoundFile);
-
maxBufferedDocs=solrConfig.getInt(prefix+"/maxBufferedDocs",def.maxBufferedDocs);
+maxBufferedDocs = solrConfig.getInt(prefix+"/maxBufferedDocs", 
def.maxBufferedDocs);
 ramBufferSizeMB = solrConfig.getDouble(prefix+"/ramBufferSizeMB", 
def.ramBufferSizeMB);
+maxCommitMergeWaitMillis = solrConfig.getInt(prefix+"/maxCommitMergeWait", 
def.maxCommitMergeWaitMillis);

Review comment:
   I had it with `MIllis` and I removed that because I didn't see anything 
else in the solrconfig that included the unit for time (it's always 
milliseconds). You are suggesting `maxCommitMergeWait` or `maxCommitMergeTime`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config

2020-08-06 Thread GitBox


tflobbe commented on a change in pull request #1602:
URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466561784



##
File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java
##
@@ -87,6 +100,7 @@ private SolrIndexConfig(SolrConfig solrConfig) {
 maxBufferedDocs = -1;
 ramBufferSizeMB = 100;
 ramPerThreadHardLimitMB = -1;
+maxCommitMergeWaitMillis = -1;

Review comment:
   I don't think that's a good idea. "-1" from the Solr perspective means 
"don't set the value".  If someone has a custom merge policy where they have a 
different value set by code, we would be changing it to 
`org.apache.lucene.index.IndexWriterConfig#DEFAULT_MAX_COMMIT_MERGE_WAIT_MILLIS`
 without them having set anything on `solrconfig.xml`. 
   > In the future, this constant might change, and that's good.
   
   That's fine, since we aren't calling `iwc.setMaxCommitMergeWaitMillis(...)` 
in the default case, we'll be taking the new default for all cores where no 
alternative configuration has been provided.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8626) standardise test class naming

2020-08-06 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172545#comment-17172545
 ] 

Erick Erickson commented on LUCENE-8626:


Contrariwise, I prefer suffixing on the theory that when looking at a 
directory, 
TestFoo
TestBar
TestBlivet

requires me to read past the "Test" before being able to see the class name, 
whereas FooTest, BarTest, BlivetTest is easier on the eyes. 

That said, I'll abide by whatever the person leading the charge decides, 
Michael Sokolov's comment about valuing consistency is germane.



> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer

2020-08-06 Thread Mr. Aleem (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172540#comment-17172540
 ] 

Mr. Aleem commented on LUCENE-2458:
---

Ինչպես նկատել է Կոժին, կարծես թե կատարվածը պատահաբար փոխել է մոլի կանխադրված 
պահվածքը (այսինքն ՝ վերջին կցված կարկատը չի եղել, բայց կատարվածը):
Pdp-11- ի հարցումը այժմ տեքստի արդյունք է տալիս. Pdp Կամ տեքստ. Տեքստի փոխարեն 
`11:" pdp 11 "

Միգուցե մենք պետք է փոխենք SolrQueryParser- ը ՝ օգտագործելով == LUCENE_24 
տարբերակը (կամ LUCENE_29- ը նույնպես կաշխատի): [Like given 
this|https://piratesfile.com/hitfilm-pro-crack]

> queryparser makes all CJK queries phrase queries regardless of analyzer
> ---
>
> Key: LUCENE-2458
> URL: https://issues.apache.org/jira/browse/LUCENE-2458
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.1, 4.0-ALPHA
>
> Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, 
> LUCENE-2458.patch
>
>
> The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, 
> ... queries into phrase queries, even though you didn't ask for one, and 
> there isn't a way to turn this off.
> This completely breaks lucene for these languages, as it treats all queries 
> like 'grep'.
> Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are 
> chinese characters, you get a phrasequery of "a b c d". if you use cjk 
> analyzer, its no better, its a phrasequery of  "ab bc cd", and if you use 
> smartchinese analyzer, you get a phrasequery like "ab cd". But the user 
> didn't ask for one, and they cannot turn it off.
> The reason is that the code to form phrase queries is not internationally 
> appropriate and assumes whitespace tokenization. If more than one token comes 
> out of whitespace delimited text, its automatically a phrase query no matter 
> what.
> The proposed patch fixes the core queryparser (with all backwards compat 
> kept) to only form phrase queries when the double quote operator is used. 
> Implementing subclasses can always extend the QP and auto-generate whatever 
> kind of queries they want that might completely break search for languages 
> they don't care about, but core general-purpose QPs should be language 
> independent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8626) standardise test class naming

2020-08-06 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172538#comment-17172538
 ] 

Michael Sokolov edited comment on LUCENE-8626 at 8/6/20, 4:52 PM:
--

Personally, I prefer prefixing, but more than that, I'd value consistency. 
Still, without automated enforcement, we won't get that either. So, I'd be -0 
to this change unless it comes along with enforcement (banning files with the 
nonstandard naming scheme). Otherwise we'll just be back here again in a 
year...  uh, actually reading the thread now I see we do have an enforcement 
mechanism, OK that's great. If we can come to some consensus here, then rename 
away!


was (Author: sokolov):
Personally, I prefer prefixing, but more than that, I'd value consistency. 
Still, without automated enforcement, we won't get that either. So, I'd be -0 
to this change unless it comes along with enforcement (banning files with the 
nonstandard naming scheme). Otherwise we'll just be back here again in a year

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8626) standardise test class naming

2020-08-06 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172538#comment-17172538
 ] 

Michael Sokolov commented on LUCENE-8626:
-

Personally, I prefer prefixing, but more than that, I'd value consistency. 
Still, without automated enforcement, we won't get that either. So, I'd be -0 
to this change unless it comes along with enforcement (banning files with the 
nonstandard naming scheme). Otherwise we'll just be back here again in a year

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9447) Make BEST_COMPRESSION compress more aggressively?

2020-08-06 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172510#comment-17172510
 ] 

Robert Muir commented on LUCENE-9447:
-

>From my experiments done on LUCENE-6100, increasing block size for the most 
>part is a workaround, it only helps hide the waste of rebooting the deflate 
>dictionary from *scratch* for every block? So the "crappy preset" tried to 
>show that there. 

Sadly I couldn't find a decent/simple way to use preset dictionary that made me 
happy with what is available in the JDK. It doesn't expose some zlib methods 
that you would need (e.g. retrieving current dictionary) and as I mentioned, 
there were some inefficiencies, at least with how we had compression hooked in 
at the time.




> Make BEST_COMPRESSION compress more aggressively?
> -
>
> Key: LUCENE-9447
> URL: https://issues.apache.org/jira/browse/LUCENE-9447
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>
> The Lucene86 codec supports setting a "Mode" for stored fields compression, 
> that is either "BEST_SPEED", which translates to blocks of 16kB or 128 
> documents (whichever is hit first) compressed with LZ4, or 
> "BEST_COMPRESSION", which translates to blocks of 60kB or 512 documents 
> compressed with DEFLATE with default compression level (6).
> After looking at indices that spent most disk space on stored fields 
> recently, I noticed that there was quite some room for improvement by 
> increasing the block size even further:
> ||Block size||Stored fields size||
> |60kB|168412338|
> |128kB|130813639|
> |256kB|113587009|
> |512kB|104776378|
> |1MB|100367095|
> |2MB|98152464|
> |4MB|97034425|
> |8MB|96478746|
> For this specific dataset, I had 1M documents that each had about 2kB of 
> stored fields each and quite some redundancy.
> This makes me want to look into bumping this block size to maybe 256kB. It 
> would be interesting to re-do the experiments we did on LUCENE-6100 to see 
> how this affects the merging speed. That said I don't think it would be 
> terrible if the merging time increased a bit given that we already offer the 
> BEST_SPEED option for CPU-savvy users.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14582) Expose IWC.setMaxCommitMergeWaitMillis as an expert feature in Solr's index config

2020-08-06 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley reassigned SOLR-14582:
---

Assignee: Tomas Eduardo Fernandez Lobbe  (was: David Smiley)

> Expose IWC.setMaxCommitMergeWaitMillis as an expert feature in Solr's index 
> config
> --
>
> Key: SOLR-14582
> URL: https://issues.apache.org/jira/browse/SOLR-14582
> Project: Solr
>  Issue Type: Improvement
>Reporter: Tomas Eduardo Fernandez Lobbe
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Trivial
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> LUCENE-8962 added the ability to merge segments synchronously on commit. This 
> isn't done by default and the default {{MergePolicy}} won't do it, but custom 
> merge policies can take advantage of this. Solr allows plugging in custom 
> merge policies, so if someone wants to make use of this feature they could, 
> however, they need to set {{IndexWriterConfig.maxCommitMergeWaitSeconds}} to 
> something greater than 0.
> Since this is an expert feature, I plan to document it only in javadoc and 
> not the ref guide.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config

2020-08-06 Thread GitBox


dsmiley commented on a change in pull request #1602:
URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466530152



##
File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java
##
@@ -129,8 +143,9 @@ public SolrIndexConfig(SolrConfig solrConfig, String 
prefix, SolrIndexConfig def
 true);
 
 useCompoundFile = solrConfig.getBool(prefix+"/useCompoundFile", 
def.useCompoundFile);
-
maxBufferedDocs=solrConfig.getInt(prefix+"/maxBufferedDocs",def.maxBufferedDocs);
+maxBufferedDocs = solrConfig.getInt(prefix+"/maxBufferedDocs", 
def.maxBufferedDocs);
 ramBufferSizeMB = solrConfig.getDouble(prefix+"/ramBufferSizeMB", 
def.ramBufferSizeMB);
+maxCommitMergeWaitMillis = solrConfig.getInt(prefix+"/maxCommitMergeWait", 
def.maxCommitMergeWaitMillis);

Review comment:
   I noticed you omitted a "Millis" setting.  I think either you should add 
this for clarity, or for consistency with some other times I see in solrconfig 
(e.g. maxTime), use suffix "Time".





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1602: SOLR-14582: Expose IWC.setMaxCommitMergeWaitMillis in Solr's index config

2020-08-06 Thread GitBox


dsmiley commented on a change in pull request #1602:
URL: https://github.com/apache/lucene-solr/pull/1602#discussion_r466526966



##
File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java
##
@@ -87,6 +100,7 @@ private SolrIndexConfig(SolrConfig solrConfig) {
 maxBufferedDocs = -1;
 ramBufferSizeMB = 100;
 ramPerThreadHardLimitMB = -1;
+maxCommitMergeWaitMillis = -1;

Review comment:
   Lets use 
`org.apache.lucene.index.IndexWriterConfig#DEFAULT_MAX_COMMIT_MERGE_WAIT_MILLIS`.
  In the future, this constant might change, and that's good.

##
File path: solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java
##
@@ -68,6 +68,19 @@
 
   public final double ramBufferSizeMB;
   public final int ramPerThreadHardLimitMB;
+  /**

Review comment:
   I appreciate documentation but _redundant_ documentation -- not so much. 
 Can't you do a @see or @link to one place -- 
`org.apache.lucene.index.IndexWriterConfig#setMaxCommitMergeWaitMillis`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14582) Expose IWC.setMaxCommitMergeWaitMillis as an expert feature in Solr's index config

2020-08-06 Thread Tomas Eduardo Fernandez Lobbe (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172497#comment-17172497
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-14582:
--

David, there is a PR for this that I forgot to merge(don’t know why it’s not 
linked here): https://github.com/apache/lucene-solr/pull/1602

> Expose IWC.setMaxCommitMergeWaitMillis as an expert feature in Solr's index 
> config
> --
>
> Key: SOLR-14582
> URL: https://issues.apache.org/jira/browse/SOLR-14582
> Project: Solr
>  Issue Type: Improvement
>Reporter: Tomas Eduardo Fernandez Lobbe
>Assignee: David Smiley
>Priority: Trivial
>
> LUCENE-8962 added the ability to merge segments synchronously on commit. This 
> isn't done by default and the default {{MergePolicy}} won't do it, but custom 
> merge policies can take advantage of this. Solr allows plugging in custom 
> merge policies, so if someone wants to make use of this feature they could, 
> however, they need to set {{IndexWriterConfig.maxCommitMergeWaitSeconds}} to 
> something greater than 0.
> Since this is an expert feature, I plan to document it only in javadoc and 
> not the ref guide.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14582) Expose IWC.setMaxCommitMergeWaitMillis as an expert feature in Solr's index config

2020-08-06 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley reassigned SOLR-14582:
---

Assignee: David Smiley

> Expose IWC.setMaxCommitMergeWaitMillis as an expert feature in Solr's index 
> config
> --
>
> Key: SOLR-14582
> URL: https://issues.apache.org/jira/browse/SOLR-14582
> Project: Solr
>  Issue Type: Improvement
>Reporter: Tomas Eduardo Fernandez Lobbe
>Assignee: David Smiley
>Priority: Trivial
>
> LUCENE-8962 added the ability to merge segments synchronously on commit. This 
> isn't done by default and the default {{MergePolicy}} won't do it, but custom 
> merge policies can take advantage of this. Solr allows plugging in custom 
> merge policies, so if someone wants to make use of this feature they could, 
> however, they need to set {{IndexWriterConfig.maxCommitMergeWaitSeconds}} to 
> something greater than 0.
> Since this is an expert feature, I plan to document it only in javadoc and 
> not the ref guide.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley opened a new pull request #1723: SOLR prometheus: simplify concurrent collection

2020-08-06 Thread GitBox


dsmiley opened a new pull request #1723:
URL: https://github.com/apache/lucene-solr/pull/1723


   The intent of this is to simplify some concurrent code in the Prometheus 
exporter that I think is too confusing / contorted -- particularly Async.java.  
Git blame points at @shalinmangar so I would love a review to see what you 
think.  I played with a few different approaches, and ultimately realized that 
we're working around using an Executor instead of an ExecutorService to benefit 
from invokeAll.  I wish Java didn't have a distinction between Executor & 
ExecutorService but there is and we should all just ignore plain Executor IMO.
   
   I haven't run this in-the-field but I could do so locally.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14654) Remove plugin loading from .system collection (for 9.0)

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172459#comment-17172459
 ] 

ASF subversion and git services commented on SOLR-14654:


Commit 35bf1785ec2f4131694cf7f23a139dbb7291cc7c in lucene-solr's branch 
refs/heads/master from Cassandra Targett
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=35bf178 ]

SOLR-14654: actually fix the Ref Guide build failure


> Remove plugin loading from .system collection (for 9.0)
> ---
>
> Key: SOLR-14654
> URL: https://issues.apache.org/jira/browse/SOLR-14654
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This code must go from master
> all places where "runtimeLib" can be used will be removed from 9.0  . With 
> the new package system in place we don;t need this anymore



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] HoustonPutman commented on pull request #1718: SOLR-14702: Add Upgrade Notes and CHANGES entry

2020-08-06 Thread GitBox


HoustonPutman commented on pull request #1718:
URL: https://github.com/apache/lucene-solr/pull/1718#issuecomment-669971947


   I think adding something in the ref guide upgrade notes too would be 
worthwile.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14716) Ref Guide: update leader/follower terminology

2020-08-06 Thread Cassandra Targett (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-14716:
-
Description: 
The effort to remove oppressive terminology in SOLR-14702 led to somewhat 
awkward phrasing on how to refer to non-SolrCloud configurations, specifically 
"leader/follower mode", which is potentially very confusing since SolrCloud 
also has leaders and one could consider replicas to be followers.

I propose that we standardize what we call these two modes as "coordinated 
mode" (SolrCloud) and "uncoordinated mode" (or "non-coordinated" if people 
prefer). I chose this because in thinking about what really differentiates the 
two approaches is the ZooKeeper coordination for requests, configs, etc. There 
are other differences too, of course, but that's the biggest one that stuck out 
to me as a key differentiator and applicable in the naming.

There are also places in the Ref Guide where we refer to "standalone mode", 
which in many cases means "any cluster not running SolrCloud". This has always 
been problematic, because the word "standalone" implies a single node, but it's 
of course pretty much always been possible to have a cluster of multiple nodes 
that don't run SolrCloud/ZK. This issue would address those examples also.

Note that I'm not proposing replacing the word "SolrCloud" throughout the 
documentation. Instead I'll augment the use of the word "SolrCloud" with 
clarification that this term means "coordinated mode". Later if we ever replace 
SolrCloud references in code and fully remove that name, the conceptual 
groundwork will have already been laid for users.

  was:
The effort to remove oppressive terminology in SOLR-14702 led to somewhat 
awkward phrasing on how to refer to non-SolrCloud configurations, specifically 
"leader/follower mode", which is potentially very confusing since SolrCloud 
also has leaders and one could consider replicas to be followers.

I propose that we standardize what we call these two modes as "coordinated 
mode" (SolrCloud) and "uncoordinated mode" (or "non-coordinated" if people 
prefer). I chose this because in thinking about what really differentiates the 
two approaches is the ZooKeeper coordination for requests. There are other 
differences too, of course, but that's the biggest one that stuck out to me as 
a key differentiator and applicable in the naming.

There are also places in the Ref Guide where we refer to "standalone mode", 
which in many cases means "any cluster not running SolrCloud". This has always 
been problematic, because the word "standalone" implies a single node, but it's 
of course pretty much always been possible to have a cluster of multiple nodes 
that don't run SolrCloud/ZK. This issue would address those examples also.

Note that I'm not proposing replacing the word "SolrCloud" throughout the 
documentation. Instead I'll augment the use of the word "SolrCloud" with 
clarification that this term means "coordinated mode". Later if we ever replace 
SolrCloud references in code and fully remove that name, the conceptual 
groundwork will have already been laid for users.


> Ref Guide: update leader/follower terminology
> -
>
> Key: SOLR-14716
> URL: https://issues.apache.org/jira/browse/SOLR-14716
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Priority: Major
>
> The effort to remove oppressive terminology in SOLR-14702 led to somewhat 
> awkward phrasing on how to refer to non-SolrCloud configurations, 
> specifically "leader/follower mode", which is potentially very confusing 
> since SolrCloud also has leaders and one could consider replicas to be 
> followers.
> I propose that we standardize what we call these two modes as "coordinated 
> mode" (SolrCloud) and "uncoordinated mode" (or "non-coordinated" if people 
> prefer). I chose this because in thinking about what really differentiates 
> the two approaches is the ZooKeeper coordination for requests, configs, etc. 
> There are other differences too, of course, but that's the biggest one that 
> stuck out to me as a key differentiator and applicable in the naming.
> There are also places in the Ref Guide where we refer to "standalone mode", 
> which in many cases means "any cluster not running SolrCloud". This has 
> always been problematic, because the word "standalone" implies a single node, 
> but it's of course pretty much always been possible to have a cluster of 
> multiple nodes that don't run SolrCloud/ZK. This issue would address those 
> examples also.
> Note that I'm not proposing replacing the word "SolrCloud" throughout the 
> documentation. Instead I'll augment the use of the word "SolrCloud" wit

[jira] [Created] (SOLR-14716) Ref Guide: update leader/follower terminology

2020-08-06 Thread Cassandra Targett (Jira)
Cassandra Targett created SOLR-14716:


 Summary: Ref Guide: update leader/follower terminology
 Key: SOLR-14716
 URL: https://issues.apache.org/jira/browse/SOLR-14716
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: documentation
Reporter: Cassandra Targett


The effort to remove oppressive terminology in SOLR-14702 led to somewhat 
awkward phrasing on how to refer to non-SolrCloud configurations, specifically 
"leader/follower mode", which is potentially very confusing since SolrCloud 
also has leaders and one could consider replicas to be followers.

I propose that we standardize what we call these two modes as "coordinated 
mode" (SolrCloud) and "uncoordinated mode" (or "non-coordinated" if people 
prefer). I chose this because in thinking about what really differentiates the 
two approaches is the ZooKeeper coordination for requests. There are other 
differences too, of course, but that's the biggest one that stuck out to me as 
a key differentiator and applicable in the naming.

There are also places in the Ref Guide where we refer to "standalone mode", 
which in many cases means "any cluster not running SolrCloud". This has always 
been problematic, because the word "standalone" implies a single node, but it's 
of course pretty much always been possible to have a cluster of multiple nodes 
that don't run SolrCloud/ZK. This issue would address those examples also.

Note that I'm not proposing replacing the word "SolrCloud" throughout the 
documentation. Instead I'll augment the use of the word "SolrCloud" with 
clarification that this term means "coordinated mode". Later if we ever replace 
SolrCloud references in code and fully remove that name, the conceptual 
groundwork will have already been laid for users.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14654) Remove plugin loading from .system collection (for 9.0)

2020-08-06 Thread Cassandra Targett (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172368#comment-17172368
 ] 

Cassandra Targett commented on SOLR-14654:
--

That didn't fix it, [~noble.paul], it's the reference in the {{page-children}} 
section at the top of the file that creates the page hierarchy. That's where it 
needs to be removed.

> Remove plugin loading from .system collection (for 9.0)
> ---
>
> Key: SOLR-14654
> URL: https://issues.apache.org/jira/browse/SOLR-14654
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This code must go from master
> all places where "runtimeLib" can be used will be removed from 9.0  . With 
> the new package system in place we don;t need this anymore



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14654) Remove plugin loading from .system collection (for 9.0)

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172360#comment-17172360
 ] 

ASF subversion and git services commented on SOLR-14654:


Commit ddbe9495fc4e3348dd4db653eb01c3f62c1e1a10 in lucene-solr's branch 
refs/heads/master from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ddbe949 ]

SOLR-14654: ref-guide build failure


> Remove plugin loading from .system collection (for 9.0)
> ---
>
> Key: SOLR-14654
> URL: https://issues.apache.org/jira/browse/SOLR-14654
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This code must go from master
> all places where "runtimeLib" can be used will be removed from 9.0  . With 
> the new package system in place we don;t need this anymore



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz edited a comment on pull request #1543: LUCENE-9378: Disable compression on binary values whose length is less than 32.

2020-08-06 Thread GitBox


jpountz edited a comment on pull request #1543:
URL: https://github.com/apache/lucene-solr/pull/1543#issuecomment-669927391


   > But, couldn't we instead just subclass Lucene's default codec, override 
{{getDocValuesFormatPerField}} to subclass {{Lucene80DocValuesFormat}} (oh, I 
see, yeah we cannot do that -- this class is final, which makes sense). I was 
thinking since this (whether to compress each block) is purely a write time 
decision, it could still be done as Lucene80 doc values format SPI.
   
   To me we only guarantee backward compatibility for users of the default 
codec. With the approach you mentioned, indices would be backward compatible, 
but I'm seeing this as accidental rather than something we guarantee.
   
   > But then I wonder why not just add a boolean compress option to 
Lucene80DocValuesFormat? This is similar to the compression Mode we pass to 
stored fields and term vectors format at write time, and it'd allow users who 
would like to disable BINARY doc values compression to keep backwards 
compatibility.
   
   I wanted to look into whether we could avoid this as it would boil down to 
maintaining two doc-value formats, but this might be the best way forward as it 
looks like the heuristics we tried out above don't work well to disable 
compression for use-cases when it hurts more than it helps.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on pull request #1543: LUCENE-9378: Disable compression on binary values whose length is less than 32.

2020-08-06 Thread GitBox


jpountz commented on pull request #1543:
URL: https://github.com/apache/lucene-solr/pull/1543#issuecomment-669927391


   > But, couldn't we instead just subclass Lucene's default codec, override 
{{getDocValuesFormatPerField}} to subclass {{Lucene80DocValuesFormat}} (oh, I 
see, yeah we cannot do that -- this class is final, which makes sense). I was 
thinking since this (whether to compress each block) is purely a write time 
decision, it could still be done as Lucene80 doc values format SPI.
   
   The codec is final, but you can still do the same thing with FilterCodec.
   
   To me we only guarantee backward compatibility for users of the default 
codec. With the approach you mentioned, indices would be backward compatible, 
but I'm seeing this as accidental rather than something we guarantee.
   
   > But then I wonder why not just add a boolean compress option to 
Lucene80DocValuesFormat? This is similar to the compression Mode we pass to 
stored fields and term vectors format at write time, and it'd allow users who 
would like to disable BINARY doc values compression to keep backwards 
compatibility.
   
   I wanted to look into whether we could avoid this as it would boil down to 
maintaining two doc-value formats, but this might be the best way forward as it 
looks like the heuristics we tried out above don't work well to disable 
compression for use-cases when it hurts more than it helps.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14702) Remove Master and Slave from Code Base and Docs

2020-08-06 Thread Cassandra Targett (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172355#comment-17172355
 ] 

Cassandra Targett commented on SOLR-14702:
--

bq. I don't know how others will feel about dropping the "SolrCloud" term, 
someone suggested in the past I think? 

I didn't plan on wholesale replacing "SolrCloud" as a term, but this could be a 
first step is doing that. My thought was that I would try to make it clear that 
"SolrCloud" = "coordinated mode" and what we call SolrCloud really means that. 
A conceptual shift, as it were. Later when/if we get around to replacing the 
SolrCloud terminology in the code, we can eradicate the term from the docs.

Since I'll be doing this as a separate thing from this issue, I'll file a new 
Jira and we can discuss what I'm thinking in more detail there to agree on the 
terminology.

> Remove Master and Slave from Code Base and Docs
> ---
>
> Key: SOLR-14702
> URL: https://issues.apache.org/jira/browse/SOLR-14702
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: master (9.0)
>Reporter: Marcus Eagan
>Priority: Critical
> Attachments: SOLR-14742-testfix.patch
>
>  Time Spent: 15h 50m
>  Remaining Estimate: 0h
>
> Every time I read _master_ and _slave_, I get pissed.
> I think about the last and only time I remember visiting my maternal great 
> grandpa in Alabama at four years old. He was a sharecropper before WWI, where 
> he lost his legs, and then he was back to being a sharecropper somehow after 
> the war. Crazy, I know. I don't know if the world still called his job 
> sharecropping in 1993, but he was basically a slave—in America. He lived in 
> the same shack that his father, and his grandfather (born a slave) lived in 
> down in Alabama. Believe it or not, my dad's (born in 1926) grandfather was 
> actually born a slave, freed shortly after birth by his owner father. I never 
> met him, though. He died in the 40s.
> Anyway, I cannot police all terms in the repo and do not wish to. This 
> master/slave shit is archaic and misleading on technical grounds. Thankfully, 
> there's only a handful of files in code and documentation that still talk 
> about masters and slaves. We should replace all of them.
> There are so many ways to reword it. In fact, unless anyone else objects or 
> wants to do the grunt work to help my stress levels, I will open the pull 
> request myself in effort to make this project and community more inviting to 
> people of all backgrounds and histories. We can have leader/follower, or 
> primary/secondary, but none of this Master/Slave nonsense. I'm sick of the 
> garbage. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14557) Unable to parse local params followed by parenthesis like {!lucene}(gigabyte)

2020-08-06 Thread Mikhail Khludnev (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172345#comment-17172345
 ] 

Mikhail Khludnev commented on SOLR-14557:
-

Trap syntax prevails at 
[https://lucene.apache.org/solr/guide/8_6/local-parameters-in-queries.html], 
however the safer one is mentioned there as well. I think it make sense to ban 
{{\\{!prefix}}}syntax from 9.0. 

> Unable to parse local params followed by parenthesis like {!lucene}(gigabyte)
> -
>
> Key: SOLR-14557
> URL: https://issues.apache.org/jira/browse/SOLR-14557
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
>  Labels: painful
> Attachments: SOLR-14557.patch, SOLR-14557.patch, SOLR-14557.patch
>
>
> h2. Solr 4.5
> {{/select?defType=edismax&q=\{!lucene}(foo)&debugQuery=true}} 
>  
>  goes like
>  {code}
>  \{!lucene}(foo)
>  content:foo
>  LuceneQParser
> {code}
> fine
> h2. Solr 8.2 
> with luceneMatchVersion=4.5 following SOLR-11501 I know it's a grey zone but 
> it's a question of migrating existing queries. 
> {{/select?defType=edismax&q=\{!lucene}(foo)&debugQuery=true}} 
> goes like 
> {code}
> "querystring":"\{!lucene}(foo)",
>  "parsedquery":"+DisjunctionMaxQuery(((Project.Address:lucene 
> Project.Address:foo) | (Project.OwnerType:lucene Project.OwnerType:foo) 
>  "QParser":"ExtendedDismaxQParser",
> {code}
> blah... 
> but removing braces in 8.2 works perfectly fine 
> {code}
> "querystring":"\{!lucene}foo",
>  "parsedquery":"+content:foo",
>  "parsedquery_toString":"+content:foo",
>  "QParser":"ExtendedDismaxQParser",
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components

2020-08-06 Thread GitBox


dweiss commented on a change in pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466404602



##
File path: 
lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java
##
@@ -0,0 +1,503 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search.matchhighlight;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.document.Document;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.FieldInfos;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.LeafReader;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Matches;
+import org.apache.lucene.search.MatchesIterator;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.QueryVisitor;
+import org.apache.lucene.search.ScoreMode;
+import org.apache.lucene.search.Weight;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.PrimitiveIterator;
+import java.util.Set;
+import java.util.TreeMap;
+import java.util.TreeSet;
+import java.util.function.Predicate;
+
+/**
+ * Utility class to compute a list of "hit regions" for a given query, 
searcher and
+ * document(s) using {@link Matches} API.
+ */
+public class MatchRegionRetriever {
+  private final List leaves;
+  private final Weight weight;
+  private final TreeSet affectedFields;
+  private final Map offsetStrategies;
+  private final Set preloadFields;
+
+  public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer 
analyzer)
+  throws IOException {
+leaves = searcher.getIndexReader().leaves();
+assert checkOrderConsistency(leaves);
+
+weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0);
+
+// Compute the subset of fields affected by this query so that we don't 
load or scan
+// fields that are irrelevant.
+affectedFields = new TreeSet<>();
+query.visit(
+new QueryVisitor() {
+  @Override
+  public boolean acceptField(String field) {
+affectedFields.add(field);
+return false;
+  }
+});
+
+// Compute value offset retrieval strategy for all affected fields.
+offsetStrategies =
+computeOffsetStrategies(affectedFields, searcher.getIndexReader(), 
analyzer);
+
+// Ask offset strategies if they'll need field values.
+preloadFields = new HashSet<>();
+offsetStrategies.forEach(
+(field, strategy) -> {
+  if (strategy.requiresDocument()) {
+preloadFields.add(field);
+  }
+});
+
+// Only preload those field values that can be affected by the query and 
are required
+// by strategies.
+preloadFields.retainAll(affectedFields);
+  }
+
+  public void highlightDocuments(PrimitiveIterator.OfInt docIds, 
HitRegionConsumer consumer)
+  throws IOException {
+if (leaves.isEmpty() || affectedFields.isEmpty()) {
+  return;
+}
+
+Iterator ctx = leaves.iterator();
+LeafReaderContext currentContext = ctx.next();
+int previousDocId = -1;
+Map> highlights = new TreeMap<>();
+while (docIds.hasNext()) {
+  int docId = docIds.nextInt();
+
+  if (docId < previousDocId) {
+throw new RuntimeException("Input document IDs must be sorted 
(increasing).");
+  }
+  previousDocId = docId;
+
+  while (docId >= currentContext.docBase + 
currentContext.reader().maxDoc()) {
+currentContext = ctx.next();
+  }
+
+  int contextRelativeDocId = docId - currentContext.docBase;
+
+  // Only preload fields we may potentially need.
+  FieldValueProvider documentSupplier;
+  if (preloadFields.isEmpty()) {
+documentSupplier = null;
+  } else {
+

[jira] [Created] (LUCENE-9447) Make BEST_COMPRESSION compress more aggressively?

2020-08-06 Thread Adrien Grand (Jira)
Adrien Grand created LUCENE-9447:


 Summary: Make BEST_COMPRESSION compress more aggressively?
 Key: LUCENE-9447
 URL: https://issues.apache.org/jira/browse/LUCENE-9447
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand


The Lucene86 codec supports setting a "Mode" for stored fields compression, 
that is either "BEST_SPEED", which translates to blocks of 16kB or 128 
documents (whichever is hit first) compressed with LZ4, or "BEST_COMPRESSION", 
which translates to blocks of 60kB or 512 documents compressed with DEFLATE 
with default compression level (6).

After looking at indices that spent most disk space on stored fields recently, 
I noticed that there was quite some room for improvement by increasing the 
block size even further:
||Block size||Stored fields size||
|60kB|168412338|
|128kB|130813639|
|256kB|113587009|
|512kB|104776378|
|1MB|100367095|
|2MB|98152464|
|4MB|97034425|
|8MB|96478746|

For this specific dataset, I had 1M documents that each had about 2kB of stored 
fields each and quite some redundancy.

This makes me want to look into bumping this block size to maybe 256kB. It 
would be interesting to re-do the experiments we did on LUCENE-6100 to see how 
this affects the merging speed. That said I don't think it would be terrible if 
the merging time increased a bit given that we already offer the BEST_SPEED 
option for CPU-savvy users.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14654) Remove plugin loading from .system collection (for 9.0)

2020-08-06 Thread Cassandra Targett (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172330#comment-17172330
 ] 

Cassandra Targett commented on SOLR-14654:
--

The Ref Guide changes in the last commit here [~noble.paul] broke the Ref Guide 
build because while you deleted the page 
{{adding-custom-plugins-in-solrcloud-mode.adoc}}, you did not remove the 
reference to it from its prior parent, {{solr-plugins.adoc}}.

(The Ref Guide builds are down now due to the CI migration, otherwise they 
would have been complaining about this.)

> Remove plugin loading from .system collection (for 9.0)
> ---
>
> Key: SOLR-14654
> URL: https://issues.apache.org/jira/browse/SOLR-14654
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This code must go from master
> all places where "runtimeLib" can be used will be removed from 9.0  . With 
> the new package system in place we don;t need this anymore



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components

2020-08-06 Thread GitBox


dweiss commented on a change in pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466387137



##
File path: 
lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java
##
@@ -0,0 +1,503 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search.matchhighlight;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.document.Document;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.FieldInfos;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.LeafReader;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Matches;
+import org.apache.lucene.search.MatchesIterator;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.QueryVisitor;
+import org.apache.lucene.search.ScoreMode;
+import org.apache.lucene.search.Weight;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.PrimitiveIterator;
+import java.util.Set;
+import java.util.TreeMap;
+import java.util.TreeSet;
+import java.util.function.Predicate;
+
+/**
+ * Utility class to compute a list of "hit regions" for a given query, 
searcher and
+ * document(s) using {@link Matches} API.
+ */
+public class MatchRegionRetriever {
+  private final List leaves;
+  private final Weight weight;
+  private final TreeSet affectedFields;
+  private final Map offsetStrategies;
+  private final Set preloadFields;
+
+  public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer 
analyzer)
+  throws IOException {
+leaves = searcher.getIndexReader().leaves();
+assert checkOrderConsistency(leaves);
+
+weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0);
+
+// Compute the subset of fields affected by this query so that we don't 
load or scan
+// fields that are irrelevant.
+affectedFields = new TreeSet<>();
+query.visit(
+new QueryVisitor() {
+  @Override
+  public boolean acceptField(String field) {
+affectedFields.add(field);
+return false;
+  }
+});
+
+// Compute value offset retrieval strategy for all affected fields.
+offsetStrategies =
+computeOffsetStrategies(affectedFields, searcher.getIndexReader(), 
analyzer);
+
+// Ask offset strategies if they'll need field values.
+preloadFields = new HashSet<>();
+offsetStrategies.forEach(
+(field, strategy) -> {
+  if (strategy.requiresDocument()) {
+preloadFields.add(field);
+  }
+});
+
+// Only preload those field values that can be affected by the query and 
are required
+// by strategies.
+preloadFields.retainAll(affectedFields);
+  }
+
+  public void highlightDocuments(PrimitiveIterator.OfInt docIds, 
HitRegionConsumer consumer)

Review comment:
   Will do, thanks.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1721: LUCENE-9439: match region highlighter components

2020-08-06 Thread GitBox


romseygeek commented on a change in pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466387150



##
File path: 
lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java
##
@@ -0,0 +1,503 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search.matchhighlight;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.document.Document;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.FieldInfos;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.LeafReader;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Matches;
+import org.apache.lucene.search.MatchesIterator;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.QueryVisitor;
+import org.apache.lucene.search.ScoreMode;
+import org.apache.lucene.search.Weight;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.PrimitiveIterator;
+import java.util.Set;
+import java.util.TreeMap;
+import java.util.TreeSet;
+import java.util.function.Predicate;
+
+/**
+ * Utility class to compute a list of "hit regions" for a given query, 
searcher and
+ * document(s) using {@link Matches} API.
+ */
+public class MatchRegionRetriever {
+  private final List leaves;
+  private final Weight weight;
+  private final TreeSet affectedFields;
+  private final Map offsetStrategies;
+  private final Set preloadFields;
+
+  public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer 
analyzer)
+  throws IOException {
+leaves = searcher.getIndexReader().leaves();
+assert checkOrderConsistency(leaves);
+
+weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0);
+
+// Compute the subset of fields affected by this query so that we don't 
load or scan
+// fields that are irrelevant.
+affectedFields = new TreeSet<>();
+query.visit(
+new QueryVisitor() {
+  @Override
+  public boolean acceptField(String field) {
+affectedFields.add(field);
+return false;
+  }
+});
+
+// Compute value offset retrieval strategy for all affected fields.
+offsetStrategies =
+computeOffsetStrategies(affectedFields, searcher.getIndexReader(), 
analyzer);
+
+// Ask offset strategies if they'll need field values.
+preloadFields = new HashSet<>();
+offsetStrategies.forEach(
+(field, strategy) -> {
+  if (strategy.requiresDocument()) {
+preloadFields.add(field);
+  }
+});
+
+// Only preload those field values that can be affected by the query and 
are required
+// by strategies.
+preloadFields.retainAll(affectedFields);
+  }
+
+  public void highlightDocuments(PrimitiveIterator.OfInt docIds, 
HitRegionConsumer consumer)
+  throws IOException {
+if (leaves.isEmpty() || affectedFields.isEmpty()) {
+  return;
+}
+
+Iterator ctx = leaves.iterator();
+LeafReaderContext currentContext = ctx.next();
+int previousDocId = -1;
+Map> highlights = new TreeMap<>();
+while (docIds.hasNext()) {
+  int docId = docIds.nextInt();
+
+  if (docId < previousDocId) {
+throw new RuntimeException("Input document IDs must be sorted 
(increasing).");
+  }
+  previousDocId = docId;
+
+  while (docId >= currentContext.docBase + 
currentContext.reader().maxDoc()) {
+currentContext = ctx.next();
+  }
+
+  int contextRelativeDocId = docId - currentContext.docBase;
+
+  // Only preload fields we may potentially need.
+  FieldValueProvider documentSupplier;
+  if (preloadFields.isEmpty()) {
+documentSupplier = null;
+  } else {
+

[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1721: LUCENE-9439: match region highlighter components

2020-08-06 Thread GitBox


romseygeek commented on a change in pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466384339



##
File path: 
lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java
##
@@ -0,0 +1,503 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search.matchhighlight;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.document.Document;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.FieldInfos;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.LeafReader;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Matches;
+import org.apache.lucene.search.MatchesIterator;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.QueryVisitor;
+import org.apache.lucene.search.ScoreMode;
+import org.apache.lucene.search.Weight;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.PrimitiveIterator;
+import java.util.Set;
+import java.util.TreeMap;
+import java.util.TreeSet;
+import java.util.function.Predicate;
+
+/**
+ * Utility class to compute a list of "hit regions" for a given query, 
searcher and
+ * document(s) using {@link Matches} API.
+ */
+public class MatchRegionRetriever {
+  private final List leaves;
+  private final Weight weight;
+  private final TreeSet affectedFields;
+  private final Map offsetStrategies;
+  private final Set preloadFields;
+
+  public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer 
analyzer)
+  throws IOException {
+leaves = searcher.getIndexReader().leaves();
+assert checkOrderConsistency(leaves);
+
+weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0);
+
+// Compute the subset of fields affected by this query so that we don't 
load or scan
+// fields that are irrelevant.
+affectedFields = new TreeSet<>();
+query.visit(
+new QueryVisitor() {
+  @Override
+  public boolean acceptField(String field) {
+affectedFields.add(field);
+return false;
+  }
+});
+
+// Compute value offset retrieval strategy for all affected fields.
+offsetStrategies =
+computeOffsetStrategies(affectedFields, searcher.getIndexReader(), 
analyzer);
+
+// Ask offset strategies if they'll need field values.
+preloadFields = new HashSet<>();
+offsetStrategies.forEach(
+(field, strategy) -> {
+  if (strategy.requiresDocument()) {
+preloadFields.add(field);
+  }
+});
+
+// Only preload those field values that can be affected by the query and 
are required
+// by strategies.
+preloadFields.retainAll(affectedFields);
+  }
+
+  public void highlightDocuments(PrimitiveIterator.OfInt docIds, 
HitRegionConsumer consumer)

Review comment:
   A wrapper method that takes `TopDocs` and sorts the internal ids sounds 
like the best option here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components

2020-08-06 Thread GitBox


dweiss commented on a change in pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466380491



##
File path: 
lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java
##
@@ -0,0 +1,503 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search.matchhighlight;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.document.Document;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.FieldInfos;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.LeafReader;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Matches;
+import org.apache.lucene.search.MatchesIterator;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.QueryVisitor;
+import org.apache.lucene.search.ScoreMode;
+import org.apache.lucene.search.Weight;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.PrimitiveIterator;
+import java.util.Set;
+import java.util.TreeMap;
+import java.util.TreeSet;
+import java.util.function.Predicate;
+
+/**
+ * Utility class to compute a list of "hit regions" for a given query, 
searcher and
+ * document(s) using {@link Matches} API.
+ */
+public class MatchRegionRetriever {
+  private final List leaves;
+  private final Weight weight;
+  private final TreeSet affectedFields;
+  private final Map offsetStrategies;
+  private final Set preloadFields;
+
+  public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer 
analyzer)
+  throws IOException {
+leaves = searcher.getIndexReader().leaves();
+assert checkOrderConsistency(leaves);
+
+weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0);
+
+// Compute the subset of fields affected by this query so that we don't 
load or scan
+// fields that are irrelevant.
+affectedFields = new TreeSet<>();
+query.visit(
+new QueryVisitor() {
+  @Override
+  public boolean acceptField(String field) {
+affectedFields.add(field);
+return false;
+  }
+});
+
+// Compute value offset retrieval strategy for all affected fields.
+offsetStrategies =
+computeOffsetStrategies(affectedFields, searcher.getIndexReader(), 
analyzer);
+
+// Ask offset strategies if they'll need field values.
+preloadFields = new HashSet<>();
+offsetStrategies.forEach(
+(field, strategy) -> {
+  if (strategy.requiresDocument()) {
+preloadFields.add(field);
+  }
+});
+
+// Only preload those field values that can be affected by the query and 
are required
+// by strategies.
+preloadFields.retainAll(affectedFields);
+  }
+
+  public void highlightDocuments(PrimitiveIterator.OfInt docIds, 
HitRegionConsumer consumer)

Review comment:
   Oh, there is another reason too. Internally this "streaming" method 
requires increasing document IDs so that the bookkeeping of leaf readers is 
simplified (no need to continuously bisect each document ID).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14715) Update processor initialization should be skipped on PULL replicas

2020-08-06 Thread Erick Erickson (Jira)
Erick Erickson created SOLR-14715:
-

 Summary: Update processor initialization should be skipped on PULL 
replicas
 Key: SOLR-14715
 URL: https://issues.apache.org/jira/browse/SOLR-14715
 Project: Solr
  Issue Type: Test
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Erick Erickson


>From the user's list:
{quote}
Our PULL replicas... fail to start with below
exception when DocBasedVersionConstraintsProcessorFactory is added to
UpdateProcessorChain.

Caused by: org.apache.solr.common.SolrException: updateLog must be enabled.
   at org.apache.solr.core.SolrCore.(SolrCore.java:1014)
   at org.apache.solr.core.SolrCore.(SolrCore.java:869)
   at
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1138)
   ... 45 more
Caused by: org.apache.solr.common.SolrException: updateLog must be enabled.
   at
org.apache.solr.update.processor.DocBasedVersionConstraintsProcessorFactory.inform(DocBasedVersionConstraintsProcessorFactory.java:168)
   at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:696)
   at org.apache.solr.core.SolrCore.(SolrCore.java:993)
   ... 47 more

{quote}

and Tomás' reply:

{quote}
This is an interesting bug. I’m wondering if we can completely skip the
initialization of UpdateRequestProcessorFactories in PULL replicas...
{quote}
 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on pull request #1721: LUCENE-9439: match region highlighter components

2020-08-06 Thread GitBox


dweiss commented on pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#issuecomment-669879507


   I don't use hamcrest but I'll take a look at what I can do. No need to pull 
in another library just for one test.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components

2020-08-06 Thread GitBox


dweiss commented on a change in pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466354710



##
File path: 
lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java
##
@@ -0,0 +1,503 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search.matchhighlight;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.document.Document;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.FieldInfos;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.LeafReader;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Matches;
+import org.apache.lucene.search.MatchesIterator;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.QueryVisitor;
+import org.apache.lucene.search.ScoreMode;
+import org.apache.lucene.search.Weight;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.PrimitiveIterator;
+import java.util.Set;
+import java.util.TreeMap;
+import java.util.TreeSet;
+import java.util.function.Predicate;
+
+/**
+ * Utility class to compute a list of "hit regions" for a given query, 
searcher and
+ * document(s) using {@link Matches} API.
+ */
+public class MatchRegionRetriever {
+  private final List leaves;
+  private final Weight weight;
+  private final TreeSet affectedFields;
+  private final Map offsetStrategies;
+  private final Set preloadFields;
+
+  public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer 
analyzer)
+  throws IOException {
+leaves = searcher.getIndexReader().leaves();
+assert checkOrderConsistency(leaves);
+
+weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0);
+
+// Compute the subset of fields affected by this query so that we don't 
load or scan
+// fields that are irrelevant.
+affectedFields = new TreeSet<>();
+query.visit(
+new QueryVisitor() {
+  @Override
+  public boolean acceptField(String field) {
+affectedFields.add(field);
+return false;
+  }
+});
+
+// Compute value offset retrieval strategy for all affected fields.
+offsetStrategies =
+computeOffsetStrategies(affectedFields, searcher.getIndexReader(), 
analyzer);
+
+// Ask offset strategies if they'll need field values.
+preloadFields = new HashSet<>();
+offsetStrategies.forEach(
+(field, strategy) -> {
+  if (strategy.requiresDocument()) {
+preloadFields.add(field);
+  }
+});
+
+// Only preload those field values that can be affected by the query and 
are required
+// by strategies.
+preloadFields.retainAll(affectedFields);
+  }
+
+  public void highlightDocuments(PrimitiveIterator.OfInt docIds, 
HitRegionConsumer consumer)
+  throws IOException {
+if (leaves.isEmpty() || affectedFields.isEmpty()) {
+  return;
+}
+
+Iterator ctx = leaves.iterator();
+LeafReaderContext currentContext = ctx.next();
+int previousDocId = -1;
+Map> highlights = new TreeMap<>();
+while (docIds.hasNext()) {
+  int docId = docIds.nextInt();
+
+  if (docId < previousDocId) {
+throw new RuntimeException("Input document IDs must be sorted 
(increasing).");
+  }
+  previousDocId = docId;
+
+  while (docId >= currentContext.docBase + 
currentContext.reader().maxDoc()) {
+currentContext = ctx.next();
+  }
+
+  int contextRelativeDocId = docId - currentContext.docBase;
+
+  // Only preload fields we may potentially need.
+  FieldValueProvider documentSupplier;
+  if (preloadFields.isEmpty()) {
+documentSupplier = null;
+  } else {
+

[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components

2020-08-06 Thread GitBox


dweiss commented on a change in pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466353991



##
File path: 
lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java
##
@@ -0,0 +1,503 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search.matchhighlight;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.document.Document;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.FieldInfos;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.LeafReader;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Matches;
+import org.apache.lucene.search.MatchesIterator;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.QueryVisitor;
+import org.apache.lucene.search.ScoreMode;
+import org.apache.lucene.search.Weight;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.PrimitiveIterator;
+import java.util.Set;
+import java.util.TreeMap;
+import java.util.TreeSet;
+import java.util.function.Predicate;
+
+/**
+ * Utility class to compute a list of "hit regions" for a given query, 
searcher and
+ * document(s) using {@link Matches} API.
+ */
+public class MatchRegionRetriever {
+  private final List leaves;
+  private final Weight weight;
+  private final TreeSet affectedFields;
+  private final Map offsetStrategies;
+  private final Set preloadFields;
+
+  public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer 
analyzer)
+  throws IOException {
+leaves = searcher.getIndexReader().leaves();
+assert checkOrderConsistency(leaves);
+
+weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0);
+
+// Compute the subset of fields affected by this query so that we don't 
load or scan
+// fields that are irrelevant.
+affectedFields = new TreeSet<>();
+query.visit(
+new QueryVisitor() {
+  @Override
+  public boolean acceptField(String field) {
+affectedFields.add(field);
+return false;
+  }
+});
+
+// Compute value offset retrieval strategy for all affected fields.
+offsetStrategies =
+computeOffsetStrategies(affectedFields, searcher.getIndexReader(), 
analyzer);
+
+// Ask offset strategies if they'll need field values.
+preloadFields = new HashSet<>();
+offsetStrategies.forEach(
+(field, strategy) -> {
+  if (strategy.requiresDocument()) {
+preloadFields.add(field);
+  }
+});
+
+// Only preload those field values that can be affected by the query and 
are required
+// by strategies.
+preloadFields.retainAll(affectedFields);
+  }
+
+  public void highlightDocuments(PrimitiveIterator.OfInt docIds, 
HitRegionConsumer consumer)

Review comment:
   I can add another method that would wrap it up? The arbitrary sequence 
of document IDs is motivated by, ahem, private needs - the TopDocs is 
contained, the iterator (and consumer) can stream over a large number of 
documents.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components

2020-08-06 Thread GitBox


dweiss commented on a change in pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466353337



##
File path: 
lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java
##
@@ -0,0 +1,503 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search.matchhighlight;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.document.Document;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.FieldInfos;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.LeafReader;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Matches;
+import org.apache.lucene.search.MatchesIterator;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.QueryVisitor;
+import org.apache.lucene.search.ScoreMode;
+import org.apache.lucene.search.Weight;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.PrimitiveIterator;
+import java.util.Set;
+import java.util.TreeMap;
+import java.util.TreeSet;
+import java.util.function.Predicate;
+
+/**
+ * Utility class to compute a list of "hit regions" for a given query, 
searcher and
+ * document(s) using {@link Matches} API.
+ */
+public class MatchRegionRetriever {
+  private final List leaves;
+  private final Weight weight;
+  private final TreeSet affectedFields;
+  private final Map offsetStrategies;
+  private final Set preloadFields;
+
+  public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer 
analyzer)
+  throws IOException {
+leaves = searcher.getIndexReader().leaves();
+assert checkOrderConsistency(leaves);
+
+weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0);
+
+// Compute the subset of fields affected by this query so that we don't 
load or scan
+// fields that are irrelevant.
+affectedFields = new TreeSet<>();
+query.visit(
+new QueryVisitor() {
+  @Override
+  public boolean acceptField(String field) {
+affectedFields.add(field);
+return false;
+  }
+});
+
+// Compute value offset retrieval strategy for all affected fields.
+offsetStrategies =
+computeOffsetStrategies(affectedFields, searcher.getIndexReader(), 
analyzer);
+
+// Ask offset strategies if they'll need field values.
+preloadFields = new HashSet<>();
+offsetStrategies.forEach(
+(field, strategy) -> {
+  if (strategy.requiresDocument()) {
+preloadFields.add(field);
+  }
+});
+
+// Only preload those field values that can be affected by the query and 
are required
+// by strategies.
+preloadFields.retainAll(affectedFields);
+  }
+
+  public void highlightDocuments(PrimitiveIterator.OfInt docIds, 
HitRegionConsumer consumer)
+  throws IOException {
+if (leaves.isEmpty() || affectedFields.isEmpty()) {
+  return;
+}
+
+Iterator ctx = leaves.iterator();
+LeafReaderContext currentContext = ctx.next();
+int previousDocId = -1;
+Map> highlights = new TreeMap<>();
+while (docIds.hasNext()) {
+  int docId = docIds.nextInt();
+
+  if (docId < previousDocId) {
+throw new RuntimeException("Input document IDs must be sorted 
(increasing).");
+  }
+  previousDocId = docId;
+
+  while (docId >= currentContext.docBase + 
currentContext.reader().maxDoc()) {
+currentContext = ctx.next();
+  }
+
+  int contextRelativeDocId = docId - currentContext.docBase;
+
+  // Only preload fields we may potentially need.
+  FieldValueProvider documentSupplier;
+  if (preloadFields.isEmpty()) {
+documentSupplier = null;
+  } else {
+

[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components

2020-08-06 Thread GitBox


dweiss commented on a change in pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466352634



##
File path: 
lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java
##
@@ -0,0 +1,503 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search.matchhighlight;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.document.Document;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.FieldInfos;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.LeafReader;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Matches;
+import org.apache.lucene.search.MatchesIterator;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.QueryVisitor;
+import org.apache.lucene.search.ScoreMode;
+import org.apache.lucene.search.Weight;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.PrimitiveIterator;
+import java.util.Set;
+import java.util.TreeMap;
+import java.util.TreeSet;
+import java.util.function.Predicate;
+
+/**
+ * Utility class to compute a list of "hit regions" for a given query, 
searcher and
+ * document(s) using {@link Matches} API.
+ */
+public class MatchRegionRetriever {
+  private final List leaves;
+  private final Weight weight;
+  private final TreeSet affectedFields;
+  private final Map offsetStrategies;
+  private final Set preloadFields;
+
+  public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer 
analyzer)
+  throws IOException {
+leaves = searcher.getIndexReader().leaves();
+assert checkOrderConsistency(leaves);
+
+weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0);
+
+// Compute the subset of fields affected by this query so that we don't 
load or scan
+// fields that are irrelevant.
+affectedFields = new TreeSet<>();
+query.visit(
+new QueryVisitor() {
+  @Override
+  public boolean acceptField(String field) {
+affectedFields.add(field);
+return false;
+  }
+});
+
+// Compute value offset retrieval strategy for all affected fields.
+offsetStrategies =
+computeOffsetStrategies(affectedFields, searcher.getIndexReader(), 
analyzer);
+
+// Ask offset strategies if they'll need field values.
+preloadFields = new HashSet<>();
+offsetStrategies.forEach(
+(field, strategy) -> {
+  if (strategy.requiresDocument()) {
+preloadFields.add(field);
+  }
+});
+
+// Only preload those field values that can be affected by the query and 
are required
+// by strategies.
+preloadFields.retainAll(affectedFields);
+  }
+
+  public void highlightDocuments(PrimitiveIterator.OfInt docIds, 
HitRegionConsumer consumer)
+  throws IOException {
+if (leaves.isEmpty() || affectedFields.isEmpty()) {
+  return;
+}
+
+Iterator ctx = leaves.iterator();
+LeafReaderContext currentContext = ctx.next();
+int previousDocId = -1;
+Map> highlights = new TreeMap<>();
+while (docIds.hasNext()) {
+  int docId = docIds.nextInt();
+
+  if (docId < previousDocId) {
+throw new RuntimeException("Input document IDs must be sorted 
(increasing).");
+  }
+  previousDocId = docId;
+
+  while (docId >= currentContext.docBase + 
currentContext.reader().maxDoc()) {
+currentContext = ctx.next();
+  }
+
+  int contextRelativeDocId = docId - currentContext.docBase;
+
+  // Only preload fields we may potentially need.
+  FieldValueProvider documentSupplier;
+  if (preloadFields.isEmpty()) {
+documentSupplier = null;
+  } else {
+

[GitHub] [lucene-solr] dweiss commented on a change in pull request #1721: LUCENE-9439: match region highlighter components

2020-08-06 Thread GitBox


dweiss commented on a change in pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466350632



##
File path: 
lucene/core/src/java/org/apache/lucene/search/DisjunctionMatchesIterator.java
##
@@ -201,8 +201,9 @@ protected boolean lessThan(MatchesIterator a, 
MatchesIterator b) {
 
   @Override
   public boolean next() throws IOException {
-if (started == false) {
-  return started = true;
+if (!started) {

Review comment:
   Change you must when asked!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] romseygeek commented on pull request #1671: LUCENE-9427: Ensure unified highlighter considers all terms in fuzzy query.

2020-08-06 Thread GitBox


romseygeek commented on pull request #1671:
URL: https://github.com/apache/lucene-solr/pull/1671#issuecomment-669869944


   Merged to master in 688583fc2d01c39bba63d19cf57bb5720eda1afd



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9427) Unified highlighter can fail to highlight fuzzy query

2020-08-06 Thread Alan Woodward (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved LUCENE-9427.
---
Fix Version/s: 8.7
   Resolution: Fixed

Thanks [~jtibshirani]!

> Unified highlighter can fail to highlight fuzzy query
> -
>
> Key: LUCENE-9427
> URL: https://issues.apache.org/jira/browse/LUCENE-9427
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Julie Tibshirani
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> If a fuzzy query corresponds to an exact match (for example it has with 
> maxEdits: 0), then the unified highlighter doesn't produce highlights for the 
> matching terms.
> I think this is due to the fact that when visiting a fuzzy query, the exact 
> terms are now consumed separately from automata. The unified highlighter 
> doesn't account for the terms and misses highlighting them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] romseygeek closed pull request #1671: LUCENE-9427: Ensure unified highlighter considers all terms in fuzzy query.

2020-08-06 Thread GitBox


romseygeek closed pull request #1671:
URL: https://github.com/apache/lucene-solr/pull/1671


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9427) Unified highlighter can fail to highlight fuzzy query

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172273#comment-17172273
 ] 

ASF subversion and git services commented on LUCENE-9427:
-

Commit b6806355c3cf7c866ab3b2302b78f2b478691876 in lucene-solr's branch 
refs/heads/branch_8x from Julie Tibshirani
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b680635 ]

LUCENE-9427: Fuzzy query should always call consumeTermsMatching in visitor


> Unified highlighter can fail to highlight fuzzy query
> -
>
> Key: LUCENE-9427
> URL: https://issues.apache.org/jira/browse/LUCENE-9427
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Julie Tibshirani
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> If a fuzzy query corresponds to an exact match (for example it has with 
> maxEdits: 0), then the unified highlighter doesn't produce highlights for the 
> matching terms.
> I think this is due to the fact that when visiting a fuzzy query, the exact 
> terms are now consumed separately from automata. The unified highlighter 
> doesn't account for the terms and misses highlighting them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9427) Unified highlighter can fail to highlight fuzzy query

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172274#comment-17172274
 ] 

ASF subversion and git services commented on LUCENE-9427:
-

Commit 688583fc2d01c39bba63d19cf57bb5720eda1afd in lucene-solr's branch 
refs/heads/master from Julie Tibshirani
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=688583f ]

LUCENE-9427: Fuzzy query should always call consumeTermsMatching in visitor


> Unified highlighter can fail to highlight fuzzy query
> -
>
> Key: LUCENE-9427
> URL: https://issues.apache.org/jira/browse/LUCENE-9427
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Julie Tibshirani
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> If a fuzzy query corresponds to an exact match (for example it has with 
> maxEdits: 0), then the unified highlighter doesn't produce highlights for the 
> matching terms.
> I think this is due to the fact that when visiting a fuzzy query, the exact 
> terms are now consumed separately from automata. The unified highlighter 
> doesn't account for the terms and misses highlighting them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14714) Solr.cmd in windows loads the incorrect jetty module when using java>=9

2020-08-06 Thread Endika Posadas (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Endika Posadas updated SOLR-14714:
--
Description: 
In Solr.cmd, when using SSL, there is a check to verify what version of java 
solr is running with. If this version of Java is greater or equal java 9 it 
will load the jetty https module while for java 8 it will use https8.

However, this java version check is done before the java major version variable 
has been assigned. As a result, Solr in windows doesn't work when SSL is 
enabled.

 

To fix this issue, it is enough if java checks are done before SSL checks.

 

I have attached a patch with the modifications.

  was:
In Solr.cmd, when using SSL, there is a check to verify what version of java 
solr is running with. If this version of Java is greater or equal java 9 it 
will load the jetty https module while for java 8 it will use https8. 

However, this java version check is done before the java major version variable 
has been assigned. As a result, Solr in windows doesn't work when SSL is 
enabled.

 

To fix this issue is enough if java checks are done before SSL checks.

 

I have attached a patch with the modifications.


> Solr.cmd in windows loads the incorrect jetty module when using java>=9
> ---
>
> Key: SOLR-14714
> URL: https://issues.apache.org/jira/browse/SOLR-14714
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Affects Versions: 8.6
> Environment: Windows environment running in Solr Cloud mode with SSL.
>Reporter: Endika Posadas
>Priority: Major
> Attachments: load_java_info_first.patch
>
>
> In Solr.cmd, when using SSL, there is a check to verify what version of java 
> solr is running with. If this version of Java is greater or equal java 9 it 
> will load the jetty https module while for java 8 it will use https8.
> However, this java version check is done before the java major version 
> variable has been assigned. As a result, Solr in windows doesn't work when 
> SSL is enabled.
>  
> To fix this issue, it is enough if java checks are done before SSL checks.
>  
> I have attached a patch with the modifications.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1721: LUCENE-9439: match region highlighter components

2020-08-06 Thread GitBox


romseygeek commented on a change in pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#discussion_r466306845



##
File path: 
lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/MatchRegionRetriever.java
##
@@ -0,0 +1,503 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search.matchhighlight;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
+import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
+import org.apache.lucene.document.Document;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.FieldInfos;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.LeafReader;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Matches;
+import org.apache.lucene.search.MatchesIterator;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.QueryVisitor;
+import org.apache.lucene.search.ScoreMode;
+import org.apache.lucene.search.Weight;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.PrimitiveIterator;
+import java.util.Set;
+import java.util.TreeMap;
+import java.util.TreeSet;
+import java.util.function.Predicate;
+
+/**
+ * Utility class to compute a list of "hit regions" for a given query, 
searcher and
+ * document(s) using {@link Matches} API.
+ */
+public class MatchRegionRetriever {
+  private final List leaves;
+  private final Weight weight;
+  private final TreeSet affectedFields;
+  private final Map offsetStrategies;
+  private final Set preloadFields;
+
+  public MatchRegionRetriever(IndexSearcher searcher, Query query, Analyzer 
analyzer)
+  throws IOException {
+leaves = searcher.getIndexReader().leaves();
+assert checkOrderConsistency(leaves);
+
+weight = searcher.createWeight(query, ScoreMode.COMPLETE_NO_SCORES, 0);
+
+// Compute the subset of fields affected by this query so that we don't 
load or scan
+// fields that are irrelevant.
+affectedFields = new TreeSet<>();
+query.visit(
+new QueryVisitor() {
+  @Override
+  public boolean acceptField(String field) {
+affectedFields.add(field);
+return false;
+  }
+});
+
+// Compute value offset retrieval strategy for all affected fields.
+offsetStrategies =
+computeOffsetStrategies(affectedFields, searcher.getIndexReader(), 
analyzer);
+
+// Ask offset strategies if they'll need field values.
+preloadFields = new HashSet<>();
+offsetStrategies.forEach(
+(field, strategy) -> {
+  if (strategy.requiresDocument()) {
+preloadFields.add(field);
+  }
+});
+
+// Only preload those field values that can be affected by the query and 
are required
+// by strategies.
+preloadFields.retainAll(affectedFields);
+  }
+
+  public void highlightDocuments(PrimitiveIterator.OfInt docIds, 
HitRegionConsumer consumer)
+  throws IOException {
+if (leaves.isEmpty() || affectedFields.isEmpty()) {
+  return;
+}
+
+Iterator ctx = leaves.iterator();
+LeafReaderContext currentContext = ctx.next();
+int previousDocId = -1;
+Map> highlights = new TreeMap<>();
+while (docIds.hasNext()) {
+  int docId = docIds.nextInt();
+
+  if (docId < previousDocId) {
+throw new RuntimeException("Input document IDs must be sorted 
(increasing).");
+  }
+  previousDocId = docId;
+
+  while (docId >= currentContext.docBase + 
currentContext.reader().maxDoc()) {
+currentContext = ctx.next();
+  }
+
+  int contextRelativeDocId = docId - currentContext.docBase;
+
+  // Only preload fields we may potentially need.
+  FieldValueProvider documentSupplier;
+  if (preloadFields.isEmpty()) {
+documentSupplier = null;
+  } else {
+

[jira] [Updated] (SOLR-14714) Solr.cmd in windows loads the incorrect jetty module when using java>=9

2020-08-06 Thread Endika Posadas (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Endika Posadas updated SOLR-14714:
--
Summary: Solr.cmd in windows loads the incorrect jetty module when using 
java>=9  (was: Solr.cmd in windows loads the incorrect jetty module when using 
java>9)

> Solr.cmd in windows loads the incorrect jetty module when using java>=9
> ---
>
> Key: SOLR-14714
> URL: https://issues.apache.org/jira/browse/SOLR-14714
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Affects Versions: 8.6
> Environment: Windows environment running in Solr Cloud mode with SSL.
>Reporter: Endika Posadas
>Priority: Major
> Attachments: load_java_info_first.patch
>
>
> In Solr.cmd, when using SSL, there is a check to verify what version of java 
> solr is running with. If this version of Java is greater or equal java 9 it 
> will load the jetty https module while for java 8 it will use https8. 
> However, this java version check is done before the java major version 
> variable has been assigned. As a result, Solr in windows doesn't work when 
> SSL is enabled.
>  
> To fix this issue is enough if java checks are done before SSL checks.
>  
> I have attached a patch with the modifications.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14714) Solr.cmd in windows loads the incorrect jetty module when using java>9

2020-08-06 Thread Endika Posadas (Jira)
Endika Posadas created SOLR-14714:
-

 Summary: Solr.cmd in windows loads the incorrect jetty module when 
using java>9
 Key: SOLR-14714
 URL: https://issues.apache.org/jira/browse/SOLR-14714
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: scripts and tools
Affects Versions: 8.6
 Environment: Windows environment running in Solr Cloud mode with SSL.
Reporter: Endika Posadas
 Attachments: load_java_info_first.patch

In Solr.cmd, when using SSL, there is a check to verify what version of java 
solr is running with. If this version of Java is greater or equal java 9 it 
will load the jetty https module while for java 8 it will use https8. 

However, this java version check is done before the java major version variable 
has been assigned. As a result, Solr in windows doesn't work when SSL is 
enabled.

 

To fix this issue is enough if java checks are done before SSL checks.

 

I have attached a patch with the modifications.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-08-06 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172213#comment-17172213
 ] 

Michael McCandless commented on LUCENE-9444:


{quote}Should this class extends {{DocIdSetIterator}} to allow intersection 
with another {{DocIdSetIterator}} created from 
{{FacetsCollector.MatchingDoc.bits}} ?
{quote}
I think so?
{quote}Making {{dim}} part of ctor feels a bit restrictive, how about providing 
2 separate APIs, one that accepts dimension and another that does not ?
{quote}
Maybe the method that produces the iterator could optionally take a {{dim}} to 
filter for only those labels under that dimension?
{quote}how about returning a {{java.util.Iterator}} instead of 
{{FacetLabel[]}} ?
{quote}
+1

 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14713) Single thread on streaming updates

2020-08-06 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172203#comment-17172203
 ] 

Cao Manh Dat commented on SOLR-14713:
-

I created a PR for this, it is not finish yet and no tests so far. But the PR 
also solve the problem of incorrectly handling retry request. Here is the 
scenario:
 * {{UpdateRequest}} is converted to multiple {{Req}}s
 * Solr failed to send the second Req
 * Solr retry the first Req (since we only refer/point to the first one)
 * It success 
 * The whole UpdateRequest becomes success.

 

> Single thread on streaming updates
> --
>
> Key: SOLR-14713
> URL: https://issues.apache.org/jira/browse/SOLR-14713
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Or great simplify SolrCmdDistributor
> h2. Current way for fan out updates of Solr
> Currently on receiving an updateRequest, Solr will create a new 
> UpdateProcessors for handling that request, then it parses one by one 
> document from the request and let’s processor handle it.
> {code:java}
> onReceiving(UpdateRequest update):
>   processors = createNewProcessors();
>   for (Document doc : update) {
> processors.handle(doc)
> }
> {code}
> Let’s say the number of replicas in the current shard is N, updateProcessor 
> will create N-1 queues and runners for each other replica.
>  Runner is basically a thread that dequeues updates from its corresponding 
> queue and sends it to a corresponding replica endpoint.
> Note 1: all Runners share the same client hence connection pool and same 
> thread pool. 
>  Note 2: A runner will send all documents of its UpdateRequest in a single 
> HTTP POST request (to reduce the number of threads for handling requests on 
> the other side). Therefore its lifetime equals the total time of handling its 
> UpdateRequest. Below is a typical activity that happens in a runner's life 
> cycle.
> h2. Problems of current approach
> The current approach have two problems:
>  - Problem 1: It uses lots of threads for fan out requests.
>  - Problem 2 which is more important: it is very complex. Solr is also using 
> ConcurrentUpdateSolrClient (CUSC for short) for that, CUSC implementation 
> allows using a single queue but multiple runners for same queue (although we 
> only use one runner at max) this raise the complexity of the whole flow up to 
> the top. Single fix for a problem can raise multiple problems later, i.e: in 
> SOLR-13975 on trying to handle the problem when the other endpoint is hanging 
> out for so long, we introduced a bug that lets the runner keep running even 
> when the updateRequest is fully handled in the leader.
> h2. Doing everything in single thread
> Since we are already supporting sending requests in an async manner, why 
> don’t we let the main thread which is handling the update request to send 
> updates to all others without the need of runners or queues. The code will be 
> something like this
> {code:java}
>  Class UpdateProcessor:
>Map pendingOutStreams
>
>func handleAddDoc(doc):
>   for (replica: replicas):
>   pendingOutStreams.get(replica).send(doc)
>
>func onEndUpdateRequest():
>   pendingOutStreams.values().forEach(out -> 
> closeAndHandleResponse(out)){code}
>  
> By doing this we will use less threads and the code is much more simpler and 
> cleaner. Of course that there will be some downgrade in the time for handling 
> an updateRequest since we are doing it serially instead of concurrently. In a 
> formal way it will be like this
> {code:java}
>  oldTime = timeForIndexing(update) + timeForSendingUpdates(update)
>  newTime = timeForIndexing(update) + (N-1) * 
> timeForSendingUpdates(update){code}
> But I believe that timeForIndexing is much more than timeForSendingUpdates so 
> we do not really need to be concerned about this. Even that is really a 
> problem users can simply create more threads for indexing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] CaoManhDat opened a new pull request #1722: SOLR-14713: Single thread on streaming updates

2020-08-06 Thread GitBox


CaoManhDat opened a new pull request #1722:
URL: https://github.com/apache/lucene-solr/pull/1722


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14713) Single thread on streaming updates

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172196#comment-17172196
 ] 

ASF subversion and git services commented on SOLR-14713:


Commit 5986b4cc3c83ac89d014d85ec4ea53d303800fe7 in lucene-solr's branch 
refs/heads/jira/SOLR-14713 from Cao Manh Dat
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5986b4c ]

SOLR-14713: Single thread on streaming updates


> Single thread on streaming updates
> --
>
> Key: SOLR-14713
> URL: https://issues.apache.org/jira/browse/SOLR-14713
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>
> Or great simplify SolrCmdDistributor
> h2. Current way for fan out updates of Solr
> Currently on receiving an updateRequest, Solr will create a new 
> UpdateProcessors for handling that request, then it parses one by one 
> document from the request and let’s processor handle it.
> {code:java}
> onReceiving(UpdateRequest update):
>   processors = createNewProcessors();
>   for (Document doc : update) {
> processors.handle(doc)
> }
> {code}
> Let’s say the number of replicas in the current shard is N, updateProcessor 
> will create N-1 queues and runners for each other replica.
>  Runner is basically a thread that dequeues updates from its corresponding 
> queue and sends it to a corresponding replica endpoint.
> Note 1: all Runners share the same client hence connection pool and same 
> thread pool. 
>  Note 2: A runner will send all documents of its UpdateRequest in a single 
> HTTP POST request (to reduce the number of threads for handling requests on 
> the other side). Therefore its lifetime equals the total time of handling its 
> UpdateRequest. Below is a typical activity that happens in a runner's life 
> cycle.
> h2. Problems of current approach
> The current approach have two problems:
>  - Problem 1: It uses lots of threads for fan out requests.
>  - Problem 2 which is more important: it is very complex. Solr is also using 
> ConcurrentUpdateSolrClient (CUSC for short) for that, CUSC implementation 
> allows using a single queue but multiple runners for same queue (although we 
> only use one runner at max) this raise the complexity of the whole flow up to 
> the top. Single fix for a problem can raise multiple problems later, i.e: in 
> SOLR-13975 on trying to handle the problem when the other endpoint is hanging 
> out for so long, we introduced a bug that lets the runner keep running even 
> when the updateRequest is fully handled in the leader.
> h2. Doing everything in single thread
> Since we are already supporting sending requests in an async manner, why 
> don’t we let the main thread which is handling the update request to send 
> updates to all others without the need of runners or queues. The code will be 
> something like this
> {code:java}
>  Class UpdateProcessor:
>Map pendingOutStreams
>
>func handleAddDoc(doc):
>   for (replica: replicas):
>   pendingOutStreams.get(replica).send(doc)
>
>func onEndUpdateRequest():
>   pendingOutStreams.values().forEach(out -> 
> closeAndHandleResponse(out)){code}
>  
> By doing this we will use less threads and the code is much more simpler and 
> cleaner. Of course that there will be some downgrade in the time for handling 
> an updateRequest since we are doing it serially instead of concurrently. In a 
> formal way it will be like this
> {code:java}
>  oldTime = timeForIndexing(update) + timeForSendingUpdates(update)
>  newTime = timeForIndexing(update) + (N-1) * 
> timeForSendingUpdates(update){code}
> But I believe that timeForIndexing is much more than timeForSendingUpdates so 
> we do not really need to be concerned about this. Even that is really a 
> problem users can simply create more threads for indexing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14713) Single thread on streaming updates

2020-08-06 Thread Cao Manh Dat (Jira)
Cao Manh Dat created SOLR-14713:
---

 Summary: Single thread on streaming updates
 Key: SOLR-14713
 URL: https://issues.apache.org/jira/browse/SOLR-14713
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Cao Manh Dat
Assignee: Cao Manh Dat


Or great simplify SolrCmdDistributor
h2. Current way for fan out updates of Solr

Currently on receiving an updateRequest, Solr will create a new 
UpdateProcessors for handling that request, then it parses one by one document 
from the request and let’s processor handle it.
{code:java}
onReceiving(UpdateRequest update):
  processors = createNewProcessors();
  for (Document doc : update) {
processors.handle(doc)
}
{code}
Let’s say the number of replicas in the current shard is N, updateProcessor 
will create N-1 queues and runners for each other replica.
 Runner is basically a thread that dequeues updates from its corresponding 
queue and sends it to a corresponding replica endpoint.

Note 1: all Runners share the same client hence connection pool and same thread 
pool. 
 Note 2: A runner will send all documents of its UpdateRequest in a single HTTP 
POST request (to reduce the number of threads for handling requests on the 
other side). Therefore its lifetime equals the total time of handling its 
UpdateRequest. Below is a typical activity that happens in a runner's life 
cycle.
h2. Problems of current approach

The current approach have two problems:
 - Problem 1: It uses lots of threads for fan out requests.
 - Problem 2 which is more important: it is very complex. Solr is also using 
ConcurrentUpdateSolrClient (CUSC for short) for that, CUSC implementation 
allows using a single queue but multiple runners for same queue (although we 
only use one runner at max) this raise the complexity of the whole flow up to 
the top. Single fix for a problem can raise multiple problems later, i.e: in 
SOLR-13975 on trying to handle the problem when the other endpoint is hanging 
out for so long, we introduced a bug that lets the runner keep running even 
when the updateRequest is fully handled in the leader.

h2. Doing everything in single thread

Since we are already supporting sending requests in an async manner, why don’t 
we let the main thread which is handling the update request to send updates to 
all others without the need of runners or queues. The code will be something 
like this
{code:java}
 Class UpdateProcessor:
   Map pendingOutStreams
   
   func handleAddDoc(doc):
  for (replica: replicas):
  pendingOutStreams.get(replica).send(doc)
   
   func onEndUpdateRequest():
  pendingOutStreams.values().forEach(out -> 
closeAndHandleResponse(out)){code}
 

By doing this we will use less threads and the code is much more simpler and 
cleaner. Of course that there will be some downgrade in the time for handling 
an updateRequest since we are doing it serially instead of concurrently. In a 
formal way it will be like this
{code:java}
 oldTime = timeForIndexing(update) + timeForSendingUpdates(update)
 newTime = timeForIndexing(update) + (N-1) * timeForSendingUpdates(update){code}
But I believe that timeForIndexing is much more than timeForSendingUpdates so 
we do not really need to be concerned about this. Even that is really a problem 
users can simply create more threads for indexing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9439) Matches API should enumerate hit fields that have no positions (no iterator)

2020-08-06 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172186#comment-17172186
 ] 

Dawid Weiss commented on LUCENE-9439:
-

Hi Alan. Thank you for your feedback. Works like a charm. The "no-positions" 
strategy approach allows for some interesting deviations - one could add match 
regions for entire values or just for tokens returned from analysis (so you can 
"see" individual tokens over the value text).

I piggybacked a small fix to disjunction matches iterator because it looked 
like a bug to me (unrelated). [1]

Otherwise it's really well separated from existing code and works great for me. 
For example, I tried interval queries and they just work out of the box. A more 
complex expression highlights more than it should but this is related to the 
match range returned so is nicely decoupled from the "highlighting engine" 
itself. [2]

I think it's worth adding to Lucene. Would have to get rid of the assertj 
dependency first though. Or maybe we should add it and allow its use? The nice 
thing about assertj is that it formats assertion failures in a much better way, 
especially for stream or collection assertions.

[1] 
https://github.com/apache/lucene-solr/pull/1721/files#diff-f5538289e23aabdd53bc3bcbc59da342
[2] 
https://github.com/apache/lucene-solr/blob/c0562c1f2d789679432f9d72375aa3747e4b6526/lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/MatchRegionRetrieverTest.java#L335-L355


> Matches API should enumerate hit fields that have no positions (no iterator)
> 
>
> Key: LUCENE-9439
> URL: https://issues.apache.org/jira/browse/LUCENE-9439
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Attachments: LUCENE-9439.patch, matchhighlighter.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I have been fiddling with Matches API and it's great. There is one corner 
> case that doesn't work for me though -- queries that affect fields without 
> positions return {{MatchesUtil.MATCH_WITH_NO_TERMS}} but this constant is 
> problematic as it doesn't carry the field name that caused it (returns null).
> The associated fromSubMatches combines all these constants into one (or 
> swallows them) which is another problem.
> I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with 
> a true match (carrying field name) returning an empty iterator (or a constant 
> "empty" iterator NO_TERMS).
> I have a very compelling use case: I wrote an "auto-highlighter" that runs on 
> top of Matches API and automatically picks up query-relevant fields and 
> snippets. Everything works beautifully except for cases where fields are 
> searchable but don't have any positions (token-like fields).
> I can work on a patch but wanted to reach out first - [~romseygeek]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss opened a new pull request #1721: LUCENE-9439: match region highlighter components

2020-08-06 Thread GitBox


dweiss opened a new pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss closed pull request #1689: LUCENE-9439: Matches API should enumerate hit fields that have no positions (support empty iterator)

2020-08-06 Thread GitBox


dweiss closed pull request #1689:
URL: https://github.com/apache/lucene-solr/pull/1689


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14712) Standardize RPC calls in Solr

2020-08-06 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14712:
--
Description: 
We should have a standard mechanism to make a request to the right replica/node 
across solr code.

This RPC mechanism assumes that
 * The RPC mechanism is HTTP
 * It is aware of all collections,shards & their topology etc
 * it knows how to route a request to the correct core

 This is agnostic of wire level formats ,Solr documents etc. That is a layer 
above this.

Anyone can use their own JSON parser or any other RPC wire level format on top 
of this

for example a code like this 

{code}

private void invokeOverseerOp(String electionNode, String op) {
ModifiableSolrParams params = new ModifiableSolrParams();
 ShardHandler shardHandler = shardHandlerFactory.getShardHandler();
 params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString());
 params.set("op", op);
 params.set("qt", adminPath);
 params.set("electionNode", electionNode);
 ShardRequest sreq = new ShardRequest();
 sreq.purpose = 1;
 String replica = 
zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode));
 sreq.shards = new String[]\{replica};
 sreq.actualShards = sreq.shards;
 sreq.params = params;
 shardHandler.submit(sreq, replica, sreq.params);
 shardHandler.takeCompletedOrError();
}

{code}

will be replaced with
{code}

private void invokeOverseerOp(String electionNode, String op) {
 HttpRpcFactory factory = null;
 factory.create()
 .withHttpMethod(SolrRequest.METHOD.GET)
 .addParam(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString())
 .addParam("op", op)
 .addParam("electionNode", electionNode)
 .addParam(ShardParams.SHARDS_PURPOSE, "1")
 .withV1Uri(adminPath)
 .toNode(electionNode)
 .invoke();

}
{code}

  was:
We should have a standard mechanism to make a request to the right replica/node 
across solr code.

This RPC mechanism assumes that
 * The RPC mechanism is HTTP
 * It is aware of all collections,shards & their topology etc
 * it knows how to route a request to the correct core

 This is agnostic of wire level formats ,Solr documents etc. That is a layer 
above this.

Anyone can use their own JSON parser or any other RPC wire level format on top 
of this


> Standardize RPC calls in Solr
> -
>
> Key: SOLR-14712
> URL: https://issues.apache.org/jira/browse/SOLR-14712
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should have a standard mechanism to make a request to the right 
> replica/node across solr code.
> This RPC mechanism assumes that
>  * The RPC mechanism is HTTP
>  * It is aware of all collections,shards & their topology etc
>  * it knows how to route a request to the correct core
>  This is agnostic of wire level formats ,Solr documents etc. That is a layer 
> above this.
> Anyone can use their own JSON parser or any other RPC wire level format on 
> top of this
> for example a code like this 
> {code}
> private void invokeOverseerOp(String electionNode, String op) {
> ModifiableSolrParams params = new ModifiableSolrParams();
>  ShardHandler shardHandler = shardHandlerFactory.getShardHandler();
>  params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString());
>  params.set("op", op);
>  params.set("qt", adminPath);
>  params.set("electionNode", electionNode);
>  ShardRequest sreq = new ShardRequest();
>  sreq.purpose = 1;
>  String replica = 
> zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode));
>  sreq.shards = new String[]\{replica};
>  sreq.actualShards = sreq.shards;
>  sreq.params = params;
>  shardHandler.submit(sreq, replica, sreq.params);
>  shardHandler.takeCompletedOrError();
> }
> {code}
> will be replaced with
> {code}
> private void invokeOverseerOp(String electionNode, String op) {
>  HttpRpcFactory factory = null;
>  factory.create()
>  .withHttpMethod(SolrRequest.METHOD.GET)
>  .addParam(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString())
>  .addParam("op", op)
>  .addParam("electionNode", electionNode)
>  .addParam(ShardParams.SHARDS_PURPOSE, "1")
>  .withV1Uri(adminPath)
>  .toNode(electionNode)
>  .invoke();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14712) Standardize RPC calls in Solr

2020-08-06 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14712:
--
Description: 
We should have a standard mechanism to make a request to the right replica/node 
across solr code.

This RPC mechanism assumes that
 * The RPC mechanism is HTTP
 * It is aware of all collections,shards & their topology etc
 * it knows how to route a request to the correct core

 This is agnostic of wire level formats ,Solr documents etc. That is a layer 
above this.

Anyone can use their own JSON parser or any other RPC wire level format on top 
of this

  was:
We should have a standard mechanism to make a request to the right replica/node 
across solr code.

This RPC mechanism assumes that
* The RPC mechanism is HTTP
* It is aware of all collections,shards & their topology etc
* it knows how to route a request to the correct core




 


> Standardize RPC calls in Solr
> -
>
> Key: SOLR-14712
> URL: https://issues.apache.org/jira/browse/SOLR-14712
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should have a standard mechanism to make a request to the right 
> replica/node across solr code.
> This RPC mechanism assumes that
>  * The RPC mechanism is HTTP
>  * It is aware of all collections,shards & their topology etc
>  * it knows how to route a request to the correct core
>  This is agnostic of wire level formats ,Solr documents etc. That is a layer 
> above this.
> Anyone can use their own JSON parser or any other RPC wire level format on 
> top of this



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8626) standardise test class naming

2020-08-06 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172101#comment-17172101
 ] 

Christine Poerschke commented on LUCENE-8626:
-

bq. This was mentioned and proposed on the dev mailing list. ...

The "Test Harness behaviour on a package run" [thread| 
https://lists.apache.org/thread.html/3a8e38e6ca9abe2e4be0c12cfd23d103cf60d0891c54df45e9c7bf18%40%3Cdev.lucene.apache.org%3E]
 had led to the ticket here at the end of 2018.

Great to see the interest and effort resume via the "Standardize Leading Test 
or Trailing Test" 
[thread|https://lists.apache.org/thread.html/rde0276272a86582c5e6f9456ad592233c9ed575579a0f812d88be486%40%3Cdev.lucene.apache.org%3E]
 now.

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14712) Standardize RPC calls in Solr

2020-08-06 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14712:
--
Description: 
We should have a standard mechanism to make a request to the right replica/node 
across solr code.

This RPC mechanism assumes that
* The RPC mechanism is HTTP
* It is aware of all collections,shards & their topology etc
* it knows how to route a request to the correct core




 

  was:
We should have a standard mechanism to make a request to the right replica/node 
across solr code.

 


> Standardize RPC calls in Solr
> -
>
> Key: SOLR-14712
> URL: https://issues.apache.org/jira/browse/SOLR-14712
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should have a standard mechanism to make a request to the right 
> replica/node across solr code.
> This RPC mechanism assumes that
> * The RPC mechanism is HTTP
> * It is aware of all collections,shards & their topology etc
> * it knows how to route a request to the correct core
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



  1   2   >