[GitHub] [lucene-solr] NazerkeBS commented on a change in pull request #1527: SOLR-14384 Stack SolrRequestInfo

2020-05-25 Thread GitBox


NazerkeBS commented on a change in pull request #1527:
URL: https://github.com/apache/lucene-solr/pull/1527#discussion_r429763566



##
File path: solr/core/src/java/org/apache/solr/request/SolrRequestInfo.java
##
@@ -38,7 +40,13 @@
 
 
 public class SolrRequestInfo {
-  protected final static ThreadLocal threadLocal = new 
ThreadLocal<>();
+
+  protected final static int capacity = 150;

Review comment:
   Initially I set it to 100, then there was one test (CloudAuthStreamTest) 
was failing due to this capacity. So increased it to 150.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9286) FST arc.copyOf clones BitTables and this can lead to excessive memory use

2020-05-25 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115802#comment-17115802
 ] 

Adrien Grand commented on LUCENE-9286:
--

FYI I was just digging a Kuromoji regression introduced in 8.4 that made 
analysis run about 6x slower. Interestingly the slowdown was on both branch_8_4 
and branch 8_5 but not on branch_8x and git bisect pointed out to this commit 
as the fix of the regression.

> FST arc.copyOf clones BitTables and this can lead to excessive memory use
> -
>
> Key: LUCENE-9286
> URL: https://issues.apache.org/jira/browse/LUCENE-9286
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 8.5
>Reporter: Dawid Weiss
>Assignee: Bruno Roustant
>Priority: Major
> Fix For: 8.6
>
> Attachments: screen-[1].png
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> I see a dramatic increase in the amount of memory required for construction 
> of (arguably large) automata. It currently OOMs with 8GB of memory consumed 
> for bit tables. I am pretty sure this didn't require so much memory before 
> (the automaton is ~50MB after construction).
> Something bad happened in between. Thoughts, [~broustant], [~sokolov]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9286) FST arc.copyOf clones BitTables and this can lead to excessive memory use

2020-05-25 Thread Bruno Roustant (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115809#comment-17115809
 ] 

Bruno Roustant commented on LUCENE-9286:


Thanks for the info Adrien.

> FST arc.copyOf clones BitTables and this can lead to excessive memory use
> -
>
> Key: LUCENE-9286
> URL: https://issues.apache.org/jira/browse/LUCENE-9286
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 8.5
>Reporter: Dawid Weiss
>Assignee: Bruno Roustant
>Priority: Major
> Fix For: 8.6
>
> Attachments: screen-[1].png
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> I see a dramatic increase in the amount of memory required for construction 
> of (arguably large) automata. It currently OOMs with 8GB of memory consumed 
> for bit tables. I am pretty sure this didn't require so much memory before 
> (the automaton is ~50MB after construction).
> Something bad happened in between. Thoughts, [~broustant], [~sokolov]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on a change in pull request #1528: SOLR-12823: remove /clusterstate.json

2020-05-25 Thread GitBox


janhoy commented on a change in pull request #1528:
URL: https://github.com/apache/lucene-solr/pull/1528#discussion_r429775163



##
File path: 
solr/core/src/java/org/apache/solr/cloud/autoscaling/sim/SimClusterStateProvider.java
##
@@ -790,6 +790,33 @@ public void simRemoveReplica(String nodeId, String 
collection, String coreNodeNa
   }
 
   /**
+<<< HEAD
+===
+   * Save clusterstate.json to {@link DistribStateManager}.
+   * @return saved state
+   */
+  private ClusterState saveClusterState(ClusterState state) throws IOException 
{
+ensureNotClosed();
+
+// TODO: this method is emptied of its content in order to compile. We're 
not saving the cluster state that has to be saved collection per collection in 
separate state.json files.
+// TODO: DO NOT CHECK THIS IN. Check with AB how to update sim to 
stateFormat 2
+
+//byte[] data = Utils.toJSON(state);
+//try {
+//  VersionedData oldData = 
stateManager.getData(ZkStateReader.CLUSTER_STATE);
+//  int version = oldData != null ? oldData.getVersion() : 0;
+//  assert clusterStateVersion == version : "local clusterStateVersion out 
of sync";
+//  stateManager.setData(ZkStateReader.CLUSTER_STATE, data, version);
+//  log.debug("** saved cluster state version {}", version);
+//  clusterStateVersion++;
+//} catch (Exception e) {
+//  throw new IOException(e);
+//}
+return state;
+  }
+
+  /**
+>>> SOLR-12823: remove /clusterstate.json

Review comment:
   ???

##
File path: solr/core/src/java/org/apache/solr/cloud/ZkController.java
##
@@ -491,6 +494,41 @@ public boolean isClosed() {
 assert ObjectReleaseTracker.track(this);
   }
 
+  /**
+   * Verifies if /clusterstate.json exists in Zookeepeer, and if it does 
and is not empty, refuses to start and outputs
+   * a helpful message regarding collection migration.
+   *
+   * If /clusterstate.json exists and is empty, it is removed.
+   */
+  private void checkNoOldClusterstate(final SolrZkClient zkClient) throws 
InterruptedException {
+try {
+  if (!zkClient.exists(ZkStateReader.UNSUPPORTED_CLUSTER_STATE, true)) {
+return;
+  }
+
+  final byte[] data = 
zkClient.getData(ZkStateReader.UNSUPPORTED_CLUSTER_STATE, null, null, true);
+
+  if (Arrays.equals("{}".getBytes(StandardCharsets.UTF_8), data)) {
+// Empty json. This log will only occur once.
+log.warn("{} no longer supported starting with Solr 9. Found empty 
file on Zookeeper, deleting it.", ZkStateReader.UNSUPPORTED_CLUSTER_STATE);
+zkClient.delete(ZkStateReader.UNSUPPORTED_CLUSTER_STATE, -1, true);

Review comment:
   I was thinking of the rolling upgrade scenario - if someone upgrades 
from 8.x to 9.0 one node at a time. Then the first node upgraded will delete 
/clusterstate.json. Will that cause any kind of failures or exceptions in the 
remaining nodes, if they have a watch on the znode or something?
   
   A way to mitigate it could be to let only the Overseer do the delete, and 
tell people to upgrade overseer last?

##
File path: 
solr/core/src/java/org/apache/solr/cloud/api/collections/RestoreCmd.java
##
@@ -160,9 +159,6 @@ public void call(ClusterState state, ZkNodeProps message, 
NamedList results) thr
   Map propMap = new HashMap<>();
   propMap.put(Overseer.QUEUE_OPERATION, CREATE.toString());
   propMap.put("fromApi", "true"); // mostly true.  Prevents 
autoCreated=true in the collection state.
-  if (properties.get(STATE_FORMAT) == null) {

Review comment:
   What happens if someone backs up a 8.5 collection with stateFormat=1 and 
then tries to restore in 9.0? Not very likely since that collection was 
probably created pre-7.0 and it would not load in 9.0 anyway. But should we 
simply throw an exception here if STATE_FORMAT is 1?

##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/impl/BaseHttpClusterStateProvider.java
##
@@ -138,8 +138,7 @@ private ClusterState fetchClusterState(SolrClient client, 
String collection, Map
 Set liveNodes = new 
HashSet((List)(cluster.get("live_nodes")));
 this.liveNodes = liveNodes;
 liveNodesTimestamp = System.nanoTime();
-//TODO SOLR-11877 we don't know the znode path; CLUSTER_STATE is probably 
wrong leading to bad stateFormat

Review comment:
   Remember to close SOLR-11877 after this

##
File path: solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java
##
@@ -210,47 +200,42 @@ public boolean liveNodesContain(String name) {
   @Override
   public String toString() {
 StringBuilder sb = new StringBuilder();
-sb.append("znodeVersion: ").append(znodeVersion);
-sb.append("\n");
 sb.append("live nodes:").append(liveNodes);
 sb.append("\n");
 sb.append("collections:").append(collectionStates);
 return sb.toString();
   }
 
-  public static ClusterState load(Integer version, byte[] bytes, Set 
liveNodes)

[jira] [Commented] (SOLR-14347) Autoscaling placement wrong when concurrent replica placements are calculated

2020-05-25 Thread Andrzej Bialecki (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115821#comment-17115821
 ] 

Andrzej Bialecki commented on SOLR-14347:
-

[~murblanc] sure, please go ahead.

> Autoscaling placement wrong when concurrent replica placements are calculated
> -
>
> Key: SOLR-14347
> URL: https://issues.apache.org/jira/browse/SOLR-14347
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: 8.5
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Critical
> Fix For: 8.6
>
> Attachments: SOLR-14347.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Steps to reproduce:
>  * create a cluster of a few nodes (tested with 7 nodes)
>  * define per-collection policies that distribute replicas exclusively on 
> different nodes per policy
>  * concurrently create a few collections, each using a different policy
>  * resulting replica placement will be seriously wrong, causing many policy 
> violations
> Running the same scenario but instead creating collections sequentially 
> results in no violations.
> I suspect this is caused by incorrect locking level for all collection 
> operations (as defined in {{CollectionParams.CollectionAction}}) that create 
> new replica placements - i.e. CREATE, ADDREPLICA, MOVEREPLICA, DELETENODE, 
> REPLACENODE, SPLITSHARD, RESTORE, REINDEXCOLLECTION. All of these operations 
> use the policy engine to create new replica placements, and as a result they 
> change the cluster state. However, currently these operations are locked (in 
> {{OverseerCollectionMessageHandler.lockTask}} ) using 
> {{LockLevel.COLLECTION}}. In practice this means that the lock is held only 
> for the particular collection that is being modified.
> A straightforward fix for this issue is to change the locking level to 
> CLUSTER (and I confirm this fixes the scenario described above). However, 
> this effectively serializes all collection operations listed above, which 
> will result in general slow-down of all collection operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1528: SOLR-12823: remove /clusterstate.json

2020-05-25 Thread GitBox


murblanc commented on a change in pull request #1528:
URL: https://github.com/apache/lucene-solr/pull/1528#discussion_r429812067



##
File path: 
solr/core/src/java/org/apache/solr/cloud/api/collections/RestoreCmd.java
##
@@ -160,9 +159,6 @@ public void call(ClusterState state, ZkNodeProps message, 
NamedList results) thr
   Map propMap = new HashMap<>();
   propMap.put(Overseer.QUEUE_OPERATION, CREATE.toString());
   propMap.put("fromApi", "true"); // mostly true.  Prevents 
autoCreated=true in the collection state.
-  if (properties.get(STATE_FORMAT) == null) {

Review comment:
   Good point. Will update.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1528: SOLR-12823: remove /clusterstate.json

2020-05-25 Thread GitBox


murblanc commented on a change in pull request #1528:
URL: https://github.com/apache/lucene-solr/pull/1528#discussion_r429811700



##
File path: solr/core/src/java/org/apache/solr/cloud/ZkController.java
##
@@ -491,6 +494,41 @@ public boolean isClosed() {
 assert ObjectReleaseTracker.track(this);
   }
 
+  /**
+   * Verifies if /clusterstate.json exists in Zookeepeer, and if it does 
and is not empty, refuses to start and outputs
+   * a helpful message regarding collection migration.
+   *
+   * If /clusterstate.json exists and is empty, it is removed.
+   */
+  private void checkNoOldClusterstate(final SolrZkClient zkClient) throws 
InterruptedException {
+try {
+  if (!zkClient.exists(ZkStateReader.UNSUPPORTED_CLUSTER_STATE, true)) {
+return;
+  }
+
+  final byte[] data = 
zkClient.getData(ZkStateReader.UNSUPPORTED_CLUSTER_STATE, null, null, true);
+
+  if (Arrays.equals("{}".getBytes(StandardCharsets.UTF_8), data)) {
+// Empty json. This log will only occur once.
+log.warn("{} no longer supported starting with Solr 9. Found empty 
file on Zookeeper, deleting it.", ZkStateReader.UNSUPPORTED_CLUSTER_STATE);
+zkClient.delete(ZkStateReader.UNSUPPORTED_CLUSTER_STATE, -1, true);

Review comment:
   I don't think that's an issue: if /clusterstate.json is non empty, no 
node running 9.0 will start.
   If /clusterstate.json exists and is empty, the first starting node on 9.0 
will delete it as it starts.
   8.x nodes that might start afterwards (who knows) will I believe recreate 
the file (that will be deleted again when a 9.0 node starts).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1528: SOLR-12823: remove /clusterstate.json

2020-05-25 Thread GitBox


murblanc commented on a change in pull request #1528:
URL: https://github.com/apache/lucene-solr/pull/1528#discussion_r429815739



##
File path: solr/core/src/test/org/apache/solr/cloud/OverseerTest.java
##
@@ -181,16 +180,21 @@ public void close() {
   zkStateReader.close();
 }
 
+/**
+ * Create a collection.
+ * Note there's a similar but slightly different {@link 
OverseerTest#createCollection(String, int)}.

Review comment:
   I hesitated. There are a few variations in how collections are created. 
I don't know if these variations are in purpose or end up doing the same thing 
and are there for historical reasons, and I tried to keep tests as identical to 
the way they were before as possible. IIRC the method was created where 
previously the code was just duplicated around, so there's some progress :), 
but I didn't want to change the logic of any test beyond what was strictly 
necessary.

##
File path: 
solr/core/src/java/org/apache/solr/cloud/autoscaling/sim/SimClusterStateProvider.java
##
@@ -790,6 +790,33 @@ public void simRemoveReplica(String nodeId, String 
collection, String coreNodeNa
   }
 
   /**
+<<< HEAD
+===
+   * Save clusterstate.json to {@link DistribStateManager}.
+   * @return saved state
+   */
+  private ClusterState saveClusterState(ClusterState state) throws IOException 
{
+ensureNotClosed();
+
+// TODO: this method is emptied of its content in order to compile. We're 
not saving the cluster state that has to be saved collection per collection in 
separate state.json files.
+// TODO: DO NOT CHECK THIS IN. Check with AB how to update sim to 
stateFormat 2
+
+//byte[] data = Utils.toJSON(state);
+//try {
+//  VersionedData oldData = 
stateManager.getData(ZkStateReader.CLUSTER_STATE);
+//  int version = oldData != null ? oldData.getVersion() : 0;
+//  assert clusterStateVersion == version : "local clusterStateVersion out 
of sync";
+//  stateManager.setData(ZkStateReader.CLUSTER_STATE, data, version);
+//  log.debug("** saved cluster state version {}", version);
+//  clusterStateVersion++;
+//} catch (Exception e) {
+//  throw new IOException(e);
+//}
+return state;
+  }
+
+  /**
+>>> SOLR-12823: remove /clusterstate.json

Review comment:
   Ouch.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1528: SOLR-12823: remove /clusterstate.json

2020-05-25 Thread GitBox


murblanc commented on a change in pull request #1528:
URL: https://github.com/apache/lucene-solr/pull/1528#discussion_r429816694



##
File path: solr/core/src/test/org/apache/solr/cloud/TestZkChroot.java
##
@@ -27,8 +27,10 @@
 import org.apache.solr.core.CoreContainer;
 import org.junit.After;
 import org.junit.Before;
+import org.junit.Ignore;
 import org.junit.Test;
 
+// TODO: this class tries to test Zookeeper using Solr abstractions, but ZK 
implies the code is running in cloud mode. It doesn't work.

Review comment:
   I agree (on the followup option).
   I tried to see if there's a way to do the chroot on the mini cluster or 
elsewhere but nothing obvious came after an hour or two of hacking, that's why 
I suggest to leave it for later.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1528: SOLR-12823: remove /clusterstate.json

2020-05-25 Thread GitBox


murblanc commented on a change in pull request #1528:
URL: https://github.com/apache/lucene-solr/pull/1528#discussion_r429817130



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/impl/BaseHttpClusterStateProvider.java
##
@@ -138,8 +138,7 @@ private ClusterState fetchClusterState(SolrClient client, 
String collection, Map
 Set liveNodes = new 
HashSet((List)(cluster.get("live_nodes")));
 this.liveNodes = liveNodes;
 liveNodesTimestamp = System.nanoTime();
-//TODO SOLR-11877 we don't know the znode path; CLUSTER_STATE is probably 
wrong leading to bad stateFormat

Review comment:
   Yes, it's mentioned in the PR description.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1528: SOLR-12823: remove /clusterstate.json

2020-05-25 Thread GitBox


murblanc commented on a change in pull request #1528:
URL: https://github.com/apache/lucene-solr/pull/1528#discussion_r429817725



##
File path: solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java
##
@@ -210,47 +200,42 @@ public boolean liveNodesContain(String name) {
   @Override
   public String toString() {
 StringBuilder sb = new StringBuilder();
-sb.append("znodeVersion: ").append(znodeVersion);
-sb.append("\n");
 sb.append("live nodes:").append(liveNodes);
 sb.append("\n");
 sb.append("collections:").append(collectionStates);
 return sb.toString();
   }
 
-  public static ClusterState load(Integer version, byte[] bytes, Set 
liveNodes) {
-return load(version, bytes, liveNodes, ZkStateReader.CLUSTER_STATE);
-  }
   /**
-   * Create ClusterState from json string that is typically stored in 
zookeeper.
+   * Create a ClusterState from Json.
* 
-   * @param version zk version of the clusterstate.json file (bytes)
-   * @param bytes clusterstate.json as a byte array
+   * @param bytes a byte array of a Json representation of a mapping from 
collection name to the Json representation of a
+   *  {@link DocCollection} as written by {@link 
#write(JSONWriter)}. It can represent
+   *  one or more collections.
* @param liveNodes list of live nodes
* @return the ClusterState
*/
-  public static ClusterState load(Integer version, byte[] bytes, Set 
liveNodes, String znode) {
-// System.out.println(" ClusterState.load:" + (bytes==null ? null 
: new String(bytes)));
+  public static ClusterState createFromJson(int version, byte[] bytes, 
Set liveNodes) {
 if (bytes == null || bytes.length == 0) {
-  return new ClusterState(version, liveNodes, Collections.emptyMap());
+  return new ClusterState(liveNodes, Collections.emptyMap());
 }
 Map stateMap = (Map) Utils.fromJSON(bytes);
-return load(version, stateMap, liveNodes, znode);
+return createFromData(version, stateMap, liveNodes);
   }
 
-  public static ClusterState load(Integer version, Map 
stateMap, Set liveNodes, String znode) {
+  public static ClusterState createFromData(int version, Map 
stateMap, Set liveNodes) {

Review comment:
   As you wish, let me know and I'll change it as I rework the other 
comments.
   `createFromIntMapAndSet` :)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on a change in pull request #1528: SOLR-12823: remove /clusterstate.json

2020-05-25 Thread GitBox


janhoy commented on a change in pull request #1528:
URL: https://github.com/apache/lucene-solr/pull/1528#discussion_r429829357



##
File path: solr/core/src/java/org/apache/solr/cloud/ZkController.java
##
@@ -491,6 +494,41 @@ public boolean isClosed() {
 assert ObjectReleaseTracker.track(this);
   }
 
+  /**
+   * Verifies if /clusterstate.json exists in Zookeepeer, and if it does 
and is not empty, refuses to start and outputs
+   * a helpful message regarding collection migration.
+   *
+   * If /clusterstate.json exists and is empty, it is removed.
+   */
+  private void checkNoOldClusterstate(final SolrZkClient zkClient) throws 
InterruptedException {
+try {
+  if (!zkClient.exists(ZkStateReader.UNSUPPORTED_CLUSTER_STATE, true)) {
+return;
+  }
+
+  final byte[] data = 
zkClient.getData(ZkStateReader.UNSUPPORTED_CLUSTER_STATE, null, null, true);
+
+  if (Arrays.equals("{}".getBytes(StandardCharsets.UTF_8), data)) {
+// Empty json. This log will only occur once.
+log.warn("{} no longer supported starting with Solr 9. Found empty 
file on Zookeeper, deleting it.", ZkStateReader.UNSUPPORTED_CLUSTER_STATE);
+zkClient.delete(ZkStateReader.UNSUPPORTED_CLUSTER_STATE, -1, true);

Review comment:
   Running 8.x nodes might get a callback since they have a watch. Have not 
checked the callback handling code, but if the code don't handle DELETE 
operation then either nothing happens or an exception is thrown...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on a change in pull request #1528: SOLR-12823: remove /clusterstate.json

2020-05-25 Thread GitBox


janhoy commented on a change in pull request #1528:
URL: https://github.com/apache/lucene-solr/pull/1528#discussion_r429830491



##
File path: solr/core/src/test/org/apache/solr/cloud/OverseerTest.java
##
@@ -181,16 +180,21 @@ public void close() {
   zkStateReader.close();
 }
 
+/**
+ * Create a collection.
+ * Note there's a similar but slightly different {@link 
OverseerTest#createCollection(String, int)}.

Review comment:
   Ok. If/when the big Overseer / Curator rewrite happens, then I guess it 
will be cleaned up then...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on a change in pull request #1528: SOLR-12823: remove /clusterstate.json

2020-05-25 Thread GitBox


janhoy commented on a change in pull request #1528:
URL: https://github.com/apache/lucene-solr/pull/1528#discussion_r429831171



##
File path: solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java
##
@@ -210,47 +200,42 @@ public boolean liveNodesContain(String name) {
   @Override
   public String toString() {
 StringBuilder sb = new StringBuilder();
-sb.append("znodeVersion: ").append(znodeVersion);
-sb.append("\n");
 sb.append("live nodes:").append(liveNodes);
 sb.append("\n");
 sb.append("collections:").append(collectionStates);
 return sb.toString();
   }
 
-  public static ClusterState load(Integer version, byte[] bytes, Set 
liveNodes) {
-return load(version, bytes, liveNodes, ZkStateReader.CLUSTER_STATE);
-  }
   /**
-   * Create ClusterState from json string that is typically stored in 
zookeeper.
+   * Create a ClusterState from Json.
* 
-   * @param version zk version of the clusterstate.json file (bytes)
-   * @param bytes clusterstate.json as a byte array
+   * @param bytes a byte array of a Json representation of a mapping from 
collection name to the Json representation of a
+   *  {@link DocCollection} as written by {@link 
#write(JSONWriter)}. It can represent
+   *  one or more collections.
* @param liveNodes list of live nodes
* @return the ClusterState
*/
-  public static ClusterState load(Integer version, byte[] bytes, Set 
liveNodes, String znode) {
-// System.out.println(" ClusterState.load:" + (bytes==null ? null 
: new String(bytes)));
+  public static ClusterState createFromJson(int version, byte[] bytes, 
Set liveNodes) {
 if (bytes == null || bytes.length == 0) {
-  return new ClusterState(version, liveNodes, Collections.emptyMap());
+  return new ClusterState(liveNodes, Collections.emptyMap());
 }
 Map stateMap = (Map) Utils.fromJSON(bytes);
-return load(version, stateMap, liveNodes, znode);
+return createFromData(version, stateMap, liveNodes);
   }
 
-  public static ClusterState load(Integer version, Map 
stateMap, Set liveNodes, String znode) {
+  public static ClusterState createFromData(int version, Map 
stateMap, Set liveNodes) {

Review comment:
   I don't have strong feeelings, just that 'data' could be anything :) 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues

2020-05-25 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115885#comment-17115885
 ] 

Adrien Grand commented on LUCENE-9378:
--

bq. Another factor that probably plays a role here is how compressible the data 
is

I looked a bit more into the data we use for benchmarking. wikibigall already 
makes titles quite compressible given how linefile docs are sorted by title in 
the input file. wikimedium makes it way more compressible given how it splits 
articles into 1kB chunks that all share the same title, creating many duplicate 
values for adjacent doc IDs. This makes wikimedium titles a best-case scenario 
for compression (and thus a worst-case scenario for search speed) and I'd 
expect performance numbers to be significantly different between wikimediumall 
and wikibigall, and again between wikibigall and a shuffled copy of wikibigall 
that would no longer sort by title.

Are wikimedium titles representative of the data that you are indexing into 
binary doc values at Amazon, ie. are adjacent doc IDs likely to get the exact 
same value? If that's the case, then we could probably add ad-hoc compression 
for this case which would have a better runtime than LZ4, and we could 
automatically make the decision at index time instead of requiring users to 
configure a flag.

> Configurable compression for BinaryDocValues
> 
>
> Key: LUCENE-9378
> URL: https://issues.apache.org/jira/browse/LUCENE-9378
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Viral Gandhi
>Priority: Minor
>
> Lucene 8.5.1 includes a change to always [compress 
> BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This 
> caused (~30%) reduction in our red-line QPS (throughput). 
> We think users should be given some way to opt-in for this compression 
> feature instead of always being enabled which can have a substantial query 
> time cost as we saw during our upgrade. [~mikemccand] suggested one possible 
> approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and 
> UNCOMPRESSED) and allowing users to create a custom Codec subclassing the 
> default Codec and pick the format they want.
> Idea is similar to Lucene50StoredFieldsFormat which has two modes, 
> Mode.BEST_SPEED and Mode.BEST_COMPRESSION.
> Here's related issues for adding benchmark covering BINARY doc values 
> query-time performance - [https://github.com/mikemccand/luceneutil/issues/61]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14498) BlockCache gets stuck not accepting new stores

2020-05-25 Thread Andrzej Bialecki (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115903#comment-17115903
 ] 

Andrzej Bialecki commented on SOLR-14498:
-

[~jakubzytka] If I understand this correctly simply upgrading to Caffeine 2.8.4 
should solve the problem?

> BlockCache gets stuck not accepting new stores
> --
>
> Key: SOLR-14498
> URL: https://issues.apache.org/jira/browse/SOLR-14498
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query
>Affects Versions: 6.5, 6.6.5, master (9.0), 7.7.3, 8.5.1
>Reporter: Jakub Zytka
>Assignee: Andrzej Bialecki
>Priority: Major
>
> {{BlockCache}} uses two components: "storage", i.e. {{banks}} and "eviction 
> mechanism", i.e {{cache}}, implemented by caffeine cache.
> The relation between them is that "storage" enforces a strict limit for the 
> number of entries (
> {{numberOfBlocksPerBank * numberOfBanks}}) whereas the "eviction mechanism" 
> takes care of freeing entries from the storage thanks to {{maximumSize}} set 
> for the caffeine cache to {{numberOfBlocksPerBank * numberOfBanks - 1}}.
> The storage relies on caffeine cache to eventually free at least 1 entry from 
> the storage. If that doesn't happen the {{BlockCache}} starts to fail all new 
> stores.
> As it turns out, the caffeine cache may not reduce it's size to the desired 
> {{maximumSize}} for as long as no {{put}} or {{getIfPresent}} which *finds an 
> entry* is executed.
> With a sufficiently unlucky read pattern, the block cache may be rendered 
> useless (0 hit ratio):
> cache poisoned by non-reusable entries; new, reusable entries are not stored 
> and thus not reused.
> Further info may be found in 
> [https://github.com/ben-manes/caffeine/issues/420]
>  
> Change in caffeine that triggers it's internal cleanup mechanism regardless 
> of whether getIfPresent gets a hit has been implemented in 
> [https://github.com/ben-manes/caffeine/commit/7239bb0dda2af1e7301e8f66a5df28215b5173bc]
> and is due to be released in caffeine 2.8.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1506: SOLR-14470: Add streaming expressions to /export handler

2020-05-25 Thread GitBox


sigram commented on a change in pull request #1506:
URL: https://github.com/apache/lucene-solr/pull/1506#discussion_r429847378



##
File path: solr/core/src/java/org/apache/solr/handler/export/ExportWriter.java
##
@@ -216,14 +376,53 @@ public void write(OutputStream os) throws IOException {
   return;
 }
 
+String expr = params.get(StreamParams.EXPR);
+if (expr != null) {
+  StreamFactory streamFactory = initialStreamContext.getStreamFactory();
+  try {
+StreamExpression expression = StreamExpressionParser.parse(expr);
+if (streamFactory.isEvaluator(expression)) {
+  streamExpression = new StreamExpression(StreamParams.TUPLE);
+  streamExpression.addParameter(new 
StreamExpressionNamedParameter(StreamParams.RETURN_VALUE, expression));
+} else {
+  streamExpression = expression;
+}
+  } catch (Exception e) {
+writeException(e, writer, true);
+return;
+  }
+  streamContext = new StreamContext();
+  streamContext.setRequestParams(params);
+  // nocommit enforce this?
+  streamContext.setLocal(true);
+
+  streamContext.workerID = 0;
+  streamContext.numWorkers = 1;
+  
streamContext.setSolrClientCache(initialStreamContext.getSolrClientCache());
+  streamContext.setModelCache(initialStreamContext.getModelCache());
+  streamContext.setObjectCache(initialStreamContext.getObjectCache());
+  streamContext.put("core", req.getCore().getName());

Review comment:
   It's consistent with other places that create StreamContext.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1506: SOLR-14470: Add streaming expressions to /export handler

2020-05-25 Thread GitBox


sigram commented on a change in pull request #1506:
URL: https://github.com/apache/lucene-solr/pull/1506#discussion_r429849300



##
File path: solr/core/src/java/org/apache/solr/handler/export/ExportWriter.java
##
@@ -285,22 +484,48 @@ protected void 
addDocsToItemWriter(List leaves, IteratorWrite
   protected void writeDocs(SolrQueryRequest req, IteratorWriter.ItemWriter 
writer, Sort sort) throws IOException {
 List leaves = 
req.getSearcher().getTopReaderContext().leaves();
 SortDoc sortDoc = getSortDoc(req.getSearcher(), sort.getSort());
-int count = 0;
 final int queueSize = Math.min(DOCUMENT_BATCH_SIZE, totalHits);
 
 SortQueue queue = new SortQueue(queueSize, sortDoc);
 SortDoc[] outDocs = new SortDoc[queueSize];
 
-while (count < totalHits) {
-  identifyLowestSortingUnexportedDocs(leaves, sortDoc, queue);
-  int outDocsIndex = transferBatchToArrayForOutput(queue, outDocs);
-
-  count += (outDocsIndex + 1);
-  addDocsToItemWriter(leaves, writer, outDocs, outDocsIndex);
+if (streamExpression != null) {
+  streamContext.put(SORT_DOCS_KEY, outDocs);
+  streamContext.put(SORT_QUEUE_KEY, queue);
+  streamContext.put(SORT_DOC_KEY, sortDoc);
+  streamContext.put(TOTAL_HITS_KEY, totalHits);
+  streamContext.put(EXPORT_WRITER_KEY, this);
+  streamContext.put(LEAF_READERS_KEY, leaves);
+  TupleStream tupleStream = createTupleStream();
+  tupleStream.open();
+  for (;;) {
+final Tuple t = tupleStream.read();
+if (t == null) {
+  break;
+}
+if (t.EOF) {
+  break;
+}
+writer.add((MapWriter) ew -> t.writeMap(ew));
+  }
+  tupleStream.close();
+} else {
+  int count = 0;
+  while (count < totalHits) {

Review comment:
   It's the same structure as in `writeDocs(...)`. Yeah, I can change it in 
both places.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1506: SOLR-14470: Add streaming expressions to /export handler

2020-05-25 Thread GitBox


sigram commented on a change in pull request #1506:
URL: https://github.com/apache/lucene-solr/pull/1506#discussion_r429849300



##
File path: solr/core/src/java/org/apache/solr/handler/export/ExportWriter.java
##
@@ -285,22 +484,48 @@ protected void 
addDocsToItemWriter(List leaves, IteratorWrite
   protected void writeDocs(SolrQueryRequest req, IteratorWriter.ItemWriter 
writer, Sort sort) throws IOException {
 List leaves = 
req.getSearcher().getTopReaderContext().leaves();
 SortDoc sortDoc = getSortDoc(req.getSearcher(), sort.getSort());
-int count = 0;
 final int queueSize = Math.min(DOCUMENT_BATCH_SIZE, totalHits);
 
 SortQueue queue = new SortQueue(queueSize, sortDoc);
 SortDoc[] outDocs = new SortDoc[queueSize];
 
-while (count < totalHits) {
-  identifyLowestSortingUnexportedDocs(leaves, sortDoc, queue);
-  int outDocsIndex = transferBatchToArrayForOutput(queue, outDocs);
-
-  count += (outDocsIndex + 1);
-  addDocsToItemWriter(leaves, writer, outDocs, outDocsIndex);
+if (streamExpression != null) {
+  streamContext.put(SORT_DOCS_KEY, outDocs);
+  streamContext.put(SORT_QUEUE_KEY, queue);
+  streamContext.put(SORT_DOC_KEY, sortDoc);
+  streamContext.put(TOTAL_HITS_KEY, totalHits);
+  streamContext.put(EXPORT_WRITER_KEY, this);
+  streamContext.put(LEAF_READERS_KEY, leaves);
+  TupleStream tupleStream = createTupleStream();
+  tupleStream.open();
+  for (;;) {
+final Tuple t = tupleStream.read();
+if (t == null) {
+  break;
+}
+if (t.EOF) {
+  break;
+}
+writer.add((MapWriter) ew -> t.writeMap(ew));
+  }
+  tupleStream.close();
+} else {
+  int count = 0;
+  while (count < totalHits) {

Review comment:
   It's the same structure as in `writeDocs(...)`. Yeah, I can change it in 
both places.
   
   Edit: hah, that's the one in writeDocs ... I forgot I refactored the other 
one already. Sure, I can change it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1506: SOLR-14470: Add streaming expressions to /export handler

2020-05-25 Thread GitBox


sigram commented on a change in pull request #1506:
URL: https://github.com/apache/lucene-solr/pull/1506#discussion_r429852167



##
File path: 
solr/core/src/java/org/apache/solr/handler/sql/FilterCalciteConnection.java
##
@@ -0,0 +1,382 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.handler.sql;
+
+import java.lang.reflect.Type;
+import java.sql.Array;
+import java.sql.Blob;
+import java.sql.CallableStatement;
+import java.sql.Clob;
+import java.sql.DatabaseMetaData;
+import java.sql.NClob;
+import java.sql.PreparedStatement;
+import java.sql.SQLClientInfoException;
+import java.sql.SQLException;
+import java.sql.SQLWarning;
+import java.sql.SQLXML;
+import java.sql.Savepoint;
+import java.sql.Statement;
+import java.sql.Struct;
+import java.util.Map;
+import java.util.Properties;
+import java.util.concurrent.Executor;
+
+import org.apache.calcite.adapter.java.JavaTypeFactory;
+import org.apache.calcite.config.CalciteConnectionConfig;
+import org.apache.calcite.jdbc.CalciteConnection;
+import org.apache.calcite.jdbc.CalcitePrepare;
+import org.apache.calcite.linq4j.Enumerator;
+import org.apache.calcite.linq4j.Queryable;
+import org.apache.calcite.linq4j.tree.Expression;
+import org.apache.calcite.schema.SchemaPlus;
+
+/**
+ * A filter that contains another {@link CalciteConnection} and
+ * allows adding pre- post-method behaviors.
+ */
+class FilterCalciteConnection implements CalciteConnection {

Review comment:
   This is no longer needed, not sure how it ended up here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1506: SOLR-14470: Add streaming expressions to /export handler

2020-05-25 Thread GitBox


sigram commented on a change in pull request #1506:
URL: https://github.com/apache/lucene-solr/pull/1506#discussion_r429854938



##
File path: solr/solrj/src/java/org/apache/solr/client/solrj/io/Tuple.java
##
@@ -44,24 +45,35 @@
   public boolean EOF;
   public boolean EXCEPTION;
 
-  public Map fields = new HashMap();
+  public Map fields = new HashMap<>(2);
   public List fieldNames;
   public Map fieldLabels;
 
-  public Tuple(){
+  public Tuple() {
 // just an empty tuple
   }
   
-  public Tuple(Map fields) {
-if(fields.containsKey("EOF")) {
-  EOF = true;
+  public Tuple(Map fields) {

Review comment:
   Added.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1506: SOLR-14470: Add streaming expressions to /export handler

2020-05-25 Thread GitBox


sigram commented on a change in pull request #1506:
URL: https://github.com/apache/lucene-solr/pull/1506#discussion_r429854808



##
File path: solr/solrj/src/java/org/apache/solr/common/params/StreamParams.java
##
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.common.params;
+
+/**
+ * Stream Parameters and Properties.

Review comment:
   +1





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1506: SOLR-14470: Add streaming expressions to /export handler

2020-05-25 Thread GitBox


sigram commented on a change in pull request #1506:
URL: https://github.com/apache/lucene-solr/pull/1506#discussion_r429854601



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/io/stream/StatsStream.java
##
@@ -266,91 +252,106 @@ public void close() throws IOException {
   }
 
   public Tuple read() throws IOException {
-if(index == 0) {
-  ++index;
+if(!done) {
+  done = true;
   return tuple;
 } else {
-  Map fields = new HashMap();
-  fields.put("EOF", true);
-  Tuple tuple = new Tuple(fields);
-  return tuple;
+  return Tuple.EOF();
 }
   }
 
-  private String getJsonFacetString(Metric[] _metrics) {
-StringBuilder buf = new StringBuilder();
-appendJson(buf, _metrics);
-return "{"+buf.toString()+"}";
+  public StreamComparator getStreamSort() {
+return null;
   }
 
-  private void appendJson(StringBuilder buf,
-  Metric[] _metrics) {
-
-int metricCount = 0;
+  private void addStats(ModifiableSolrParams params, Metric[] _metrics) {
+Map> m = new HashMap<>();
 for(Metric metric : _metrics) {
-  String identifier = metric.getIdentifier();
-  if(!identifier.startsWith("count(")) {
-if(metricCount>0) {
-  buf.append(",");
+  String metricId = metric.getIdentifier();
+  if(metricId.contains("(")) {
+metricId = metricId.substring(0, metricId.length()-1);
+String[] parts = metricId.split("\\(");
+String function = parts[0];
+String column = parts[1];
+List stats = m.get(column);
+
+if(stats == null) {
+  stats = new ArrayList<>();
 }
-if(identifier.startsWith("per(")) {
-  
buf.append("\"facet_").append(metricCount).append("\":\"").append(identifier.replaceFirst("per",
 "percentile")).append('"');
-} else if(identifier.startsWith("std(")) {
-  
buf.append("\"facet_").append(metricCount).append("\":\"").append(identifier.replaceFirst("std",
 "stddev")).append('"');
-} else {
-  
buf.append("\"facet_").append(metricCount).append("\":\"").append(identifier).append('"');
+
+if(!column.equals("*")) {
+  m.put(column, stats);
+}
+
+if(function.equals("min")) {
+  stats.add("min");
+} else if(function.equals("max")) {
+  stats.add("max");
+} else if(function.equals("sum")) {
+  stats.add("sum");
+} else if(function.equals("avg")) {
+  stats.add("mean");
+} else if(function.equals("count")) {
+  this.doCount = true;
 }
-++metricCount;
   }
 }
-  }
 
-  private void getTuples(NamedList response,
- Metric[] metrics) {
+for(Entry> entry : m.entrySet()) {
+  StringBuilder buf = new StringBuilder();
+  List stats = entry.getValue();
+  buf.append("{!");
+
+  for(String stat : stats) {
+buf.append(stat).append("=").append("true ");
+  }
 
-this.tuple = new Tuple(new HashMap());
-NamedList facets = (NamedList)response.get("facets");
-fillTuple(tuple, facets, metrics);
+  buf.append("}").append(entry.getKey());
+  params.add("stats.field", buf.toString());
+}
   }
 
-  private void fillTuple(Tuple t,
- NamedList nl,
- Metric[] _metrics) {
+  private Tuple getTuple(NamedList response) {
+Tuple tuple = new Tuple();
+SolrDocumentList solrDocumentList = (SolrDocumentList) 
response.get("response");
+
+long count = solrDocumentList.getNumFound();
 
-if(nl == null) {
-  return;
+if(doCount) {
+  tuple.put("count(*)", count);
 }
 
-int m = 0;
-for(Metric metric : _metrics) {
-  String identifier = metric.getIdentifier();
-  if(!identifier.startsWith("count(")) {
-if(nl.get("facet_"+m) != null) {
-  Object d = nl.get("facet_" + m);
-  if(d instanceof Number) {
-if (metric.outputLong) {
-  t.put(identifier, Math.round(((Number)d).doubleValue()));
-} else {
-  t.put(identifier, ((Number)d).doubleValue());
-}
-  } else {
-t.put(identifier, d);
-  }
+if(count != 0) {
+  NamedList stats = (NamedList)response.get("stats");
+  NamedList statsFields = (NamedList)stats.get("stats_fields");
+
+  for(int i=0; i

[GitHub] [lucene-solr] radu-gheorghe opened a new pull request #1536: Documented node.sysprop shard preference

2020-05-25 Thread GitBox


radu-gheorghe opened a new pull request #1536:
URL: https://github.com/apache/lucene-solr/pull/1536


   For https://issues.apache.org/jira/browse/SOLR-13445
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14511) Add documentation for node.sysprop shard preference

2020-05-25 Thread Radu Gheorghe (Jira)
Radu Gheorghe created SOLR-14511:


 Summary: Add documentation for node.sysprop shard preference
 Key: SOLR-14511
 URL: https://issues.apache.org/jira/browse/SOLR-14511
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: documentation
Reporter: Radu Gheorghe


Pull request here: [https://github.com/apache/lucene-solr/pull/1536]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9352) Add cost function to Scorable

2020-05-25 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116086#comment-17116086
 ] 

Adrien Grand commented on LUCENE-9352:
--

+1

> Add cost function to Scorable
> -
>
> Key: LUCENE-9352
> URL: https://issues.apache.org/jira/browse/LUCENE-9352
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mayya Sharipova
>Priority: Minor
>
> {{Scorable.cost() function could be useful in optimizations.}}
> For example, the ability for collectors to skip non-competitive documents 
> introduced in LUCENE-9280 is based on the cost of the corresponding Scorable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14498) BlockCache gets stuck not accepting new stores

2020-05-25 Thread Jakub Zytka (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116144#comment-17116144
 ] 

Jakub Zytka commented on SOLR-14498:


Upgrading to caffeine 2.8.4 should solve the problem;

> BlockCache gets stuck not accepting new stores
> --
>
> Key: SOLR-14498
> URL: https://issues.apache.org/jira/browse/SOLR-14498
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query
>Affects Versions: 6.5, 6.6.5, master (9.0), 7.7.3, 8.5.1
>Reporter: Jakub Zytka
>Assignee: Andrzej Bialecki
>Priority: Major
>
> {{BlockCache}} uses two components: "storage", i.e. {{banks}} and "eviction 
> mechanism", i.e {{cache}}, implemented by caffeine cache.
> The relation between them is that "storage" enforces a strict limit for the 
> number of entries (
> {{numberOfBlocksPerBank * numberOfBanks}}) whereas the "eviction mechanism" 
> takes care of freeing entries from the storage thanks to {{maximumSize}} set 
> for the caffeine cache to {{numberOfBlocksPerBank * numberOfBanks - 1}}.
> The storage relies on caffeine cache to eventually free at least 1 entry from 
> the storage. If that doesn't happen the {{BlockCache}} starts to fail all new 
> stores.
> As it turns out, the caffeine cache may not reduce it's size to the desired 
> {{maximumSize}} for as long as no {{put}} or {{getIfPresent}} which *finds an 
> entry* is executed.
> With a sufficiently unlucky read pattern, the block cache may be rendered 
> useless (0 hit ratio):
> cache poisoned by non-reusable entries; new, reusable entries are not stored 
> and thus not reused.
> Further info may be found in 
> [https://github.com/ben-manes/caffeine/issues/420]
>  
> Change in caffeine that triggers it's internal cleanup mechanism regardless 
> of whether getIfPresent gets a hit has been implemented in 
> [https://github.com/ben-manes/caffeine/commit/7239bb0dda2af1e7301e8f66a5df28215b5173bc]
> and is due to be released in caffeine 2.8.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues

2020-05-25 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116167#comment-17116167
 ] 

Michael Sokolov commented on LUCENE-9378:
-

{quote}... binary doc values at Amazon, ie. are adjacent doc IDs likely to get 
the exact same value?
{quote}
Not identical values, but the field we use as a sorting tiebreaker, that is 
binary docvalues, will be in sorted order in the index, so adjacent values may 
be very similar, all the same length, drawn from a limited set of characters, 
and maybe sequential ones differ only in the last character or few. Highly 
compressible I would think? And we probably usually only need a relatively 
small number from each block.

> Configurable compression for BinaryDocValues
> 
>
> Key: LUCENE-9378
> URL: https://issues.apache.org/jira/browse/LUCENE-9378
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Viral Gandhi
>Priority: Minor
>
> Lucene 8.5.1 includes a change to always [compress 
> BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This 
> caused (~30%) reduction in our red-line QPS (throughput). 
> We think users should be given some way to opt-in for this compression 
> feature instead of always being enabled which can have a substantial query 
> time cost as we saw during our upgrade. [~mikemccand] suggested one possible 
> approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and 
> UNCOMPRESSED) and allowing users to create a custom Codec subclassing the 
> default Codec and pick the format they want.
> Idea is similar to Lucene50StoredFieldsFormat which has two modes, 
> Mode.BEST_SPEED and Mode.BEST_COMPRESSION.
> Here's related issues for adding benchmark covering BINARY doc values 
> query-time performance - [https://github.com/mikemccand/luceneutil/issues/61]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9352) Add cost function to Scorable

2020-05-25 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116201#comment-17116201
 ] 

David Smiley commented on LUCENE-9352:
--

Is this about the cost of the score() call alone, as apposed to say DISI?  I 
suggest that a cost method here be named something like scoreCost() to help 
differentiate this cost from others.

> Add cost function to Scorable
> -
>
> Key: LUCENE-9352
> URL: https://issues.apache.org/jira/browse/LUCENE-9352
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mayya Sharipova
>Priority: Minor
>
> {{Scorable.cost() function could be useful in optimizations.}}
> For example, the ability for collectors to skip non-competitive documents 
> introduced in LUCENE-9280 is based on the cost of the corresponding Scorable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9352) Add cost function to Scorable

2020-05-25 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116236#comment-17116236
 ] 

Adrien Grand commented on LUCENE-9352:
--

It would be the same cost as DocIdSetIterator#cost.

> Add cost function to Scorable
> -
>
> Key: LUCENE-9352
> URL: https://issues.apache.org/jira/browse/LUCENE-9352
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mayya Sharipova
>Priority: Minor
>
> {{Scorable.cost() function could be useful in optimizations.}}
> For example, the ability for collectors to skip non-competitive documents 
> introduced in LUCENE-9280 is based on the cost of the corresponding Scorable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9352) Add cost function to Scorable

2020-05-25 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116242#comment-17116242
 ] 

David Smiley commented on LUCENE-9352:
--

Okay; all the more reason for a unambiguous name ;-). Maybe 
{{iteratorCost()}}

> Add cost function to Scorable
> -
>
> Key: LUCENE-9352
> URL: https://issues.apache.org/jira/browse/LUCENE-9352
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mayya Sharipova
>Priority: Minor
>
> {{Scorable.cost() function could be useful in optimizations.}}
> For example, the ability for collectors to skip non-competitive documents 
> introduced in LUCENE-9280 is based on the cost of the corresponding Scorable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14512) Require java 8 upgrade

2020-05-25 Thread Akhila John (Jira)
Akhila John created SOLR-14512:
--

 Summary: Require java 8 upgrade
 Key: SOLR-14512
 URL: https://issues.apache.org/jira/browse/SOLR-14512
 Project: Solr
  Issue Type: Task
  Security Level: Public (Default Security Level. Issues are Public)
 Environment: Production
Reporter: Akhila John


Hi Team, 

We use solr 5.3.1 for sitecore 8.2.

We require to upgrade Java version to 'Java 8 Update 251' and remove / Upgrade 
Wireshark to 3.2.3 in our application servers.
 Could you please advise if this would have any impact on the solr. Does solr 
5.3.1 support Java 8.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org