[GitHub] [lucene-jira-archive] mocobeta commented on issue #1: Fix markup conversion error

2022-06-29 Thread GitBox


mocobeta commented on issue #1:
URL: 
https://github.com/apache/lucene-jira-archive/issues/1#issuecomment-1169860211

   Thanks for reporting.
   
   I found Jira's number list (`#`) is not correctly converted and it is 
interpreted as headers in Markdown. 
   
   Jira dump
   ```
   "body": "I'm definitely not an expert on this but after some research I 
found:\r\n # The real problem probably is we're assuming object alignment in 32 
bit jvm is 4 bytes but they're actually default into 8 bytes in HotSpot JVM and 
can't be anything less than 8 bytes 
([https://stackoverflow.com/questions/44468639/memory-alignment-of-java-classes)]\r\n
 # Object header may create offset for object alignment, like in your jol 
analysis, the header is 12 bytes long and thus created a 12%8=4 bytes offset, 
so that the target array size should cover those and that's why for {{byte[]}} 
4,12,20... sizes are optimal, but I\u00a0*think* the header length can vary 
depend on either jvm or system, since I've seen some post with 2 mark words in 
the header which makes header 16 bytes\r\n\r\nSo there should be something we 
could optimize here, but probably need to figure out a way to identify how many 
bytes are in array header, ah 
[RamUsageEstimator|https://github.com/apache/lucene/blob/main/lucene
 /core/src/java/org/apache/lucene/util/RamUsageEstimator.java#L179,L187] listed 
the details out, the 64 bit machine's header is already aligned so we don't 
need to worry about the offset, and 32 bit machine's header is constant 12 
bytes so with a 4 bytes offset.",
   ```
   
   Converted markdown data
   ```
   "body": "I'm definitely not an expert on this but after some research I 
found:\r\n # The real problem probably is we're assuming object alignment in 32 
bit jvm is 4 bytes but they're actually default into 8 bytes in HotSpot JVM and 
can't be anything less than 8 bytes 
(\r\n
 # Object header may create offset for object alignment, like in your jol 
analysis, the header is 12 bytes long and thus created a 12%8=4 bytes offset, 
so that the target array size should cover those and that's why for `byte[]` 
4,12,20... sizes are optimal, but I\u00a0**think** the header length can vary 
depend on either jvm or system, since I've seen some post with 2 mark words in 
the header which makes header 16 bytes\r\n\r\nSo there should be something we 
could optimize here, but probably need to figure out a way to identify how many 
bytes are in array header, ah 
[RamUsageEstimator](https://github.com/apache/lucene/blob/main/lucen
 e/core/src/java/org/apache/lucene/util/RamUsageEstimator.java#L179,L187) 
listed the details out, the 64 bit machine's header is already aligned so we 
don't need to worry about the offset, and 32 bit machine's header is constant 
12 bytes so with a 4 bytes offset.\n\nAuthor: Patrick Zhai (`@zhaih`)\nCreated: 
2022-06-09T07:07:05.021+\nUpdated: 2022-06-09T07:07:05.021+\n",
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-jira-archive] mocobeta opened a new issue, #3: Create mapping on Jira user id -> GitHub account

2022-06-29 Thread GitBox


mocobeta opened a new issue, #3:
URL: https://github.com/apache/lucene-jira-archive/issues/3

   To correctly map Jira user ids in issues (reporter/assignee/author) to 
GitHub account, we need an account mapping file.
   This could be inferred from https://github.com/orgs/apache/people?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-jira-archive] mocobeta opened a new issue, #2: Archive all Jira attachments

2022-06-29 Thread GitBox


mocobeta opened a new issue, #2:
URL: https://github.com/apache/lucene-jira-archive/issues/2

   All attachments should be archived in `attachments/`. They will be referred 
from the migrated issues in https://github.com/apache/lucene.
   For files with the same names, we keep the latest versions only. (Jira shows 
links to the latest versions for attachments, so old versions are safely 
omitted.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-jira-archive] dweiss commented on issue #1: Fix markup conversion error

2022-06-29 Thread GitBox


dweiss commented on issue #1:
URL: 
https://github.com/apache/lucene-jira-archive/issues/1#issuecomment-1169793482

   
![image](https://user-images.githubusercontent.com/199470/176411524-9d1a8998-09cb-4544-9890-282ba1ff8b31.png)
   
   This is what the bold block-issue looks like.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-jira-archive] mocobeta opened a new issue, #1: Fix markup conversion error

2022-06-29 Thread GitBox


mocobeta opened a new issue, #1:
URL: https://github.com/apache/lucene-jira-archive/issues/1

   There are various errors in converting Jira markup to Markdown.
   
   For example:
   - tables are broken
   - bullet lists converted to bold blocks (?)
   - bullet lists include unnecessary spaces between items
   - indents are not preserved
   - ...
   
   This issue tries to figure out the root cause of the errors and fix those.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene] zacharymorn commented on a diff in pull request #767: LUCENE-10436: Deprecate DocValuesFieldExistsQuery, NormsFieldExistsQuery and KnnVectorFieldExistsQuery with FieldExistsQuery

2022-04-03 Thread GitBox


zacharymorn commented on code in PR #767:
URL: https://github.com/apache/lucene/pull/767#discussion_r841312352


##
lucene/core/src/java/org/apache/lucene/search/FieldExistsQuery.java:
##
@@ -0,0 +1,223 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.Objects;
+import org.apache.lucene.index.DocValues;
+import org.apache.lucene.index.DocValuesType;
+import org.apache.lucene.index.FieldInfo;
+import org.apache.lucene.index.FieldInfos;
+import org.apache.lucene.index.IndexOptions;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.LeafReader;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.index.PointValues;
+import org.apache.lucene.index.Terms;
+
+/**
+ * A {@link Query} that matches documents that contain either a {@link
+ * org.apache.lucene.document.KnnVectorField}, or a field that indexes norms 
or doc values.
+ */
+public class FieldExistsQuery extends Query {
+  private String field;
+
+  /** Create a query that will match that have a value for the given {@code 
field}. */
+  public FieldExistsQuery(String field) {
+this.field = Objects.requireNonNull(field);
+  }
+
+  public String getField() {
+return field;
+  }
+
+  @Override
+  public String toString(String field) {
+return "FieldExistsQuery [field=" + this.field + "]";
+  }
+
+  @Override
+  public void visit(QueryVisitor visitor) {
+if (visitor.acceptField(field)) {
+  visitor.visitLeaf(this);
+}
+  }
+
+  @Override
+  public boolean equals(Object other) {
+return sameClassAs(other) && field.equals(((FieldExistsQuery) 
other).field);
+  }
+
+  @Override
+  public int hashCode() {
+final int prime = 31;
+int hash = classHash();
+hash = prime * hash + field.hashCode();
+return hash;
+  }
+
+  @Override
+  public Query rewrite(IndexReader reader) throws IOException {
+boolean allReadersRewritable = true;
+
+for (LeafReaderContext context : reader.leaves()) {
+  LeafReader leaf = context.reader();
+  FieldInfos fieldInfos = leaf.getFieldInfos();
+  FieldInfo fieldInfo = fieldInfos.fieldInfo(field);
+
+  if (fieldInfo == null) {
+allReadersRewritable = false;
+break;
+  }
+
+  if (fieldInfo.hasNorms()) { // the field indexes norms
+if (reader.getDocCount(field) != reader.maxDoc()) {
+  allReadersRewritable = false;
+  break;
+}
+  } else if (fieldInfo.getVectorDimension() != 0) { // the field indexes 
vectors
+if (leaf.getVectorValues(field).size() != reader.maxDoc()) {
+  allReadersRewritable = false;
+  break;
+}
+  } else if (fieldInfo.getDocValuesType() != DocValuesType.NONE
+  || leaf.terms(field) != null
+  || leaf.getPointValues(field) != null) { // the field indexes doc 
values or points

Review Comment:
   I gave that a try in 
https://github.com/apache/lucene/pull/767/commits/d4d9a3f5b79fa778f29c48d5ab0fee43dab14561,
 but it failed fa few tests (that I fixed), mostly having to do with 
`BinaryPoint` or `StringField` fields don't pass the condition 
`fieldInfo.getDocValuesType() != DocValuesType.NONE`. Could you let me know if 
it looks correct to you ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene] zacharymorn commented on a diff in pull request #767: LUCENE-10436: Deprecate DocValuesFieldExistsQuery, NormsFieldExistsQuery and KnnVectorFieldExistsQuery with FieldExistsQuery

2022-04-03 Thread GitBox


zacharymorn commented on code in PR #767:
URL: https://github.com/apache/lucene/pull/767#discussion_r841309603


##
lucene/core/src/java/org/apache/lucene/search/DocValuesFieldExistsQuery.java:
##
@@ -31,42 +28,21 @@
 /**
  * A {@link Query} that matches documents that have a value for a given field 
as reported by doc
  * values iterators.
+ *
+ * @deprecated Use {@link org.apache.lucene.search.FieldExistsQuery} instead.
  */
-public final class DocValuesFieldExistsQuery extends Query {
-
-  private final String field;
+@Deprecated
+public final class DocValuesFieldExistsQuery extends FieldExistsQuery {
+  private String field;
 
   /** Create a query that will match documents which have a value for the 
given {@code field}. */
   public DocValuesFieldExistsQuery(String field) {
+super(field);
 this.field = Objects.requireNonNull(field);
   }
 
-  public String getField() {
-return field;
-  }
-
-  @Override
-  public boolean equals(Object other) {
-return sameClassAs(other) && field.equals(((DocValuesFieldExistsQuery) 
other).field);
-  }
-
-  @Override
-  public int hashCode() {
-return 31 * classHash() + field.hashCode();
-  }
-
-  @Override
-  public String toString(String field) {
-return "DocValuesFieldExistsQuery [field=" + this.field + "]";
-  }
-
-  @Override
-  public void visit(QueryVisitor visitor) {
-if (visitor.acceptField(field)) {
-  visitor.visitLeaf(this);
-}
-  }
-
+  // nocommit this seems to be generalizable to norms and knn as well given 
LUCENE-9334, and thus
+  // could be moved to the new FieldExistsQuery?

Review Comment:
   Ok I see.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-sandbox] anshumg opened a new pull request #1: Create .asf.yaml

2021-02-05 Thread GitBox


anshumg opened a new pull request #1:
URL: https://github.com/apache/lucene-solr-sandbox/pull/1


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman opened a new pull request #179: Migrating CI/CD from travis to github actions.

2021-01-15 Thread GitBox


HoustonPutman opened a new pull request #179:
URL: https://github.com/apache/lucene-solr-operator/pull/179


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] anshumg commented on issue #173: Helm repository is not available anymore

2021-01-14 Thread GitBox


anshumg commented on issue #173:
URL: 
https://github.com/apache/lucene-solr-operator/issues/173#issuecomment-760558019


   Just to be clear, @HoustonPutman meant 'smooth' 😄 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] dalbani commented on issue #173: Helm repository is not available anymore

2021-01-14 Thread GitBox


dalbani commented on issue #173:
URL: 
https://github.com/apache/lucene-solr-operator/issues/173#issuecomment-760550211


   Okay, great, thanks a lot.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman commented on issue #173: Helm repository is not available anymore

2021-01-14 Thread GitBox


HoustonPutman commented on issue #173:
URL: 
https://github.com/apache/lucene-solr-operator/issues/173#issuecomment-760550041


   Great question, I should have addressed it.
   
   It looks like we will unfortunately not be able to move to 
apache/solr-operator for a while. It will eventually move there, but I cant 
give a timeline. We will definitely be more vocal about the change beforehand 
next time and try to make the transition smoothless.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] dalbani commented on issue #173: Helm repository is not available anymore

2021-01-14 Thread GitBox


dalbani commented on issue #173:
URL: 
https://github.com/apache/lucene-solr-operator/issues/173#issuecomment-760547409


   But, as you said, will the Git project and Helm repo eventually move to 
`apache/solr-operator` instead of `apache/lucene-solr-operator`?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman closed issue #173: Helm repository is not available anymore

2021-01-14 Thread GitBox


HoustonPutman closed issue #173:
URL: https://github.com/apache/lucene-solr-operator/issues/173


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman merged pull request #178: Change the repo location to apache

2021-01-14 Thread GitBox


HoustonPutman merged pull request #178:
URL: https://github.com/apache/lucene-solr-operator/pull/178


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] anshumg commented on pull request #176: Standardize headers to be ASL2.0, move copyright to NOTICE

2021-01-14 Thread GitBox


anshumg commented on pull request #176:
URL: 
https://github.com/apache/lucene-solr-operator/pull/176#issuecomment-760545292


   Thanks for taking a look @klaporte :) 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] anshumg closed issue #175: Standardize headers to be ASL2.0, move copyright to NOTICE

2021-01-14 Thread GitBox


anshumg closed issue #175:
URL: https://github.com/apache/lucene-solr-operator/issues/175


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] anshumg merged pull request #176: Standardize headers to be ASL2.0, move copyright to NOTICE

2021-01-14 Thread GitBox


anshumg merged pull request #176:
URL: https://github.com/apache/lucene-solr-operator/pull/176


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman opened a new pull request #178: Change the repo location to apache

2021-01-14 Thread GitBox


HoustonPutman opened a new pull request #178:
URL: https://github.com/apache/lucene-solr-operator/pull/178


   Fixes: #173 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman closed pull request #174: Create .asf.yaml

2021-01-14 Thread GitBox


HoustonPutman closed pull request #174:
URL: https://github.com/apache/lucene-solr-operator/pull/174


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman merged pull request #177: Add .asf.yaml

2021-01-14 Thread GitBox


HoustonPutman merged pull request #177:
URL: https://github.com/apache/lucene-solr-operator/pull/177


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman opened a new pull request #177: Add .asf.yaml

2021-01-14 Thread GitBox


HoustonPutman opened a new pull request #177:
URL: https://github.com/apache/lucene-solr-operator/pull/177


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] klaporte commented on pull request #176: Standardize headers to be ASL2.0, move copyright to NOTICE

2021-01-14 Thread GitBox


klaporte commented on pull request #176:
URL: 
https://github.com/apache/lucene-solr-operator/pull/176#issuecomment-760495787


   Looks good to me. Thanks @anshumg  & @HoustonPutman 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] anshumg commented on pull request #176: Standardize headers to be ASL2.0, move copyright to NOTICE

2021-01-14 Thread GitBox


anshumg commented on pull request #176:
URL: 
https://github.com/apache/lucene-solr-operator/pull/176#issuecomment-760490146


   FYI, still waiting for Infra to remove the required checks before I can 
squash and merge.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] anshumg commented on pull request #176: Standardize headers to be ASL2.0, move copyright to NOTICE

2021-01-14 Thread GitBox


anshumg commented on pull request #176:
URL: 
https://github.com/apache/lucene-solr-operator/pull/176#issuecomment-760474459


   @HoustonPutman  - reading more about the license and the requirements around 
that. We might be able to remove that too. I'll do that as another PR if needed 
once I have clarity.
   
   The only thing missing here is the fix in `check-license.sh` 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] anshumg commented on a change in pull request #176: Standardize headers to be ASL2.0, move copyright to NOTICE

2021-01-14 Thread GitBox


anshumg commented on a change in pull request #176:
URL: 
https://github.com/apache/lucene-solr-operator/pull/176#discussion_r557694683



##
File path: NOTICE.txt
##
@@ -0,0 +1,30 @@
+==
+ Apache Solr
+ Copyright 2006-2021 The Apache Software Foundation
+==
+
+This product includes software developed at
+The Apache Software Foundation (http://www.apache.org/).
+
+Includes software from other Apache Software Foundation projects,
+including, but not limited to:
+  - Apache Lucene Java

Review comment:
   Yes, but we might not have to add Solr as it's the same project. I think 
the Lucene bit that exists in 'Solr' is just cruft from 10 years ago :) It'll 
be required again in the future, but for the operator, we shouldn't need Lucene 
in this list.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman commented on a change in pull request #176: Standardize headers to be ASL2.0, move copyright to NOTICE

2021-01-14 Thread GitBox


HoustonPutman commented on a change in pull request #176:
URL: 
https://github.com/apache/lucene-solr-operator/pull/176#discussion_r557689208



##
File path: NOTICE.txt
##
@@ -0,0 +1,30 @@
+==
+ Apache Solr
+ Copyright 2006-2021 The Apache Software Foundation
+==
+
+This product includes software developed at
+The Apache Software Foundation (http://www.apache.org/).
+
+Includes software from other Apache Software Foundation projects,
+including, but not limited to:
+  - Apache Lucene Java

Review comment:
   I imagine we are going to have to fill this out with the go dependencies 
we use?

##
File path: NOTICE.txt
##
@@ -0,0 +1,30 @@
+==
+ Apache Solr

Review comment:
   Should this be Apache Solr Operator?

##
File path: NOTICE.txt
##
@@ -0,0 +1,30 @@
+==
+ Apache Solr
+ Copyright 2006-2021 The Apache Software Foundation
+==
+
+This product includes software developed at
+The Apache Software Foundation (http://www.apache.org/).
+
+Includes software from other Apache Software Foundation projects,
+including, but not limited to:
+  - Apache Lucene Java

Review comment:
   So this project doesn't actually contain software from Apache Lucene, 
right? We use the APIs, but not the libraries.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] anshumg opened a new pull request #176: Standardize headers to be ASL2.0, move copyright to NOTICE

2021-01-14 Thread GitBox


anshumg opened a new pull request #176:
URL: https://github.com/apache/lucene-solr-operator/pull/176


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] anshumg opened a new issue #175: Standardize headers to be ASL2.0, move copyright to NOTICE

2021-01-14 Thread GitBox


anshumg opened a new issue #175:
URL: https://github.com/apache/lucene-solr-operator/issues/175


   As per the discussion here : https://issues.apache.org/jira/browse/LEGAL-553
   
   Standardize the file headers to be ASL2.0 text, and move the copyright to 
NOTICE.txt



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman opened a new pull request #174: Create .asf.yaml

2021-01-14 Thread GitBox


HoustonPutman opened a new pull request #174:
URL: https://github.com/apache/lucene-solr-operator/pull/174


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman commented on issue #173: Helm repository is not available anymore

2021-01-14 Thread GitBox


HoustonPutman commented on issue #173:
URL: 
https://github.com/apache/lucene-solr-operator/issues/173#issuecomment-760309717


   In the meantime, you should be able to use this instead: (Just a fix until 
the real changes have been made)
   
   ```
   helm repo add solr-operator 
https://raw.githubusercontent.com/apache/lucene-solr-operator/master/docs/charts
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman commented on issue #173: Helm repository is not available anymore

2021-01-14 Thread GitBox


HoustonPutman commented on issue #173:
URL: 
https://github.com/apache/lucene-solr-operator/issues/173#issuecomment-760276148


   Hey everyone, sorry about this. I'll try to have it available as soon as 
possible.
   
   Unfortunately the repo was moved to `apache/lucene-solr-operator` instead of 
`apache/solr-operator`. So it will have to be moved again. But I will try to 
make it work in the meantime. I'll post an update here and on the slack channel 
when it should be available.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] vladiceanu commented on issue #173: Helm repository is not available anymore

2021-01-14 Thread GitBox


vladiceanu commented on issue #173:
URL: 
https://github.com/apache/lucene-solr-operator/issues/173#issuecomment-760272753


   @bsankara any news on this? thanks



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] benediktarnold commented on issue #173: Helm repository is not available anymore

2021-01-14 Thread GitBox


benediktarnold commented on issue #173:
URL: 
https://github.com/apache/lucene-solr-operator/issues/173#issuecomment-760132855


   I don't know when it happened, but since solr-operator is now an apache 
project, it makes sense that the chart isn't hosted on bloombergs github page 
anymore. I'm looking for an alternative as well.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] dalbani opened a new issue #173: Helm repository is not available anymore

2021-01-14 Thread GitBox


dalbani opened a new issue #173:
URL: https://github.com/apache/lucene-solr-operator/issues/173


   We used to use the Helm repository located at 
https://bloomberg.github.io/solr-operator/charts, but it seems not to be 
available anymore.
   Is there a new location? The documentation still mentions the old location.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley closed pull request #855: SOLR-13739: Improve performance on huge schema updates

2019-09-13 Thread GitBox
dsmiley closed pull request #855: SOLR-13739: Improve performance on huge 
schema updates
URL: https://github.com/apache/lucene-solr/pull/855
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early 
Termination
URL: https://github.com/apache/lucene-solr/pull/877#issuecomment-531276346
 
 
   Hmm, the precommit failure seems to be coming from `JdbcDataSource.java`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early 
Termination
URL: https://github.com/apache/lucene-solr/pull/877#issuecomment-531275454
 
 
   @jimczi Here is the link:
   
   
https://issues.apache.org/jira/browse/LUCENE-8978?focusedCommentId=16929277&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16929277


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase commented on issue #627: LUCENE-8746: Make EdgeTree (aka ComponentTree) support different type of components

2019-09-13 Thread GitBox
iverase commented on issue #627: LUCENE-8746: Make EdgeTree (aka ComponentTree) 
support different type of components
URL: https://github.com/apache/lucene-solr/pull/627#issuecomment-531271483
 
 
   See #878 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase closed pull request #627: LUCENE-8746: Make EdgeTree (aka ComponentTree) support different type of components

2019-09-13 Thread GitBox
iverase closed pull request #627: LUCENE-8746: Make EdgeTree (aka 
ComponentTree) support different type of components
URL: https://github.com/apache/lucene-solr/pull/627
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase opened a new pull request #878: LUCENE-8746: Refactor EdgeTree

2019-09-13 Thread GitBox
iverase opened a new pull request #878: LUCENE-8746: Refactor EdgeTree 
URL: https://github.com/apache/lucene-solr/pull/878
 
 
   Another try in refactoring edge tree. This PR splits Edge Tree class into 
two and adds a new interface:
   
   * Component2D: Interface defining an object that knows its bounding box and 
can perform some spatial operations.
   * ComponentTree: An interval tree containing the different components (e.g 
polygon or line) 
   * EdgeTree: An interval tree containing the edges of a components (polygon 
or line edges)
   
   Unfortunately the PR touches quite a lots of files but most of them are test 
files. Running benchmark for points results look good, points shows same 
performance and there is an increase of performance for shapes (we are not 
computing the bounding box of the triangle many times).

|Approach|Shape|M hits/sec Dev|M hits/sec Base|Diff| QPS  Dev| QPS 
Base|Diff|Hit count Dev|Hit count Base|Diff|
   |--- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
   |points|polyRussia|13.97|13.98|-0%|3.98|3.99|-0%|3508846|3508846| 0%|
   |points|poly 10|73.35|71.84| 2%|46.39|45.43| 2%|355809475|355809475| 0%|
   |points|polyMedium|8.73|8.67| 1%|106.99|106.24| 1%|2693559|2693559| 0%|
   |shapes|polyRussia|6.88|5.73|20%|1.96|1.63|20%|3508846|3508846| 0%|
   |shapes|poly 10|27.98|27.03| 4%|17.69|17.10| 4%|355809475|355809475| 0%|
   |shapes|polyMedium|2.79|2.75| 2%|34.19|33.66| 2%|2693559|2693559| 0%|
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early 
Termination
URL: https://github.com/apache/lucene-solr/pull/877#issuecomment-531266829
 
 
   Can you run another benchmark now that we propagate the global min score to 
the scorer ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early 
Termination
URL: https://github.com/apache/lucene-solr/pull/877#issuecomment-531261223
 
 
   @jimczi Updated


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324221245
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -138,14 +152,20 @@ public void collect(int doc) throws IOException {
   if (score > after.score || (score == after.score && doc <= 
afterDoc)) {
 // hit was collected on a previous page
 if (totalHitsRelation == TotalHits.Relation.EQUAL_TO && 
hitsThresholdChecker.isThresholdReached()) {
+  // Since the queue is prepopulated with sentinel objects, 
getting here means that the local
+  // priority queue is full
+  if (bottomValueChecker != null) {
 
 Review comment:
   Agreed -- The reason I did not put it in `updateMinCompetitiveScore` is 
because the function already became significantly heavy by the additions in 
this PR. Added now


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324218422
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -155,6 +175,11 @@ public void collect(int doc) throws IOException {
   pqTop.doc = doc + docBase;
   pqTop.score = score;
   pqTop = pq.updateTop();
+
+  if (bottomValueChecker != null && 
bottomValueChecker.getBottomValue() > 0) {
 
 Review comment:
   The update below should be called only when the total hits threshold is used 
and the queue is full so it would be easier to set it in 
`updateMinCompetitiveScore` like we do to propagate the min score in the scorer 
?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324218620
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -138,14 +152,20 @@ public void collect(int doc) throws IOException {
   if (score > after.score || (score == after.score && doc <= 
afterDoc)) {
 // hit was collected on a previous page
 if (totalHitsRelation == TotalHits.Relation.EQUAL_TO && 
hitsThresholdChecker.isThresholdReached()) {
+  // Since the queue is prepopulated with sentinel objects, 
getting here means that the local
+  // priority queue is full
+  if (bottomValueChecker != null) {
 
 Review comment:
   It would be easier to put this logic in updateMinCompetitiveScore as well ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324216889
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/BottomValueChecker.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Maintains the bottom value across multiple collectors
+ */
+abstract class BottomValueChecker {
 
 Review comment:
   Let's resolve this later, we're still not sure that we'll use the same 
mechanism so we should focus on the top score docs collector where we only need 
a float at the moment ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on issue #876: Use The Passed In Threshold Value in doConcurrentSearchWithThreshold

2019-09-13 Thread GitBox
atris commented on issue #876: Use The Passed In Threshold Value in 
doConcurrentSearchWithThreshold
URL: https://github.com/apache/lucene-solr/pull/876#issuecomment-531233493
 
 
   Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] msokolov commented on issue #876: Use The Passed In Threshold Value in doConcurrentSearchWithThreshold

2019-09-13 Thread GitBox
msokolov commented on issue #876: Use The Passed In Threshold Value in 
doConcurrentSearchWithThreshold
URL: https://github.com/apache/lucene-solr/pull/876#issuecomment-531233209
 
 
   thanks! merged


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] msokolov merged pull request #876: Use The Passed In Threshold Value in doConcurrentSearchWithThreshold

2019-09-13 Thread GitBox
msokolov merged pull request #876: Use The Passed In Threshold Value in 
doConcurrentSearchWithThreshold
URL: https://github.com/apache/lucene-solr/pull/876
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on issue #876: Use The Passed In Threshold Value in doConcurrentSearchWithThreshold

2019-09-13 Thread GitBox
atris commented on issue #876: Use The Passed In Threshold Value in 
doConcurrentSearchWithThreshold
URL: https://github.com/apache/lucene-solr/pull/876#issuecomment-531230646
 
 
   Thanks, missed the camel casing, fixed now


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] msokolov commented on issue #876: Use The Passed In Threshold Value in doConcurrentSearchWithThreshold

2019-09-13 Thread GitBox
msokolov commented on issue #876: Use The Passed In Threshold Value in 
doConcurrentSearchWithThreshold
URL: https://github.com/apache/lucene-solr/pull/876#issuecomment-531229416
 
 
   I'll commit it. As I looked at this I couldn't help noticing a little 
spelling nit; we have {{thresHold}} but "threshold" is a single word, so it'd 
look nicer without the camel case I think :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early 
Termination
URL: https://github.com/apache/lucene-solr/pull/877#issuecomment-531228068
 
 
   @jimczi Updated, please see.
   
   One thing we could do is cache the bottom docID (doc + docBase) in 
BottomValueChecker, but as you said, it might not be worth it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324181290
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/BottomValueChecker.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Maintains the bottom value across multiple collectors
+ */
+abstract class BottomValueChecker {
+  /** Maintains global bottom score as the maximum of all bottom scores */
+  private static class MaximumBottomScoreChecker extends 
BottomValueChecker {
+private volatile float maxMinScore;
+private final AtomicBoolean bottomValueAvailable = new AtomicBoolean();
+
+@Override
+public void updateThreadLocalBottomValue(Float value) {
 
 Review comment:
   Removed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324146082
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/BottomValueChecker.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Maintains the bottom value across multiple collectors
+ */
+abstract class BottomValueChecker {
+  /** Maintains global bottom score as the maximum of all bottom scores */
+  private static class MaximumBottomScoreChecker extends 
BottomValueChecker {
+private volatile float maxMinScore;
+private final AtomicBoolean bottomValueAvailable = new AtomicBoolean();
+
+@Override
+public void updateThreadLocalBottomValue(Float value) {
 
 Review comment:
   Due to the generic


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324162028
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -74,11 +75,25 @@ public void collect(int doc) throws IOException {
   totalHits++;
   hitsThresholdChecker.incrementHitCount();
 
-  if (score <= pqTop.score) {
+  boolean nonCompetitiveHit;
+  if (bottomValueChecker != null && 
bottomValueChecker.isGlobalBottomValueAvailable()) {
+nonCompetitiveHit = score <= (float) 
bottomValueChecker.getBottomValue();
+  } else {
+nonCompetitiveHit = score <= pqTop.score;
+  }
+
+  if (nonCompetitiveHit) {
 if (totalHitsRelation == TotalHits.Relation.EQUAL_TO && 
hitsThresholdChecker.isThresholdReached()) {
   // we just reached totalHitsThreshold, we can start setting the 
min
   // competitive score now
   updateMinCompetitiveScore(scorer);
+
+  // Since the queue is prepopulated with sentinel objects, 
getting here means that the local
+  // priority queue is full
+  if (bottomValueChecker != null) {
+bottomValueChecker.updateThreadLocalBottomValue(pqTop.score);
+scorer.setMinCompetitiveScore((float) 
bottomValueChecker.getBottomValue());
 
 Review comment:
   Ah, understood it now moved, thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324161934
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/BottomValueChecker.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Maintains the bottom value across multiple collectors
+ */
+abstract class BottomValueChecker {
+  /** Maintains global bottom score as the maximum of all bottom scores */
+  private static class MaximumBottomScoreChecker extends 
BottomValueChecker {
+private volatile float maxMinScore;
+private final AtomicBoolean bottomValueAvailable = new AtomicBoolean();
+
+@Override
+public void updateThreadLocalBottomValue(Float value) {
+  if (value <= maxMinScore) {
+return;
+  }
+  synchronized (this) {
+if (value > maxMinScore) {
+  maxMinScore = value;
+}
+  }
+  bottomValueAvailable.compareAndSet(false, true);
+}
+
+@Override
+public Float getBottomValue() {
 
 Review comment:
   Removed for now


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324146126
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/BottomValueChecker.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Maintains the bottom value across multiple collectors
+ */
+abstract class BottomValueChecker {
+  /** Maintains global bottom score as the maximum of all bottom scores */
+  private static class MaximumBottomScoreChecker extends 
BottomValueChecker {
+private volatile float maxMinScore;
+private final AtomicBoolean bottomValueAvailable = new AtomicBoolean();
+
+@Override
+public void updateThreadLocalBottomValue(Float value) {
+  if (value <= maxMinScore) {
+return;
+  }
+  synchronized (this) {
+if (value > maxMinScore) {
+  maxMinScore = value;
+}
+  }
+  bottomValueAvailable.compareAndSet(false, true);
+}
+
+@Override
+public Float getBottomValue() {
 
 Review comment:
   Please see above


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on issue #876: Use The Passed In Threshold Value in doConcurrentSearchWithThreshold

2019-09-13 Thread GitBox
atris commented on issue #876: Use The Passed In Threshold Value in 
doConcurrentSearchWithThreshold
URL: https://github.com/apache/lucene-solr/pull/876#issuecomment-531226956
 
 
   Can we merge this one? This is a simple fix to an existing test method.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz merged pull request #868: LUCENE-8975: Code Cleanup: Use entryset for map iteration wherever possible.

2019-09-13 Thread GitBox
jpountz merged pull request #868: LUCENE-8975: Code Cleanup: Use entryset for 
map iteration wherever possible.
URL: https://github.com/apache/lucene-solr/pull/868
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on issue #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-13 Thread GitBox
atris commented on issue #815: LUCENE-8213: Introduce Asynchronous Caching in 
LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#issuecomment-531221868
 
 
   Any further thoughts on this one?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on issue #875: merge master

2019-09-13 Thread GitBox
jpountz commented on issue #875: merge master
URL: https://github.com/apache/lucene-solr/pull/875#issuecomment-531221639
 
 
   Probably open by mistake.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz closed pull request #875: merge master

2019-09-13 Thread GitBox
jpountz closed pull request #875: merge master
URL: https://github.com/apache/lucene-solr/pull/875
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #865: LUCENE-8973: XYRectangle2D should work on float space

2019-09-13 Thread GitBox
jpountz commented on a change in pull request #865: LUCENE-8973: XYRectangle2D 
should work on float space
URL: https://github.com/apache/lucene-solr/pull/865#discussion_r324172772
 
 

 ##
 File path: lucene/sandbox/src/java/org/apache/lucene/geo/XYRectangle2D.java
 ##
 @@ -16,42 +16,156 @@
  */
 package org.apache.lucene.geo;
 
-import static org.apache.lucene.geo.XYEncodingUtils.decode;
-import static org.apache.lucene.geo.XYEncodingUtils.encode;
+import java.util.Arrays;
+import java.util.Objects;
+
+import org.apache.lucene.index.PointValues;
+
+import static org.apache.lucene.geo.GeoUtils.orient;
 
 /**
  * 2D rectangle implementation containing cartesian spatial logic.
  *
  * @lucene.internal
  */
-public class XYRectangle2D extends Rectangle2D {
+public class XYRectangle2D  {
 
-  protected XYRectangle2D(double minX, double maxX, double minY, double maxY) {
-super(encode(minX), encode(maxX), encode(minY), encode(maxY));
+  private final float minX;
+  private final float maxX;
+  private final float minY;
+  private final float maxY;
+
+  protected XYRectangle2D(float minX, float maxX, float minY, float maxY) {
+this.minX =  minX;
+this.maxX =  maxX;
+this.minY =  minY;
+this.maxY =  maxY;
   }
 
-  /** Builds a Rectangle2D from rectangle */
-  public static XYRectangle2D create(XYRectangle rectangle) {
-return new XYRectangle2D(rectangle.minX, rectangle.maxX, rectangle.minY, 
rectangle.maxY);
+  public boolean contains(float x, float y) {
+return x >= this.minX && x <= this.maxX && y >= this.minY && y <= 
this.maxY;
   }
 
-  @Override
-  public boolean crossesDateline() {
+  public PointValues.Relation relate(float minX, float maxX, float minY, float 
maxY) {
+if (this.minX > maxX || this.maxX < minX || this.minY > maxY || this.maxY 
< minY) {
+  return PointValues.Relation.CELL_OUTSIDE_QUERY;
+}
+if (minX >= this.minX && maxX <= this.maxX && minY >= this.minY && maxY <= 
this.maxY) {
+  return PointValues.Relation.CELL_INSIDE_QUERY;
+}
+return PointValues.Relation.CELL_CROSSES_QUERY;
+  }
+
+  public PointValues.Relation relateTriangle(float aX, float aY, float bX, 
float bY, float cX, float cY) {
+// compute bounding box of triangle
+float tMinX = StrictMath.min(StrictMath.min(aX, bX), cX);
+float tMaxX = StrictMath.max(StrictMath.max(aX, bX), cX);
+float tMinY = StrictMath.min(StrictMath.min(aY, bY), cY);
+float tMaxY = StrictMath.max(StrictMath.max(aY, bY), cY);
+
+if (tMaxX < minX || tMinX > maxX || tMinY > maxY || tMaxY < minY) {
+  return PointValues.Relation.CELL_OUTSIDE_QUERY;
+}
+
+int edgesContain = numberOfCorners(aX, aY, bX, bY, cX, cY);
+if (edgesContain == 3) {
+  return PointValues.Relation.CELL_INSIDE_QUERY;
+} else if (edgesContain != 0) {
+  return PointValues.Relation.CELL_CROSSES_QUERY;
+} else if (Tessellator.pointInTriangle(minX, minY, aX, aY, bX, bY, cX, cY)
+   || edgesIntersect(aX, aY, bX, bY)
+   || edgesIntersect(bX, bY, cX, cY)
+   || edgesIntersect(cX, cY, aX, aY)) {
+  return PointValues.Relation.CELL_CROSSES_QUERY;
+}
+return PointValues.Relation.CELL_OUTSIDE_QUERY;
+  }
+
+  private  boolean edgesIntersect(float ax, float ay, float bx, float by) {
+// shortcut: if edge is a point (occurs w/ Line shapes); simply check bbox 
w/ point
+if (ax == bx && ay == by) {
+  return false;
+}
+
+// shortcut: check bboxes of edges are disjoint
+if ( Math.max(ax, bx) < minX || Math.min(ax, bx) > maxX || Math.min(ay, 
by) > maxY || Math.max(ay, by) < minY) {
+  return false;
+}
+
+// top
+if (orient(ax, ay, bx, by, minX, maxY) * orient(ax, ay, bx, by, maxX, 
maxY) <= 0 &&
+orient(minX, maxY, maxX, maxY, ax, ay) * orient(minX, maxY, maxX, 
maxY, bx, by) <= 0) {
+  return true;
+}
+
+// right
+if (orient(ax, ay, bx, by, maxX, maxY) * orient(ax, ay, bx, by, maxX, 
minY) <= 0 &&
+orient(maxX, maxY, maxX, minY, ax, ay) * orient(maxX, maxY, maxX, 
minY, bx, by) <= 0) {
+  return true;
+}
+
+// bottom
+if (orient(ax, ay, bx, by, maxX, minY) * orient(ax, ay, bx, by, minX, 
minY) <= 0 &&
+orient(maxX, minY, minX, minY, ax, ay) * orient(maxX, minY, minX, 
minY, bx, by) <= 0) {
+  return true;
+}
+
+// left
+if (orient(ax, ay, bx, by, minX, minY) * orient(ax, ay, bx, by, minX, 
maxY) <= 0 &&
+orient(minX, minY, minX, maxY, ax, ay) * orient(minX, minY, minX, 
maxY, bx, by) <= 0) {
+  return true;
+}
 return false;
   }
 
+  private int numberOfCorners(float ax, float ay, float bx, float by, float 
cx, float cy) {
+int containsCount = 0;
+if (contains(ax, ay)) {
+  containsCount++;
+}
+if (contains(bx, by)) {
+  containsCount++;
+}
+if (contains(cx, cy)) {
+  containsCount++;
+}
+return containsCount;
+  }
+
+  @Override
+  public boolean equals(

[GitHub] [lucene-solr] iverase commented on issue #770: LUCENE-8746: Component2D topology library that works on encoded space

2019-09-13 Thread GitBox
iverase commented on issue #770: LUCENE-8746: Component2D topology library that 
works on encoded space
URL: https://github.com/apache/lucene-solr/pull/770#issuecomment-531219042
 
 
   Closing this PR. I am opening a new one with narrow scope


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase closed pull request #770: LUCENE-8746: Component2D topology library that works on encoded space

2019-09-13 Thread GitBox
iverase closed pull request #770: LUCENE-8746: Component2D topology library 
that works on encoded space
URL: https://github.com/apache/lucene-solr/pull/770
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324145887
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/BottomValueChecker.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Maintains the bottom value across multiple collectors
+ */
+abstract class BottomValueChecker {
 
 Review comment:
   Since we will need to re use the class for `TopFieldCollector` as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324141173
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -74,11 +75,25 @@ public void collect(int doc) throws IOException {
   totalHits++;
   hitsThresholdChecker.incrementHitCount();
 
-  if (score <= pqTop.score) {
+  boolean nonCompetitiveHit;
+  if (bottomValueChecker != null && 
bottomValueChecker.isGlobalBottomValueAvailable()) {
+nonCompetitiveHit = score <= (float) 
bottomValueChecker.getBottomValue();
+  } else {
+nonCompetitiveHit = score <= pqTop.score;
+  }
+
+  if (nonCompetitiveHit) {
 if (totalHitsRelation == TotalHits.Relation.EQUAL_TO && 
hitsThresholdChecker.isThresholdReached()) {
   // we just reached totalHitsThreshold, we can start setting the 
min
   // competitive score now
   updateMinCompetitiveScore(scorer);
+
+  // Since the queue is prepopulated with sentinel objects, 
getting here means that the local
+  // priority queue is full
+  if (bottomValueChecker != null) {
+bottomValueChecker.updateThreadLocalBottomValue(pqTop.score);
+scorer.setMinCompetitiveScore((float) 
bottomValueChecker.getBottomValue());
 
 Review comment:
   If the global minimum score is equal to the local minimum score, should we 
not be discarding this document, because, as you said, we collect in docID 
order?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324141301
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -74,11 +75,25 @@ public void collect(int doc) throws IOException {
   totalHits++;
   hitsThresholdChecker.incrementHitCount();
 
-  if (score <= pqTop.score) {
 
 Review comment:
   Agreed, +1.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] KoenDG commented on a change in pull request #868: LUCENE-8975: Code Cleanup: Use entryset for map iteration wherever possible.

2019-09-13 Thread GitBox
KoenDG commented on a change in pull request #868: LUCENE-8975: Code Cleanup: 
Use entryset for map iteration wherever possible.
URL: https://github.com/apache/lucene-solr/pull/868#discussion_r324133535
 
 

 ##
 File path: lucene/CHANGES.txt
 ##
 @@ -65,6 +65,8 @@ Other
 
 * LUCENE-8768: Fix Javadocs build in Java 11. (Namgyu Kim)
 
+* LUCENE-8975: Code Cleanup: Use entryset for map iteration wherever possible.
+
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324130344
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/BottomValueChecker.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Maintains the bottom value across multiple collectors
+ */
+abstract class BottomValueChecker {
 
 Review comment:
   Why is a generic needed ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324130518
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/BottomValueChecker.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Maintains the bottom value across multiple collectors
+ */
+abstract class BottomValueChecker {
+  /** Maintains global bottom score as the maximum of all bottom scores */
+  private static class MaximumBottomScoreChecker extends 
BottomValueChecker {
+private volatile float maxMinScore;
+private final AtomicBoolean bottomValueAvailable = new AtomicBoolean();
+
+@Override
+public void updateThreadLocalBottomValue(Float value) {
 
 Review comment:
   Just use a simple `float` ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324125480
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -74,11 +75,25 @@ public void collect(int doc) throws IOException {
   totalHits++;
   hitsThresholdChecker.incrementHitCount();
 
-  if (score <= pqTop.score) {
+  boolean nonCompetitiveHit;
+  if (bottomValueChecker != null && 
bottomValueChecker.isGlobalBottomValueAvailable()) {
+nonCompetitiveHit = score <= (float) 
bottomValueChecker.getBottomValue();
+  } else {
+nonCompetitiveHit = score <= pqTop.score;
+  }
+
+  if (nonCompetitiveHit) {
 if (totalHitsRelation == TotalHits.Relation.EQUAL_TO && 
hitsThresholdChecker.isThresholdReached()) {
   // we just reached totalHitsThreshold, we can start setting the 
min
   // competitive score now
   updateMinCompetitiveScore(scorer);
+
+  // Since the queue is prepopulated with sentinel objects, 
getting here means that the local
+  // priority queue is full
+  if (bottomValueChecker != null) {
+bottomValueChecker.updateThreadLocalBottomValue(pqTop.score);
+scorer.setMinCompetitiveScore((float) 
bottomValueChecker.getBottomValue());
 
 Review comment:
   Can you move this to `updateMinCompetitiveScore` ? If the global minimum 
score is equals to the local minimum score we can require the next float value 
since we tie-break on doc id and collect in doc id order of the segment. So we 
need to reconcile the local minimum score and the global one to call 
setMinCompetitiveScore only once.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324122789
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/BottomValueChecker.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Maintains the bottom value across multiple collectors
+ */
+abstract class BottomValueChecker {
+  /** Maintains global bottom score as the maximum of all bottom scores */
+  private static class MaximumBottomScoreChecker extends 
BottomValueChecker {
+private volatile float maxMinScore;
+private final AtomicBoolean bottomValueAvailable = new AtomicBoolean();
+
+@Override
+public void updateThreadLocalBottomValue(Float value) {
+  if (value <= maxMinScore) {
+return;
+  }
+  synchronized (this) {
+if (value > maxMinScore) {
+  maxMinScore = value;
+}
+  }
+  bottomValueAvailable.compareAndSet(false, true);
 
 Review comment:
   Is this needed ? The default `maxMinScore` is 0 so you can just check that 
in the collector ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324130957
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -89,8 +104,12 @@ public void collect(int doc) throws IOException {
   pqTop.score = score;
   pqTop = pq.updateTop();
   updateMinCompetitiveScore(scorer);
-}
 
+  if (bottomValueChecker != null && 
bottomValueChecker.isGlobalBottomValueAvailable()) {
 
 Review comment:
   Same here, this should move `updateMinCompetitiveScore`, as is it doesn't 
respect the total hits threshold and the  bottom values might be greater than 
the local bottom score.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324130644
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/BottomValueChecker.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Maintains the bottom value across multiple collectors
+ */
+abstract class BottomValueChecker {
+  /** Maintains global bottom score as the maximum of all bottom scores */
+  private static class MaximumBottomScoreChecker extends 
BottomValueChecker {
+private volatile float maxMinScore;
+private final AtomicBoolean bottomValueAvailable = new AtomicBoolean();
+
+@Override
+public void updateThreadLocalBottomValue(Float value) {
+  if (value <= maxMinScore) {
+return;
+  }
+  synchronized (this) {
+if (value > maxMinScore) {
+  maxMinScore = value;
+}
+  }
+  bottomValueAvailable.compareAndSet(false, true);
+}
+
+@Override
+public Float getBottomValue() {
 
 Review comment:
   Same here, a `float` is enough


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324129841
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -74,11 +75,25 @@ public void collect(int doc) throws IOException {
   totalHits++;
   hitsThresholdChecker.incrementHitCount();
 
-  if (score <= pqTop.score) {
 
 Review comment:
   I find this simpler:
   `if (score <= pqTop.score && (bottomValueChecker == null || score < 
bottomValueChecker.getBottomValue()) {`
   This is redundant for the segment that holds the current max minimum score 
but we can optimize this case later, when the logic is fully implemented.
   When comparing the global minimum score we need to use lesser than `<` 
rather than `<=`. This is important since we tiebreak on document id and the 
global minimum score can come from a document id that is **after** this  local 
segment. We can also optimize this part and record the global document id that 
is associated with the current max score in the BottomValueChecker but that's 
not necessary in the first iteration IMO.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324116784
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -156,6 +187,10 @@ public void collect(int doc) throws IOException {
   pqTop.score = score;
   pqTop = pq.updateTop();
   updateMinCompetitiveScore(scorer);
 
 Review comment:
   Hmm, yeah, not sure what happened. I beasted the tests again -- came in clean


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early 
Termination
URL: https://github.com/apache/lucene-solr/pull/877#issuecomment-531174857
 
 
   @jimczi Thanks for the comments -- updated the PR. Please let me know your 
thoughts
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324105474
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -156,6 +187,10 @@ public void collect(int doc) throws IOException {
   pqTop.score = score;
   pqTop = pq.updateTop();
   updateMinCompetitiveScore(scorer);
 
 Review comment:
   I don't understand how it could break this test. We should propagate the min 
score only if the total hit threshold is reached and this test indexes 100 docs 
only while the default threshold is 1,000. So we should never skip hits in this 
test, am I missing something ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324105474
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -156,6 +187,10 @@ public void collect(int doc) throws IOException {
   pqTop.score = score;
   pqTop = pq.updateTop();
   updateMinCompetitiveScore(scorer);
 
 Review comment:
   I don't understand how it could break this test. We should propagate the min 
score only if the total hit threshold is reached and this test indexes 100 docs 
only while the default threshold is 10,000. So we should never skip hits in 
this test, am I missing something ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324103471
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/BottomValueChecker.java
 ##
 @@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Maintains the bottom value across multiple collectors
+ */
+abstract class BottomValueChecker {
+  /** Maintains global bottom score as the maximum of all bottom scores */
+  private static class MaximumBottomScoreChecker extends 
BottomValueChecker {
+private final AtomicInteger globalBottomValue = new AtomicInteger();
+private final AtomicBoolean bottomValueAvailable = new AtomicBoolean();
+
+@Override
+public void updateThreadLocalBottomValue(Float value) {
+  globalBottomValue.updateAndGet(currentValue -> 
Float.intBitsToFloat(currentValue) < value ? Float.floatToIntBits(value) : 
currentValue);
 
 Review comment:
   Fixed, thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324103214
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -156,6 +187,10 @@ public void collect(int doc) throws IOException {
   pqTop.score = score;
   pqTop = pq.updateTop();
   updateMinCompetitiveScore(scorer);
 
 Review comment:
   @jimczi That breaks `testIndexSearcher.testCount` though


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #868: LUCENE-8975: Code Cleanup: Use entryset for map iteration wherever possible.

2019-09-13 Thread GitBox
jpountz commented on a change in pull request #868: LUCENE-8975: Code Cleanup: 
Use entryset for map iteration wherever possible.
URL: https://github.com/apache/lucene-solr/pull/868#discussion_r324101947
 
 

 ##
 File path: lucene/CHANGES.txt
 ##
 @@ -65,6 +65,8 @@ Other
 
 * LUCENE-8768: Fix Javadocs build in Java 11. (Namgyu Kim)
 
+* LUCENE-8975: Code Cleanup: Use entryset for map iteration wherever possible.
+
 
 Review comment:
   can you move it under `Lucene 8.3.0` instead of `Lucene 9.0.0`, I plan to 
backport this to 8.x


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324098325
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -156,6 +187,10 @@ public void collect(int doc) throws IOException {
   pqTop.score = score;
   pqTop = pq.updateTop();
   updateMinCompetitiveScore(scorer);
 
 Review comment:
   We should propagate the global minimum score to the underlying scorer 
(`Scorer#setMinCompetitiveScore`)  too. The optimization above that compares 
the current doc with the global minimum score is a good one but propagating the 
global minimum score should be more effective since it will allow skipping hits 
directly in the scorer.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
jimczi commented on a change in pull request #877: LUCENE-8978: Maximal Bottom 
Score Based Early Termination
URL: https://github.com/apache/lucene-solr/pull/877#discussion_r324097168
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/BottomValueChecker.java
 ##
 @@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Maintains the bottom value across multiple collectors
+ */
+abstract class BottomValueChecker {
+  /** Maintains global bottom score as the maximum of all bottom scores */
+  private static class MaximumBottomScoreChecker extends 
BottomValueChecker {
+private final AtomicInteger globalBottomValue = new AtomicInteger();
+private final AtomicBoolean bottomValueAvailable = new AtomicBoolean();
+
+@Override
+public void updateThreadLocalBottomValue(Float value) {
+  globalBottomValue.updateAndGet(currentValue -> 
Float.intBitsToFloat(currentValue) < value ? Float.floatToIntBits(value) : 
currentValue);
 
 Review comment:
   Instead of using an AtomicInteger you could maybe use a double checked 
locking with a volatile ? Something like:
   ```
   public void updateThreadLocalBottomValue(float value) {
  if (value <= maxMinScore) {
return;
  }
  synchronized (this) {
  if (value > maxMinScore) {
 maxMinScore = value;
  }
 }
   }
   ```
   This would avoid the float to int conversion and should limit contention ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early Termination

2019-09-13 Thread GitBox
atris commented on issue #877: LUCENE-8978: Maximal Bottom Score Based Early 
Termination
URL: https://github.com/apache/lucene-solr/pull/877#issuecomment-531142523
 
 
   I ran luceneutil with concurrent mode and wikimedium2m -- no degradation to 
QPS and some tasks show a reduction of around 30% in latencies. Full results at:
   
   
https://issues.apache.org/jira/browse/LUCENE-8978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16929030#comment-16929030


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] eribeiro commented on a change in pull request #864: SOLR-13101 : Shared storage support in SolrCloud

2019-09-12 Thread GitBox
eribeiro commented on a change in pull request #864: SOLR-13101 : Shared 
storage support in SolrCloud
URL: https://github.com/apache/lucene-solr/pull/864#discussion_r324030543
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/store/blob/client/LocalStorageClient.java
 ##
 @@ -0,0 +1,259 @@
+package org.apache.solr.store.blob.client;
+
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.PrintWriter;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.StandardCopyOption;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.List;
+import java.util.stream.Collectors;
+
+/**
+ * Class that handles reads and writes of solr blob files to the local file 
system.
+ */
+public class LocalStorageClient implements CoreStorageClient {
+  
+  /** The directory on the local file system where blobs will be stored. */
+  public static final String BLOB_STORE_LOCAL_FS_ROOT_DIR_PROPERTY = 
"blob.local.dir";
+  
+  private final String blobStoreRootDir = 
System.getProperty(BLOB_STORE_LOCAL_FS_ROOT_DIR_PROPERTY, 
"/tmp/BlobStoreLocal/");
+
+  public LocalStorageClient() throws IOException {
+File rootDir = new File(blobStoreRootDir);
+rootDir.mkdirs(); // Might create the directory... or not
+if (!rootDir.isDirectory()) {
+  throw new IOException("Can't create local Blob root directory " + 
rootDir.getAbsolutePath());
+}
+  }
+
+  private File getCoreRootDir(String blobName) {
+return new File(BlobClientUtils.concatenatePaths(blobStoreRootDir, 
blobName));
+  }
+
+  @Override
+  public String pushStream(String blobName, InputStream is, long 
contentLength, String fileNamePrefix) throws BlobException {
+try {
+  createCoreStorage(blobName);
+  String blobPath = createNewNonExistingBlob(blobName, fileNamePrefix);
+
+  Files.copy(is, Paths.get(getBlobAbsolutePath(blobPath)), 
StandardCopyOption.REPLACE_EXISTING);
+
+  assert new File(getBlobAbsolutePath(blobPath)).length() == contentLength;
+
+  return blobPath;
+} catch (Exception ex) {
+  throw new BlobException(ex);
+}
+  }
+
+  /**
+   * Picks a unique name for a new blob for the given core.
+   * The current implementation creates a file, but eventually we just pick up 
a random blob name then delegate to S3...
+   * @return the blob file name, including the "path" part of the name
+   */
+  private String createNewNonExistingBlob(String blobName, String 
fileNamePrefix) throws BlobException {
+try {
+  String blobPath = BlobClientUtils.generateNewBlobCorePath(blobName, 
fileNamePrefix);
+  final File blobFile = new File(getBlobAbsolutePath(blobPath));
+  if (blobFile.exists()) {
+// Not expecting this ever to happen. In theory we could just do 
"continue" here to try a new
+// name. For now throwing an exception to make sure we don't run into 
this...
+// continue;
+throw new IllegalStateException("The random file name chosen using 
UUID already exists. Very worrying! " + blobFile.getAbsolutePath());
+  }
+
+  return blobPath;
+} catch (Exception ex) {
+  throw new BlobException(ex);
+}
+  }
+
+  @Override
+  public InputStream pullStream(String blobPath) throws BlobException {
+try {
+  File blobFile = new File(getBlobAbsolutePath(blobPath));
+  return new FileInputStream(blobFile);
+} catch (Exception ex) {
+  throw new BlobException(ex);
+}
+  }
+
+  @Override
+  public void pushCoreMetadata(String sharedStoreName, String 
blobCoreMetadataName, BlobCoreMetadata bcm) throws BlobException {
+try {
+  createCoreStorage(sharedStoreName);
+  ToFromJson converter = new ToFromJson<>();
+  String json = converter.toJson(bcm);
+
+  // Constant path under which the core metadata is stored in the Blob 
store (the only blob stored under a constant path!)
+  String blobMetadataPath = 
getBlobAbsolutePath(getBlobMetadataName(sharedStoreName, blobCoreMetadataName));
+  final File blobMetadataFile = new File(blobMetadataPath); 
+
+  // Writing to the file assumed atomic, the file cannot be observed 
midway. Might not hold here but should be the case
+  // with a real S3 implementation.
+  try (PrintWriter out = new PrintWriter(blobMetadataFile)){
+out.println(json);
+  }  
+} catch (Exception ex) {
+  throw new BlobException(ex);
+}
+  }
+
+  @Override
+  public BlobCoreMetadata pullCoreMetadata(String sharedStoreName, String 
blobCoreMetadataName) throws BlobException {
+try {
+  if (!coreMetadataExists(sharedStoreName, blobCoreMetadataName)) {
+return null;
+  }
+  
+  String blobMetadataPath = 
getBlobAbsolutePath(getBlobMetadataName(sharedStoreName, blobCoreMetadataName));
+  File blobMetadataFile = new File(blobMetadataPath); 
+  
+  String json = new

[GitHub] [lucene-solr] eribeiro commented on a change in pull request #864: SOLR-13101 : Shared storage support in SolrCloud

2019-09-12 Thread GitBox
eribeiro commented on a change in pull request #864: SOLR-13101 : Shared 
storage support in SolrCloud
URL: https://github.com/apache/lucene-solr/pull/864#discussion_r324027378
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/store/blob/client/BlobCoreMetadataBuilder.java
 ##
 @@ -0,0 +1,95 @@
+package org.apache.solr.store.blob.client;
+
+import java.util.*;
 
 Review comment:
   We usually avoid wildcard imports in Apache projects, iirc. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] eribeiro commented on a change in pull request #864: SOLR-13101 : Shared storage support in SolrCloud

2019-09-12 Thread GitBox
eribeiro commented on a change in pull request #864: SOLR-13101 : Shared 
storage support in SolrCloud
URL: https://github.com/apache/lucene-solr/pull/864#discussion_r323982385
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/store/blob/process/CorePullTask.java
 ##
 @@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.store.blob.process;
+
+import java.io.File;
+import java.lang.invoke.MethodHandles;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.solr.cloud.ZkController;
+import org.apache.solr.common.cloud.DocCollection;
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.CoreDescriptor;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.store.blob.client.BlobCoreMetadata;
+import org.apache.solr.store.blob.client.CoreStorageClient;
+import org.apache.solr.store.blob.metadata.CorePushPull;
+import org.apache.solr.store.blob.metadata.ServerSideMetadata;
+import org.apache.solr.store.blob.metadata.SharedStoreResolutionUtil;
+import 
org.apache.solr.store.blob.metadata.SharedStoreResolutionUtil.SharedMetadataResolutionResult;
+import org.apache.solr.store.blob.process.CorePullerFeeder.PullCoreInfo;
+import org.apache.solr.store.blob.provider.BlobStorageProvider;
+import org.apache.solr.store.blob.util.BlobStoreUtils;
+import org.apache.solr.store.blob.util.DeduplicatingList;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.base.Throwables;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+
+/**
+ * Code for pulling updates on a specific core to the Blob store. see 
{@CorePushTask} for the push version of this.
+ */
+public class CorePullTask implements DeduplicatingList.Deduplicatable {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  /**
+   * Minimum delay between to pull retries for a given core. Setting this 
higher than the push retry to reduce noise
+   * we get from a flood of queries for a stale core
+   * 
+   * TODO: make configurable
+   */
+  private static final long MIN_RETRY_DELAY_MS = 2;
+
+  /** Cores currently being pulled and timestamp of pull start (to identify 
stuck ones in logs) */
+  private static final HashMap pullsInFlight = Maps.newHashMap();
+
+  /** Cores unknown locally that got created as part of the pull process but 
for which no data has been pulled yet
+   * from Blob store. If we ignore this transitory state, these cores can be 
accessed locally and simply look empty.
+   * We'd rather treat threads attempting to access such cores like threads 
attempting to access an unknown core and
+   * do a pull (or more likely wait for an ongoing pull to finish).
+   *
+   * When this lock has to be taken as well as {@link #pullsInFlight}, then 
{@link #pullsInFlight} has to be taken first.
+   * Reading this set implies acquiring the monitor of the set (as if 
@GuardedBy("itself")), but writing to the set
+   * additionally implies holding the {@link #pullsInFlight}. This guarantees 
that while {@link #pullsInFlight}
+   * is held, no element in the set is changing.
+   */
+  private static final Set coresCreatedNotPulledYet = 
Sets.newHashSet();
+
+  private final CoreContainer coreContainer;
+  private final PullCoreInfo pullCoreInfo;
+  private final long queuedTimeMs;
+  private int attempts;
+  private long lastAttemptTimestamp;
+  private final PullCoreCallback callback;
+
+  CorePullTask(CoreContainer coreContainer, PullCoreInfo pullCoreInfo, 
PullCoreCallback callback) {
+this(coreContainer, pullCoreInfo, System.currentTimeMillis(), 0, 0L, 
callback);
+  }
+
+  private CorePullTask(CoreContainer coreContainer, PullCoreInfo pullCoreInfo, 
long queuedTimeMs, int attempts,
+  long lastAttemptTimestamp, PullCoreCallback callback) {
+this.coreContainer = coreContainer;
+this.pullCoreInfo = pullCoreInfo;
+this.queuedTimeMs = queuedTimeMs;
+this.attempts = attempts;
+this.lastAttemptTimestamp = lastAttemptTimestamp;
+this.callback = callback;
+  }
+
+  

[GitHub] [lucene-solr] eribeiro commented on a change in pull request #864: SOLR-13101 : Shared storage support in SolrCloud

2019-09-12 Thread GitBox
eribeiro commented on a change in pull request #864: SOLR-13101 : Shared 
storage support in SolrCloud
URL: https://github.com/apache/lucene-solr/pull/864#discussion_r323982385
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/store/blob/process/CorePullTask.java
 ##
 @@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.store.blob.process;
+
+import java.io.File;
+import java.lang.invoke.MethodHandles;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.solr.cloud.ZkController;
+import org.apache.solr.common.cloud.DocCollection;
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.CoreDescriptor;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.store.blob.client.BlobCoreMetadata;
+import org.apache.solr.store.blob.client.CoreStorageClient;
+import org.apache.solr.store.blob.metadata.CorePushPull;
+import org.apache.solr.store.blob.metadata.ServerSideMetadata;
+import org.apache.solr.store.blob.metadata.SharedStoreResolutionUtil;
+import 
org.apache.solr.store.blob.metadata.SharedStoreResolutionUtil.SharedMetadataResolutionResult;
+import org.apache.solr.store.blob.process.CorePullerFeeder.PullCoreInfo;
+import org.apache.solr.store.blob.provider.BlobStorageProvider;
+import org.apache.solr.store.blob.util.BlobStoreUtils;
+import org.apache.solr.store.blob.util.DeduplicatingList;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.base.Throwables;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+
+/**
+ * Code for pulling updates on a specific core to the Blob store. see 
{@CorePushTask} for the push version of this.
+ */
+public class CorePullTask implements DeduplicatingList.Deduplicatable {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  /**
+   * Minimum delay between to pull retries for a given core. Setting this 
higher than the push retry to reduce noise
+   * we get from a flood of queries for a stale core
+   * 
+   * TODO: make configurable
+   */
+  private static final long MIN_RETRY_DELAY_MS = 2;
+
+  /** Cores currently being pulled and timestamp of pull start (to identify 
stuck ones in logs) */
+  private static final HashMap pullsInFlight = Maps.newHashMap();
+
+  /** Cores unknown locally that got created as part of the pull process but 
for which no data has been pulled yet
+   * from Blob store. If we ignore this transitory state, these cores can be 
accessed locally and simply look empty.
+   * We'd rather treat threads attempting to access such cores like threads 
attempting to access an unknown core and
+   * do a pull (or more likely wait for an ongoing pull to finish).
+   *
+   * When this lock has to be taken as well as {@link #pullsInFlight}, then 
{@link #pullsInFlight} has to be taken first.
+   * Reading this set implies acquiring the monitor of the set (as if 
@GuardedBy("itself")), but writing to the set
+   * additionally implies holding the {@link #pullsInFlight}. This guarantees 
that while {@link #pullsInFlight}
+   * is held, no element in the set is changing.
+   */
+  private static final Set coresCreatedNotPulledYet = 
Sets.newHashSet();
+
+  private final CoreContainer coreContainer;
+  private final PullCoreInfo pullCoreInfo;
+  private final long queuedTimeMs;
+  private int attempts;
+  private long lastAttemptTimestamp;
+  private final PullCoreCallback callback;
+
+  CorePullTask(CoreContainer coreContainer, PullCoreInfo pullCoreInfo, 
PullCoreCallback callback) {
+this(coreContainer, pullCoreInfo, System.currentTimeMillis(), 0, 0L, 
callback);
+  }
+
+  private CorePullTask(CoreContainer coreContainer, PullCoreInfo pullCoreInfo, 
long queuedTimeMs, int attempts,
+  long lastAttemptTimestamp, PullCoreCallback callback) {
+this.coreContainer = coreContainer;
+this.pullCoreInfo = pullCoreInfo;
+this.queuedTimeMs = queuedTimeMs;
+this.attempts = attempts;
+this.lastAttemptTimestamp = lastAttemptTimestamp;
+this.callback = callback;
+  }
+
+  

[GitHub] [lucene-solr] eribeiro commented on a change in pull request #864: SOLR-13101 : Shared storage support in SolrCloud

2019-09-12 Thread GitBox
eribeiro commented on a change in pull request #864: SOLR-13101 : Shared 
storage support in SolrCloud
URL: https://github.com/apache/lucene-solr/pull/864#discussion_r323967328
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/handler/admin/RequestApplyUpdatesOp.java
 ##
 @@ -68,4 +74,20 @@ public void execute(CoreAdminHandler.CallInfo it) throws 
Exception {
   if (it.req != null) it.req.close();
 }
   }
+
+
+  private void pushToSharedStore(SolrCore core) {
+// Push the index to blob storage before we set our state to ACTIVE
+CloudDescriptor cloudDesc = core.getCoreDescriptor().getCloudDescriptor();
+if (cloudDesc.getReplicaType().equals(Replica.Type.SHARED)) {
 
 Review comment:
   `Replica.Type.SHARED` is a enum so this line could be as below, right?
   
   ```suggestion
   if (cloudDesc.getReplicaType() == Replica.Type.SHARED) {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] eribeiro commented on a change in pull request #864: SOLR-13101 : Shared storage support in SolrCloud

2019-09-12 Thread GitBox
eribeiro commented on a change in pull request #864: SOLR-13101 : Shared 
storage support in SolrCloud
URL: https://github.com/apache/lucene-solr/pull/864#discussion_r323982385
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/store/blob/process/CorePullTask.java
 ##
 @@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.store.blob.process;
+
+import java.io.File;
+import java.lang.invoke.MethodHandles;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.solr.cloud.ZkController;
+import org.apache.solr.common.cloud.DocCollection;
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.CoreDescriptor;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.store.blob.client.BlobCoreMetadata;
+import org.apache.solr.store.blob.client.CoreStorageClient;
+import org.apache.solr.store.blob.metadata.CorePushPull;
+import org.apache.solr.store.blob.metadata.ServerSideMetadata;
+import org.apache.solr.store.blob.metadata.SharedStoreResolutionUtil;
+import 
org.apache.solr.store.blob.metadata.SharedStoreResolutionUtil.SharedMetadataResolutionResult;
+import org.apache.solr.store.blob.process.CorePullerFeeder.PullCoreInfo;
+import org.apache.solr.store.blob.provider.BlobStorageProvider;
+import org.apache.solr.store.blob.util.BlobStoreUtils;
+import org.apache.solr.store.blob.util.DeduplicatingList;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.base.Throwables;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+
+/**
+ * Code for pulling updates on a specific core to the Blob store. see 
{@CorePushTask} for the push version of this.
+ */
+public class CorePullTask implements DeduplicatingList.Deduplicatable {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  /**
+   * Minimum delay between to pull retries for a given core. Setting this 
higher than the push retry to reduce noise
+   * we get from a flood of queries for a stale core
+   * 
+   * TODO: make configurable
+   */
+  private static final long MIN_RETRY_DELAY_MS = 2;
+
+  /** Cores currently being pulled and timestamp of pull start (to identify 
stuck ones in logs) */
+  private static final HashMap pullsInFlight = Maps.newHashMap();
+
+  /** Cores unknown locally that got created as part of the pull process but 
for which no data has been pulled yet
+   * from Blob store. If we ignore this transitory state, these cores can be 
accessed locally and simply look empty.
+   * We'd rather treat threads attempting to access such cores like threads 
attempting to access an unknown core and
+   * do a pull (or more likely wait for an ongoing pull to finish).
+   *
+   * When this lock has to be taken as well as {@link #pullsInFlight}, then 
{@link #pullsInFlight} has to be taken first.
+   * Reading this set implies acquiring the monitor of the set (as if 
@GuardedBy("itself")), but writing to the set
+   * additionally implies holding the {@link #pullsInFlight}. This guarantees 
that while {@link #pullsInFlight}
+   * is held, no element in the set is changing.
+   */
+  private static final Set coresCreatedNotPulledYet = 
Sets.newHashSet();
+
+  private final CoreContainer coreContainer;
+  private final PullCoreInfo pullCoreInfo;
+  private final long queuedTimeMs;
+  private int attempts;
+  private long lastAttemptTimestamp;
+  private final PullCoreCallback callback;
+
+  CorePullTask(CoreContainer coreContainer, PullCoreInfo pullCoreInfo, 
PullCoreCallback callback) {
+this(coreContainer, pullCoreInfo, System.currentTimeMillis(), 0, 0L, 
callback);
+  }
+
+  private CorePullTask(CoreContainer coreContainer, PullCoreInfo pullCoreInfo, 
long queuedTimeMs, int attempts,
+  long lastAttemptTimestamp, PullCoreCallback callback) {
+this.coreContainer = coreContainer;
+this.pullCoreInfo = pullCoreInfo;
+this.queuedTimeMs = queuedTimeMs;
+this.attempts = attempts;
+this.lastAttemptTimestamp = lastAttemptTimestamp;
+this.callback = callback;
+  }
+
+  

[GitHub] [lucene-solr] eribeiro commented on a change in pull request #864: SOLR-13101 : Shared storage support in SolrCloud

2019-09-12 Thread GitBox
eribeiro commented on a change in pull request #864: SOLR-13101 : Shared 
storage support in SolrCloud
URL: https://github.com/apache/lucene-solr/pull/864#discussion_r324026158
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/store/blob/client/BlobCoreMetadata.java
 ##
 @@ -0,0 +1,284 @@
+package org.apache.solr.store.blob.client;
+
+import java.util.Arrays;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.UUID;
+
+/**
+ * Object defining metadata stored in blob store for a Shared Collection shard 
and its builders.  
+ * This metadata includes all actual segment files as well as the segments_N 
file of the commit point.
+ * 
+ * This object is serialized to/from Json and stored in the blob store as a 
blob.
+ */
+public class BlobCoreMetadata {
+
+/**
+ * Name of the shard index data that is shared by all replicas belonging 
to that shard. This 
+ * name is to decouple the core name that Solr manages from the name of 
the core on blob store. 
+ */
+private final String sharedBlobName;
+
+/**
+ * Unique identifier of this metadata, that changes on every update to the 
metadata (except generating a new corrupt metadata
+ * through {@link #getCorruptOf}).
+ */
+private final String uniqueIdentifier;
+
+/**
+ * Indicates that a Solr (search) server pulled this core and was then 
unable to open or use it. This flag is used as
+ * an indication to servers pushing blobs for that core into Blob Store to 
push a complete set of files if they have
+ * a locally working copy rather than just diffs (files missing on Blob 
Store).
+ */
+private final boolean isCorrupt;
+
+/**
+ * Indicates that this core has been deleted by the client. This flag is 
used as a marker to prevent other servers
+ * from pushing their version of this core to blob and to allow local copy 
cleanup.
+ */
+private final boolean isDeleted;
+
+/**
+ * The array of files that constitute the current commit point of the core 
(as known by the Blob store).
+ * This array is not ordered! There are no duplicate entries in it either 
(see how it's built in {@link BlobCoreMetadataBuilder}).
+ */
+private final BlobFile[] blobFiles;
+
+/**
+ * Files marked for delete but not yet removed from the Blob store. Each 
such file contains information indicating when
+ * it was marked for delete so we can actually remove the corresponding 
blob (and the entry from this array in the metadata)
+ * when it's safe to do so even if there are (unexpected) conflicting 
updates to the blob store by multiple solr servers...
+ * TODO: we might want to separate the metadata blob with the deletes as 
it's not required to always fetch the delete list when checking freshness of 
local core...
+ */
+private final BlobFileToDelete[] blobFilesToDelete;
+
+/**
+ * This is the constructor called by {@link BlobCoreMetadataBuilder}.
+ * It always builds non "isCorrupt" and non "isDeleted" metadata. 
+ * The only way to build an instance of "isCorrupt" metadata is to use 
{@link #getCorruptOf} and for "isDeleted" use {@link #getDeletedOf()}
+ */
+BlobCoreMetadata(String sharedBlobName, BlobFile[] blobFiles, 
BlobFileToDelete[] blobFilesToDelete) {
+this(sharedBlobName, blobFiles, blobFilesToDelete, 
UUID.randomUUID().toString(), false,
+false);
+}
+
+private BlobCoreMetadata(String sharedBlobName, BlobFile[] blobFiles, 
BlobFileToDelete[] blobFilesToDelete, 
+String uniqueIdentifier, boolean isCorrupt, boolean isDeleted) {
+this.sharedBlobName = sharedBlobName;
+this.blobFiles = blobFiles;
+this.blobFilesToDelete = blobFilesToDelete;
+this.uniqueIdentifier = uniqueIdentifier;
+this.isCorrupt = isCorrupt;
+this.isDeleted = isDeleted;
+}
+
+/**
+ * Given a non corrupt {@link BlobCoreMetadata} instance, creates an 
equivalent one based on it but marked as corrupt.
+ * The new instance keeps all the rest of the metadata unchanged, 
including the {@link #uniqueIdentifier}.
+ */
+public BlobCoreMetadata getCorruptOf() {
+assert !isCorrupt;
+return new BlobCoreMetadata(sharedBlobName, blobFiles, 
blobFilesToDelete, uniqueIdentifier, true, isDeleted);
+}
+
+/**
+ * Given a {@link BlobCoreMetadata} instance, creates an equivalent one 
based on it but marked as deleted.
+ * 
+ * The new instance keeps all the rest of the metadata unchanged, 
including the {@link #uniqueIdentifier}.
+ */
+public BlobCoreMetadata getDeletedOf() {
+assert !isDeleted;
+return new BlobCoreMetadata(sharedBlobName, blobFiles, 
blobFilesToDelete, uniqueIdentifier, isCorrupt, true);
+}
+
+/**
+ * Returns true if the Blob metadata was marked as deleted
+ */
+public boolean getIsDeleted() {
+return isDeleted;
+}
+
+/**
+ 

[GitHub] [lucene-solr] eribeiro commented on a change in pull request #864: SOLR-13101 : Shared storage support in SolrCloud

2019-09-12 Thread GitBox
eribeiro commented on a change in pull request #864: SOLR-13101 : Shared 
storage support in SolrCloud
URL: https://github.com/apache/lucene-solr/pull/864#discussion_r323984413
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/cloud/autoscaling/IndexSizeTrigger.java
 ##
 @@ -31,6 +31,7 @@
 import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.TimeUnit;
 import java.util.concurrent.atomic.AtomicLong;
+import java.util.Locale;
 
 Review comment:
   is this being used?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] eribeiro commented on a change in pull request #864: SOLR-13101 : Shared storage support in SolrCloud

2019-09-12 Thread GitBox
eribeiro commented on a change in pull request #864: SOLR-13101 : Shared 
storage support in SolrCloud
URL: https://github.com/apache/lucene-solr/pull/864#discussion_r323981156
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/store/blob/process/CorePullTask.java
 ##
 @@ -0,0 +1,452 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.store.blob.process;
+
+import java.io.File;
+import java.lang.invoke.MethodHandles;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.solr.cloud.ZkController;
+import org.apache.solr.common.cloud.DocCollection;
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.CoreDescriptor;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.store.blob.client.BlobCoreMetadata;
+import org.apache.solr.store.blob.client.CoreStorageClient;
+import org.apache.solr.store.blob.metadata.CorePushPull;
+import org.apache.solr.store.blob.metadata.ServerSideMetadata;
+import org.apache.solr.store.blob.metadata.SharedStoreResolutionUtil;
+import 
org.apache.solr.store.blob.metadata.SharedStoreResolutionUtil.SharedMetadataResolutionResult;
+import org.apache.solr.store.blob.process.CorePullerFeeder.PullCoreInfo;
+import org.apache.solr.store.blob.provider.BlobStorageProvider;
+import org.apache.solr.store.blob.util.BlobStoreUtils;
+import org.apache.solr.store.blob.util.DeduplicatingList;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.base.Throwables;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+
+/**
+ * Code for pulling updates on a specific core to the Blob store. see 
{@CorePushTask} for the push version of this.
+ */
+public class CorePullTask implements DeduplicatingList.Deduplicatable {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  /**
+   * Minimum delay between to pull retries for a given core. Setting this 
higher than the push retry to reduce noise
+   * we get from a flood of queries for a stale core
+   * 
+   * TODO: make configurable
+   */
+  private static final long MIN_RETRY_DELAY_MS = 2;
+
+  /** Cores currently being pulled and timestamp of pull start (to identify 
stuck ones in logs) */
+  private static final HashMap pullsInFlight = Maps.newHashMap();
+
+  /** Cores unknown locally that got created as part of the pull process but 
for which no data has been pulled yet
+   * from Blob store. If we ignore this transitory state, these cores can be 
accessed locally and simply look empty.
+   * We'd rather treat threads attempting to access such cores like threads 
attempting to access an unknown core and
+   * do a pull (or more likely wait for an ongoing pull to finish).
+   *
+   * When this lock has to be taken as well as {@link #pullsInFlight}, then 
{@link #pullsInFlight} has to be taken first.
+   * Reading this set implies acquiring the monitor of the set (as if 
@GuardedBy("itself")), but writing to the set
+   * additionally implies holding the {@link #pullsInFlight}. This guarantees 
that while {@link #pullsInFlight}
+   * is held, no element in the set is changing.
+   */
+  private static final Set coresCreatedNotPulledYet = 
Sets.newHashSet();
+
+  private final CoreContainer coreContainer;
+  private final PullCoreInfo pullCoreInfo;
+  private final long queuedTimeMs;
+  private int attempts;
+  private long lastAttemptTimestamp;
+  private final PullCoreCallback callback;
+
+  CorePullTask(CoreContainer coreContainer, PullCoreInfo pullCoreInfo, 
PullCoreCallback callback) {
+this(coreContainer, pullCoreInfo, System.currentTimeMillis(), 0, 0L, 
callback);
+  }
+
+  private CorePullTask(CoreContainer coreContainer, PullCoreInfo pullCoreInfo, 
long queuedTimeMs, int attempts,
+  long lastAttemptTimestamp, PullCoreCallback callback) {
+this.coreContainer = coreContainer;
+this.pullCoreInfo = pullCoreInfo;
+this.queuedTimeMs = queuedTimeMs;
+this.attempts = attempts;
+this.lastAttemptTimestamp = lastAttemptTimestamp;
+this.callback = callback;
+  }
+
+  

[GitHub] [lucene-solr] eribeiro commented on a change in pull request #864: SOLR-13101 : Shared storage support in SolrCloud

2019-09-12 Thread GitBox
eribeiro commented on a change in pull request #864: SOLR-13101 : Shared 
storage support in SolrCloud
URL: https://github.com/apache/lucene-solr/pull/864#discussion_r323974231
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/store/blob/provider/BlobStorageProvider.java
 ##
 @@ -0,0 +1,62 @@
+package org.apache.solr.store.blob.provider;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.store.blob.client.BlobException;
+import org.apache.solr.store.blob.client.BlobStorageClientBuilder;
+import org.apache.solr.store.blob.client.BlobstoreProviderType;
+import org.apache.solr.store.blob.client.CoreStorageClient;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.amazonaws.SdkClientException;
+
+/**
+ * Class that provides access to the shared storage client (blob client) and
+ * handles initiation of such client. This class serves as the provider for all
+ * blob store communication channels.
+ */
+public class BlobStorageProvider {
+
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  private CoreStorageClient storageClient;
 
 Review comment:
   ```suggestion
 private volatile CoreStorageClient storageClient;
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] eribeiro commented on a change in pull request #864: SOLR-13101 : Shared storage support in SolrCloud

2019-09-12 Thread GitBox
eribeiro commented on a change in pull request #864: SOLR-13101 : Shared 
storage support in SolrCloud
URL: https://github.com/apache/lucene-solr/pull/864#discussion_r323972376
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/store/blob/client/S3StorageClient.java
 ##
 @@ -0,0 +1,385 @@
+package org.apache.solr.store.blob.client;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.*;
+import java.util.function.Consumer;
+import java.util.stream.Collectors;
+
+import org.apache.solr.common.StringUtils;
+import org.apache.solr.util.FileUtils;
+
+import com.amazonaws.AmazonClientException;
+import com.amazonaws.AmazonServiceException;
+import com.amazonaws.regions.Regions;
+import com.amazonaws.services.s3.AmazonS3;
+import com.amazonaws.services.s3.AmazonS3ClientBuilder;
+import com.amazonaws.services.s3.model.*;
+import com.amazonaws.services.s3.model.DeleteObjectsRequest.KeyVersion;
+import com.google.common.collect.Iterables;
+
+import org.apache.solr.store.blob.client.BlobCoreMetadata;
+import org.apache.solr.store.blob.client.BlobClientUtils;
+import org.apache.solr.store.blob.client.ToFromJson;
+
+/**
+ * This class implements an AmazonS3 client for reading and writing search 
index
+ * data to AWS S3.
+ */
+public class S3StorageClient implements CoreStorageClient {
+
+  private final AmazonS3 s3Client;
+
+  /** The S3 bucket where we write all of our blobs to */
+  private final String blobBucketName;
+
+  // S3 has a hard limit of 1000 keys per batch delete request
+  private static final int MAX_KEYS_PER_BATCH_DELETE = 1000;
+
+  /**
+   * Construct a new S3StorageClient that is an implementation of the
+   * CoreStorageClient using AWS S3 as the underlying blob store service 
provider.
+   */
+  public S3StorageClient() throws IOException {
+String credentialsFilePath = 
AmazonS3Configs.CREDENTIALS_FILE_PATH.getValue();
+
+// requires credentials file on disk to authenticate with S3
+if (!FileUtils.fileExists(credentialsFilePath)) {
+  throw new IOException("Credentials file does not exist in " + 
credentialsFilePath);
+}
+
+/*
+ * default s3 client builder loads credentials from disk and handles token 
refreshes
+ */
+AmazonS3ClientBuilder builder = AmazonS3ClientBuilder.standard();
+s3Client = builder
+.withPathStyleAccessEnabled(true)
+.withRegion(Regions.fromName(AmazonS3Configs.REGION.getValue()))
+.build();
+
+blobBucketName = AmazonS3Configs.BUCKET_NAME.getValue();
+  }
+
+  @Override
+  public void pushCoreMetadata(String sharedStoreName, String 
blobCoreMetadataName, BlobCoreMetadata bcm)
+  throws BlobException {
+try {
+  ToFromJson converter = new ToFromJson<>();
+  String json = converter.toJson(bcm);
+
+  String blobCoreMetadataPath = getBlobMetadataPath(sharedStoreName, 
blobCoreMetadataName);
+  /*
+   * Encodes contents of the string into an S3 object. If no exception is 
thrown
+   * then the object is guaranteed to have been stored
+   */
+  s3Client.putObject(blobBucketName, blobCoreMetadataPath, json);
+} catch (AmazonServiceException ase) {
+  throw handleAmazonServiceException(ase);
+} catch (AmazonClientException ace) {
+  throw new BlobClientException(ace);
+} catch (Exception ex) {
+  throw new BlobException(ex);
+}
+  }
+
+  @Override
+  public BlobCoreMetadata pullCoreMetadata(String sharedStoreName, String 
blobCoreMetadataName) throws BlobException {
+try {
+  String blobCoreMetadataPath = getBlobMetadataPath(sharedStoreName, 
blobCoreMetadataName);
+
+  if (!coreMetadataExists(sharedStoreName, blobCoreMetadataName)) {
+return null;
+  }
+
+  String decodedJson = s3Client.getObjectAsString(blobBucketName, 
blobCoreMetadataPath);
+  ToFromJson converter = new ToFromJson<>();
+  return converter.fromJson(decodedJson, BlobCoreMetadata.class);
+} catch (AmazonServiceException ase) {
+  throw handleAmazonServiceException(ase);
+} catch (AmazonClientException ace) {
+  throw new BlobClientException(ace);
+} catch (Exception ex) {
+  throw new BlobException(ex);
+}
+  }
+
+  @Override
+  public InputStream pullStream(String path) throws BlobException {
+try {
+  S3Object requestedObject = s3Client.getObject(blobBucketName, path);
+  // This InputStream instance needs to be closed by the caller
+  return requestedObject.getObjectContent();
+} catch (AmazonServiceException ase) {
+  throw handleAmazonServiceException(ase);
+} catch (AmazonClientException ace) {
+  throw new BlobClientException(ace);
+} catch (Exception ex) {
+  throw new BlobException(ex);
+}
+  }
+
+  @Override
+  public String pushStream(String blobName, InputStream is, long 
contentLength, String fileNamePrefix)
+  throws BlobException {
+try {
+  /*
+   * This object metadata is associated per 

[GitHub] [lucene-solr] eribeiro commented on a change in pull request #864: SOLR-13101 : Shared storage support in SolrCloud

2019-09-12 Thread GitBox
eribeiro commented on a change in pull request #864: SOLR-13101 : Shared 
storage support in SolrCloud
URL: https://github.com/apache/lucene-solr/pull/864#discussion_r323974906
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/store/blob/provider/BlobStorageProvider.java
 ##
 @@ -0,0 +1,62 @@
+package org.apache.solr.store.blob.provider;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.store.blob.client.BlobException;
+import org.apache.solr.store.blob.client.BlobStorageClientBuilder;
+import org.apache.solr.store.blob.client.BlobstoreProviderType;
+import org.apache.solr.store.blob.client.CoreStorageClient;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.amazonaws.SdkClientException;
+
+/**
+ * Class that provides access to the shared storage client (blob client) and
+ * handles initiation of such client. This class serves as the provider for all
+ * blob store communication channels.
+ */
+public class BlobStorageProvider {
+
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  private CoreStorageClient storageClient;
+
+  public CoreStorageClient getClient() {
+if (storageClient != null) {
+  return storageClient;
+}
+
+return getClient(BlobstoreProviderType.getConfiguredProvider());
+  }
+
+  private synchronized CoreStorageClient getClient(BlobstoreProviderType 
blobStorageProviderType) {
+if (storageClient != null) {
 
 Review comment:
   Lines 37-39 duplicate lines 29-31. Maybe remove the redundant lines in the 
`getClient()` method? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] eribeiro commented on a change in pull request #864: SOLR-13101 : Shared storage support in SolrCloud

2019-09-12 Thread GitBox
eribeiro commented on a change in pull request #864: SOLR-13101 : Shared 
storage support in SolrCloud
URL: https://github.com/apache/lucene-solr/pull/864#discussion_r323973484
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/store/blob/process/CorePullTracker.java
 ##
 @@ -0,0 +1,208 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.store.blob.process;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.Map;
+
+import javax.servlet.http.HttpServletRequest;
+
+import org.apache.solr.client.solrj.cloud.autoscaling.VersionedData;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.cloud.DocCollection;
+import org.apache.solr.common.cloud.Slice;
+import org.apache.solr.common.cloud.ZkStateReader;
+import org.apache.solr.common.util.Utils;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.servlet.SolrRequestParsers;
+import org.apache.solr.store.blob.metadata.PushPullData;
+import org.apache.solr.store.blob.process.CorePullerFeeder.PullCoreInfo;
+import org.apache.solr.store.blob.util.BlobStoreUtils;
+import org.apache.solr.store.blob.util.DeduplicatingList;
+import org.apache.solr.store.shared.metadata.SharedShardMetadataController;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Tracks cores that are being queried and if necessary enqueues them for pull 
from blob store
+ */
+public class CorePullTracker {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  static private final int TRACKING_LIST_MAX_SIZE = 50;
+
+  private final DeduplicatingList coresToPull;
+
+  /* Config value that enables core pulls */
+  @VisibleForTesting
+  public static boolean isBackgroundPullEnabled = true; // TODO : make 
configurable
+
+  // Let's define these paths in yet another place in the code...
+  private static final String QUERY_PATH_PREFIX = "/select";
+  private static final String SPELLCHECK_PATH_PREFIX = "/spellcheck";
+  private static final String RESULTPROMOTION_PATH_PREFIX = 
"/result_promotion";
+  private static final String INDEXLOOKUP_PATH_PREFIX = "/indexLookup";
+  private static final String HIGHLIGHT_PATH_PREFIX = "/highlight";
+  private static final String BACKUP_PATH_PREFIX = "/backup";
+
+  public CorePullTracker() {
+coresToPull = new DeduplicatingList<>(TRACKING_LIST_MAX_SIZE, new 
CorePullerFeeder.PullCoreInfoMerger());
+  }
+
+  /**
+   * If the local core is stale, enqueues it to be pulled in from blob
+   * TODO: add stricter checks so that we don't pull on every request
+   */
+  public void enqueueForPullIfNecessary(String requestPath, SolrCore core, 
String collectionName,
+  CoreContainer cores) throws IOException, SolrException {
+// Initialize variables
+String coreName = core.getName();
+String shardName = 
core.getCoreDescriptor().getCloudDescriptor().getShardId();
+SharedShardMetadataController sharedShardMetadataController = 
cores.getSharedStoreManager().getSharedShardMetadataController();
+DocCollection collection = 
cores.getZkController().getClusterState().getCollection(collectionName);
+
+Slice shard = collection.getSlicesMap().get(shardName);
+if (shard != null) {
+  try {
+if (!collection.getActiveSlices().contains(shard)) {
+  // unclear if there are side effects but logging for now
+  log.warn("Enqueueing a pull for shard " + shardName + " that is 
inactive!");
+}
+log.info("Enqueue a pull for collection=" + collectionName + " shard=" 
+ shardName + " coreName=" + coreName);
+// creates the metadata node if it doesn't exist
+sharedShardMetadataController.ensureMetadataNodeExists(collectionName, 
shardName);
+
+/*
+ * Get the metadataSuffix value from ZooKeeper or from a cache if an 
entry exists for the 
+ * given collection and shardName. If the leader has already changed, 
the conditional update
+ * later will fail and invalidate the cache entry if it exists. 
+ */
+VersionedD

  1   2   3   4   5   6   7   8   9   10   >