[GitHub] [lucene-solr] dweiss commented on pull request #1721: LUCENE-9439: match region highlighter components

2020-08-12 Thread GitBox


dweiss commented on pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#issuecomment-673296232


   I will split the original issue into two - one adjusting Matches API 
slightly and the other with highlighter components on top of that API; I'll 
provide some more high-level description then. In short: it's a different angle 
to highlighting query hits. I think it's simpler internally (relies on matches 
API almost entirely - it has no knowledge of query types, etc, occasionally 
reverse-engineering offsets from positions). And from the outside it's really 
flexible because it gives you all the controls to build your own highlighting 
code (however you wish it to work). It doesn't have all the bells and whistles 
of UnifiedHighlighter but at the same time there's nothing preventing you from 
adding these (snippet scoring is very simple, for example).
   
   If you take a look at the test case you'll see that it works really nice, in 
spite of not knowing anything about underlying query types. The remaining 
classes (passage selector, overlap conflict resolution) are auxiliary classes I 
already had for my own needs (again - none of the existing code in Lucene 
fulfilled all my expectations) so I added them in, perhaps they'll be useful.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9450) Taxonomy index should use DocValues not StoredFields

2020-08-12 Thread Gautam Worah (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176793#comment-17176793
 ] 

Gautam Worah commented on LUCENE-9450:
--

Thanks for reviewing the PR.

Based on feedback received from the previous revision I've posted a new PR 
(revision 2).

(Copied from a comment on GitHub)

The new PR has the following changes:
 # Use {{ordinal}} instead of {{catIDInteger}} (IntelliJ says that boxing is 
anyways not needed, perhaps we can remove?)
 # Use the correct {{values}} instance that has advanced to the correct 
{{docId}}
 # Use {{ReaderUtil}} to get to the {{leaf}} and then use {{LeafReader}} 
instead of using the higher level {{MultiDocValues}} call

TEST:
 {{ant test}} in the {{lucene-solr/lucene/facet}} directory (passes 
successfully)
 {{ant precommit}}

I've not added any new tests because this PR changes a low level implementation 
detail and the current tests already cover this (we have a simple {{assert}} 
that checks that {{.advanceExact}} returns {{true)}}

 

 

 

 

 

> Taxonomy index should use DocValues not StoredFields
> 
>
> Key: LUCENE-9450
> URL: https://issues.apache.org/jira/browse/LUCENE-9450
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.5.2
>Reporter: Gautam Worah
>Priority: Minor
>  Labels: performance
> Attachments: wip_taxonomy_patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The taxonomy index that maps binning labels to ordinals was created before 
> Lucene added BinaryDocValues.
> I've attached a WIP patch (does not pass tests currently)
> Issue suggested by [~mikemccand]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] gautamworah96 commented on a change in pull request #1733: LUCENE-9450 Use BinaryDocValues in the taxonomy writer

2020-08-12 Thread GitBox


gautamworah96 commented on a change in pull request #1733:
URL: https://github.com/apache/lucene-solr/pull/1733#discussion_r469727743



##
File path: 
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java
##
@@ -323,8 +323,10 @@ public FacetLabel getPath(int ordinal) throws IOException {
   }
 }
 
-Document doc = indexReader.document(ordinal);
-FacetLabel ret = new 
FacetLabel(FacetsConfig.stringToPath(doc.get(Consts.FULL)));
+boolean found = MultiDocValues.getBinaryValues(indexReader, 
Consts.FULL).advanceExact(catIDInteger);

Review comment:
   Thank you for looking at it so closely (and helping in debugging).
   
   The new PR has the following changes:
   1. Use `ordinal`  instead of `catIDInteger` (IntelliJ says that boxing is 
anyways not needed, perhaps we can remove?)
   2. Use the correct `values` instance that has advanced to the correct `docId`
   3. Use `ReaderUtil` to get to the `leaf` and then use `LeafReader` instead 
of using the higher level `MultiDocValues` call
   
   TEST:
   `ant test` in the `lucene-solr/lucene/facets` directory (passes successfully)
   `ant precommit`
   
   I've not added any new tests because this PR changes a low level 
implementation detail and the current tests already cover this
   
   The next step is to run lucene benchmarks!
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1743: Gradual naming convention enforcement.

2020-08-12 Thread GitBox


dweiss commented on a change in pull request #1743:
URL: https://github.com/apache/lucene-solr/pull/1743#discussion_r469724935



##
File path: 
lucene/test-framework/src/java/org/apache/lucene/util/VerifyTestClassNamingConvention.java
##
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import com.carrotsearch.randomizedtesting.RandomizedContext;
+import org.junit.Assume;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.io.UncheckedIOException;
+import java.io.Writer;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.StandardOpenOption;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+
+/**
+ * Enforce test naming convention.
+ */
+public class VerifyTestClassNamingConvention extends AbstractBeforeAfterRule {
+  public static final Pattern ALLOWED_CONVENTION = 
Pattern.compile("(.+?)\\.Test[^.]+");
+
+  private static Set exceptions;
+  static {
+try {
+  exceptions = new HashSet<>();
+  try (BufferedReader is =
+ new BufferedReader(
+ new InputStreamReader(
+   
VerifyTestClassNamingConvention.class.getResourceAsStream("test-naming-exceptions.txt"),
+   StandardCharsets.UTF_8))) {
+is.lines().forEach(exceptions::add);
+  }
+} catch (IOException e) {
+  throw new UncheckedIOException(e);
+}
+  }
+
+  @Override
+  protected void before() throws Exception {
+if (TestRuleIgnoreTestSuites.isRunningNested()) {
+  // Ignore nested test suites that test the test framework itself.
+  return;
+}
+
+String suiteName = RandomizedContext.current().getTargetClass().getName();
+
+// You can use this helper method to dump all suite names to a file.
+// Run gradle with one worker so that it doesn't try to append to the same
+// file from multiple processes:
+//
+// gradlew  test --max-workers 1 -Dtests.useSecurityManager=false
+//
+// dumpSuiteNamesOnly(suiteName);
+
+if (!ALLOWED_CONVENTION.matcher(suiteName).matches()) {
+  // if this class exists on the exception list, leave it.
+  if (!exceptions.contains("!" + suiteName)) {
+throw new AssertionError("Suite must follow Test*.java naming 
convention: "

Review comment:
   Again - this was just to show how it can be done; I used prefix 
convention because it was mentioned on the list, that's it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1743: Gradual naming convention enforcement.

2020-08-12 Thread GitBox


dweiss commented on a change in pull request #1743:
URL: https://github.com/apache/lucene-solr/pull/1743#discussion_r469724710



##
File path: 
lucene/test-framework/src/java/org/apache/lucene/util/VerifyTestClassNamingConvention.java
##
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import com.carrotsearch.randomizedtesting.RandomizedContext;
+import org.junit.Assume;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.io.UncheckedIOException;
+import java.io.Writer;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.StandardOpenOption;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+
+/**
+ * Enforce test naming convention.
+ */
+public class VerifyTestClassNamingConvention extends AbstractBeforeAfterRule {
+  public static final Pattern ALLOWED_CONVENTION = 
Pattern.compile("(.+?)\\.Test[^.]+");
+
+  private static Set exceptions;
+  static {
+try {
+  exceptions = new HashSet<>();
+  try (BufferedReader is =
+ new BufferedReader(
+ new InputStreamReader(
+   
VerifyTestClassNamingConvention.class.getResourceAsStream("test-naming-exceptions.txt"),
+   StandardCharsets.UTF_8))) {
+is.lines().forEach(exceptions::add);
+  }
+} catch (IOException e) {
+  throw new UncheckedIOException(e);
+}
+  }
+
+  @Override
+  protected void before() throws Exception {
+if (TestRuleIgnoreTestSuites.isRunningNested()) {
+  // Ignore nested test suites that test the test framework itself.
+  return;
+}
+
+String suiteName = RandomizedContext.current().getTargetClass().getName();
+
+// You can use this helper method to dump all suite names to a file.
+// Run gradle with one worker so that it doesn't try to append to the same
+// file from multiple processes:
+//
+// gradlew  test --max-workers 1 -Dtests.useSecurityManager=false
+//
+// dumpSuiteNamesOnly(suiteName);
+
+if (!ALLOWED_CONVENTION.matcher(suiteName).matches()) {
+  // if this class exists on the exception list, leave it.
+  if (!exceptions.contains("!" + suiteName)) {
+throw new AssertionError("Suite must follow Test*.java naming 
convention: "
+  + suiteName);
+  }
+}
+  }
+
+  private void dumpSuiteNamesOnly(String suiteName) throws IOException {
+// Has to be a global unique path (not a temp file because temp files
+// are different for each JVM).
+Path temporaryFile = Paths.get("c:\\_tmp\\test-naming-exceptions.txt");

Review comment:
   Mike, this patch was just to show how it can be done. And this method 
only collects test cases for exclusion - it's not used and could be removed 
since in theory we'd only want to take away from once-generated exceptions 
file. I left it to show how I collected the list in the first place.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on pull request #1721: LUCENE-9439: match region highlighter components

2020-08-12 Thread GitBox


dweiss commented on pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#issuecomment-673287936


   I think it's best explained by looking at the code, David. It's really tiny. 
Take a look at the tests and then dig into the class that retrieves offsets 
from matches - the rest is basically supporting stuff.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14702) Remove Master and Slave from Code Base and Docs

2020-08-12 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176780#comment-17176780
 ] 

David Smiley commented on SOLR-14702:
-

Thanks everyone!  I'm late to the comment party.  I'm really glad to see 
"leader/follower" now used consistently in both SolrCloud and standalone mode 
for the reasons Tomas gave.

> Remove Master and Slave from Code Base and Docs
> ---
>
> Key: SOLR-14702
> URL: https://issues.apache.org/jira/browse/SOLR-14702
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: master (9.0)
>Reporter: Marcus Eagan
>Priority: Critical
> Fix For: master (9.0), 8.7
>
> Attachments: SOLR-14742-testfix.patch
>
>  Time Spent: 17h
>  Remaining Estimate: 0h
>
> Every time I read _master_ and _slave_, I get pissed.
> I think about the last and only time I remember visiting my maternal great 
> grandpa in Alabama at four years old. He was a sharecropper before WWI, where 
> he lost his legs, and then he was back to being a sharecropper somehow after 
> the war. Crazy, I know. I don't know if the world still called his job 
> sharecropping in 1993, but he was basically a slave—in America. He lived in 
> the same shack that his father, and his grandfather (born a slave) lived in 
> down in Alabama. Believe it or not, my dad's (born in 1926) grandfather was 
> actually born a slave, freed shortly after birth by his owner father. I never 
> met him, though. He died in the 40s.
> Anyway, I cannot police all terms in the repo and do not wish to. This 
> master/slave shit is archaic and misleading on technical grounds. Thankfully, 
> there's only a handful of files in code and documentation that still talk 
> about masters and slaves. We should replace all of them.
> There are so many ways to reword it. In fact, unless anyone else objects or 
> wants to do the grunt work to help my stress levels, I will open the pull 
> request myself in effort to make this project and community more inviting to 
> people of all backgrounds and histories. We can have leader/follower, or 
> primary/secondary, but none of this Master/Slave nonsense. I'm sick of the 
> garbage. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14730) Build new SolrJ APIs without concrete classes like NamedList/Map

2020-08-12 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176777#comment-17176777
 ] 

Noble Paul edited comment on SOLR-14730 at 8/13/20, 5:51 AM:
-

This API will live side-by-side with the existing SolrJ APIs. 




So, we can commit this to the branch_8x as and when they are ready. There is no 
problem in using NamedList in the server. 

 

The deserializing code can deserialize a NamedList into any interface that we 
define. 

 

{quote} Before starting in earnest, I suggest doing just one API as an example 
for peer review before going further.{quote}

Totally. This should be 100% peer reviewed before merging in

 


was (Author: noble.paul):
This API will live side-by-side with the existing SolrJ APIs. 




So, we can commit this to the branch_8x as and when they are ready. There is no 
problem in using NamedList in the server. 

 

The deserializing code can deserialize a NamedList into any interface that we 
define. 

 

 bq.Before starting in earnest, I suggest doing just one API as an example for 
peer review before going further.

Totally. This should be 100% peer reviewed before merging in

 

> Build new SolrJ APIs without concrete classes like NamedList/Map
> 
>
> Key: SOLR-14730
> URL: https://issues.apache.org/jira/browse/SOLR-14730
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>  Labels: clean-api
>
> We must minimize weakly typed code. Our public APIs should be programmed 
> against interfaces and wherever possible use POJOs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14730) Build new SolrJ APIs without concrete classes like NamedList/Map

2020-08-12 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176777#comment-17176777
 ] 

Noble Paul commented on SOLR-14730:
---

This API will live side-by-side with the existing SolrJ APIs. 




So, we can commit this to the branch_8x as and when they are ready. There is no 
problem in using NamedList in the server. 

 

The deserializing code can deserialize a NamedList into any interface that we 
define. 

 

 bq.Before starting in earnest, I suggest doing just one API as an example for 
peer review before going further.

Totally. This should be 100% peer reviewed before merging in

 

> Build new SolrJ APIs without concrete classes like NamedList/Map
> 
>
> Key: SOLR-14730
> URL: https://issues.apache.org/jira/browse/SOLR-14730
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>  Labels: clean-api
>
> We must minimize weakly typed code. Our public APIs should be programmed 
> against interfaces and wherever possible use POJOs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14730) Build new SolrJ APIs without concrete classes like NamedList/Map

2020-08-12 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176775#comment-17176775
 ] 

David Smiley commented on SOLR-14730:
-

I assume 9x only due to API breaks?  Is this module SolrJ only or maybe at 
least just for now to limit scope, keep the changes there?  After all, 
NamedList is use freaking everywhere in Solr.
Before starting in earnest, I suggest doing just one API as an example for peer 
review before going further.

> Build new SolrJ APIs without concrete classes like NamedList/Map
> 
>
> Key: SOLR-14730
> URL: https://issues.apache.org/jira/browse/SOLR-14730
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>  Labels: clean-api
>
> We must minimize weakly typed code. Our public APIs should be programmed 
> against interfaces and wherever possible use POJOs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2020-08-12 Thread Atri Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176774#comment-17176774
 ] 

Atri Sharma commented on SOLR-13350:


I have been bringing this up to date and fixing some outstanding comments at:

 

[https://github.com/atris/lucene-solr/tree/solr-13350]

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on pull request #1721: LUCENE-9439: match region highlighter components

2020-08-12 Thread GitBox


dsmiley commented on pull request #1721:
URL: https://github.com/apache/lucene-solr/pull/1721#issuecomment-673267859


   Can a description be posted somewhere (e.g. description of this PR; it's 
blank) describing this highlighter?  I specifically wonder how it positions 
itself relative to the other highlighters.  e.g. what use-case does this solve 
that is new?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14687) Make child/parent query parsers natively aware of _nest_path_

2020-08-12 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176768#comment-17176768
 ] 

David Smiley commented on SOLR-14687:
-

I'm glad to see you are tackling this, Hoss!  You are thorough as usual.

> Make child/parent query parsers natively aware of _nest_path_
> -
>
> Key: SOLR-14687
> URL: https://issues.apache.org/jira/browse/SOLR-14687
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Chris M. Hostetter
>Priority: Major
>
> A long standing pain point of the parent/child QParsers is the "all parents" 
> bitmask/filter specified via the "which" and "of" params (respectively).
> This is particularly tricky/painful to "get right" when dealing with 
> multi-level nested documents...
>  * 
> https://issues.apache.org/jira/browse/SOLR-14383?focusedCommentId=17166339&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17166339
>  * 
> [https://lists.apache.org/thread.html/r7633a366dd76e7ce9d98e6b9f2a65da8af8240e846f789d938c8113f%40%3Csolr-user.lucene.apache.org%3E]
> ...and it's *really* hard to get right when the nested structure isn't 100% 
> consistent among all docs:
>  * collections that mix docs w/o children and docs that have children.
>  ** Ex: blog posts, some of which have child docs that are "comments", but 
> some don't
>  * when some "types" of documents can exist at multiple levels:
>  ** Ex: top level "product" documents, which may have 2 types of children: 
> "skus" and "manuals", but "skus" may also have their own wku-specific child 
> "manuals"
> BUT! ... now that we have some semi-native support for the {{_nest_path_}} 
> field, i think it may be possible to offer an "easier to use" variant syntax 
> of the parent/child QParsers that directly depends on these fields. This new 
> syntax should be optional – and purely syntactic sugar. "expert" users should 
> be able to do all the same things using the existing syntax (possibly more 
> efficiently depending on what invarients exist in their data model)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on pull request #1737: SOLR-14615: Implement CPU Utilization Based Circuit Breaker

2020-08-12 Thread GitBox


atris commented on pull request #1737:
URL: https://github.com/apache/lucene-solr/pull/1737#issuecomment-673261051


   @madrob Any further thoughts on this?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9373) Allow FunctionMatchQuery to customize matchCost of TwoPhaseIterator

2020-08-12 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-9373:
-
Attachment: LUCENE-9373.patch
  Assignee: David Smiley
Status: Open  (was: Open)

> Allow FunctionMatchQuery to customize matchCost of TwoPhaseIterator
> ---
>
> Key: LUCENE-9373
> URL: https://issues.apache.org/jira/browse/LUCENE-9373
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/queries
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
>  Labels: newdev
> Attachments: LUCENE-9373.patch, LUCENE-9373.patch
>
>
> FunctionMatchQuery internally has a TwoPhaseIterator using a constant 
> matchCost.  If it were customizable by the query, the user could control this 
> ordering.  I propose an optional matchCost via an overloaded constructor.  
> Ideally the DoubleValues abstraction would have a matchCost but it doesn't, 
> and even if it did, the user might just want real control over this at a 
> query construction/parse level.
> See similar LUCENE-9114



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14712) Standardize internal Solr-to-Solr RPC API

2020-08-12 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176763#comment-17176763
 ] 

David Smiley commented on SOLR-14712:
-

The to-be-replaced with is gorgeous! Thanks Noble.

> Standardize internal Solr-to-Solr RPC API
> -
>
> Key: SOLR-14712
> URL: https://issues.apache.org/jira/browse/SOLR-14712
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: clean-api
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We should have a standard mechanism to make a request to the right 
> replica/node across solr code.
> This RPC mechanism assumes that
>  * The RPC mechanism is HTTP
>  * It is aware of all collections,shards & their topology etc
>  * it knows how to route a request to the correct core
>  This is agnostic of wire level formats ,Solr documents etc. That is a layer 
> above this.
> Anyone can use their own JSON parser or any other RPC wire level format on 
> top of this
> for example a code like this 
> {code}
> private void invokeOverseerOp(String electionNode, String op) {
> ModifiableSolrParams params = new ModifiableSolrParams();
>  ShardHandler shardHandler = shardHandlerFactory.getShardHandler();
>  params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString());
>  params.set("op", op);
>  params.set("qt", adminPath);
>  params.set("electionNode", electionNode);
>  ShardRequest sreq = new ShardRequest();
>  sreq.purpose = 1;
>  String replica = 
> zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode));
>  sreq.shards = new String[]\{replica};
>  sreq.actualShards = sreq.shards;
>  sreq.params = params;
>  shardHandler.submit(sreq, replica, sreq.params);
>  shardHandler.takeCompletedOrError();
> }
> {code}
> will be replaced with
> {code}
> private void invokeOverseerOp(String electionNode, String op) {
>  RpcFactory factory = null;
> factory.createCallRouter()
> .toNode(electionNode)
> .createHttpRpc()
> .withMethod(SolrRequest.METHOD.GET)
> .addParam(CoreAdminParams.ACTION, 
> CoreAdminAction.OVERSEEROP.toString())
> .addParam("op", op)
> .addParam("electionNode", electionNode)
> .addParam(ShardParams.SHARDS_PURPOSE, 1)
> .withV1Path(adminPath)
> .invoke();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14712) Standardize internal Solr-to-Solr RPC API

2020-08-12 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-14712:

Summary: Standardize internal Solr-to-Solr RPC API  (was: Standardize RPC 
calls in Solr)

> Standardize internal Solr-to-Solr RPC API
> -
>
> Key: SOLR-14712
> URL: https://issues.apache.org/jira/browse/SOLR-14712
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: clean-api
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We should have a standard mechanism to make a request to the right 
> replica/node across solr code.
> This RPC mechanism assumes that
>  * The RPC mechanism is HTTP
>  * It is aware of all collections,shards & their topology etc
>  * it knows how to route a request to the correct core
>  This is agnostic of wire level formats ,Solr documents etc. That is a layer 
> above this.
> Anyone can use their own JSON parser or any other RPC wire level format on 
> top of this
> for example a code like this 
> {code}
> private void invokeOverseerOp(String electionNode, String op) {
> ModifiableSolrParams params = new ModifiableSolrParams();
>  ShardHandler shardHandler = shardHandlerFactory.getShardHandler();
>  params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString());
>  params.set("op", op);
>  params.set("qt", adminPath);
>  params.set("electionNode", electionNode);
>  ShardRequest sreq = new ShardRequest();
>  sreq.purpose = 1;
>  String replica = 
> zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode));
>  sreq.shards = new String[]\{replica};
>  sreq.actualShards = sreq.shards;
>  sreq.params = params;
>  shardHandler.submit(sreq, replica, sreq.params);
>  shardHandler.takeCompletedOrError();
> }
> {code}
> will be replaced with
> {code}
> private void invokeOverseerOp(String electionNode, String op) {
>  RpcFactory factory = null;
> factory.createCallRouter()
> .toNode(electionNode)
> .createHttpRpc()
> .withMethod(SolrRequest.METHOD.GET)
> .addParam(CoreAdminParams.ACTION, 
> CoreAdminAction.OVERSEEROP.toString())
> .addParam("op", op)
> .addParam("electionNode", electionNode)
> .addParam(ShardParams.SHARDS_PURPOSE, 1)
> .withV1Path(adminPath)
> .invoke();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-08-12 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176231#comment-17176231
 ] 

Mark Robert Miller edited comment on SOLR-14636 at 8/13/20, 4:58 AM:
-

Hi [~markrmiller] , as curious as a was, i tried to compile it and load it with 
custom libs, cores and data to see how it performs. Unfortunately i cannot get 
it to run, nor does the internal ZK listen to upconfig calls.

 

This is what i get from the logs:
{code:java}
2020-08-12 10:09:42.999 INFO  (main) [   ] o.a.s.c.CorePropertiesLocator Found 
0 core definitions underneath /home/markus/temp/lucene-solr/solr/server/solr
2020-08-12 10:09:42.999 INFO  (main) [   ] o.a.s.c.ParWork No work collected to 
submit
2020-08-12 10:10:18.169 WARN  (solr-jetty-thread-1-thread-12) [   ] 
o.e.j.s.HttpChannel /solr/ => java.lang.NullPointerException
at org.apache.solr.servlet.SolrQoSFilter.doFilter(SolrQoSFilter.java:63)
java.lang.NullPointerException: null
at 
org.apache.solr.servlet.SolrQoSFilter.doFilter(SolrQoSFilter.java:63) 
~[solr-core-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT 
750d69b54614ed1cfa895c455da3efd7f0d71cc3 - markus - 2020-08-12 12:05:18]
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1604)
 ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227]
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) 
~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227]
 {code}
My tree is up to date. Any thoughts?

 

 Te edeDre 


was (Author: markus17):
Hi [~markrmiller] , as curious as a was, i tried to compile it and load it with 
custom libs, cores and data to see how it performs. Unfortunately i cannot get 
it to run, nor does the internal ZK listen to upconfig calls.

 

This is what i get from the logs:
{code:java}
2020-08-12 10:09:42.999 INFO  (main) [   ] o.a.s.c.CorePropertiesLocator Found 
0 core definitions underneath /home/markus/temp/lucene-solr/solr/server/solr
2020-08-12 10:09:42.999 INFO  (main) [   ] o.a.s.c.ParWork No work collected to 
submit
2020-08-12 10:10:18.169 WARN  (solr-jetty-thread-1-thread-12) [   ] 
o.e.j.s.HttpChannel /solr/ => java.lang.NullPointerException
at org.apache.solr.servlet.SolrQoSFilter.doFilter(SolrQoSFilter.java:63)
java.lang.NullPointerException: null
at 
org.apache.solr.servlet.SolrQoSFilter.doFilter(SolrQoSFilter.java:63) 
~[solr-core-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT 
750d69b54614ed1cfa895c455da3efd7f0d71cc3 - markus - 2020-08-12 12:05:18]
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1604)
 ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227]
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) 
~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227]
 {code}
My tree is up to date. Any thoughts?

 

 

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
> Attachments: IMG_5575 (1).jpg, jenkins.png, solr-ref-branch.gif
>
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
> *location*: [https://github.com/apache/lucene-solr/tree/reference_impl]
> *status*: alpha
> *speed*: ludicrous
> *tests***:
>  * *core*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *solrj*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *test-framework*: *extremely stable* with {color:#de350b}*ignores*{color}
>  * *contrib/analysis-extras*: *extremely stable* with 
> {color:#de350b}*ignores*{color}
>  * *contrib/analytics*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/clustering*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *contrib/dataimporthandler*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/dataimporthandler-extras*: {color:#00875a}*extremely 
> stable*{color} with *{color:#de350b}ignores{color}*
>  * *contrib/extraction*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/jaegertracer-configurator*: {color:#00875a}*extremely 
> stable*{color} with {color:#de350b}*ignores*{color}
>  * *contrib/langid*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/prometheus-exporter*: {color:#00875a}*extremely stable*{color} 
>

[jira] [Commented] (LUCENE-9373) Allow FunctionMatchQuery to customize matchCost of TwoPhaseIterator

2020-08-12 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176757#comment-17176757
 ] 

David Smiley commented on LUCENE-9373:
--

Thanks for the patch [~Maxim Glazkov] !  I applied it in my IDE and made some 
trivial changes (e.g. I prefer "final" to be _after_ "static"), and some 
javadoc edits to link directly to TPI matchCost.  I learned something new from 
your patch – the javadoc \{\@value } reference.

I noticed that there's an equals & hashCode that do not take this new matchCost 
into consideration.  I suppose that's for the best because it's only an 
implementation hint; it should not change any semantics.  I added a comment 
about that.

I wondered -- what if one day DoubleValues finally has its own matchCost -- 
what then.  It would not obsolete what we have here because the matchCost here 
is both explicit (maybe the user-developer knows better what the matchCost 
should be), and furthermore the cost is more than the DoubleValues -- it's also 
that of the predicate.

[~romseygeek] Unrelated to this issue/patch but pertinent to MatchCostQuery: I 
see MatchCostQuery uses a DoublePredicate and it includes this in the 
equals/hashcode as it should.  Shouldn't we advice the caller in javadocs that 
the predicate _must_ implement equals/hashcode _if_ this query might be 
cached?.  Failing to do so (e.g. a lambda impl) will mean the query will never 
get a cache hit.  Maybe the very use of DoublePredicate is just asking for 
trouble and we should define a class similarly with equals/hashcode as abstract?

> Allow FunctionMatchQuery to customize matchCost of TwoPhaseIterator
> ---
>
> Key: LUCENE-9373
> URL: https://issues.apache.org/jira/browse/LUCENE-9373
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/queries
>Reporter: David Smiley
>Priority: Major
>  Labels: newdev
> Attachments: LUCENE-9373.patch
>
>
> FunctionMatchQuery internally has a TwoPhaseIterator using a constant 
> matchCost.  If it were customizable by the query, the user could control this 
> ordering.  I propose an optional matchCost via an overloaded constructor.  
> Ideally the DoubleValues abstraction would have a matchCost but it doesn't, 
> and even if it did, the user might just want real control over this at a 
> query construction/parse level.
> See similar LUCENE-9114



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14680) Provide simple interfaces to our concrete SolrCloud classes

2020-08-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176750#comment-17176750
 ] 

ASF subversion and git services commented on SOLR-14680:


Commit 0b55c94ad664278d4ef5079810a709c1cb7c4457 in lucene-solr's branch 
refs/heads/master from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=0b55c94 ]

SOLR-14680: make jdk 8 compatible


> Provide simple interfaces to our concrete SolrCloud classes
> ---
>
> Key: SOLR-14680
> URL: https://issues.apache.org/jira/browse/SOLR-14680
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> All our current implementations of SolrCloud such as 
> # ClusterState
> # DocCollection
> # Slice
> # Replica
> etc are concrete classes. Providing alternate implementations or wrappers is 
> extremely difficult. 
> SOLR-14613 is attempting to create  such interfaces to make their sdk simpler
> The objective is not to have a comprehensive set of methods in these 
> interfaces. We will start out with a subset of required interfaces. We 
> guarantee is that signatures of methods in these interfaces will not be 
> deleted/changed . But we may add more methods as and when it suits us



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14680) Provide simple interfaces to our concrete SolrCloud classes

2020-08-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176746#comment-17176746
 ] 

ASF subversion and git services commented on SOLR-14680:


Commit 898b75d4333c6a45da5f44e31dadf0f005f49391 in lucene-solr's branch 
refs/heads/branch_8x from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=898b75d ]

SOLR-14680: Provide an implementation for the new SolrCluster API (#1730)


> Provide simple interfaces to our concrete SolrCloud classes
> ---
>
> Key: SOLR-14680
> URL: https://issues.apache.org/jira/browse/SOLR-14680
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> All our current implementations of SolrCloud such as 
> # ClusterState
> # DocCollection
> # Slice
> # Replica
> etc are concrete classes. Providing alternate implementations or wrappers is 
> extremely difficult. 
> SOLR-14613 is attempting to create  such interfaces to make their sdk simpler
> The objective is not to have a comprehensive set of methods in these 
> interfaces. We will start out with a subset of required interfaces. We 
> guarantee is that signatures of methods in these interfaces will not be 
> deleted/changed . But we may add more methods as and when it suits us



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14680) Provide simple interfaces to our concrete SolrCloud classes

2020-08-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176747#comment-17176747
 ] 

ASF subversion and git services commented on SOLR-14680:


Commit 3b3e46a0b4a97946d7ff3c1b54a30a092d302b4e in lucene-solr's branch 
refs/heads/branch_8x from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3b3e46a ]

SOLR-14680: make jdk 8 compatible


> Provide simple interfaces to our concrete SolrCloud classes
> ---
>
> Key: SOLR-14680
> URL: https://issues.apache.org/jira/browse/SOLR-14680
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> All our current implementations of SolrCloud such as 
> # ClusterState
> # DocCollection
> # Slice
> # Replica
> etc are concrete classes. Providing alternate implementations or wrappers is 
> extremely difficult. 
> SOLR-14613 is attempting to create  such interfaces to make their sdk simpler
> The objective is not to have a comprehensive set of methods in these 
> interfaces. We will start out with a subset of required interfaces. We 
> guarantee is that signatures of methods in these interfaces will not be 
> deleted/changed . But we may add more methods as and when it suits us



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14680) Provide simple interfaces to our concrete SolrCloud classes

2020-08-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176742#comment-17176742
 ] 

ASF subversion and git services commented on SOLR-14680:


Commit d517361bb1fbcb040f1c7cda103127ed0f3b6a2c in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d517361 ]

SOLR-14680: Provide an implementation for the new SolrCluster API (#1730)



> Provide simple interfaces to our concrete SolrCloud classes
> ---
>
> Key: SOLR-14680
> URL: https://issues.apache.org/jira/browse/SOLR-14680
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> All our current implementations of SolrCloud such as 
> # ClusterState
> # DocCollection
> # Slice
> # Replica
> etc are concrete classes. Providing alternate implementations or wrappers is 
> extremely difficult. 
> SOLR-14613 is attempting to create  such interfaces to make their sdk simpler
> The objective is not to have a comprehensive set of methods in these 
> interfaces. We will start out with a subset of required interfaces. We 
> guarantee is that signatures of methods in these interfaces will not be 
> deleted/changed . But we may add more methods as and when it suits us



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14680) Provide simple interfaces to our concrete SolrCloud classes

2020-08-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176743#comment-17176743
 ] 

ASF subversion and git services commented on SOLR-14680:


Commit fc76180394964a275f47031f5a7ca8e1c47bc425 in lucene-solr's branch 
refs/heads/branch_8x from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fc76180 ]

SOLR-14680: Provide simple interfaces to our cloud classes  (only API) (#1694)



> Provide simple interfaces to our concrete SolrCloud classes
> ---
>
> Key: SOLR-14680
> URL: https://issues.apache.org/jira/browse/SOLR-14680
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> All our current implementations of SolrCloud such as 
> # ClusterState
> # DocCollection
> # Slice
> # Replica
> etc are concrete classes. Providing alternate implementations or wrappers is 
> extremely difficult. 
> SOLR-14613 is attempting to create  such interfaces to make their sdk simpler
> The objective is not to have a comprehensive set of methods in these 
> interfaces. We will start out with a subset of required interfaces. We 
> guarantee is that signatures of methods in these interfaces will not be 
> deleted/changed . But we may add more methods as and when it suits us



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14712) Standardize RPC calls in Solr

2020-08-12 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14712:
--
Labels: clean-api  (was: )

> Standardize RPC calls in Solr
> -
>
> Key: SOLR-14712
> URL: https://issues.apache.org/jira/browse/SOLR-14712
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: clean-api
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We should have a standard mechanism to make a request to the right 
> replica/node across solr code.
> This RPC mechanism assumes that
>  * The RPC mechanism is HTTP
>  * It is aware of all collections,shards & their topology etc
>  * it knows how to route a request to the correct core
>  This is agnostic of wire level formats ,Solr documents etc. That is a layer 
> above this.
> Anyone can use their own JSON parser or any other RPC wire level format on 
> top of this
> for example a code like this 
> {code}
> private void invokeOverseerOp(String electionNode, String op) {
> ModifiableSolrParams params = new ModifiableSolrParams();
>  ShardHandler shardHandler = shardHandlerFactory.getShardHandler();
>  params.set(CoreAdminParams.ACTION, CoreAdminAction.OVERSEEROP.toString());
>  params.set("op", op);
>  params.set("qt", adminPath);
>  params.set("electionNode", electionNode);
>  ShardRequest sreq = new ShardRequest();
>  sreq.purpose = 1;
>  String replica = 
> zkStateReader.getBaseUrlForNodeName(LeaderElector.getNodeName(electionNode));
>  sreq.shards = new String[]\{replica};
>  sreq.actualShards = sreq.shards;
>  sreq.params = params;
>  shardHandler.submit(sreq, replica, sreq.params);
>  shardHandler.takeCompletedOrError();
> }
> {code}
> will be replaced with
> {code}
> private void invokeOverseerOp(String electionNode, String op) {
>  RpcFactory factory = null;
> factory.createCallRouter()
> .toNode(electionNode)
> .createHttpRpc()
> .withMethod(SolrRequest.METHOD.GET)
> .addParam(CoreAdminParams.ACTION, 
> CoreAdminAction.OVERSEEROP.toString())
> .addParam("op", op)
> .addParam("electionNode", electionNode)
> .addParam(ShardParams.SHARDS_PURPOSE, 1)
> .withV1Path(adminPath)
> .invoke();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul merged pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API

2020-08-12 Thread GitBox


noblepaul merged pull request #1730:
URL: https://github.com/apache/lucene-solr/pull/1730


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8776) Start offset going backwards has a legitimate purpose

2020-08-12 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176735#comment-17176735
 ] 

David Smiley commented on LUCENE-8776:
--

I really like the relatively new offset ordering constraint _as a default_ for 
some of the reasons Simon gave. There's this sensibility to it; it just makes 
sense intuitively in a way that needs no defense.  I upgraded a bunch of old 
code to follow this rule and I'm happier with the upgraded version of those 
components than the prior behavior.  *But* then there's some 
interesting/advanced cases where the rule is simply impossible to follow.  What 
I'd like to see is a way for expert users to toggle this off.  Perhaps a 
setting on the Analyzer passed to IndexWriterConfig or just some other setting 
on IndexWriterConfig.  Or maybe a read-only setting on the OffsetAttribute 
(requiring a custom impl)?  That'd be my preference.  I think a pluggable 
IndexingChain thing seems too invasive / too difficult to get right.  I'm 
willing to roll up my sleeves and make this setting happen if I know in advance 
I'm not going to be vetoed on principle.

> Start offset going backwards has a legitimate purpose
> -
>
> Key: LUCENE-8776
> URL: https://issues.apache.org/jira/browse/LUCENE-8776
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 7.6
>Reporter: Ram Venkat
>Priority: Major
>
> Here is the use case where startOffset can go backwards:
> Say there is a line "Organic light-emitting-diode glows", and I want to run 
> span queries and highlight them properly. 
> During index time, light-emitting-diode is split into three words, which 
> allows me to search for 'light', 'emitting' and 'diode' individually. The 
> three words occupy adjacent positions in the index, as 'light' adjacent to 
> 'emitting' and 'light' at a distance of two words from 'diode' need to match 
> this word. So, the order of words after splitting are: Organic, light, 
> emitting, diode, glows. 
> But, I also want to search for 'organic' being adjacent to 
> 'light-emitting-diode' or 'light-emitting-diode' being adjacent to 'glows'. 
> The way I solved this was to also generate 'light-emitting-diode' at two 
> positions: (a) In the same position as 'light' and (b) in the same position 
> as 'glows', like below:
> ||organic||light||emitting||diode||glows||
> | |light-emitting-diode| |light-emitting-diode| |
> |0|1|2|3|4|
> The positions of the two 'light-emitting-diode' are 1 and 3, but the offsets 
> are obviously the same. This works beautifully in Lucene 5.x in both 
> searching and highlighting with span queries. 
> But when I try this in Lucene 7.6, it hits the condition "Offsets must not go 
> backwards" at DefaultIndexingChain:818. This IllegalArgumentException is 
> being thrown without any comments on why this check is needed. As I explained 
> above, startOffset going backwards is perfectly valid, to deal with word 
> splitting and span operations on these specialized use cases. On the other 
> hand, it is not clear what value is added by this check and which highlighter 
> code is affected by offsets going backwards. This same check is done at 
> BaseTokenStreamTestCase:245. 
> I see others talk about how this check found bugs in WordDelimiter etc. but 
> it also prevents legitimate use cases. Can this check be removed?  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14748) Correct SSL/Auth startup warning

2020-08-12 Thread Jason Gerlowski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski resolved SOLR-14748.

Fix Version/s: 8.7
   master (9.0)
   Resolution: Fixed

> Correct SSL/Auth startup warning
> 
>
> Key: SOLR-14748
> URL: https://issues.apache.org/jira/browse/SOLR-14748
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: master (9.0), 8.6
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Minor
> Fix For: master (9.0), 8.7
>
>
> In 8.4 (as a part of SOLR-13972) I added some warn logging that was intended 
> to warn users if they were using auth without SSL (and therefore exposing 
> their passwords over the network).
> But apparently I inverted the check logic prior to committing and the error 
> now displays incorrectly.
> Fixing this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14748) Correct SSL/Auth startup warning

2020-08-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176732#comment-17176732
 ] 

ASF subversion and git services commented on SOLR-14748:


Commit 50b6cf1e3b51801926ef312a2a047a87d8ab6630 in lucene-solr's branch 
refs/heads/branch_8x from Jason Gerlowski
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=50b6cf1 ]

SOLR-14748: Correct condition on startup auth/ssl logging


> Correct SSL/Auth startup warning
> 
>
> Key: SOLR-14748
> URL: https://issues.apache.org/jira/browse/SOLR-14748
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: master (9.0), 8.6
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Minor
>
> In 8.4 (as a part of SOLR-13972) I added some warn logging that was intended 
> to warn users if they were using auth without SSL (and therefore exposing 
> their passwords over the network).
> But apparently I inverted the check logic prior to committing and the error 
> now displays incorrectly.
> Fixing this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-08-12 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176729#comment-17176729
 ] 

Mark Robert Miller commented on SOLR-14636:
---

Thanks [~ichattopadhyaya], you have kept me motivated since last year. It 
sounds a bit bombastic to claim it’s so much better, and like a huge Infinite 
Jest Novel, it will be difficult to see what’s there fully. The number of 
things that I have fixed that only go wrong or slow down on rarer occasions is 
staggering. With just me, the version of Solr is ready to push past the 
competition. If I can get other devs working on it and we can actually make 
larger more fundamental changes, the sky is the limit. 

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
> Attachments: IMG_5575 (1).jpg, jenkins.png, solr-ref-branch.gif
>
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
> *location*: [https://github.com/apache/lucene-solr/tree/reference_impl]
> *status*: alpha
> *speed*: ludicrous
> *tests***:
>  * *core*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *solrj*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *test-framework*: *extremely stable* with {color:#de350b}*ignores*{color}
>  * *contrib/analysis-extras*: *extremely stable* with 
> {color:#de350b}*ignores*{color}
>  * *contrib/analytics*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/clustering*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *contrib/dataimporthandler*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/dataimporthandler-extras*: {color:#00875a}*extremely 
> stable*{color} with *{color:#de350b}ignores{color}*
>  * *contrib/extraction*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/jaegertracer-configurator*: {color:#00875a}*extremely 
> stable*{color} with {color:#de350b}*ignores*{color}
>  * *contrib/langid*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/prometheus-exporter*: {color:#00875a}*extremely stable*{color} 
> with {color:#de350b}*ignores*{color}
>  * *contrib/velocity*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
>  _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-08-12 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176702#comment-17176702
 ] 

Mark Robert Miller commented on SOLR-14636:
---

Sorry about that delay - I didn’t realize I was on the hook for other stuff all 
last week and I’ve gotten pretty ambitious about thread management and there is 
just no corner of the code that couldn’t use improvement and the tests are fast 
as hell, but still not as fast as they should be and I’m shifting into second 
gear there. At the same time, it’s a big juggling trick, I’m getting wiped, the 
time pressure grows in multiple ways. But I need a good tester like you, I will 
try and have something by Friday. 

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
> Attachments: IMG_5575 (1).jpg, jenkins.png, solr-ref-branch.gif
>
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
> *location*: [https://github.com/apache/lucene-solr/tree/reference_impl]
> *status*: alpha
> *speed*: ludicrous
> *tests***:
>  * *core*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *solrj*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *test-framework*: *extremely stable* with {color:#de350b}*ignores*{color}
>  * *contrib/analysis-extras*: *extremely stable* with 
> {color:#de350b}*ignores*{color}
>  * *contrib/analytics*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/clustering*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *contrib/dataimporthandler*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/dataimporthandler-extras*: {color:#00875a}*extremely 
> stable*{color} with *{color:#de350b}ignores{color}*
>  * *contrib/extraction*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/jaegertracer-configurator*: {color:#00875a}*extremely 
> stable*{color} with {color:#de350b}*ignores*{color}
>  * *contrib/langid*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/prometheus-exporter*: {color:#00875a}*extremely stable*{color} 
> with {color:#de350b}*ignores*{color}
>  * *contrib/velocity*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
>  _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14687) Make child/parent query parsers natively aware of _nest_path_

2020-08-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176681#comment-17176681
 ] 

ASF subversion and git services commented on SOLR-14687:


Commit 3095dd23958aa3dd9190ca698e1258f994ad374c in lucene-solr's branch 
refs/heads/jira/SOLR-14383 from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3095dd2 ]

SOLR-14383: minor typo/inconsistency fix in sample doc ids

Also add a unit test of hueristic for 'safe' of/which params based on 
_nest_path_ before trying to figure out how to document it clearly

(This is basically the manual 'long form' way of writing child/parent queries 
using the (hypothetical) syntactic sugar described in SOLR-14687)


> Make child/parent query parsers natively aware of _nest_path_
> -
>
> Key: SOLR-14687
> URL: https://issues.apache.org/jira/browse/SOLR-14687
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Chris M. Hostetter
>Priority: Major
>
> A long standing pain point of the parent/child QParsers is the "all parents" 
> bitmask/filter specified via the "which" and "of" params (respectively).
> This is particularly tricky/painful to "get right" when dealing with 
> multi-level nested documents...
>  * 
> https://issues.apache.org/jira/browse/SOLR-14383?focusedCommentId=17166339&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17166339
>  * 
> [https://lists.apache.org/thread.html/r7633a366dd76e7ce9d98e6b9f2a65da8af8240e846f789d938c8113f%40%3Csolr-user.lucene.apache.org%3E]
> ...and it's *really* hard to get right when the nested structure isn't 100% 
> consistent among all docs:
>  * collections that mix docs w/o children and docs that have children.
>  ** Ex: blog posts, some of which have child docs that are "comments", but 
> some don't
>  * when some "types" of documents can exist at multiple levels:
>  ** Ex: top level "product" documents, which may have 2 types of children: 
> "skus" and "manuals", but "skus" may also have their own wku-specific child 
> "manuals"
> BUT! ... now that we have some semi-native support for the {{_nest_path_}} 
> field, i think it may be possible to offer an "easier to use" variant syntax 
> of the parent/child QParsers that directly depends on these fields. This new 
> syntax should be optional – and purely syntactic sugar. "expert" users should 
> be able to do all the same things using the existing syntax (possibly more 
> efficiently depending on what invarients exist in their data model)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14383) Fix indexing-nested-documents.adoc XML/JSON examples to be accurate, consistent, and clear

2020-08-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176680#comment-17176680
 ] 

ASF subversion and git services commented on SOLR-14383:


Commit 3095dd23958aa3dd9190ca698e1258f994ad374c in lucene-solr's branch 
refs/heads/jira/SOLR-14383 from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3095dd2 ]

SOLR-14383: minor typo/inconsistency fix in sample doc ids

Also add a unit test of hueristic for 'safe' of/which params based on 
_nest_path_ before trying to figure out how to document it clearly

(This is basically the manual 'long form' way of writing child/parent queries 
using the (hypothetical) syntactic sugar described in SOLR-14687)


> Fix indexing-nested-documents.adoc XML/JSON examples to be accurate, 
> consistent, and clear
> --
>
> Key: SOLR-14383
> URL: https://issues.apache.org/jira/browse/SOLR-14383
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14383.patch, SOLR-14383.patch, SOLR-14383.patch, 
> SOLR-14383.patch
>
>
> As reported on solr-user@lucene by Peter Pimley...
> {noformat}
> The page "Indexing Nested Documents" has an XML example showing two
> different ways of adding nested documents:
> https://lucene.apache.org/solr/guide/8_5/indexing-nested-documents.html#xml-examples
> The text says:
>   "It illustrates two styles of adding child documents: the first is
> associated via a field "comment" (preferred), and the second is done
> in the classic way now referred to as an "anonymous" or "unlabelled"
> child document."
> However in the XML directly below there is no field named "comment".
> There is one named "content" and another named "comments" (plural),
> but no field named "comment".  In fact, looking at the Json example
> immediately below, I wonder if the XML element currently named
> "content" should be named "comments", and what is currently marked
> "comments" should be "content"?
> Secondly, in the Json example it says:
>   "The labelled relationship here is one child document but could have
> been wrapped in array brackets."
> However in the actual Json, the parent document (ID=1) with a labelled
> relationship has two child documents (IDs 2 and 3), and they are
> already in array brackets.
> {noformat}
> * The 2 examples (XML and JSON) should be updated to contains *structurally* 
> identical content, (ie: same number of documents, with same field values, and 
> same hierarchical relationships) to focus on demonstrating the syntax 
> differences (ie: things like the special {{\_childDocuments\_}} key in json)
> * The paragraphs describing the examples should be updated to:
> ** refer to the correct field names -- since both "comments" and "contents" 
> fields exist in the examples, it's impossible for novice users to even 
> udnerstand where th "typo" might be in the descriptions (I'm pretty 
> knowledgeable about Solr and even i'm second guessing myself as to what the 
> intent in these paragraphs are)
> ** refer to documents by {{"id"}} value, not just descriptors like "first" 
> and "second" 
> * it might be worth considering rewriting this section to use "callouts": 
> https://asciidoctor.org/docs/user-manual/#callouts -- similar to how we use 
> them in other sections like this: 
> https://lucene.apache.org/solr/guide/8_5/uploading-data-with-index-handlers.html#sending-json-update-commands



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] anshumg commented on pull request #1744: SOLR-14731: Add SingleThreaded Annotation to Class

2020-08-12 Thread GitBox


anshumg commented on pull request #1744:
URL: https://github.com/apache/lucene-solr/pull/1744#issuecomment-673180596


   The annotation usage is correct, however I haven't taken a look at the code 
itself to validate if it's meant to be single threaded or not. Perhaps other 
folks who've been in here can chime in on that.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-5986) Don't allow runaway queries from harming Solr cluster health or search performance

2020-08-12 Thread Anshum Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176669#comment-17176669
 ] 

Anshum Gupta commented on SOLR-5986:


Not at the time. I think there was some value that the TLC provided in terms of 
tracking and exiting in a phase that the ExitableDirectoryReader didn't cover. 
The use case required for both to exist. Ideally, would have loved to have just 
a clean approach that tracked the time in one place, but that would've required 
a very different level of plumbing.

If the collection really takes most of the time, instead of the query 
expansion/rewriting I think you would still want TLC as a user.

> Don't allow runaway queries from harming Solr cluster health or search 
> performance
> --
>
> Key: SOLR-5986
> URL: https://issues.apache.org/jira/browse/SOLR-5986
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: Steve Davids
>Assignee: Anshum Gupta
>Priority: Critical
> Fix For: 5.0
>
> Attachments: SOLR-5986-fixtests.patch, SOLR-5986-fixtests.patch, 
> SOLR-5986-fixtests.patch, SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, 
> SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, 
> SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, 
> SOLR-5986.patch
>
>
> The intent of this ticket is to have all distributed search requests stop 
> wasting CPU cycles on requests that have already timed out or are so 
> complicated that they won't be able to execute. We have come across a case 
> where a nasty wildcard query within a proximity clause was causing the 
> cluster to enumerate terms for hours even though the query timeout was set to 
> minutes. This caused a noticeable slowdown within the system which made us 
> restart the replicas that happened to service that one request, the worst 
> case scenario are users with a relatively low zk timeout value will have 
> nodes start dropping from the cluster due to long GC pauses.
> [~amccurry] Built a mechanism into Apache Blur to help with the issue in 
> BLUR-142 (see commit comment for code, though look at the latest code on the 
> trunk for newer bug fixes).
> Solr should be able to either prevent these problematic queries from running 
> by some heuristic (possibly estimated size of heap usage) or be able to 
> execute a thread interrupt on all query threads once the time threshold is 
> met. This issue mirrors what others have discussed on the mailing list: 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-08-12 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176664#comment-17176664
 ] 

Mark Robert Miller commented on SOLR-14636:
---

Hey [~markus17], I was hoping to have things ready for others to play with last 
week, but now the hope is this week. I’ll share some getting started doc then. 
Would be great if you were able to take it for a spin. 

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
> Attachments: IMG_5575 (1).jpg, jenkins.png, solr-ref-branch.gif
>
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
> *location*: [https://github.com/apache/lucene-solr/tree/reference_impl]
> *status*: alpha
> *speed*: ludicrous
> *tests***:
>  * *core*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *solrj*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *test-framework*: *extremely stable* with {color:#de350b}*ignores*{color}
>  * *contrib/analysis-extras*: *extremely stable* with 
> {color:#de350b}*ignores*{color}
>  * *contrib/analytics*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/clustering*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *contrib/dataimporthandler*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/dataimporthandler-extras*: {color:#00875a}*extremely 
> stable*{color} with *{color:#de350b}ignores{color}*
>  * *contrib/extraction*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/jaegertracer-configurator*: {color:#00875a}*extremely 
> stable*{color} with {color:#de350b}*ignores*{color}
>  * *contrib/langid*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/prometheus-exporter*: {color:#00875a}*extremely stable*{color} 
> with {color:#de350b}*ignores*{color}
>  * *contrib/velocity*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
>  _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14748) Correct SSL/Auth startup warning

2020-08-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176638#comment-17176638
 ] 

ASF subversion and git services commented on SOLR-14748:


Commit a6515ca38f9813730d16f1d8eaba953e4cd130ca in lucene-solr's branch 
refs/heads/master from Jason Gerlowski
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a6515ca ]

SOLR-14748: Correct condition on startup auth/ssl logging


> Correct SSL/Auth startup warning
> 
>
> Key: SOLR-14748
> URL: https://issues.apache.org/jira/browse/SOLR-14748
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: master (9.0), 8.6
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Minor
>
> In 8.4 (as a part of SOLR-13972) I added some warn logging that was intended 
> to warn users if they were using auth without SSL (and therefore exposing 
> their passwords over the network).
> But apparently I inverted the check logic prior to committing and the error 
> now displays incorrectly.
> Fixing this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution

2020-08-12 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-13412.
---
Resolution: Won't Fix

After discussion, adding a pre-packaged Luke app with Solr isn't a good 
architectural fit, so we shouldn't do it.

> Make the Lucene Luke module available from a Solr distribution
> --
>
> Key: SOLR-13412
> URL: https://issues.apache.org/jira/browse/SOLR-13412
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-13412.patch
>
>
> Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the 
> project, I think it would be good to be able to access it from a Solr distro.
> I want to go to the right place under the Solr install directory and start 
> Luke up to examine the local indexes. 
> This ticket is explicitly _not_ about accessing it from the admin UI, Luke is 
> a stand-alone app that must be invoked on the node that has a Lucene index on 
> the local filesystem
> We need to 
>  * have it included in Solr when running "ant package". 
>  * add some bits to the ref guide on how to invoke
>  ** Where to invoke it from
>  ** mention anything that has to be installed.
>  ** any other "gotchas" someone just installing Solr should be aware of.
>  * Ant should not be necessary.
>  * 
>  
> I'll assign this to myself to keep track of, but would not be offended in the 
> least if someone with more knowledge of "ant package" and the like wanted to 
> take it over ;)
> If we can do it at all



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mayya-sharipova commented on pull request #1725: LUCENE-9449 Skip docs with _doc sort and "after"

2020-08-12 Thread GitBox


mayya-sharipova commented on pull request #1725:
URL: https://github.com/apache/lucene-solr/pull/1725#issuecomment-673129734


   @jimczi Thank you for the initial feedback.  I tried to address it, can you 
please continue the review



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1725: LUCENE-9449 Skip docs with _doc sort and "after"

2020-08-12 Thread GitBox


mayya-sharipova commented on a change in pull request #1725:
URL: https://github.com/apache/lucene-solr/pull/1725#discussion_r469568119



##
File path: 
lucene/core/src/java/org/apache/lucene/search/FilteringDocLeafComparator.java
##
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import org.apache.lucene.index.LeafReaderContext;
+
+import java.io.IOException;
+
+/**
+ * This comparator is used when there is sort by _doc asc together with 
"after" FieldDoc.
+ * The comparator provides an iterator that can quickly skip to the desired 
"after" document.
+ */
+public class FilteringDocLeafComparator implements 
FilteringLeafFieldComparator {

Review comment:
   I like `AfterDocLeafComparator`, but I renamed to 
`FilteringAfterDocLeafComparator` for consistency with all other filtering 
comparators. Please let me know if you still like it to be renamed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1725: LUCENE-9449 Skip docs with _doc sort and "after"

2020-08-12 Thread GitBox


mayya-sharipova commented on a change in pull request #1725:
URL: https://github.com/apache/lucene-solr/pull/1725#discussion_r469566632



##
File path: lucene/core/src/java/org/apache/lucene/search/FieldValueHitQueue.java
##
@@ -121,7 +121,7 @@ protected boolean lessThan(final Entry hitA, final Entry 
hitB) {
   }
   
   // prevent instantiation and extension.
-  private FieldValueHitQueue(SortField[] fields, int size, boolean 
filterNonCompetitiveDocs) {
+  private FieldValueHitQueue(SortField[] fields, int size, boolean 
filterNonCompetitiveDocs, boolean hasAfter) {

Review comment:
   At this point of time, topValue for comparators is not set yet, that's 
why we need `hasAfter`.
   As an alternative to this implementation, we can pass `FieldDoc after` to 
`FieldValueHitQueue.create` and setTopValue during `FieldValueHitQueue` 
creation.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1725: LUCENE-9449 Skip docs with _doc sort and "after"

2020-08-12 Thread GitBox


mayya-sharipova commented on a change in pull request #1725:
URL: https://github.com/apache/lucene-solr/pull/1725#discussion_r469565524



##
File path: 
lucene/core/src/java/org/apache/lucene/search/FilteringFieldComparator.java
##
@@ -68,10 +68,12 @@ public int compareValues(T first, T second) {
* @param comparator – comparator to wrap
* @param reverse – if this sort is reverse
* @param singleSort – true if this sort is based on a single field and 
there are no other sort fields for tie breaking
+   * @param hasAfter – true if this sort has after FieldDoc
* @return comparator wrapped as a filtering comparator or the original 
comparator if the filtering functionality
* is not implemented for it
*/
-  public static FieldComparator 
wrapToFilteringComparator(FieldComparator comparator, boolean reverse, 
boolean singleSort) {
+  public static FieldComparator 
wrapToFilteringComparator(FieldComparator comparator, boolean reverse, 
boolean singleSort,
+  boolean hasAfter) {

Review comment:
   At that moment topValue is not set yet, it will be set later in the 
constructor of `PagingFieldCollector`.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9457) Why is Kuromoji tokenization throughput bimodal?

2020-08-12 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176609#comment-17176609
 ] 

Michael McCandless commented on LUCENE-9457:


{quote}It's one of those things that are exciting to debug, take days to 
complete and sometimes never reach any reasonable explanation. :)
{quote}
LOL I fear you have already handled too many such cases!

> Why is Kuromoji tokenization throughput bimodal?
> 
>
> Key: LUCENE-9457
> URL: https://issues.apache.org/jira/browse/LUCENE-9457
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> With the recent accidental regression of Japanese (Kuromoji) tokenization 
> throughput due to exciting FST optimizations, we [added new nightly Lucene 
> benchmarks|https://github.com/mikemccand/luceneutil/issues/64] to measure 
> tokenization throughput for {{JapaneseTokenizer}}: 
> [https://home.apache.org/~mikemccand/lucenebench/analyzers.html]
> It has already been running for ~5-6 weeks now!  But for some reason, it 
> looks bi-modal?  "Normally" it is ~.45 M tokens/sec, but for two data points 
> it dropped down to ~.33 M tokens/sec, which is odd.  It could be hotspot 
> noise maybe?  But would be good to get to the root cause and fix it if 
> possible.
> Hotspot noise that randomly steals ~27% of your tokenization throughput is no 
> good!!
> Or does anyone have any other ideas of what could be bi-modal in Kuromoji?  I 
> don't think [this performance 
> test|https://github.com/mikemccand/luceneutil/blob/master/src/main/perf/TestAnalyzerPerf.java]
>  has any randomness in it...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1743: Gradual naming convention enforcement.

2020-08-12 Thread GitBox


mikemccand commented on a change in pull request #1743:
URL: https://github.com/apache/lucene-solr/pull/1743#discussion_r469551162



##
File path: 
lucene/test-framework/src/java/org/apache/lucene/util/VerifyTestClassNamingConvention.java
##
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import com.carrotsearch.randomizedtesting.RandomizedContext;
+import org.junit.Assume;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.io.UncheckedIOException;
+import java.io.Writer;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.StandardOpenOption;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+
+/**
+ * Enforce test naming convention.
+ */
+public class VerifyTestClassNamingConvention extends AbstractBeforeAfterRule {
+  public static final Pattern ALLOWED_CONVENTION = 
Pattern.compile("(.+?)\\.Test[^.]+");
+
+  private static Set exceptions;
+  static {
+try {
+  exceptions = new HashSet<>();
+  try (BufferedReader is =
+ new BufferedReader(
+ new InputStreamReader(
+   
VerifyTestClassNamingConvention.class.getResourceAsStream("test-naming-exceptions.txt"),
+   StandardCharsets.UTF_8))) {
+is.lines().forEach(exceptions::add);
+  }
+} catch (IOException e) {
+  throw new UncheckedIOException(e);
+}
+  }
+
+  @Override
+  protected void before() throws Exception {
+if (TestRuleIgnoreTestSuites.isRunningNested()) {
+  // Ignore nested test suites that test the test framework itself.
+  return;
+}
+
+String suiteName = RandomizedContext.current().getTargetClass().getName();
+
+// You can use this helper method to dump all suite names to a file.
+// Run gradle with one worker so that it doesn't try to append to the same
+// file from multiple processes:
+//
+// gradlew  test --max-workers 1 -Dtests.useSecurityManager=false
+//
+// dumpSuiteNamesOnly(suiteName);
+
+if (!ALLOWED_CONVENTION.matcher(suiteName).matches()) {
+  // if this class exists on the exception list, leave it.
+  if (!exceptions.contains("!" + suiteName)) {
+throw new AssertionError("Suite must follow Test*.java naming 
convention: "
+  + suiteName);
+  }
+}
+  }
+
+  private void dumpSuiteNamesOnly(String suiteName) throws IOException {
+// Has to be a global unique path (not a temp file because temp files
+// are different for each JVM).
+Path temporaryFile = Paths.get("c:\\_tmp\\test-naming-exceptions.txt");

Review comment:
   Hmm this is effectively a `// nocommit` right?  I.e. we must find the 
right place for this to live in the sources?

##
File path: 
lucene/test-framework/src/java/org/apache/lucene/util/VerifyTestClassNamingConvention.java
##
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import com.carrotsearch.randomizedtesting.RandomizedContext;
+import org.junit.Assume;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.io.UncheckedIOException;
+import java.io.Writer;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Files;
+

[jira] [Commented] (LUCENE-8626) standardise test class naming

2020-08-12 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176607#comment-17176607
 ] 

Michael McCandless commented on LUCENE-8626:


Ahh thanks [~dweiss], yeah +1 for this approach – progress not perfection!

Hmm did you link the PR here?  OK looks like: 
[https://github.com/apache/lucene-solr/pull/1743]

I left a couple comments on the PR.

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-12 Thread GitBox


mikemccand commented on pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#issuecomment-673112665


   Awesome, thanks @s1monw!  I will try to have a look soon.  I kicked off 
beasting of all Lucene (core + modules) tests with this change ... no failures 
yet after 31 iterations.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-5986) Don't allow runaway queries from harming Solr cluster health or search performance

2020-08-12 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176591#comment-17176591
 ] 

David Smiley commented on SOLR-5986:


[~anshum] did you consider _removing_ Solr's use of TimeLimitingCollector?  I 
believe ExitableDirectoryReader is superior in it's span of effect to TLC.  TLC 
_might_ extend slightly longer (?) but I don't think it's worth the complexity 
in maintaining competing mechanisms.  I can file an issue... what do you think?

> Don't allow runaway queries from harming Solr cluster health or search 
> performance
> --
>
> Key: SOLR-5986
> URL: https://issues.apache.org/jira/browse/SOLR-5986
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: Steve Davids
>Assignee: Anshum Gupta
>Priority: Critical
> Fix For: 5.0
>
> Attachments: SOLR-5986-fixtests.patch, SOLR-5986-fixtests.patch, 
> SOLR-5986-fixtests.patch, SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, 
> SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, 
> SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, 
> SOLR-5986.patch
>
>
> The intent of this ticket is to have all distributed search requests stop 
> wasting CPU cycles on requests that have already timed out or are so 
> complicated that they won't be able to execute. We have come across a case 
> where a nasty wildcard query within a proximity clause was causing the 
> cluster to enumerate terms for hours even though the query timeout was set to 
> minutes. This caused a noticeable slowdown within the system which made us 
> restart the replicas that happened to service that one request, the worst 
> case scenario are users with a relatively low zk timeout value will have 
> nodes start dropping from the cluster due to long GC pauses.
> [~amccurry] Built a mechanism into Apache Blur to help with the issue in 
> BLUR-142 (see commit comment for code, though look at the latest code on the 
> trunk for newer bug fixes).
> Solr should be able to either prevent these problematic queries from running 
> by some heuristic (possibly estimated size of heap usage) or be able to 
> execute a thread interrupt on all query threads once the time threshold is 
> met. This issue mirrors what others have discussed on the mailing list: 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8319) A Time-limiting collector that works with CollectorManagers

2020-08-12 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176585#comment-17176585
 ] 

David Smiley commented on LUCENE-8319:
--

Note that there is also ExitableDirectoryReader which seems to compete with 
TimeLimitingCollector.  IMO EDR is better because it extends earlier to query 
rewrite phase, and TLC has maybe no advantages?  I'd rather see TLC removed.  
Any way, I bring this up because I'm not sure how EDR plays with concurrent 
search.  Maybe just fine, maybe there is a parallel concern there with the 
proposal above.

> A Time-limiting collector that works with CollectorManagers
> ---
>
> Key: LUCENE-8319
> URL: https://issues.apache.org/jira/browse/LUCENE-8319
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Tony Xu
>Priority: Minor
>
> Currently Lucene has *TimeLimitingCollector* to support time-bound collection 
> and it will throw 
> *TimeExceededException* if timeout happens. This only works nicely with the 
> single-thread low-level API from the IndexSearcher. The method signature is --
> *void search(List leaves, Weight weight, Collector 
> collector)*
> The intended use is to always enclose the searcher.search(query, collector) 
> call with a try ... catch and handle the timeout exception. Unfortunately 
> when working with a *CollectorManager* in the multi-thread search context, 
> the *TimeExceededException* thrown during collecting one leaf slice will be 
> re-thrown by *IndexSearcher* without calling *CollectorManager*'s reduce(), 
> even if other slices are successfully collected. The signature 
> of the search api with *CollectorManager* is --
> * T search(Query query, CollectorManager 
> collectorManager)*
>  
> The good news is that IndexSearcher handles *CollectionTerminatedException* 
> gracefully by ignoring it. We can either wrap TimeLimitingCollector and throw 
>  *CollectionTerminatedException* when timeout happens or simply replace 
> *TimeExceededException* with *CollectionTerminatedException*. In either way, 
> we also need to maintain a flag that indicates if timeout occurred so that 
> the user know it's a partial collection.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-12 Thread GitBox


s1monw commented on pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#issuecomment-673088302


   @mikemccand @msokolov @msfroh I pushed a new and slightly more complex but 
afaik correct approach to do the merge during getReader. Would be great to get 
some feedback. I think it's still pretty contained.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8626) standardise test class naming

2020-08-12 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176578#comment-17176578
 ] 

Dawid Weiss commented on LUCENE-8626:
-

bq. 1) do the renaming at once

I pushed the PR which makes it easier to do it in smaller steps (enforces any 
new classes to follow the convention though). Lucene may be easy as patches are 
smaller; Solr may be more problematic.

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9457) Why is Kuromoji tokenization throughput bimodal?

2020-08-12 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176574#comment-17176574
 ] 

Dawid Weiss commented on LUCENE-9457:
-

It's one of those things that are exciting to debug, take days to complete and 
sometimes never reach any reasonable explanation. :)

> Why is Kuromoji tokenization throughput bimodal?
> 
>
> Key: LUCENE-9457
> URL: https://issues.apache.org/jira/browse/LUCENE-9457
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> With the recent accidental regression of Japanese (Kuromoji) tokenization 
> throughput due to exciting FST optimizations, we [added new nightly Lucene 
> benchmarks|https://github.com/mikemccand/luceneutil/issues/64] to measure 
> tokenization throughput for {{JapaneseTokenizer}}: 
> [https://home.apache.org/~mikemccand/lucenebench/analyzers.html]
> It has already been running for ~5-6 weeks now!  But for some reason, it 
> looks bi-modal?  "Normally" it is ~.45 M tokens/sec, but for two data points 
> it dropped down to ~.33 M tokens/sec, which is odd.  It could be hotspot 
> noise maybe?  But would be good to get to the root cause and fix it if 
> possible.
> Hotspot noise that randomly steals ~27% of your tokenization throughput is no 
> good!!
> Or does anyone have any other ideas of what could be bi-modal in Kuromoji?  I 
> don't think [this performance 
> test|https://github.com/mikemccand/luceneutil/blob/master/src/main/perf/TestAnalyzerPerf.java]
>  has any randomness in it...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9450) Taxonomy index should use DocValues not StoredFields

2020-08-12 Thread Gautam Worah (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176561#comment-17176561
 ] 

Gautam Worah commented on LUCENE-9450:
--

Thanks for the feedback [~mikemccand] !

Unfortunately, there were no errors at the line where I called 
{{binaryValue()}} (without calling {{.advanceExact()}})

However, the docs for the {{.binaryValue()}} function do say that

```

{color:#808080}It is illegal to call this method after 
{{color}{color:#808080}@link 
{color}{color:#808080}#advanceExact(int)}{color}{color:#808080}* returned 
{{color}{color:#808080}@code {color}{color:#808080}false}.{color}

{color:#808080}```{color}

I've submitted a Revision 1 
[PR|https://github.com/apache/lucene-solr/pull/1733] with the following changes:
 * Added a call to {{`advanceExact()`}} before calling {{`.binaryValue()`}} and 
an {{assert}} to check that the field exists in the index

 * Re-added the {{`StringField`}} with the {{`Field.Store.YES`}} changed to 
{{`Field.Store.NO}}`.

 * I've not added new tests at the moment. Trying to get the existing ones to 
work first.

 

 

 

 

> Taxonomy index should use DocValues not StoredFields
> 
>
> Key: LUCENE-9450
> URL: https://issues.apache.org/jira/browse/LUCENE-9450
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.5.2
>Reporter: Gautam Worah
>Priority: Minor
>  Labels: performance
> Attachments: wip_taxonomy_patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The taxonomy index that maps binning labels to ordinals was created before 
> Lucene added BinaryDocValues.
> I've attached a WIP patch (does not pass tests currently)
> Issue suggested by [~mikemccand]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9457) Why is Kuromoji tokenization throughput bimodal?

2020-08-12 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176559#comment-17176559
 ] 

Michael McCandless commented on LUCENE-9457:


Yeah that is one possible theory, but, this machine (dedicated physical box) is 
very idle and only runs Lucene's nightly benchmarks.  Also, the other 
benchmarks run on those same timestamps (e.g. the other analyzers) did not also 
seem to show a performance drop.  So I think it is not likely a time specific 
environmental issue ...

> Why is Kuromoji tokenization throughput bimodal?
> 
>
> Key: LUCENE-9457
> URL: https://issues.apache.org/jira/browse/LUCENE-9457
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> With the recent accidental regression of Japanese (Kuromoji) tokenization 
> throughput due to exciting FST optimizations, we [added new nightly Lucene 
> benchmarks|https://github.com/mikemccand/luceneutil/issues/64] to measure 
> tokenization throughput for {{JapaneseTokenizer}}: 
> [https://home.apache.org/~mikemccand/lucenebench/analyzers.html]
> It has already been running for ~5-6 weeks now!  But for some reason, it 
> looks bi-modal?  "Normally" it is ~.45 M tokens/sec, but for two data points 
> it dropped down to ~.33 M tokens/sec, which is odd.  It could be hotspot 
> noise maybe?  But would be good to get to the root cause and fix it if 
> possible.
> Hotspot noise that randomly steals ~27% of your tokenization throughput is no 
> good!!
> Or does anyone have any other ideas of what could be bi-modal in Kuromoji?  I 
> don't think [this performance 
> test|https://github.com/mikemccand/luceneutil/blob/master/src/main/perf/TestAnalyzerPerf.java]
>  has any randomness in it...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8626) standardise test class naming

2020-08-12 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176557#comment-17176557
 ] 

Michael McCandless commented on LUCENE-8626:


{quote}I'm myself more of a suffix-style kind of guy (so that prefix class 
searches work for both the class and its test in any environment)
{quote}
I had thought this issue was really a butter side up / Sneetches / Law of 
Triviality / bike shedding sort of situation, but it is not ;)

This justification makes sense to me, so I think suffix is indeed better than 
prefix, as long as we can 1) do the renaming at once, and 2) enforce that 
consistent naming going forwards.

+1 to rename all of Lucene's tests to use {{Test.java}} suffix.

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9439) Matches API should enumerate hit fields that have no positions (no iterator)

2020-08-12 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176555#comment-17176555
 ] 

Dawid Weiss commented on LUCENE-9439:
-

It does, doesn't it? I am in favor of your suggestion to split the patch into 
separate two issues - makes sense. I'll do it (have an away day tomorrow but 
Friday, hopefully).

And you're right that something resembling an "end product" highlighter would 
be nice too. Although I believe part of the power is in these components being 
so nicely decoupled... You can easily build pretty much any kind of 
highlighting you wish with these... For example, I built a component that uses 
hit highlighting over several fields but always fills in certain fields with 
the default "snippet" if they're not part of the query hit.  Yes, it does 
require custom code as opposed to just configuration but it also opens up a 
great freedom of choice in how you want your highlighter to work.

I'll have to think how to best showcase this without putting everything in one 
box.

> Matches API should enumerate hit fields that have no positions (no iterator)
> 
>
> Key: LUCENE-9439
> URL: https://issues.apache.org/jira/browse/LUCENE-9439
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Attachments: LUCENE-9439.patch, matchhighlighter.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> I have been fiddling with Matches API and it's great. There is one corner 
> case that doesn't work for me though -- queries that affect fields without 
> positions return {{MatchesUtil.MATCH_WITH_NO_TERMS}} but this constant is 
> problematic as it doesn't carry the field name that caused it (returns null).
> The associated fromSubMatches combines all these constants into one (or 
> swallows them) which is another problem.
> I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with 
> a true match (carrying field name) returning an empty iterator (or a constant 
> "empty" iterator NO_TERMS).
> I have a very compelling use case: I wrote an "auto-highlighter" that runs on 
> top of Matches API and automatically picks up query-relevant fields and 
> snippets. Everything works beautifully except for cases where fields are 
> searchable but don't have any positions (token-like fields).
> I can work on a patch but wanted to reach out first - [~romseygeek]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9447) Make BEST_COMPRESSION compress more aggressively?

2020-08-12 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176554#comment-17176554
 ] 

Michael McCandless commented on LUCENE-9447:


+1 to simply switch to bigger default block size (256 KB seems good) for now.  
At least for this particular corpus, the reduction is massive (~33%).

It would be nice if we could auto-adapt the block size based on how 
compressible the stored fields really are, dynamically tuning the index size vs 
CPU cost of doc retrieval, but that can come later.

But, do we have any benchmarks that measure the CPU impact to retrieving stored 
fields?  That is the downside of compressing bigger blocks, right?  Higher 
per-hit decode cost (if the hit is in a new block).

E.g. Lucene's facet implementation relies on this, since resolving its int 
ordinals to human friendly facet labels is done by loading a document for each 
ordinal.  [~gworah] is working on switching to doc values in LUCENE-9450 to 
reduce this cost.

Sharing the compression dictionary across blocks would be amazing, but that is 
surely complex, and would indeed likely reduce how often we could bulk-copy 
compressed blocks during merging.  But, maybe that is OK?  Increasing indexing 
cost in order to get a smaller index is often a good tradeoff?  Does {{zlib}} 
maybe support merging dictionaries / quickly re-writing a previously compressed 
output based on a new dictionary?  Maybe we (later!) could switch to a 
different implementation that would offer such "expert" APIs?

> Make BEST_COMPRESSION compress more aggressively?
> -
>
> Key: LUCENE-9447
> URL: https://issues.apache.org/jira/browse/LUCENE-9447
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>
> The Lucene86 codec supports setting a "Mode" for stored fields compression, 
> that is either "BEST_SPEED", which translates to blocks of 16kB or 128 
> documents (whichever is hit first) compressed with LZ4, or 
> "BEST_COMPRESSION", which translates to blocks of 60kB or 512 documents 
> compressed with DEFLATE with default compression level (6).
> After looking at indices that spent most disk space on stored fields 
> recently, I noticed that there was quite some room for improvement by 
> increasing the block size even further:
> ||Block size||Stored fields size||
> |60kB|168412338|
> |128kB|130813639|
> |256kB|113587009|
> |512kB|104776378|
> |1MB|100367095|
> |2MB|98152464|
> |4MB|97034425|
> |8MB|96478746|
> For this specific dataset, I had 1M documents that each had about 2kB of 
> stored fields each and quite some redundancy.
> This makes me want to look into bumping this block size to maybe 256kB. It 
> would be interesting to re-do the experiments we did on LUCENE-6100 to see 
> how this affects the merging speed. That said I don't think it would be 
> terrible if the merging time increased a bit given that we already offer the 
> BEST_SPEED option for CPU-savvy users.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14748) Correct SSL/Auth startup warning

2020-08-12 Thread Jason Gerlowski (Jira)
Jason Gerlowski created SOLR-14748:
--

 Summary: Correct SSL/Auth startup warning
 Key: SOLR-14748
 URL: https://issues.apache.org/jira/browse/SOLR-14748
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 8.6, master (9.0)
Reporter: Jason Gerlowski
Assignee: Jason Gerlowski


In 8.4 (as a part of SOLR-13972) I added some warn logging that was intended to 
warn users if they were using auth without SSL (and therefore exposing their 
passwords over the network).

But apparently I inverted the check logic prior to committing and the error now 
displays incorrectly.

Fixing this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-12 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r469455229



##
File path: solr/core/src/java/org/apache/solr/cluster/placement/Cluster.java
##
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.io.IOException;
+import java.util.Optional;
+import java.util.Set;
+
+/**
+ * A representation of the (initial) cluster state, providing information 
on which nodes are part of the cluster and a way
+ * to get to more detailed info.
+ *
+ * This instance can also be used as a {@link PropertyValueSource} if 
{@link PropertyKey}'s need to be specified with
+ * a global cluster target.
+ */
+public interface Cluster extends PropertyValueSource {
+  /**
+   * @return current set of live nodes. Never null, never empty 
(Solr wouldn't call the plugin if empty
+   * since no useful work could then be done).
+   */
+  Set getLiveNodes();
+
+  /**
+   * Returns info about the given collection if one exists. Because it is 
not expected for plugins to request info about
+   * a large number of collections, requests can only be made one by one.
+   *
+   * This is also the reason we do not return a {@link java.util.Map} or 
{@link Set} of {@link SolrCollection}'s here: it would be
+   * wasteful to fetch all data and fill such a map when plugin code likely 
needs info about at most one or two collections.
+   */
+  Optional getCollection(String collectionName) throws 
IOException;
+
+  /**
+   * Allows getting all {@link SolrCollection} present in the cluster.
+   *
+   * WARNING: this call might be extremely inefficient on large 
clusters. Usage is discouraged.
+   */
+  Set getAllCollections();

Review comment:
   Side note: I need (and plan to eventually make) SolrCloud to scale to 
hundreds of thousand collections. Anything that’s in O(n) or worse in number of 
collections will not fly for this scale. Implementations can be inefficient 
(and can change) but let’s try to keep interfaces efficient.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-12 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r469455229



##
File path: solr/core/src/java/org/apache/solr/cluster/placement/Cluster.java
##
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.io.IOException;
+import java.util.Optional;
+import java.util.Set;
+
+/**
+ * A representation of the (initial) cluster state, providing information 
on which nodes are part of the cluster and a way
+ * to get to more detailed info.
+ *
+ * This instance can also be used as a {@link PropertyValueSource} if 
{@link PropertyKey}'s need to be specified with
+ * a global cluster target.
+ */
+public interface Cluster extends PropertyValueSource {
+  /**
+   * @return current set of live nodes. Never null, never empty 
(Solr wouldn't call the plugin if empty
+   * since no useful work could then be done).
+   */
+  Set getLiveNodes();
+
+  /**
+   * Returns info about the given collection if one exists. Because it is 
not expected for plugins to request info about
+   * a large number of collections, requests can only be made one by one.
+   *
+   * This is also the reason we do not return a {@link java.util.Map} or 
{@link Set} of {@link SolrCollection}'s here: it would be
+   * wasteful to fetch all data and fill such a map when plugin code likely 
needs info about at most one or two collections.
+   */
+  Optional getCollection(String collectionName) throws 
IOException;
+
+  /**
+   * Allows getting all {@link SolrCollection} present in the cluster.
+   *
+   * WARNING: this call might be extremely inefficient on large 
clusters. Usage is discouraged.
+   */
+  Set getAllCollections();

Review comment:
   Side note: I need (and plan to eventually make) SolrCloud to scale to 
hundred of thousand collections. Anything that’s in O(n) or worse in number of 
collections will not fly for this scale. Implementations can be inefficient 
(and can change) but let’s try to keep interfaces efficient.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-12 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r469452206



##
File path: solr/core/src/java/org/apache/solr/cluster/placement/Cluster.java
##
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.io.IOException;
+import java.util.Optional;
+import java.util.Set;
+
+/**
+ * A representation of the (initial) cluster state, providing information 
on which nodes are part of the cluster and a way
+ * to get to more detailed info.
+ *
+ * This instance can also be used as a {@link PropertyValueSource} if 
{@link PropertyKey}'s need to be specified with
+ * a global cluster target.
+ */
+public interface Cluster extends PropertyValueSource {
+  /**
+   * @return current set of live nodes. Never null, never empty 
(Solr wouldn't call the plugin if empty
+   * since no useful work could then be done).
+   */
+  Set getLiveNodes();
+
+  /**
+   * Returns info about the given collection if one exists. Because it is 
not expected for plugins to request info about
+   * a large number of collections, requests can only be made one by one.
+   *
+   * This is also the reason we do not return a {@link java.util.Map} or 
{@link Set} of {@link SolrCollection}'s here: it would be
+   * wasteful to fetch all data and fill such a map when plugin code likely 
needs info about at most one or two collections.
+   */
+  Optional getCollection(String collectionName) throws 
IOException;
+
+  /**
+   * Allows getting all {@link SolrCollection} present in the cluster.
+   *
+   * WARNING: this call might be extremely inefficient on large 
clusters. Usage is discouraged.
+   */
+  Set getAllCollections();

Review comment:
   Ok. Can return names here.
   Still didn’t get what’s the use case for this method though. I’d assume if 
we need to fetch collection names we don’t know, we might want to fetch names 
that verify a given pattern. Maybe make this method accept some form of 
filtering? (Something that can be implemented efficiently if we ever want to, 
not an “accept” function that forces iterating over all collection names 
anyway).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-12 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r469449162



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlugin.java
##
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Implemented by external plugins to control replica placement and movement 
on the search cluster (as well as other things
+ * such as cluster elasticity?) when cluster changes are required (initiated 
elsewhere, most likely following a Collection
+ * API call).
+ */
+public interface PlacementPlugin {

Review comment:
   Either constructor convention or - I prefer but not sure how popular it 
is in the Solr codebase - when configuring a plugin what is added to the Solr 
config is a plugin factory and that factory is called to get plugin instances. 
One more interface but cleaner code.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8776) Start offset going backwards has a legitimate purpose

2020-08-12 Thread Roman (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176524#comment-17176524
 ] 

Roman commented on LUCENE-8776:
---

[~simonw] no doubt that the decisions are not always easy, I appreciate the 
attention you are giving the matter. The arguments presented here are meant to 
provide more information and if you decide that the matter at hand has no 
merit, there is no point arguing (and no bad blood). It is actually easier for 
us to fork Lucene – but it *seems* (and the stress is 'seems') wrong for Lucene 
to intentionally limit itself in what it was doing so well, so I'm trying to 
nudge the case.

 

Consider this: I have disabled the checks in DefaultIndexingChain and rerun 
full suite of tests, these tests became failing:

 

7.7: org.apache.lucene.index.TestPostingsOffsets

8.6: org.apache.lucene.index.TestPostingsOffsets

 

You'll notice that the *only* new tests failing are those that enforce the 
check. Maybe there are no tests written for index integrity?

 

As to the issue at had: whether the implementation/extension can be made 
'pluggable'. My view is following: if you constraint what is *already* in 
Lucene, you are forcing people to make forks. We are somewhere inbetween - it 
would be easy to provide option to plug a custom chain. It would cost little to 
give them the option (with BIG NEON WARNINGS pasted all over if necessary).

 

Ok, that's an argument by practicality – not a strong one. But how about the 
"nothing is broken" part? (yes, the tests that enforce the condition are 
failing – but nothing else is broken) . User cases are broken: there are 
already two examples of projects that got this complex scenario right (our 
project is one such example) - I asked in the forum, and [~dsmiley] and 
[~gh_at] struggled in their work with the same issue. I'm not meaning to drag 
them in for them to weigh in (but I wouldn't mind obviously ;)), I'm just 
trying to illustrate that the limitations break real-case scenarios.

And the benefits still seem to be in the realm of " future possibilities". 
Sure, that is not to be dismissed lightly, they are important concerns. But if 
we as engineers choose the most efficient over the most optimal, we would 
always be building "houses" without windows (these things loose energy, make 
people fall from heights, require cleaning - are incredibly wasteful!)

The next thing I could test is to run a performance test with a tokenizer chain 
which allows backward postings and the one which employes the flatten tokenizer 
and report results. But I'm going to do only if it the case is really open for 
consideration, otherwise it would be a waste of time.

> Start offset going backwards has a legitimate purpose
> -
>
> Key: LUCENE-8776
> URL: https://issues.apache.org/jira/browse/LUCENE-8776
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 7.6
>Reporter: Ram Venkat
>Priority: Major
>
> Here is the use case where startOffset can go backwards:
> Say there is a line "Organic light-emitting-diode glows", and I want to run 
> span queries and highlight them properly. 
> During index time, light-emitting-diode is split into three words, which 
> allows me to search for 'light', 'emitting' and 'diode' individually. The 
> three words occupy adjacent positions in the index, as 'light' adjacent to 
> 'emitting' and 'light' at a distance of two words from 'diode' need to match 
> this word. So, the order of words after splitting are: Organic, light, 
> emitting, diode, glows. 
> But, I also want to search for 'organic' being adjacent to 
> 'light-emitting-diode' or 'light-emitting-diode' being adjacent to 'glows'. 
> The way I solved this was to also generate 'light-emitting-diode' at two 
> positions: (a) In the same position as 'light' and (b) in the same position 
> as 'glows', like below:
> ||organic||light||emitting||diode||glows||
> | |light-emitting-diode| |light-emitting-diode| |
> |0|1|2|3|4|
> The positions of the two 'light-emitting-diode' are 1 and 3, but the offsets 
> are obviously the same. This works beautifully in Lucene 5.x in both 
> searching and highlighting with span queries. 
> But when I try this in Lucene 7.6, it hits the condition "Offsets must not go 
> backwards" at DefaultIndexingChain:818. This IllegalArgumentException is 
> being thrown without any comments on why this check is needed. As I explained 
> above, startOffset going backwards is perfectly valid, to deal with word 
> splitting and span operations on these specialized use cases. On the other 
> hand, it is not clear what value is added by this check and which highlighter 
> code is affected by offsets going backwards. This same check is done at 
> BaseTokenStreamTestCase:245. 
> I 

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-12 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r469447213



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlugin.java
##
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Implemented by external plugins to control replica placement and movement 
on the search cluster (as well as other things
+ * such as cluster elasticity?) when cluster changes are required (initiated 
elsewhere, most likely following a Collection
+ * API call).
+ */
+public interface PlacementPlugin {

Review comment:
   Perfectly fine with me. The contract will be for the plugin to provide a 
constructor accepting config to create a new Plugin instance, then that 
instance will be called for each placement computation (and if the plugin 
doesn’t care about config, no arg constructor would be fine). Multiple plugin 
instances might be in use at the same time if so called by Solr (config changes 
or other reasons).
   This means the plugin instance must be reentrant and its member variables 
can’t directly be used as if specific to a single computation. Not a big deal, 
plugin implementor can delegate internally to an instance of another class if 
they want that option.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Deleted] (SOLR-14745) CLONE - Hannah Zacharski

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya deleted SOLR-14745:



> CLONE - Hannah Zacharski
> 
>
> Key: SOLR-14745
> URL: https://issues.apache.org/jira/browse/SOLR-14745
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>  Labels: LeaderElector
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Deleted] (SOLR-14747) CLONE - Michael Zacharski

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya deleted SOLR-14747:



> CLONE - Michael Zacharski
> -
>
> Key: SOLR-14747
> URL: https://issues.apache.org/jira/browse/SOLR-14747
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>
> Hannah Slaughter Zacharski



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Deleted] (SOLR-14746) Michael Zacharski

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya deleted SOLR-14746:



> Michael Zacharski
> -
>
> Key: SOLR-14746
> URL: https://issues.apache.org/jira/browse/SOLR-14746
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>
> Hannah Slaughter Zacharski



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Deleted] (SOLR-14744) Hannah Zacharski

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya deleted SOLR-14744:



> Hannah Zacharski
> 
>
> Key: SOLR-14744
> URL: https://issues.apache.org/jira/browse/SOLR-14744
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>  Labels: LeaderElector
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-10317) Solr Nightly Benchmarks

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176507#comment-17176507
 ] 

Ishan Chattopadhyaya commented on SOLR-10317:
-

bq. Ishan ChattopadhyayaDavid SmileyI'm using the Icecat dataset 
(https://github.com/querqy/chorus#sample-data-details) for a project. Today we 
have lots of attributed products (over 8000 attributes across the roughly 100K) 
products. We're working on adding actual price data to this dataset, which is 
currently doesn't have, and then I'm mulling over some ideas to generate 
reasonable queries (and judgement lists) to use with this dataset at scale.

[~epugh], I'll give it a try. Thanks for the tip!

> Solr Nightly Benchmarks
> ---
>
> Key: SOLR-10317
> URL: https://issues.apache.org/jira/browse/SOLR-10317
> Project: Solr
>  Issue Type: Task
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
> Attachments: 
> Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks-FINAL-PROPOSAL.pdf, 
> Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks.docx, SOLR-10317.patch, 
> SOLR-10317.patch, Screenshot from 2017-07-30 20-30-05.png, 
> changes-lucene-20160907.json, changes-solr-20160907.json, managed-schema, 
> solrconfig.xml
>
>
> The benchmarking suite is now here: 
> [https://github.com/thesearchstack/solr-bench]
> Actual datasets and queries are TBD yet.
>  
> --- Original description ---
>  Solr needs nightly benchmarks reporting. Similar Lucene benchmarks can be 
> found here, [https://home.apache.org/~mikemccand/lucenebench/].
>  
>  Preferably, we need:
>  # A suite of benchmarks that build Solr from a commit point, start Solr 
> nodes, both in SolrCloud and standalone mode, and record timing information 
> of various operations like indexing, querying, faceting, grouping, 
> replication etc.
>  # It should be possible to run them either as an independent suite or as a 
> Jenkins job, and we should be able to report timings as graphs (Jenkins has 
> some charting plugins).
>  # The code should eventually be integrated in the Solr codebase, so that it 
> never goes out of date.
>  
>  There is some prior work / discussion:
>  # [https://github.com/shalinmangar/solr-perf-tools] (Shalin)
>  # [https://github.com/chatman/solr-upgrade-tests/blob/master/BENCHMARKS.md] 
> (Ishan/Vivek)
>  # SOLR-2646 & SOLR-9863 (Mark Miller)
>  # [https://home.apache.org/~mikemccand/lucenebench/] (Mike McCandless)
>  # [https://github.com/lucidworks/solr-scale-tk] (Tim Potter)
>  
>  There is support for building, starting, indexing/querying and stopping Solr 
> in some of these frameworks above. However, the benchmarks run are very 
> limited. Any of these can be a starting point, or a new framework can as well 
> be used. The motivation is to be able to cover every functionality of Solr 
> with a corresponding benchmark that is run every night.
>  
>  Proposing this as a GSoC 2017 project. I'm willing to mentor, and I'm sure 
> [~shalinmangar] and [~[markrmil...@gmail.com|mailto:markrmil...@gmail.com]] 
> would help here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-10317) Solr Nightly Benchmarks

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176506#comment-17176506
 ] 

Ishan Chattopadhyaya commented on SOLR-10317:
-

Thanks for the feedback, [~mdrob]. I'll address all your suggestions for 
downloads, esp the one about the verification.
As for point 7, chmod errors are unexpected. I'll take a look.
As for point 8, the final results are worth comparing against the same test run 
on another build or another commit point.
As for point 9, indexing threads correspond to how many client threads are used 
in indexing the documents, query threads are essentially a measure of 
concurrency while making requests. Increasing query threads helps gauge the 
performance under high concurrent load.

> Solr Nightly Benchmarks
> ---
>
> Key: SOLR-10317
> URL: https://issues.apache.org/jira/browse/SOLR-10317
> Project: Solr
>  Issue Type: Task
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
> Attachments: 
> Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks-FINAL-PROPOSAL.pdf, 
> Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks.docx, SOLR-10317.patch, 
> SOLR-10317.patch, Screenshot from 2017-07-30 20-30-05.png, 
> changes-lucene-20160907.json, changes-solr-20160907.json, managed-schema, 
> solrconfig.xml
>
>
> The benchmarking suite is now here: 
> [https://github.com/thesearchstack/solr-bench]
> Actual datasets and queries are TBD yet.
>  
> --- Original description ---
>  Solr needs nightly benchmarks reporting. Similar Lucene benchmarks can be 
> found here, [https://home.apache.org/~mikemccand/lucenebench/].
>  
>  Preferably, we need:
>  # A suite of benchmarks that build Solr from a commit point, start Solr 
> nodes, both in SolrCloud and standalone mode, and record timing information 
> of various operations like indexing, querying, faceting, grouping, 
> replication etc.
>  # It should be possible to run them either as an independent suite or as a 
> Jenkins job, and we should be able to report timings as graphs (Jenkins has 
> some charting plugins).
>  # The code should eventually be integrated in the Solr codebase, so that it 
> never goes out of date.
>  
>  There is some prior work / discussion:
>  # [https://github.com/shalinmangar/solr-perf-tools] (Shalin)
>  # [https://github.com/chatman/solr-upgrade-tests/blob/master/BENCHMARKS.md] 
> (Ishan/Vivek)
>  # SOLR-2646 & SOLR-9863 (Mark Miller)
>  # [https://home.apache.org/~mikemccand/lucenebench/] (Mike McCandless)
>  # [https://github.com/lucidworks/solr-scale-tk] (Tim Potter)
>  
>  There is support for building, starting, indexing/querying and stopping Solr 
> in some of these frameworks above. However, the benchmarks run are very 
> limited. Any of these can be a starting point, or a new framework can as well 
> be used. The motivation is to be able to cover every functionality of Solr 
> with a corresponding benchmark that is run every night.
>  
>  Proposing this as a GSoC 2017 project. I'm willing to mentor, and I'm sure 
> [~shalinmangar] and [~[markrmil...@gmail.com|mailto:markrmil...@gmail.com]] 
> would help here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14747) CLONE - Michael Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Slaughter Zacharski updated SOLR-14747:
--
Description: Hannah Slaughter Zacharski

> CLONE - Michael Zacharski
> -
>
> Key: SOLR-14747
> URL: https://issues.apache.org/jira/browse/SOLR-14747
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
> Attachments: 992CAFDB-165C-45BC-8F43-67450A77AE3E.jpeg
>
>
> Hannah Slaughter Zacharski



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14746) Michael Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Slaughter Zacharski updated SOLR-14746:
--
Description: Hannah Slaughter Zacharski

> Michael Zacharski
> -
>
> Key: SOLR-14746
> URL: https://issues.apache.org/jira/browse/SOLR-14746
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
> Attachments: 5677B1CD-335A-4883-A97F-A1D8EE26CC17.jpeg
>
>
> Hannah Slaughter Zacharski



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14747) CLONE - Michael Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Slaughter Zacharski updated SOLR-14747:
--
Attachment: 992CAFDB-165C-45BC-8F43-67450A77AE3E.jpeg

> CLONE - Michael Zacharski
> -
>
> Key: SOLR-14747
> URL: https://issues.apache.org/jira/browse/SOLR-14747
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
> Attachments: 992CAFDB-165C-45BC-8F43-67450A77AE3E.jpeg
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14746) Michael Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Slaughter Zacharski updated SOLR-14746:
--
Attachment: 5677B1CD-335A-4883-A97F-A1D8EE26CC17.jpeg

> Michael Zacharski
> -
>
> Key: SOLR-14746
> URL: https://issues.apache.org/jira/browse/SOLR-14746
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
> Attachments: 5677B1CD-335A-4883-A97F-A1D8EE26CC17.jpeg
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14747) CLONE - Michael Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)
Hannah Slaughter Zacharski created SOLR-14747:
-

 Summary: CLONE - Michael Zacharski
 Key: SOLR-14747
 URL: https://issues.apache.org/jira/browse/SOLR-14747
 Project: Solr
  Issue Type: Wish
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hannah Slaughter Zacharski






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14746) Michael Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)
Hannah Slaughter Zacharski created SOLR-14746:
-

 Summary: Michael Zacharski
 Key: SOLR-14746
 URL: https://issues.apache.org/jira/browse/SOLR-14746
 Project: Solr
  Issue Type: Wish
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hannah Slaughter Zacharski






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-10317) Solr Nightly Benchmarks

2020-08-12 Thread Mike Drob (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176504#comment-17176504
 ] 

Mike Drob commented on SOLR-10317:
--

Ishan,

I tried to use the benchmarking framework you have put up to test 8.6.1 RC and 
ran into a lot of difficulty. It was not a great experience for me. I tried to 
go with the {{config-prebuilt.json}}
 # Please provide {{mvnw}} or {{gradlew}} for ease of getting started
 # (minor) Please address your maven build warnings
 # When it's prebuilt, I don't want to download a JDK. I deleted this from my 
local copy.
 # Don't download zookeeper directly from {{archive.apache.org}}, please use a 
mirror.
 # Start script downloads zookeeper 3.5.6, but then the first thing the java 
program does is to download zookeeper 3.4.14
 # If you're going to download things, then verify checksums and signatures. I 
generally don't trust anything that I download from the internet, and I do not 
appreciate how this repository is very lax about trust.
 # I get a flood of chmod errors when I try to start.
 # I get some "final results" as JSON but it is not at all clear to me how to 
compare these with other results.
 # In the config, I see a lot of options for tweaking threads, but it is not 
clear what these values correspond to, or what I will actually be testing if I 
change them.

> Solr Nightly Benchmarks
> ---
>
> Key: SOLR-10317
> URL: https://issues.apache.org/jira/browse/SOLR-10317
> Project: Solr
>  Issue Type: Task
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
>  Labels: gsoc2017, mentor
> Attachments: 
> Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks-FINAL-PROPOSAL.pdf, 
> Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks.docx, SOLR-10317.patch, 
> SOLR-10317.patch, Screenshot from 2017-07-30 20-30-05.png, 
> changes-lucene-20160907.json, changes-solr-20160907.json, managed-schema, 
> solrconfig.xml
>
>
> The benchmarking suite is now here: 
> [https://github.com/thesearchstack/solr-bench]
> Actual datasets and queries are TBD yet.
>  
> --- Original description ---
>  Solr needs nightly benchmarks reporting. Similar Lucene benchmarks can be 
> found here, [https://home.apache.org/~mikemccand/lucenebench/].
>  
>  Preferably, we need:
>  # A suite of benchmarks that build Solr from a commit point, start Solr 
> nodes, both in SolrCloud and standalone mode, and record timing information 
> of various operations like indexing, querying, faceting, grouping, 
> replication etc.
>  # It should be possible to run them either as an independent suite or as a 
> Jenkins job, and we should be able to report timings as graphs (Jenkins has 
> some charting plugins).
>  # The code should eventually be integrated in the Solr codebase, so that it 
> never goes out of date.
>  
>  There is some prior work / discussion:
>  # [https://github.com/shalinmangar/solr-perf-tools] (Shalin)
>  # [https://github.com/chatman/solr-upgrade-tests/blob/master/BENCHMARKS.md] 
> (Ishan/Vivek)
>  # SOLR-2646 & SOLR-9863 (Mark Miller)
>  # [https://home.apache.org/~mikemccand/lucenebench/] (Mike McCandless)
>  # [https://github.com/lucidworks/solr-scale-tk] (Tim Potter)
>  
>  There is support for building, starting, indexing/querying and stopping Solr 
> in some of these frameworks above. However, the benchmarks run are very 
> limited. Any of these can be a starting point, or a new framework can as well 
> be used. The motivation is to be able to cover every functionality of Solr 
> with a corresponding benchmark that is run every night.
>  
>  Proposing this as a GSoC 2017 project. I'm willing to mentor, and I'm sure 
> [~shalinmangar] and [~[markrmil...@gmail.com|mailto:markrmil...@gmail.com]] 
> would help here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14745) CLONE - Hannah Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Slaughter Zacharski updated SOLR-14745:
--
Attachment: 08C5783A-84BE-4AFC-BC36-F71D844CCB91.jpeg
Status: Open  (was: Open)

> CLONE - Hannah Zacharski
> 
>
> Key: SOLR-14745
> URL: https://issues.apache.org/jira/browse/SOLR-14745
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI, Authentication, Authorization, Backup/Restore, 
> Build, config-api, contrib - Clustering, contrib - DataImportHandler, contrib 
> - LangId, contrib - LTR, contrib - morphlines-cell, Data-driven Schema, 
> documentation, Facet Module, Package Manager, Parallel SQL, replication 
> (scripts), Schema and Analysis, scripts and tools, Server, SolrCLI, 
> UpdateRequestProcessors
>Affects Versions: 8.5.2
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>  Labels: LeaderElector
> Fix For: master (9.0)
>
> Attachments: 08292BE4-FC7D-46DE-9918-05DBDDA49EC5.tiff, 
> 08C5783A-84BE-4AFC-BC36-F71D844CCB91.jpeg, 
> 272C4C18-0404-43FA-8FBF-1C60A67E6CAB.jpeg, 
> BF53BDD5-3F2A-4CA4-8EC0-5B92A8582E9B.jpeg, 
> D2A3C4E2-516F-48D4-A1DE-2F58D9DA7220.jpeg
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14745) CLONE - Hannah Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Slaughter Zacharski updated SOLR-14745:
--
Attachment: 08292BE4-FC7D-46DE-9918-05DBDDA49EC5.tiff
Status: Open  (was: Open)

> CLONE - Hannah Zacharski
> 
>
> Key: SOLR-14745
> URL: https://issues.apache.org/jira/browse/SOLR-14745
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI, Authentication, Authorization, Backup/Restore, 
> Build, config-api, contrib - Clustering, contrib - DataImportHandler, contrib 
> - LangId, contrib - LTR, contrib - morphlines-cell, Data-driven Schema, 
> documentation, Facet Module, Package Manager, Parallel SQL, replication 
> (scripts), Schema and Analysis, scripts and tools, Server, SolrCLI, 
> UpdateRequestProcessors
>Affects Versions: 8.5.2
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>  Labels: LeaderElector
> Fix For: master (9.0)
>
> Attachments: 08292BE4-FC7D-46DE-9918-05DBDDA49EC5.tiff, 
> 08C5783A-84BE-4AFC-BC36-F71D844CCB91.jpeg, 
> 272C4C18-0404-43FA-8FBF-1C60A67E6CAB.jpeg, 
> BF53BDD5-3F2A-4CA4-8EC0-5B92A8582E9B.jpeg, 
> D2A3C4E2-516F-48D4-A1DE-2F58D9DA7220.jpeg
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14745) CLONE - Hannah Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Slaughter Zacharski updated SOLR-14745:
--
   Attachment: 272C4C18-0404-43FA-8FBF-1C60A67E6CAB.jpeg
Fix Version/s: master (9.0)
Affects Version/s: 8.5.2
   Status: Open  (was: Open)

> CLONE - Hannah Zacharski
> 
>
> Key: SOLR-14745
> URL: https://issues.apache.org/jira/browse/SOLR-14745
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI, Authentication, Authorization, Backup/Restore, 
> Build, config-api, contrib - Clustering, contrib - DataImportHandler, contrib 
> - LangId, contrib - LTR, contrib - morphlines-cell, Data-driven Schema, 
> documentation, Facet Module, Package Manager, Parallel SQL, replication 
> (scripts), Schema and Analysis, scripts and tools, Server, SolrCLI, 
> UpdateRequestProcessors
>Affects Versions: 8.5.2
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>  Labels: LeaderElector
> Fix For: master (9.0)
>
> Attachments: 272C4C18-0404-43FA-8FBF-1C60A67E6CAB.jpeg, 
> BF53BDD5-3F2A-4CA4-8EC0-5B92A8582E9B.jpeg, 
> D2A3C4E2-516F-48D4-A1DE-2F58D9DA7220.jpeg
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14745) CLONE - Hannah Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Slaughter Zacharski updated SOLR-14745:
--
Attachment: D2A3C4E2-516F-48D4-A1DE-2F58D9DA7220.jpeg
Status: Open  (was: Open)

> CLONE - Hannah Zacharski
> 
>
> Key: SOLR-14745
> URL: https://issues.apache.org/jira/browse/SOLR-14745
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI, Authentication, Authorization, Backup/Restore, 
> Build, config-api, contrib - Clustering, contrib - DataImportHandler, contrib 
> - LangId, contrib - LTR, contrib - morphlines-cell, Data-driven Schema, 
> documentation, Facet Module, Package Manager, Parallel SQL, replication 
> (scripts), Schema and Analysis, scripts and tools, Server, SolrCLI, 
> UpdateRequestProcessors
>Affects Versions: 8.5.2
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>  Labels: LeaderElector
> Fix For: master (9.0)
>
> Attachments: 272C4C18-0404-43FA-8FBF-1C60A67E6CAB.jpeg, 
> BF53BDD5-3F2A-4CA4-8EC0-5B92A8582E9B.jpeg, 
> D2A3C4E2-516F-48D4-A1DE-2F58D9DA7220.jpeg
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14744) Hannah Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Slaughter Zacharski updated SOLR-14744:
--
Labels: LeaderElector  (was: LeaderElector l)

> Hannah Zacharski
> 
>
> Key: SOLR-14744
> URL: https://issues.apache.org/jira/browse/SOLR-14744
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI, Authentication, Authorization, Backup/Restore, 
> Build, config-api, contrib - Clustering, contrib - DataImportHandler, contrib 
> - LangId, contrib - LTR, contrib - morphlines-cell, Data-driven Schema, 
> documentation, Facet Module, Package Manager, Parallel SQL, replication 
> (scripts), Schema and Analysis, scripts and tools, Server, SolrCLI, 
> UpdateRequestProcessors
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>  Labels: LeaderElector
> Attachments: BF53BDD5-3F2A-4CA4-8EC0-5B92A8582E9B.jpeg
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14744) Hannah Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Slaughter Zacharski updated SOLR-14744:
--
Labels: LeaderElector l  (was: )

> Hannah Zacharski
> 
>
> Key: SOLR-14744
> URL: https://issues.apache.org/jira/browse/SOLR-14744
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI, Authentication, Authorization, Backup/Restore, 
> Build, config-api, contrib - Clustering, contrib - DataImportHandler, contrib 
> - LangId, contrib - LTR, contrib - morphlines-cell, Data-driven Schema, 
> documentation, Facet Module, Package Manager, Parallel SQL, replication 
> (scripts), Schema and Analysis, scripts and tools, Server, SolrCLI, 
> UpdateRequestProcessors
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>  Labels: LeaderElector, l
> Attachments: BF53BDD5-3F2A-4CA4-8EC0-5B92A8582E9B.jpeg
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14745) CLONE - Hannah Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)
Hannah Slaughter Zacharski created SOLR-14745:
-

 Summary: CLONE - Hannah Zacharski
 Key: SOLR-14745
 URL: https://issues.apache.org/jira/browse/SOLR-14745
 Project: Solr
  Issue Type: Wish
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Admin UI, Authentication, Authorization, Backup/Restore, 
Build, config-api, contrib - Clustering, contrib - DataImportHandler, contrib - 
LangId, contrib - LTR, contrib - morphlines-cell, Data-driven Schema, 
documentation, Facet Module, Package Manager, Parallel SQL, replication 
(scripts), Schema and Analysis, scripts and tools, Server, SolrCLI, 
UpdateRequestProcessors
Reporter: Hannah Slaughter Zacharski
 Attachments: BF53BDD5-3F2A-4CA4-8EC0-5B92A8582E9B.jpeg





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14744) Hannah Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Slaughter Zacharski updated SOLR-14744:
--
Attachment: BF53BDD5-3F2A-4CA4-8EC0-5B92A8582E9B.jpeg

> Hannah Zacharski
> 
>
> Key: SOLR-14744
> URL: https://issues.apache.org/jira/browse/SOLR-14744
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI, Authentication, Authorization, Backup/Restore, 
> Build, config-api, contrib - Clustering, contrib - DataImportHandler, contrib 
> - LangId, contrib - LTR, contrib - morphlines-cell, Data-driven Schema, 
> documentation, Facet Module, Package Manager, Parallel SQL, replication 
> (scripts), Schema and Analysis, scripts and tools, Server, SolrCLI, 
> UpdateRequestProcessors
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
> Attachments: BF53BDD5-3F2A-4CA4-8EC0-5B92A8582E9B.jpeg
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14744) Hannah Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)
Hannah Slaughter Zacharski created SOLR-14744:
-

 Summary: Hannah Zacharski
 Key: SOLR-14744
 URL: https://issues.apache.org/jira/browse/SOLR-14744
 Project: Solr
  Issue Type: Wish
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Admin UI, Authentication, Authorization, Backup/Restore, 
Build, config-api, contrib - Clustering, contrib - DataImportHandler, contrib - 
LangId, contrib - LTR, contrib - morphlines-cell, Data-driven Schema, 
documentation, Facet Module, Package Manager, Parallel SQL, replication 
(scripts), Schema and Analysis, scripts and tools, Server, SolrCLI, 
UpdateRequestProcessors
Reporter: Hannah Slaughter Zacharski






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Deleted] (SOLR-14742) Katarzyna Zacharski

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya deleted SOLR-14742:



> Katarzyna Zacharski
> ---
>
> Key: SOLR-14742
> URL: https://issues.apache.org/jira/browse/SOLR-14742
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Deleted] (SOLR-14741) CLONE - Boone Zacharski

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya deleted SOLR-14741:



> CLONE - Boone Zacharski
> ---
>
> Key: SOLR-14741
> URL: https://issues.apache.org/jira/browse/SOLR-14741
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Deleted] (SOLR-14743) CLONE - Katarzyna Zacharski

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya deleted SOLR-14743:



> CLONE - Katarzyna Zacharski
> ---
>
> Key: SOLR-14743
> URL: https://issues.apache.org/jira/browse/SOLR-14743
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Deleted] (SOLR-14740) Boone Zacharski

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya deleted SOLR-14740:



> Boone Zacharski
> ---
>
> Key: SOLR-14740
> URL: https://issues.apache.org/jira/browse/SOLR-14740
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14151) Make schema components load from packages

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya resolved SOLR-14151.
-
Fix Version/s: 8.7
   Resolution: Fixed

> Make schema components load from packages
> -
>
> Key: SOLR-14151
> URL: https://issues.apache.org/jira/browse/SOLR-14151
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: packagemanager
> Fix For: 8.7
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Example:
> {code:xml}
>  
> 
>   
>generateNumberParts="0" catenateWords="0"
>   catenateNumbers="0" catenateAll="0"/>
>   
>   
> 
>   
> {code}
> * When a package is updated, the entire {{IndexSchema}} object is refreshed, 
> but the SolrCore object is not reloaded
> * Any component can be prefixed with the package name
> * The semantics of loading plugins remain the same as that of the components 
> in {{solrconfig.xml}}
> * Plugins can be registered using schema API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14597) Advanced Query Parser

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176494#comment-17176494
 ] 

Ishan Chattopadhyaya edited comment on SOLR-14597 at 8/12/20, 5:14 PM:
---

I think we should build separate package for these, rather than loading them by 
default. Now we have this done: SOLR-14151.


was (Author: ichattopadhyaya):
I think we should build separate package for these, rather than loading them by 
default.

> Advanced Query Parser
> -
>
> Key: SOLR-14597
> URL: https://issues.apache.org/jira/browse/SOLR-14597
> Project: Solr
>  Issue Type: New Feature
>  Components: query parsers
>Affects Versions: 8.6
>Reporter: Mike Nibeck
>Assignee: Gus Heck
>Priority: Major
>
> This JIRA ticket tracks the progress of SIP-9, the Advanced Query Parser that 
> is being donated by the Library of Congress. Full description of the feature 
> can be found on the SIP Page.
> [https://cwiki.apache.org/confluence/display/SOLR/SIP-9+Advanced+Query+Parser]
> Briefly, this parser provides a comprehensive syntax for users that use 
> search on a daily basis. It also reserves a smaller set of punctuators than 
> other parsers. This facilitates easier handling of acronyms and punctuated 
> patterns with meaning ( such as C++ or 401(k) ). The new syntax opens up some 
> advanced features while also preventing access to arbitrary features via 
> local parameters. This parser will be safe for accepting user queries 
> directly with minimal pre-parsing, but for use cases beyond it's established 
> features alternate query paths (using other parsers) will need to be supplied.
> The code drop is being prepared and will be supplied as soon as we receive 
> guidance from the PMC regarding the proper process. Given that the Library 
> already has a signed CCLA we need to understand which of these (or other 
> processes) apply:
> [http://incubator.apache.org/ip-clearance/ip-clearance-template.html]
> and 
> [https://www.apache.org/licenses/contributor-agreements.html#grants]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14743) CLONE - Katarzyna Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)
Hannah Slaughter Zacharski created SOLR-14743:
-

 Summary: CLONE - Katarzyna Zacharski
 Key: SOLR-14743
 URL: https://issues.apache.org/jira/browse/SOLR-14743
 Project: Solr
  Issue Type: Wish
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hannah Slaughter Zacharski
 Attachments: CF7F5D9A-731E-41D0-A807-AE3AC1A23CE9.jpeg





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14742) Katarzyna Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)
Hannah Slaughter Zacharski created SOLR-14742:
-

 Summary: Katarzyna Zacharski
 Key: SOLR-14742
 URL: https://issues.apache.org/jira/browse/SOLR-14742
 Project: Solr
  Issue Type: Wish
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hannah Slaughter Zacharski
 Attachments: CF7F5D9A-731E-41D0-A807-AE3AC1A23CE9.jpeg





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14597) Advanced Query Parser

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176494#comment-17176494
 ] 

Ishan Chattopadhyaya commented on SOLR-14597:
-

I think we should build separate package for these, rather than loading them by 
default.

> Advanced Query Parser
> -
>
> Key: SOLR-14597
> URL: https://issues.apache.org/jira/browse/SOLR-14597
> Project: Solr
>  Issue Type: New Feature
>  Components: query parsers
>Affects Versions: 8.6
>Reporter: Mike Nibeck
>Assignee: Gus Heck
>Priority: Major
>
> This JIRA ticket tracks the progress of SIP-9, the Advanced Query Parser that 
> is being donated by the Library of Congress. Full description of the feature 
> can be found on the SIP Page.
> [https://cwiki.apache.org/confluence/display/SOLR/SIP-9+Advanced+Query+Parser]
> Briefly, this parser provides a comprehensive syntax for users that use 
> search on a daily basis. It also reserves a smaller set of punctuators than 
> other parsers. This facilitates easier handling of acronyms and punctuated 
> patterns with meaning ( such as C++ or 401(k) ). The new syntax opens up some 
> advanced features while also preventing access to arbitrary features via 
> local parameters. This parser will be safe for accepting user queries 
> directly with minimal pre-parsing, but for use cases beyond it's established 
> features alternate query paths (using other parsers) will need to be supplied.
> The code drop is being prepared and will be supplied as soon as we receive 
> guidance from the PMC regarding the proper process. Given that the Library 
> already has a signed CCLA we need to understand which of these (or other 
> processes) apply:
> [http://incubator.apache.org/ip-clearance/ip-clearance-template.html]
> and 
> [https://www.apache.org/licenses/contributor-agreements.html#grants]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14741) CLONE - Boone Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)
Hannah Slaughter Zacharski created SOLR-14741:
-

 Summary: CLONE - Boone Zacharski
 Key: SOLR-14741
 URL: https://issues.apache.org/jira/browse/SOLR-14741
 Project: Solr
  Issue Type: Wish
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hannah Slaughter Zacharski
 Attachments: FEA2CE24-FF3B-40FB-80D5-0AAF749C0526.tiff





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14740) Boone Zacharski

2020-08-12 Thread Hannah Slaughter Zacharski (Jira)
Hannah Slaughter Zacharski created SOLR-14740:
-

 Summary: Boone Zacharski
 Key: SOLR-14740
 URL: https://issues.apache.org/jira/browse/SOLR-14740
 Project: Solr
  Issue Type: Wish
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hannah Slaughter Zacharski
 Attachments: FEA2CE24-FF3B-40FB-80D5-0AAF749C0526.tiff





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Deleted] (SOLR-14739) CLONE - Lena Slaughter

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya deleted SOLR-14739:



> CLONE - Lena Slaughter
> --
>
> Key: SOLR-14739
> URL: https://issues.apache.org/jira/browse/SOLR-14739
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Deleted] (SOLR-14732) Lena Slaughter

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya deleted SOLR-14732:



> Lena Slaughter
> --
>
> Key: SOLR-14732
> URL: https://issues.apache.org/jira/browse/SOLR-14732
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Deleted] (SOLR-14737) Michael Zacharski

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya deleted SOLR-14737:



> Michael Zacharski
> -
>
> Key: SOLR-14737
> URL: https://issues.apache.org/jira/browse/SOLR-14737
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Deleted] (SOLR-14738) CLONE - Mosby Zacharski

2020-08-12 Thread Ishan Chattopadhyaya (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya deleted SOLR-14738:



> CLONE - Mosby Zacharski
> ---
>
> Key: SOLR-14738
> URL: https://issues.apache.org/jira/browse/SOLR-14738
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hannah Slaughter Zacharski
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



  1   2   >