[jira] [Commented] (LUCENE-9215) replace checkJavaDocs.py with doclet

2020-02-10 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034206#comment-17034206
 ] 

Dawid Weiss commented on LUCENE-9215:
-

I think eventually we should leverage the object model the Java parser provides 
for us here. It is clumsy sometimes (visitor pattern everywhere) but it is 
elegant and backward-compatible with new features, should they be added to the 
language.

Just for future reference - the custom doclet could be a separate task (still 
javadoc but with a custom doclet) and then it'd be independent. There is some 
prior unrelated (but copy-able) code I've worked on in Carrot2 that extracts 
Javadoc snippets from existing classes. 

https://github.com/carrot2/carrot2/tree/master/infra/jsondoclet/src/main/java/com/carrotsearch/jsondoclet

and its application here:
https://github.com/carrot2/carrot2/blob/master/core/build.gradle#L53-L73

> replace checkJavaDocs.py with doclet
> 
>
> Key: LUCENE-9215
> URL: https://issues.apache.org/jira/browse/LUCENE-9215
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9215_prototype.patch
>
>
> The current checker runs regular expressions against html, and it breaks when 
> newer java change html output. This is not particularly fun to fix: see 
> LUCENE-9213
> Java releases often now, and when i compared generated html of a simple class 
> across 11,12,13 it surprised me how much changes. So I think we want to avoid 
> parsing their HTML.
> Javadoc {{Xdoclint}} feature has a "missing checker": but it is black/white. 
> Either everything is fully documented or its not. And while you can 
> enable/disable doclint checks per-package, this also seems black/white 
> (either all checks or no checks at all).
> On the other hand the python checker is able to check per-package at 
> different granularities (package, class, method). It makes it possible to 
> iteratively improve the situation.
> With doclet api we could implement checks in pure java, for example to match 
> checkJavaDocs.py logic:
> {code}
>   private void checkComment(Element element) {
> var tree = docTrees.getDocCommentTree(element);
> if (tree == null) {
>   error(element, "javadocs are missing");
> } else {
>   var normalized = tree.getFirstSentence().get(0).toString()
>.replace('\u00A0', ' ')
>.trim()
>.toLowerCase(Locale.ROOT);
>   if (normalized.isEmpty()) {
> error(element, "blank javadoc comment");
>   } else if (normalized.startsWith("licensed to the apache software 
> foundation") ||
>  normalized.startsWith("copyright 2004 the apache software 
> foundation")) {
> error(element, "comment is really a license");
>   }
> }
> {code}
> If there are problems then they just appear as errors from the output of 
> {{javadoc}} like usual:
> {noformat}
> javadoc: error - org.apache.lucene.nodoc (package): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanNearQuery.java:190:
>  error - SpanNearWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanContainingQuery.java:54:
>  error - SpanContainingWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanWithinQuery.java:55:
>  error - SpanWithinWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanTermQuery.java:94:
>  error - SpanTermWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanNotQuery.java:109:
>  error - SpanNotWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanOrQuery.java:139:
>  error - SpanOrWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanPositionCheckQuery.java:77:
>  error - SpanPositionCheckWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/MultiCollectorManager.java:61:
>  error - Collectors (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/MultiCollectorManager.java:89:
>  error - LeafCollectors (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/util/PagedBytes.java:353:
>  error - PagedBytesDataOutput (class): javadocs are missing
> 

[GitHub] [lucene-solr] atris commented on issue #1214: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches

2020-02-10 Thread GitBox
atris commented on issue #1214: LUCENE-9074: Slice Allocation Control Plane For 
Concurrent Searches
URL: https://github.com/apache/lucene-solr/pull/1214#issuecomment-584514132
 
 
   @jpountz Updated, please see and let me know your thoughts


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #1214: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches

2020-02-10 Thread GitBox
atris commented on a change in pull request #1214: LUCENE-9074: Slice 
Allocation Control Plane For Concurrent Searches
URL: https://github.com/apache/lucene-solr/pull/1214#discussion_r377481946
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/QueueSizeBasedExecutionControlPlane.java
 ##
 @@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.List;
+import java.util.concurrent.Executor;
+import java.util.concurrent.Future;
+import java.util.concurrent.FutureTask;
+import java.util.concurrent.RejectedExecutionException;
+import java.util.concurrent.ThreadPoolExecutor;
+
+/**
+ * Implementation of SliceExecutionControlPlane with queue backpressure based 
thread allocation
+ */
+public class QueueSizeBasedExecutionControlPlane implements 
SliceExecutionControlPlane {
+  private static final double LIMITING_FACTOR = 1.5;
+  private static final int NUMBER_OF_PROCESSORS = 
Runtime.getRuntime().availableProcessors();
+
+  private Executor executor;
+
+  public QueueSizeBasedExecutionControlPlane(Executor executor) {
+this.executor = executor;
+  }
+
+  @Override
+  public List> invokeAll(Collection tasks) {
+boolean isThresholdCheckEnabled = true;
+
+if (tasks == null) {
+  throw new IllegalArgumentException("Tasks is null");
+}
+
+if (executor == null) {
+  throw new IllegalArgumentException("Executor is null");
+}
+
+ThreadPoolExecutor threadPoolExecutor = null;
+if ((executor instanceof ThreadPoolExecutor) == false) {
 
 Review comment:
   Agreed. Reverted the Executor changes and added the abstraction while 
updating the docs for IndexSearcher


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9209) fix javadocs to be html5, enable doclint html checks, remove jtidy

2020-02-10 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034202#comment-17034202
 ] 

Dawid Weiss commented on LUCENE-9209:
-

Thank Robert, looks great.

> fix javadocs to be html5, enable doclint html checks, remove jtidy
> --
>
> Key: LUCENE-9209
> URL: https://issues.apache.org/jira/browse/LUCENE-9209
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9209.patch, LUCENE-9209_current_state.patch
>
>
> Currently doclint is very angry about all the {{}} elements and similar 
> stuff going on. We claim to be emitting html5 documentation so it is about 
> time to clean it up.
> Then the html check can simply be enabled and we can remove the jtidy stuff 
> completely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9201) Port documentation-lint task to Gradle build

2020-02-10 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034199#comment-17034199
 ] 

Dawid Weiss commented on LUCENE-9201:
-

bq. we would need a special task to collect all javadocs to one place for 
publishing documentation

It is my personal preference to have a project-scope granularity. This way you 
can run project-scoped task (like {{gradlew -p lucene/core javadoc}}). My 
personal take on assembling "distributions" is to have a separate project that 
just takes what it needs from other projects and puts it together (with any 
tweaks required). This makes it easier to reason about how a distribution is 
assembled and from where, while each project just takes care of itself. 

Again - the above isn't a convention. It's just a style I gradually developed 
that has been working for me in other projects. If you take a look at the 
current Solr packaging project it's pretty much what I have in mind:

https://github.com/apache/lucene-solr/blob/master/solr/packaging/build.gradle

Let me look at the patch again later today (digging myself out of the vacation 
hole).


> Port documentation-lint task to Gradle build
> 
>
> Key: LUCENE-9201
> URL: https://issues.apache.org/jira/browse/LUCENE-9201
> Project: Lucene - Core
>  Issue Type: Sub-task
>Affects Versions: master (9.0)
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Major
> Attachments: javadocGRADLE.png, javadocHTML4.png, javadocHTML5.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Ant build's "documentation-lint" target consists of those two sub targets.
>  * "-ecj-javadoc-lint" (Javadoc linting by ECJ)
>  * "-documentation-lint"(Missing javadocs / broken links check by python 
> scripts)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9201) Port documentation-lint task to Gradle build

2020-02-10 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034121#comment-17034121
 ] 

Tomoko Uchida edited comment on LUCENE-9201 at 2/11/20 4:43 AM:


One small thing about the equivalent "ant documentation" (gradle built-in 
Javadoc task or our customized one), 

I think it'd be better the javadoc generation task outputs all javadocs to 
module-wide common directory (e.g., {{lucene/build/docs}} or 
{{solr/build/docs}}) just like ant build does, instead of each module's build 
directory. This makes things easy for succeeding "broken links check" (running 
{{checkJavadocLinks.py}} - or its replacement?) and release managers work that 
should includes updating the official documentation site 
([https://cwiki.apache.org/confluence/display/LUCENE/ReleaseTodo#ReleaseTodo-Pushdocs,changesandjavadocstotheCMSproductiontree]).


was (Author: tomoko uchida):
One small thing about the equivalent "ant documentation" (gradle built-in 
Javadoc task or our customized one), 

I think it'd be better the javadoc generation task should output all javadocs 
to module-wide common directory (e.g., {{lucene/build/docs}} or 
{{solr/build/docs}}) just ant build does, instead of each module's build 
directory. This makes things easy for succeeding "broken links check" (running 
{{checkJavadocLinks.py}} - or its replacement?) and release managers work that 
should includes updating the official documentation site 
([https://cwiki.apache.org/confluence/display/LUCENE/ReleaseTodo#ReleaseTodo-Pushdocs,changesandjavadocstotheCMSproductiontree]).

> Port documentation-lint task to Gradle build
> 
>
> Key: LUCENE-9201
> URL: https://issues.apache.org/jira/browse/LUCENE-9201
> Project: Lucene - Core
>  Issue Type: Sub-task
>Affects Versions: master (9.0)
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Major
> Attachments: javadocGRADLE.png, javadocHTML4.png, javadocHTML5.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Ant build's "documentation-lint" target consists of those two sub targets.
>  * "-ecj-javadoc-lint" (Javadoc linting by ECJ)
>  * "-documentation-lint"(Missing javadocs / broken links check by python 
> scripts)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14247) IndexSizeTriggerMixedBoundsTest does a lot of sleeping

2020-02-10 Thread Mike Drob (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob resolved SOLR-14247.
--
Fix Version/s: master (9.0)
 Assignee: Mike Drob
   Resolution: Fixed

Thanks for the additional testing, Erick!

I committed the changes, if this ends up failing on Jenkins due to less 
powerful hardware, then we will go back and revisit.

> IndexSizeTriggerMixedBoundsTest does a lot of sleeping
> --
>
> Key: SOLR-14247
> URL: https://issues.apache.org/jira/browse/SOLR-14247
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Minor
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When I run tests locally, the slowest reported test is always 
> IndexSizeTriggerMixedBoundsTest  coming in at around 2 minutes.
> I took a look at the code and discovered that at least 80s of that is all 
> sleeps!
> There might need to be more synchronization and ordering added back in, but 
> when I removed all of the sleeps the test still passed locally for me, so I'm 
> not too sure what the point was or why we were slowing the system down so 
> much.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14247) IndexSizeTriggerMixedBoundsTest does a lot of sleeping

2020-02-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034104#comment-17034104
 ] 

ASF subversion and git services commented on SOLR-14247:


Commit 71b869381ef0090a6e96eccbc9924ebdb4f57306 in lucene-solr's branch 
refs/heads/master from Mike
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=71b8693 ]

SOLR-14247 Remove unneeded sleeps (#1244)



> IndexSizeTriggerMixedBoundsTest does a lot of sleeping
> --
>
> Key: SOLR-14247
> URL: https://issues.apache.org/jira/browse/SOLR-14247
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Mike Drob
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When I run tests locally, the slowest reported test is always 
> IndexSizeTriggerMixedBoundsTest  coming in at around 2 minutes.
> I took a look at the code and discovered that at least 80s of that is all 
> sleeps!
> There might need to be more synchronization and ordering added back in, but 
> when I removed all of the sleeps the test still passed locally for me, so I'm 
> not too sure what the point was or why we were slowing the system down so 
> much.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob merged pull request #1244: SOLR-14247 Remove unneeded sleeps

2020-02-10 Thread GitBox
madrob merged pull request #1244: SOLR-14247 Remove unneeded sleeps
URL: https://github.com/apache/lucene-solr/pull/1244
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Reopened] (SOLR-14245) Validate Replica / ReplicaInfo on creation

2020-02-10 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter reopened SOLR-14245:
---

we're seeing a spike in (reproducible) jenkins failures for 
ReplicaListTransformerTest.testTransform that corrispond with the commits on 
this issue...

 
{noformat}

[junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=ReplicaListTransformerTest -Dtests.method=testTransform 
-Dtests.seed=51F4546B22050419 -Dtests.multiplier=3 -Dtests.slow=true 
-Dtests.locale=saq -Dtests.timezone=Asia/Omsk -Dtests.asserts=true 
-Dtests.file.encoding=ISO-8859-1
   [junit4] FAILURE 0.01s J2 | ReplicaListTransformerTest.testTransform <<<
   [junit4]> Throwable #1: java.lang.AssertionError: expected:<1> but 
was:<2>
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([51F4546B22050419:5B1FFCC907E1DD14]:0)
   [junit4]>at 
org.apache.solr.client.solrj.routing.ReplicaListTransformerTest.testTransform(ReplicaListTransformerTest.java:144)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   [junit4]>at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [junit4]>at 
java.base/java.lang.reflect.Method.invoke(Method.java:566)
   [junit4]>at java.base/java.lang.Thread.run(Thread.java:834)
{noformat}

> Validate Replica / ReplicaInfo on creation
> --
>
> Key: SOLR-14245
> URL: https://issues.apache.org/jira/browse/SOLR-14245
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Minor
> Fix For: 8.5
>
>
> Replica / ReplicaInfo should be immutable and their fields should be 
> validated on creation.
> Some users reported that very rarely during a failed collection CREATE or 
> DELETE, or when the Overseer task queue becomes corrupted, Solr may write to 
> ZK incomplete replica infos (eg. node_name = null).
> This problem is difficult to reproduce but we should add safeguards anyway to 
> prevent writing such corrupted replica info to ZK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14254) Index backcompat break between 8.3.1 and 8.4.1

2020-02-10 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034015#comment-17034015
 ] 

David Smiley edited comment on SOLR-14254 at 2/10/20 11:35 PM:
---

CC [~jpountz] I think Hossman has a point that the "FST50" name should have 
changed when the index version changed to like FST84 if 8.4 has a different 
version than prior.  It'd be cool if a user could say postingsFormat="FST"; 
clearly out of scope I know.

Anyway... Cassandra, to answer your question: upgrade advise is re-index if 
that's easiest.  Another option is for a user to stay at the previous version 
and change the postingsFormat to be non-existent to be the default and then 
trigger an optimize which will rewrite it to something that will survive to the 
next version.  Then upgrade the software.  Then to get back to an FST based 
format, change the postingsFormat config back then do an optimize again.  
Wether or not users should use postingsFormat=FST50 or let it default is a 
trade-off in performance vs compatibility.  FST kicks but but perhaps you won't 
notice if you send small amounts of text (e.g. queries).  If someone is 
desperate, the old code can be moved into Solr and packaged up as a plugin.


was (Author: dsmiley):
CC [~jpountz] I think Hossman has a point that the "FST50" name should have 
changed when the index version changed to like FST84 if 8.4 has a different 
version than prior.  It'd be cool if a user could say postingsFormat="FST"; 
clearly out of scope I know.

Anyway... Cassandra, to answer your question: upgrade advise is re-index if 
that's easiest.  Another option is for a user to stay at the previous version 
and change the postingsFormat to be non-existent to be the default and then 
trigger an optimize which will rewrite it to something that will survive to the 
next version.  Then upgrade the software, then change the version back 
(optionally).  Wether or not users should use postingsFormat=FST50 or let it 
default is a trade-off in performance vs compatibility.  FST kicks but but 
perhaps you won't notice if you send small amounts of text (e.g. queries).  If 
someone is desperate, the old code can be moved into Solr and packaged up as a 
plugin.

> Index backcompat break between 8.3.1 and 8.4.1
> --
>
> Key: SOLR-14254
> URL: https://issues.apache.org/jira/browse/SOLR-14254
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jason Gerlowski
>Priority: Major
>
> I believe I found a backcompat break between 8.4.1 and 8.3.1.
> I encountered this when a Solr 8.3.1 cluster was upgraded to 8.4.1.  On 8.4. 
> nodes, several collections had cores fail to come up with 
> {{CorruptIndexException}}:
> {code}
> 2020-02-10 20:58:26.136 ERROR 
> (coreContainerWorkExecutor-2-thread-1-processing-n:192.168.1.194:8983_solr) [ 
>   ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup 
> => org.apache.sol
> r.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
> org.apache.solr.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
>  ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:788) 
> ~[?:?]
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>  ~[metrics-core-4.0.5.jar:4.0.5]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  ~[?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.(SolrCore.java:1072) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2182) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) 
> ~[?:?]
> at 

[GitHub] [lucene-solr] ErickErickson commented on issue #1248: LUCENE-9134: Port ant-regenerate tasks to Gradle build

2020-02-10 Thread GitBox
ErickErickson commented on issue #1248: LUCENE-9134: Port ant-regenerate tasks 
to Gradle build
URL: https://github.com/apache/lucene-solr/pull/1248#issuecomment-584409136
 
 
   I made the changes Mike mentioned, but I won't create another PR for a bit 
to give others a chance to look


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14254) Index backcompat break between 8.3.1 and 8.4.1

2020-02-10 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034015#comment-17034015
 ] 

David Smiley commented on SOLR-14254:
-

CC [~jpountz] I think Hossman has a point that the "FST50" name should have 
changed when the index version changed to like FST84 if 8.4 has a different 
version than prior.  It'd be cool if a user could say postingsFormat="FST"; 
clearly out of scope I know.

Anyway... Cassandra, to answer your question: upgrade advise is re-index if 
that's easiest.  Another option is for a user to stay at the previous version 
and change the postingsFormat to be non-existent to be the default and then 
trigger an optimize which will rewrite it to something that will survive to the 
next version.  Then upgrade the software, then change the version back 
(optionally).  Wether or not users should use postingsFormat=FST50 or let it 
default is a trade-off in performance vs compatibility.  FST kicks but but 
perhaps you won't notice if you send small amounts of text (e.g. queries).  If 
someone is desperate, the old code can be moved into Solr and packaged up as a 
plugin.

> Index backcompat break between 8.3.1 and 8.4.1
> --
>
> Key: SOLR-14254
> URL: https://issues.apache.org/jira/browse/SOLR-14254
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jason Gerlowski
>Priority: Major
>
> I believe I found a backcompat break between 8.4.1 and 8.3.1.
> I encountered this when a Solr 8.3.1 cluster was upgraded to 8.4.1.  On 8.4. 
> nodes, several collections had cores fail to come up with 
> {{CorruptIndexException}}:
> {code}
> 2020-02-10 20:58:26.136 ERROR 
> (coreContainerWorkExecutor-2-thread-1-processing-n:192.168.1.194:8983_solr) [ 
>   ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup 
> => org.apache.sol
> r.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
> org.apache.solr.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
>  ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:788) 
> ~[?:?]
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>  ~[metrics-core-4.0.5.jar:4.0.5]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  ~[?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.(SolrCore.java:1072) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2182) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1132) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:1013) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.lucene.index.CorruptIndexException: codec mismatch: 
> actual codec=Lucene50PostingsWriterDoc vs expected 
> codec=Lucene84PostingsWriterDoc 
> (resource=MMapIndexInput(path="/Users/jasongerlowski/run/solrdata/data/testbackcompat_shard1_replica_n1/data/index/_0_FST50_0.doc"))
> at 
> org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:208) 
> ~[?:?]
> at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:198) 
> ~[?:?]
> at 
> org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) ~[?:?]
> at 
> org.apache.lucene.codecs.lucene84.Lucene84PostingsReader.(Lucene84PostingsReader.java:82)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.memory.FSTPostingsFormat.fieldsProducer(FSTPostingsFormat.java:66)
>  ~[?:?]
> 

[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1248: LUCENE-9134: Port ant-regenerate tasks to Gradle build

2020-02-10 Thread GitBox
ErickErickson commented on a change in pull request #1248: LUCENE-9134: Port 
ant-regenerate tasks to Gradle build
URL: https://github.com/apache/lucene-solr/pull/1248#discussion_r377376693
 
 

 ##
 File path: gradle/generation/util.gradle
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  configurations {
+utilgen
+  }
+
+  dependencies {
+  }
+
+  task utilgen {
+description "Regenerate sources for ...lucene/util/automaton and 
...lucene/util/packed."
+group "generation"
+
+dependsOn ":lucene:core:utilGenPacked"
+dependsOn ":lucene:core:utilGenLev"
+  }
+}
+
+
+task installMoman(type: Download) {
+  def momanDir = new File(buildDir, "moman").getAbsolutePath()
+  def momanZip = new File(momanDir, "moman.zip").getAbsolutePath()
+
+  src "https://bitbucket.org/jpbarrette/moman/get/5c5c2a1e4dea.zip;
+  dest momanZip
+  onlyIfModified true
+
+  doLast {
+logger.lifecycle("Downloading moman to: ${buildDir}")
+ant.unzip(src: momanZip, dest: momanDir, overwrite: "true") {
+  ant.cutdirsmapper(dirs: "1")
+}
+  }
+}
+
+configure(project(":lucene:core")) {
+  task utilGenPacked(dependsOn: installMoman) {
+description "Regenerate util/PackedBulkOperationsPacked*.java and 
Packed64SingleBlock.java"
+group "generation"
+
+def workDir = "src/java/org/apache/lucene/util/packed"
+
+doLast {
+  ['gen_BulkOperation.py', 'gen_Packed64SingleBlock.py'].each { prog ->
+logger.lifecycle("Executing: ${prog} in ${workDir}")
+project.exec {
+  workingDir workDir
+  executable "python"
+  args = ['-B', "${prog}"]
+}
+  }
+  // Correct line endings for Windows.
+  ['Packed64SingleBlock.java', 'BulkOperation*.java'].each { files ->
 
 Review comment:
   True, I'll  change it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1248: LUCENE-9134: Port ant-regenerate tasks to Gradle build

2020-02-10 Thread GitBox
ErickErickson commented on a change in pull request #1248: LUCENE-9134: Port 
ant-regenerate tasks to Gradle build
URL: https://github.com/apache/lucene-solr/pull/1248#discussion_r377374590
 
 

 ##
 File path: gradle/generation/util.gradle
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  configurations {
+utilgen
+  }
+
+  dependencies {
+  }
+
+  task utilgen {
+description "Regenerate sources for ...lucene/util/automaton and 
...lucene/util/packed."
+group "generation"
+
+dependsOn ":lucene:core:utilGenPacked"
+dependsOn ":lucene:core:utilGenLev"
+  }
+}
+
+
+task installMoman(type: Download) {
+  def momanDir = new File(buildDir, "moman").getAbsolutePath()
+  def momanZip = new File(momanDir, "moman.zip").getAbsolutePath()
+
+  src "https://bitbucket.org/jpbarrette/moman/get/5c5c2a1e4dea.zip;
+  dest momanZip
+  onlyIfModified true
+
+  doLast {
+logger.lifecycle("Downloading moman to: ${buildDir}")
+ant.unzip(src: momanZip, dest: momanDir, overwrite: "true") {
+  ant.cutdirsmapper(dirs: "1")
+}
+  }
+}
+
+configure(project(":lucene:core")) {
+  task utilGenPacked(dependsOn: installMoman) {
+description "Regenerate util/PackedBulkOperationsPacked*.java and 
Packed64SingleBlock.java"
+group "generation"
+
+def workDir = "src/java/org/apache/lucene/util/packed"
+
+doLast {
+  ['gen_BulkOperation.py', 'gen_Packed64SingleBlock.py'].each { prog ->
+logger.lifecycle("Executing: ${prog} in ${workDir}")
+project.exec {
+  workingDir workDir
+  executable "python"
+  args = ['-B', "${prog}"]
+}
+  }
+  // Correct line endings for Windows.
+  ['Packed64SingleBlock.java', 'BulkOperation*.java'].each { files ->
+project.ant.fixcrlf(
+srcDir: workDir,
+includes: files,
+encoding: 'UTF-8',
+eol: 'lf'
+)
+  }
+}
+  }
+}
+
+configure(project(":lucene:core")) {
 
 Review comment:
   Good point, done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson commented on a change in pull request #1248: LUCENE-9134: Port ant-regenerate tasks to Gradle build

2020-02-10 Thread GitBox
ErickErickson commented on a change in pull request #1248: LUCENE-9134: Port 
ant-regenerate tasks to Gradle build
URL: https://github.com/apache/lucene-solr/pull/1248#discussion_r377369604
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/util/automaton/createLevAutomata.py
 ##
 @@ -22,7 +22,7 @@
 import os
 import sys
 # sys.path.insert(0, 'moman/finenight/python')
-sys.path.insert(0, '../../../../../../../../build/core/moman/finenight/python')
+sys.path.insert(0, '../../../../../../../../../build/moman/finenight/python')
 
 Review comment:
   Yeah. I'm pretty sure it does break the ant regenerate. Since the whole 
regenerate process is apparently run extremely rarely (like every couple of 
years or so from what I can tell), I think we'll be on gradle exclusively the 
next time this is run.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14254) Index backcompat break between 8.3.1 and 8.4.1

2020-02-10 Thread Cassandra Targett (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033993#comment-17033993
 ] 

Cassandra Targett edited comment on SOLR-14254 at 2/10/20 10:57 PM:


[~dsmiley], do you have a recommendation for people who use the Tagger handler 
in 8.x and who upgrade to 8.4? Is the only option to reindex?

LUCENE-9116 is for 8.5, so it was a bit confusing to find the break in 8.4.1, 
but my reading of the comments there is that something was worked out - did I 
misunderstand that discussion (very possible since I didn't look at the code)?

Also, {{tagger-handler.adoc}} for 8.4 and 8.5 still says the 
{{postingsFormat=FST50}} is the official recommendation. That would obviously 
fail - should the {{postingsFormat}} param now be omitted?

Edit: re-reading Hoss' comment I think I missed that the new codec is using the 
same name as the old one (or maybe that's just on master after LUCENE-9116), 
which would mean docs do not need to be updated, but it would be nice to get 
explicit clarification on that. The docs still need to be updated with a major 
caveat that using the param we tell them to use means they are now in 
"non-standard codec" territory and stuff could shift without warning.


was (Author: ctargett):
[~dsmiley], do you have a recommendation for people who use the Tagger handler 
in 8.x and who upgrade to 8.4? Is the only option to reindex?

LUCENE-9116 is for 8.5, so it was a bit confusing to find the break in 8.4.1, 
but my reading of the comments there is that something was worked out - did I 
misunderstand that discussion (very possible since I didn't look at the code)?

Also, {{tagger-handler.adoc}} for 8.4 and 8.5 still says the 
{{postingsFormat=FST50}} is the official recommendation. That would obviously 
fail - should the {{postingsFormat}} param now be omitted?

> Index backcompat break between 8.3.1 and 8.4.1
> --
>
> Key: SOLR-14254
> URL: https://issues.apache.org/jira/browse/SOLR-14254
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jason Gerlowski
>Priority: Major
>
> I believe I found a backcompat break between 8.4.1 and 8.3.1.
> I encountered this when a Solr 8.3.1 cluster was upgraded to 8.4.1.  On 8.4. 
> nodes, several collections had cores fail to come up with 
> {{CorruptIndexException}}:
> {code}
> 2020-02-10 20:58:26.136 ERROR 
> (coreContainerWorkExecutor-2-thread-1-processing-n:192.168.1.194:8983_solr) [ 
>   ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup 
> => org.apache.sol
> r.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
> org.apache.solr.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
>  ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:788) 
> ~[?:?]
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>  ~[metrics-core-4.0.5.jar:4.0.5]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  ~[?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.(SolrCore.java:1072) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2182) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1132) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:1013) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.lucene.index.CorruptIndexException: codec mismatch: 
> actual 

[jira] [Commented] (SOLR-14254) Index backcompat break between 8.3.1 and 8.4.1

2020-02-10 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033998#comment-17033998
 ] 

Chris M. Hostetter commented on SOLR-14254:
---

{quote}LUCENE-9116 is for 8.5, so it was a bit confusing to find the break in 
8.4.1, but my reading of the comments there is that something was worked out ...
{quote}
IIUC: LUCENE-9027 actually broke backcompat on the {{FST50}} postings format 
(in 8.4) ... but that's not considered a "back compat break" for lucene/solr 
because it was not the default postings format at the time it was changed. (and 
this policy is mentioned in {{field-type-definitions-and-properties.adoc:112}})

David's points/references to LUCENE-9116 are about the fact that in _that_ 
issue the {{FST50}} posting format was (initially) removed entirely from 
branch_8x, and the later decision was to re-add it – but it's still not fully 
back compatible with how it worked in 8.3.1
{quote}Also, {{tagger-handler.adoc}} for 8.4 and 8.5 still says the 
{{postingsFormat=FST50}} is the official recommendation. That would obviously 
fail
{quote}
I believe that usage will still _work_ in 8.4 and higher ... it just creates 
different bytes on disk then it did in 8.3, and it hypothetically may not be 
readable in 8.5, 8.6, etc... if additional changes are made

> Index backcompat break between 8.3.1 and 8.4.1
> --
>
> Key: SOLR-14254
> URL: https://issues.apache.org/jira/browse/SOLR-14254
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jason Gerlowski
>Priority: Major
>
> I believe I found a backcompat break between 8.4.1 and 8.3.1.
> I encountered this when a Solr 8.3.1 cluster was upgraded to 8.4.1.  On 8.4. 
> nodes, several collections had cores fail to come up with 
> {{CorruptIndexException}}:
> {code}
> 2020-02-10 20:58:26.136 ERROR 
> (coreContainerWorkExecutor-2-thread-1-processing-n:192.168.1.194:8983_solr) [ 
>   ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup 
> => org.apache.sol
> r.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
> org.apache.solr.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
>  ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:788) 
> ~[?:?]
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>  ~[metrics-core-4.0.5.jar:4.0.5]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  ~[?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.(SolrCore.java:1072) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2182) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1132) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:1013) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.lucene.index.CorruptIndexException: codec mismatch: 
> actual codec=Lucene50PostingsWriterDoc vs expected 
> codec=Lucene84PostingsWriterDoc 
> (resource=MMapIndexInput(path="/Users/jasongerlowski/run/solrdata/data/testbackcompat_shard1_replica_n1/data/index/_0_FST50_0.doc"))
> at 
> org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:208) 
> ~[?:?]
> at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:198) 
> ~[?:?]
> at 
> org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) ~[?:?]
> at 
> 

[jira] [Commented] (SOLR-14254) Index backcompat break between 8.3.1 and 8.4.1

2020-02-10 Thread Cassandra Targett (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033993#comment-17033993
 ] 

Cassandra Targett commented on SOLR-14254:
--

[~dsmiley], do you have a recommendation for people who use the Tagger handler 
in 8.x and who upgrade to 8.4? Is the only option to reindex?

LUCENE-9116 is for 8.5, so it was a bit confusing to find the break in 8.4.1, 
but my reading of the comments there is that something was worked out - did I 
misunderstand that discussion (very possible since I didn't look at the code)?

Also, {{tagger-handler.adoc}} for 8.4 and 8.5 still says the 
{{postingsFormat=FST50}} is the official recommendation. That would obviously 
fail - should the {{postingsFormat}} param now be omitted?

> Index backcompat break between 8.3.1 and 8.4.1
> --
>
> Key: SOLR-14254
> URL: https://issues.apache.org/jira/browse/SOLR-14254
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jason Gerlowski
>Priority: Major
>
> I believe I found a backcompat break between 8.4.1 and 8.3.1.
> I encountered this when a Solr 8.3.1 cluster was upgraded to 8.4.1.  On 8.4. 
> nodes, several collections had cores fail to come up with 
> {{CorruptIndexException}}:
> {code}
> 2020-02-10 20:58:26.136 ERROR 
> (coreContainerWorkExecutor-2-thread-1-processing-n:192.168.1.194:8983_solr) [ 
>   ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup 
> => org.apache.sol
> r.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
> org.apache.solr.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
>  ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:788) 
> ~[?:?]
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>  ~[metrics-core-4.0.5.jar:4.0.5]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  ~[?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.(SolrCore.java:1072) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2182) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1132) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:1013) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.lucene.index.CorruptIndexException: codec mismatch: 
> actual codec=Lucene50PostingsWriterDoc vs expected 
> codec=Lucene84PostingsWriterDoc 
> (resource=MMapIndexInput(path="/Users/jasongerlowski/run/solrdata/data/testbackcompat_shard1_replica_n1/data/index/_0_FST50_0.doc"))
> at 
> org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:208) 
> ~[?:?]
> at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:198) 
> ~[?:?]
> at 
> org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) ~[?:?]
> at 
> org.apache.lucene.codecs.lucene84.Lucene84PostingsReader.(Lucene84PostingsReader.java:82)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.memory.FSTPostingsFormat.fieldsProducer(FSTPostingsFormat.java:66)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:315)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:395)
>  ~[?:?]
> at 
> org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:114)
>  ~[?:?]
> at 
> 

[jira] [Commented] (SOLR-14254) Index backcompat break between 8.3.1 and 8.4.1

2020-02-10 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033989#comment-17033989
 ] 

David Smiley commented on SOLR-14254:
-

FST50 has a back-compat break but there is no back-compat guarantees for the 
non-default codec -- see LUCENE-9116.  This is documented in 
{{field-type-definitions-and-properties.adoc}} but perhaps this should have 
been called out explicitly in the upgrade notes.

> Index backcompat break between 8.3.1 and 8.4.1
> --
>
> Key: SOLR-14254
> URL: https://issues.apache.org/jira/browse/SOLR-14254
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jason Gerlowski
>Priority: Major
>
> I believe I found a backcompat break between 8.4.1 and 8.3.1.
> I encountered this when a Solr 8.3.1 cluster was upgraded to 8.4.1.  On 8.4. 
> nodes, several collections had cores fail to come up with 
> {{CorruptIndexException}}:
> {code}
> 2020-02-10 20:58:26.136 ERROR 
> (coreContainerWorkExecutor-2-thread-1-processing-n:192.168.1.194:8983_solr) [ 
>   ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup 
> => org.apache.sol
> r.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
> org.apache.solr.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
>  ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:788) 
> ~[?:?]
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>  ~[metrics-core-4.0.5.jar:4.0.5]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  ~[?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.(SolrCore.java:1072) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2182) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1132) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:1013) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.lucene.index.CorruptIndexException: codec mismatch: 
> actual codec=Lucene50PostingsWriterDoc vs expected 
> codec=Lucene84PostingsWriterDoc 
> (resource=MMapIndexInput(path="/Users/jasongerlowski/run/solrdata/data/testbackcompat_shard1_replica_n1/data/index/_0_FST50_0.doc"))
> at 
> org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:208) 
> ~[?:?]
> at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:198) 
> ~[?:?]
> at 
> org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) ~[?:?]
> at 
> org.apache.lucene.codecs.lucene84.Lucene84PostingsReader.(Lucene84PostingsReader.java:82)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.memory.FSTPostingsFormat.fieldsProducer(FSTPostingsFormat.java:66)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:315)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:395)
>  ~[?:?]
> at 
> org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:114)
>  ~[?:?]
> at 
> org.apache.lucene.index.SegmentReader.(SegmentReader.java:84) ~[?:?]
> at 
> org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:177)
>  ~[?:?]
> at 
> org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:219)
>  ~[?:?]
> at 
> 

[jira] [Commented] (SOLR-14254) Index backcompat break between 8.3.1 and 8.4.1

2020-02-10 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033973#comment-17033973
 ] 

Chris M. Hostetter commented on SOLR-14254:
---

I think the problem here is that {{FSTPostingsFormat}} has always (and still) 
identifies itself with the name "FST50" -- and LUCENE-9027 changed the 
underlying writer/reader of {{FSTPostingsFormat}} from  
{{Lucene50PostingsWriter}} + {{Lucene50PostingsReader}} to 
{{Lucene84PostingsWriter}} + {{Lucene84PostingsReader}} ... w/o changing the 
"FST50" name passed to the {{PostingsFormat}} super constructor (or adding a 
new "FST50PostingsFormat" to the backcompat codecs)

Which means -- IIUC -- when the "FST50" postings format is read from an 8.3.1 
index, {{FSTPostingsFormat}}  is used, but it tries to use 
{{Lucene84PostingsReader}} instead of {{Lucene50PostingsWriter}} ?

(It looks like {{FSTOrdPostingsFormat}} might also have hte same bug?)



> Index backcompat break between 8.3.1 and 8.4.1
> --
>
> Key: SOLR-14254
> URL: https://issues.apache.org/jira/browse/SOLR-14254
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Jason Gerlowski
>Priority: Major
>
> I believe I found a backcompat break between 8.4.1 and 8.3.1.
> I encountered this when a Solr 8.3.1 cluster was upgraded to 8.4.1.  On 8.4. 
> nodes, several collections had cores fail to come up with 
> {{CorruptIndexException}}:
> {code}
> 2020-02-10 20:58:26.136 ERROR 
> (coreContainerWorkExecutor-2-thread-1-processing-n:192.168.1.194:8983_solr) [ 
>   ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup 
> => org.apache.sol
> r.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
> org.apache.solr.common.SolrException: Unable to create core 
> [testbackcompat_shard1_replica_n1]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
>  ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:788) 
> ~[?:?]
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
>  ~[metrics-core-4.0.5.jar:4.0.5]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  ~[?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  ~[?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.(SolrCore.java:1072) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2182) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1132) 
> ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:1013) ~[?:?]
> at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
>  ~[?:?]
> ... 7 more
> Caused by: org.apache.lucene.index.CorruptIndexException: codec mismatch: 
> actual codec=Lucene50PostingsWriterDoc vs expected 
> codec=Lucene84PostingsWriterDoc 
> (resource=MMapIndexInput(path="/Users/jasongerlowski/run/solrdata/data/testbackcompat_shard1_replica_n1/data/index/_0_FST50_0.doc"))
> at 
> org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:208) 
> ~[?:?]
> at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:198) 
> ~[?:?]
> at 
> org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) ~[?:?]
> at 
> org.apache.lucene.codecs.lucene84.Lucene84PostingsReader.(Lucene84PostingsReader.java:82)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.memory.FSTPostingsFormat.fieldsProducer(FSTPostingsFormat.java:66)
>  ~[?:?]
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:315)
>  ~[?:?]
> at 
> 

[GitHub] [lucene-solr] madrob commented on a change in pull request #1248: LUCENE-9134: Port ant-regenerate tasks to Gradle build

2020-02-10 Thread GitBox
madrob commented on a change in pull request #1248: LUCENE-9134: Port 
ant-regenerate tasks to Gradle build
URL: https://github.com/apache/lucene-solr/pull/1248#discussion_r377327836
 
 

 ##
 File path: gradle/generation/util.gradle
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  configurations {
+utilgen
+  }
+
+  dependencies {
+  }
+
+  task utilgen {
+description "Regenerate sources for ...lucene/util/automaton and 
...lucene/util/packed."
+group "generation"
+
+dependsOn ":lucene:core:utilGenPacked"
+dependsOn ":lucene:core:utilGenLev"
+  }
+}
+
+
+task installMoman(type: Download) {
+  def momanDir = new File(buildDir, "moman").getAbsolutePath()
+  def momanZip = new File(momanDir, "moman.zip").getAbsolutePath()
+
+  src "https://bitbucket.org/jpbarrette/moman/get/5c5c2a1e4dea.zip;
+  dest momanZip
+  onlyIfModified true
+
+  doLast {
+logger.lifecycle("Downloading moman to: ${buildDir}")
+ant.unzip(src: momanZip, dest: momanDir, overwrite: "true") {
+  ant.cutdirsmapper(dirs: "1")
+}
+  }
+}
+
+configure(project(":lucene:core")) {
+  task utilGenPacked(dependsOn: installMoman) {
+description "Regenerate util/PackedBulkOperationsPacked*.java and 
Packed64SingleBlock.java"
+group "generation"
+
+def workDir = "src/java/org/apache/lucene/util/packed"
+
+doLast {
+  ['gen_BulkOperation.py', 'gen_Packed64SingleBlock.py'].each { prog ->
+logger.lifecycle("Executing: ${prog} in ${workDir}")
+project.exec {
+  workingDir workDir
+  executable "python"
+  args = ['-B', "${prog}"]
+}
+  }
+  // Correct line endings for Windows.
+  ['Packed64SingleBlock.java', 'BulkOperation*.java'].each { files ->
+project.ant.fixcrlf(
+srcDir: workDir,
+includes: files,
+encoding: 'UTF-8',
+eol: 'lf'
+)
+  }
+}
+  }
+}
+
+configure(project(":lucene:core")) {
 
 Review comment:
   I think you can combine this with the previous configure block, I don't 
think separating them adds readability. Let me know if you did this 
intentionally though


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1248: LUCENE-9134: Port ant-regenerate tasks to Gradle build

2020-02-10 Thread GitBox
madrob commented on a change in pull request #1248: LUCENE-9134: Port 
ant-regenerate tasks to Gradle build
URL: https://github.com/apache/lucene-solr/pull/1248#discussion_r377327334
 
 

 ##
 File path: gradle/generation/util.gradle
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+apply plugin: "de.undercouch.download"
+
+configure(rootProject) {
+  configurations {
+utilgen
+  }
+
+  dependencies {
+  }
+
+  task utilgen {
+description "Regenerate sources for ...lucene/util/automaton and 
...lucene/util/packed."
+group "generation"
+
+dependsOn ":lucene:core:utilGenPacked"
+dependsOn ":lucene:core:utilGenLev"
+  }
+}
+
+
+task installMoman(type: Download) {
+  def momanDir = new File(buildDir, "moman").getAbsolutePath()
+  def momanZip = new File(momanDir, "moman.zip").getAbsolutePath()
+
+  src "https://bitbucket.org/jpbarrette/moman/get/5c5c2a1e4dea.zip;
+  dest momanZip
+  onlyIfModified true
+
+  doLast {
+logger.lifecycle("Downloading moman to: ${buildDir}")
+ant.unzip(src: momanZip, dest: momanDir, overwrite: "true") {
+  ant.cutdirsmapper(dirs: "1")
+}
+  }
+}
+
+configure(project(":lucene:core")) {
+  task utilGenPacked(dependsOn: installMoman) {
+description "Regenerate util/PackedBulkOperationsPacked*.java and 
Packed64SingleBlock.java"
+group "generation"
+
+def workDir = "src/java/org/apache/lucene/util/packed"
+
+doLast {
+  ['gen_BulkOperation.py', 'gen_Packed64SingleBlock.py'].each { prog ->
+logger.lifecycle("Executing: ${prog} in ${workDir}")
+project.exec {
+  workingDir workDir
+  executable "python"
+  args = ['-B', "${prog}"]
+}
+  }
+  // Correct line endings for Windows.
+  ['Packed64SingleBlock.java', 'BulkOperation*.java'].each { files ->
 
 Review comment:
   Does this need to be an `each` block, or can we specify multiple includes 
for the ant execution?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1248: LUCENE-9134: Port ant-regenerate tasks to Gradle build

2020-02-10 Thread GitBox
madrob commented on a change in pull request #1248: LUCENE-9134: Port 
ant-regenerate tasks to Gradle build
URL: https://github.com/apache/lucene-solr/pull/1248#discussion_r377332813
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/util/automaton/createLevAutomata.py
 ##
 @@ -22,7 +22,7 @@
 import os
 import sys
 # sys.path.insert(0, 'moman/finenight/python')
-sys.path.insert(0, '../../../../../../../../build/core/moman/finenight/python')
+sys.path.insert(0, '../../../../../../../../../build/moman/finenight/python')
 
 Review comment:
   I think this has been answered before, but please remind me, does this break 
`ant regenerate` meaning that both cannot co-exist?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14254) Index backcompat break between 8.3.1 and 8.4.1

2020-02-10 Thread Jason Gerlowski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski updated SOLR-14254:
---
Description: 
I believe I found a backcompat break between 8.4.1 and 8.3.1.

I encountered this when a Solr 8.3.1 cluster was upgraded to 8.4.1.  On 8.4. 
nodes, several collections had cores fail to come up with 
{{CorruptIndexException}}:

{code}
2020-02-10 20:58:26.136 ERROR 
(coreContainerWorkExecutor-2-thread-1-processing-n:192.168.1.194:8983_solr) [   
] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup => 
org.apache.sol
r.common.SolrException: Unable to create core [testbackcompat_shard1_replica_n1]
at 
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
org.apache.solr.common.SolrException: Unable to create core 
[testbackcompat_shard1_replica_n1]
at 
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313)
 ~[?:?]
at 
org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:788) ~[?:?]
at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
 ~[metrics-core-4.0.5.jar:4.0.5]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
 ~[?:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.(SolrCore.java:1072) ~[?:?]
at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
at 
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
 ~[?:?]
... 7 more
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2182) 
~[?:?]
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) ~[?:?]
at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1132) ~[?:?]
at org.apache.solr.core.SolrCore.(SolrCore.java:1013) ~[?:?]
at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
at 
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292)
 ~[?:?]
... 7 more
Caused by: org.apache.lucene.index.CorruptIndexException: codec mismatch: 
actual codec=Lucene50PostingsWriterDoc vs expected 
codec=Lucene84PostingsWriterDoc 
(resource=MMapIndexInput(path="/Users/jasongerlowski/run/solrdata/data/testbackcompat_shard1_replica_n1/data/index/_0_FST50_0.doc"))
at 
org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:208) ~[?:?]
at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:198) 
~[?:?]
at 
org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) ~[?:?]
at 
org.apache.lucene.codecs.lucene84.Lucene84PostingsReader.(Lucene84PostingsReader.java:82)
 ~[?:?]
at 
org.apache.lucene.codecs.memory.FSTPostingsFormat.fieldsProducer(FSTPostingsFormat.java:66)
 ~[?:?]
at 
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:315)
 ~[?:?]
at 
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:395)
 ~[?:?]
at 
org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:114) 
~[?:?]
at org.apache.lucene.index.SegmentReader.(SegmentReader.java:84) 
~[?:?]
at 
org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:177) 
~[?:?]
at 
org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:219)
 ~[?:?]
at 
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:109)
 ~[?:?]
at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:526) 
~[?:?]
at 
org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:116) ~[?:?]
at 
org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:92) ~[?:?]
at 
org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:39)
 ~[?:?]
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2146) 
~[?:?]
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) ~[?:?]
at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1132) ~[?:?]
at org.apache.solr.core.SolrCore.(SolrCore.java:1013) ~[?:?]
at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?]
at 

[GitHub] [lucene-solr] HoustonPutman commented on a change in pull request #1238: SOLR-14240: Clean up znodes after shard deletion is invoked

2020-02-10 Thread GitBox
HoustonPutman commented on a change in pull request #1238: SOLR-14240: Clean up 
znodes after shard deletion is invoked
URL: https://github.com/apache/lucene-solr/pull/1238#discussion_r377325559
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/cloud/api/collections/DeleteShardCmd.java
 ##
 @@ -151,6 +154,21 @@ public void call(ClusterState clusterState, ZkNodeProps 
message, NamedList resul
   "Error executing delete operation for collection: " + collectionName 
+ " shard: " + sliceId, e);
 }
   }
+  
+  private void cleanupZooKeeperShardMetadata(SolrZkClient client, String 
collection, String sliceId) throws InterruptedException {
+String leaderElectPath = ZkStateReader.COLLECTIONS_ZKNODE + "/" + 
collection + "/leader_elect/" + sliceId;
+String shardLeaderPath = ZkStateReader.COLLECTIONS_ZKNODE + "/" + 
collection + "/leaders/" + sliceId;
+String shardTermsPath = ZkStateReader.COLLECTIONS_ZKNODE + "/" + 
collection + "/terms/" + sliceId;
+
+try {
+  client.clean(leaderElectPath);
+  client.clean(shardLeaderPath);
+  client.clean(shardTermsPath);
+} catch (KeeperException ex) {
+  log.warn("Non-fatal error occured attempting to delete shard metadata on 
zooker for collection " + 
 
 Review comment:
   If you are just logging a warning on failure, you might want to loop through 
each one, with the try-catch inside the loop. Therefore if one path fails, the 
others have a chance of succeeding. You can also log the path that failed which 
will help in debugging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9215) replace checkJavaDocs.py with doclet

2020-02-10 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033944#comment-17033944
 ] 

Robert Muir commented on LUCENE-9215:
-

It may be a while before I get to spend time to try to work on this. But i will 
get back to it if nobody else does. Feel free to assign the issue and hack on 
it if you are inspired.

> replace checkJavaDocs.py with doclet
> 
>
> Key: LUCENE-9215
> URL: https://issues.apache.org/jira/browse/LUCENE-9215
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9215_prototype.patch
>
>
> The current checker runs regular expressions against html, and it breaks when 
> newer java change html output. This is not particularly fun to fix: see 
> LUCENE-9213
> Java releases often now, and when i compared generated html of a simple class 
> across 11,12,13 it surprised me how much changes. So I think we want to avoid 
> parsing their HTML.
> Javadoc {{Xdoclint}} feature has a "missing checker": but it is black/white. 
> Either everything is fully documented or its not. And while you can 
> enable/disable doclint checks per-package, this also seems black/white 
> (either all checks or no checks at all).
> On the other hand the python checker is able to check per-package at 
> different granularities (package, class, method). It makes it possible to 
> iteratively improve the situation.
> With doclet api we could implement checks in pure java, for example to match 
> checkJavaDocs.py logic:
> {code}
>   private void checkComment(Element element) {
> var tree = docTrees.getDocCommentTree(element);
> if (tree == null) {
>   error(element, "javadocs are missing");
> } else {
>   var normalized = tree.getFirstSentence().get(0).toString()
>.replace('\u00A0', ' ')
>.trim()
>.toLowerCase(Locale.ROOT);
>   if (normalized.isEmpty()) {
> error(element, "blank javadoc comment");
>   } else if (normalized.startsWith("licensed to the apache software 
> foundation") ||
>  normalized.startsWith("copyright 2004 the apache software 
> foundation")) {
> error(element, "comment is really a license");
>   }
> }
> {code}
> If there are problems then they just appear as errors from the output of 
> {{javadoc}} like usual:
> {noformat}
> javadoc: error - org.apache.lucene.nodoc (package): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanNearQuery.java:190:
>  error - SpanNearWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanContainingQuery.java:54:
>  error - SpanContainingWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanWithinQuery.java:55:
>  error - SpanWithinWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanTermQuery.java:94:
>  error - SpanTermWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanNotQuery.java:109:
>  error - SpanNotWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanOrQuery.java:139:
>  error - SpanOrWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/spans/SpanPositionCheckQuery.java:77:
>  error - SpanPositionCheckWeight (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/MultiCollectorManager.java:61:
>  error - Collectors (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/search/MultiCollectorManager.java:89:
>  error - LeafCollectors (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/util/PagedBytes.java:353:
>  error - PagedBytesDataOutput (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/util/PagedBytes.java:285:
>  error - PagedBytesDataInput (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/nodoc/EmptyDoc.java:22:
>  error - EmptyDoc (class): javadocs are missing
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/nodoc/LicenseDoc.java:36:
>  error - LicenseDoc (class): comment is really a license
> /home/rmuir/workspace/lucene-solr/lucene/core/src/java/org/apache/lucene/nodoc/NoDoc.java:19:
>  error - NoDoc (class): javadocs are missing
> FAILURE: Build failed 

[jira] [Resolved] (LUCENE-9209) fix javadocs to be html5, enable doclint html checks, remove jtidy

2020-02-10 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-9209.
-
Fix Version/s: master (9.0)
   Resolution: Fixed

> fix javadocs to be html5, enable doclint html checks, remove jtidy
> --
>
> Key: LUCENE-9209
> URL: https://issues.apache.org/jira/browse/LUCENE-9209
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9209.patch, LUCENE-9209_current_state.patch
>
>
> Currently doclint is very angry about all the {{}} elements and similar 
> stuff going on. We claim to be emitting html5 documentation so it is about 
> time to clean it up.
> Then the html check can simply be enabled and we can remove the jtidy stuff 
> completely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9201) Port documentation-lint task to Gradle build

2020-02-10 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033936#comment-17033936
 ] 

Robert Muir commented on LUCENE-9201:
-

maybe we should try to get the PR in and iterate from there? It find a lot of 
problems! Maybe we should break into issues.

as far as the pr itself:
* if the new lint task fails, maybe that is ok for now? there are problems.
* i didnt look much into the actual innards of the PR, so I am not opinionated 
on that. I have been trying to chase down what it finds.

as far as javadoc generation itself, it is buggy, and we should fix it.
* package.html (and maybe other files) are not included so we need to e.g. use 
the ant-task from gradle or something else. solving the "split package problem" 
and doing package-info.java seems much more difficult.
* some linkoffline is missing, this breaks cross-module links. 

as far as checkers themselves:
* html validation is enabled and jtidy removed: LUCENE-9209.
* prototype doclet to replace checkJavaDocs.py: LUCENE-9215.
* checkJavadocLinks.py is tougher to replace, but a lot of the issues it finds 
are better discovered with javac warnings if the module system is enabled. 
These are API bugs (where public method returns package-private stuff, etc). 
There are even some of these bugs today that python does not find. You will 
find them if you start adding module-info.java's. This is a difficult issue 
because it involves dealing with split-package problem.
* ecj-javadocs-lint is not really doing what it says. actually the main check 
thing it does right now is to fail on unused imports. Would love to not need 
the separate pass for that, but 
https://bugs.openjdk.java.net/browse/JDK-4963930 is not encouraging.


> Port documentation-lint task to Gradle build
> 
>
> Key: LUCENE-9201
> URL: https://issues.apache.org/jira/browse/LUCENE-9201
> Project: Lucene - Core
>  Issue Type: Sub-task
>Affects Versions: master (9.0)
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Major
> Attachments: javadocGRADLE.png, javadocHTML4.png, javadocHTML5.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Ant build's "documentation-lint" target consists of those two sub targets.
>  * "-ecj-javadoc-lint" (Javadoc linting by ECJ)
>  * "-documentation-lint"(Missing javadocs / broken links check by python 
> scripts)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14253) Remove Sleeps from OverseerCollectionMessageHandler

2020-02-10 Thread Mike Drob (Jira)
Mike Drob created SOLR-14253:


 Summary: Remove Sleeps from OverseerCollectionMessageHandler
 Key: SOLR-14253
 URL: https://issues.apache.org/jira/browse/SOLR-14253
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Server
Reporter: Mike Drob


>From the conversations with Mark Miller a few months back - there are a lot of 
>places in the server code where we have hard sleeps instead of relying on 
>notifications and watchers to handle state.

I will begin to tackle these one at a time, starting with 
OverseerCollectionMessageHandler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2020-02-10 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033914#comment-17033914
 ] 

David Smiley commented on SOLR-13579:
-

[~ab] Do you think the design here will play nicely with 
{{SolrCores.transientCores}} (not _yet_ SolrCloud compatible but see 
SOLR-5446)? From what I've seen I suppose it'll be fine: cores come and go 
wether it be normally (e.g. new collections ore rebalancing or deletion of 
old/not-needed) or wether it be to keep a fixed amount of recently used ones in 
memory.  Then the question in my mind is if the transientCoreCache should 
implement the new ManagedComponent interface.  Perhaps not since it is above 
per-core resources, if that matters?  FWIW I imagine the transientCoreCache 
would be configured to be fixed on the core count.  It's helpful for the 
resource management API / framework to balance the embedded resources to ensure 
the sum total of cache's and such are bounded.  And also the memory use of each 
core is rather hard to gauge.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9149) Increase data dimension limit in BKD

2020-02-10 Thread Nick Knize (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Knize resolved LUCENE-9149.

Resolution: Implemented

> Increase data dimension limit in BKD
> 
>
> Key: LUCENE-9149
> URL: https://issues.apache.org/jira/browse/LUCENE-9149
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nick Knize
>Priority: Major
> Attachments: LUCENE-9149.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> LUCENE-8496 added selective indexing; the ability to designate the first K <= 
> N dimensions for driving the construction of the BKD internal nodes. Follow 
> on work stored the "data dimensions" for only the leaf nodes and only the 
> "index dimensions" are stored for the internal nodes. While 
> {{maxPointsInLeafNode}} is still important for managing the BKD heap memory 
> footprint (thus we don't want this to get too large), I'd like to propose 
> increasing the {{MAX_DIMENSIONS}} limit (to something not too crazy like 16; 
> effectively doubling the index dimension limit) while maintaining the 
> {{MAX_INDEX_DIMENSIONS}} at 8.
> Doing this will enable us to encode higher dimension data within a lower 
> dimension index (e.g., 3D tessellated triangles as a 10 dimension point using 
> only the first 6 dimensions for index construction)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on a change in pull request #357: [SOLR-12238] Synonym Queries boost

2020-02-10 Thread GitBox
dsmiley commented on a change in pull request #357: [SOLR-12238] Synonym 
Queries boost
URL: https://github.com/apache/lucene-solr/pull/357#discussion_r377238017
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/util/graph/GraphTokenStreamFiniteStrings.java
 ##
 @@ -124,6 +126,15 @@ public boolean hasSidePath(int state) {
 .toArray(Term[]::new);
   }
 
+  /**
+   * Returns the list of terms that start at the provided state
+   */
+  public QueryBuilder.TermAndBoost[] getTermsAndBoosts(String field, int 
state) {
 
 Review comment:
   Can't you remove this now?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] andywebb1975 commented on a change in pull request #1247: SOLR-14252 use double rather than Double to avoid NPE

2020-02-10 Thread GitBox
andywebb1975 commented on a change in pull request #1247: SOLR-14252 use double 
rather than Double to avoid NPE
URL: https://github.com/apache/lucene-solr/pull/1247#discussion_r377223058
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java
 ##
 @@ -93,16 +99,13 @@ public double getMax() {
 if (values.isEmpty()) {
   return 0;
 }
-Double res = null;
+double res = 0;
 for (Update u : values.values()) {
   if (!(u.value instanceof Number)) {
+log.warn("not a Number: " + u.value);
 
 Review comment:
   Note I'm not completely clear whether `u.value` is ever _expected_ to not be 
a `Number` - have seen this line report `false` and `LocalStatsCache` and I'm 
tracing back through to find out why these occur.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13390) Provide Query Elevation Component by default

2020-02-10 Thread Jason Gerlowski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033703#comment-17033703
 ] 

Jason Gerlowski commented on SOLR-13390:


Sounds reasonable to me. +1

QEC is used somewhat commonly and seems safe to enable by default.

> Provide Query Elevation Component by default
> 
>
> Key: SOLR-13390
> URL: https://issues.apache.org/jira/browse/SOLR-13390
> Project: Solr
>  Issue Type: Bug
>Reporter: Erik Hatcher
>Priority: Major
>
> Like other components, like highlighting and faceting, it'd be useful to have 
> this work out of the box by just enabling it on the request.   Currently the 
> component needs to be added to `/select` and an empty elevate.xml file needs 
> to be added to the config - a bit unnecessarily arduous.
> Let's add the component to `/select` and modify the component to be happy 
> with or without an elevate.xml (since id's can be sent on the request to 
> elevate, so fixed config isn't needed either).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13996) Refactor HttpShardHandler#prepDistributed() into smaller pieces

2020-02-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033687#comment-17033687
 ] 

ASF subversion and git services commented on SOLR-13996:


Commit 78e567c57e45f56ff22ecfc3e43e315255ef3561 in lucene-solr's branch 
refs/heads/branch_8x from Shalin Shekhar Mangar
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=78e567c ]

SOLR-13996: Refactor HttpShardHandler.prepDistributed method (#1220)

SOLR-13996: Refactor HttpShardHandler.prepDistributed method into smaller pieces

This commit introduces an interface named ReplicaSource which is marked as 
experimental. It has two sub-classes named CloudReplicaSource (for solr cloud) 
and LegacyReplicaSource for non-cloud clusters. The prepDistributed method now 
calls out to these sub-classes depending on whether the cluster is running on 
cloud mode or not.

(cherry picked from commit c65b97665c61116632bc93e5f88f84bdb5cccf21)


> Refactor HttpShardHandler#prepDistributed() into smaller pieces
> ---
>
> Key: SOLR-13996
> URL: https://issues.apache.org/jira/browse/SOLR-13996
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
>Priority: Major
> Attachments: SOLR-13996.patch, SOLR-13996.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, it is very hard to understand all the various things being done in 
> HttpShardHandler. I'm starting with refactoring the prepDistributed() method 
> to make it easier to grasp. It has standalone and cloud code intertwined, and 
> wanted to cleanly separate them out. Later, we can even have two separate 
> method (for standalone and cloud, each).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13996) Refactor HttpShardHandler#prepDistributed() into smaller pieces

2020-02-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033686#comment-17033686
 ] 

ASF subversion and git services commented on SOLR-13996:


Commit 78e567c57e45f56ff22ecfc3e43e315255ef3561 in lucene-solr's branch 
refs/heads/branch_8x from Shalin Shekhar Mangar
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=78e567c ]

SOLR-13996: Refactor HttpShardHandler.prepDistributed method (#1220)

SOLR-13996: Refactor HttpShardHandler.prepDistributed method into smaller pieces

This commit introduces an interface named ReplicaSource which is marked as 
experimental. It has two sub-classes named CloudReplicaSource (for solr cloud) 
and LegacyReplicaSource for non-cloud clusters. The prepDistributed method now 
calls out to these sub-classes depending on whether the cluster is running on 
cloud mode or not.

(cherry picked from commit c65b97665c61116632bc93e5f88f84bdb5cccf21)


> Refactor HttpShardHandler#prepDistributed() into smaller pieces
> ---
>
> Key: SOLR-13996
> URL: https://issues.apache.org/jira/browse/SOLR-13996
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
>Priority: Major
> Attachments: SOLR-13996.patch, SOLR-13996.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, it is very hard to understand all the various things being done in 
> HttpShardHandler. I'm starting with refactoring the prepDistributed() method 
> to make it easier to grasp. It has standalone and cloud code intertwined, and 
> wanted to cleanly separate them out. Later, we can even have two separate 
> method (for standalone and cloud, each).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14209) Upgrade JQuery to 3.4.1

2020-02-10 Thread Kevin Risden (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033674#comment-17033674
 ] 

Kevin Risden commented on SOLR-14209:
-

Thanks [~mkhl]! Sorry didn't think about the Java version change on JDK 8.

> Upgrade JQuery to 3.4.1
> ---
>
> Key: SOLR-14209
> URL: https://issues.apache.org/jira/browse/SOLR-14209
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI, contrib - Velocity
>Reporter: Kevin Risden
>Assignee: Kevin Risden
>Priority: Major
> Fix For: 8.5
>
> Attachments: Screen Shot 2020-01-23 at 3.17.07 PM.png, Screen Shot 
> 2020-01-23 at 3.28.47 PM.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently JQuery is on 2.1.3. It would be good to upgrade to the latest 
> version if possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14249) Krb5HttpClientBuilder should not buffer requests

2020-02-10 Thread Kevin Risden (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033670#comment-17033670
 ] 

Kevin Risden commented on SOLR-14249:
-

Yea SOLR-13270 is definitely relevant.

> Krb5HttpClientBuilder should not buffer requests 
> -
>
> Key: SOLR-14249
> URL: https://issues.apache.org/jira/browse/SOLR-14249
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication, SolrJ
>Affects Versions: 7.4, master (9.0), 8.4.1
>Reporter: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-14249-reproduction.patch
>
>
> When SolrJ clients enable Kerberos authentication, a request interceptor is 
> set up which wraps the actual HttpEntity in a BufferedHttpEntity.  This 
> BufferedHttpEntity, well, buffers the request body in a {{byte[]}} so it can 
> be repeated if needed.  This works fine for small requests, but when requests 
> get large storing the entire request in memory causes contention or 
> OutOfMemoryErrors.
> The easiest way for this to manifest is to use ConcurrentUpdateSolrClient, 
> which opens a connection to Solr and streams documents out in an ever 
> increasing request entity until the doc queue held by the client is emptied.
> I ran into this while troubleshooting a DIH run that would reproducibly load 
> a few hundred thousand documents before progress stalled out.  Solr never 
> crashed and the DIH thread was still alive, but the 
> ConcurrentUpdateSolrClient used by DIH had its "Runner" thread disappear 
> around the time of the stall and an OOM like the one below could be seen in 
> solr-8983-console.log:
> {code}
> WARNING: Uncaught exception in thread: 
> Thread[concurrentUpdateScheduler-28-thread-1,5,TGRP-TestKerberosClientBuffering]
> java.lang.OutOfMemoryError: Java heap space
>   at __randomizedtesting.SeedInfo.seed([371A00FBA76D31DF]:0)
>   at java.base/java.util.Arrays.copyOf(Arrays.java:3745)
>   at 
> java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:120)
>   at 
> java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95)
>   at 
> java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156)
>   at 
> org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:213)
>   at 
> org.apache.solr.common.util.FastOutputStream.write(FastOutputStream.java:94)
>   at 
> org.apache.solr.common.util.ByteUtils.writeUTF16toUTF8(ByteUtils.java:145)
>   at org.apache.solr.common.util.JavaBinCodec.writeStr(JavaBinCodec.java:848)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writePrimitive(JavaBinCodec.java:932)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:328)
>   at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeSolrInputDocument(JavaBinCodec.java:616)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:355)
>   at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeMapEntry(JavaBinCodec.java:764)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:383)
>   at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeIterator(JavaBinCodec.java:705)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:367)
>   at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:223)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:330)
>   at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
>   at org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:155)
>   at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.marshal(JavaBinUpdateRequestCodec.java:91)
>   at 
> org.apache.solr.client.solrj.impl.BinaryRequestWriter.write(BinaryRequestWriter.java:83)
>   at 
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner$1.writeTo(ConcurrentUpdateSolrClient.java:264)
>   at org.apache.http.entity.EntityTemplate.writeTo(EntityTemplate.java:73)
>   at 
> org.apache.http.entity.BufferedHttpEntity.(BufferedHttpEntity.java:62)
>   at 
> org.apache.solr.client.solrj.impl.Krb5HttpClientBuilder.lambda$new$3(Krb5HttpClientBuilder.java:155)
>   at 
> org.apache.solr.client.solrj.impl.Krb5HttpClientBuilder$$Lambda$459/0x000800623840.process(Unknown
>  Source)
>   at 
> 

[jira] [Updated] (SOLR-14243) ant clean-jars should not delete gradle-wrapper.jar

2020-02-10 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-14243:
---
Priority: Trivial  (was: Major)

> ant clean-jars should not delete gradle-wrapper.jar
> ---
>
> Key: SOLR-14243
> URL: https://issues.apache.org/jira/browse/SOLR-14243
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Assignee: Dawid Weiss
>Priority: Trivial
> Attachments: SOLR-14243-01.patch
>
>
> Right now ant clean-jars deletes {{gradle/wrapper/gradle-wrapper.jar}}, so if 
> I execute the following command to recreate the checksums it shows up as as 
> deleted file in git:
> {noformat}
> $ ant clean-jars jar-checksums 
> ...
> $ git status -s
>  D gradle/wrapper/gradle-wrapper.jar{noformat}
> I don't think we should delete the gradle-wrapper.jar here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14243) ant clean-jars should not delete gradle-wrapper.jar

2020-02-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033668#comment-17033668
 ] 

ASF subversion and git services commented on SOLR-14243:


Commit b21312f411bdfb069114846f31f45dcc6ec6ecb8 in lucene-solr's branch 
refs/heads/master from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b21312f ]

SOLR-14243: ant clean-jars should not delete gradle-wrapper.jar.


> ant clean-jars should not delete gradle-wrapper.jar
> ---
>
> Key: SOLR-14243
> URL: https://issues.apache.org/jira/browse/SOLR-14243
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Priority: Major
> Attachments: SOLR-14243-01.patch
>
>
> Right now ant clean-jars deletes {{gradle/wrapper/gradle-wrapper.jar}}, so if 
> I execute the following command to recreate the checksums it shows up as as 
> deleted file in git:
> {noformat}
> $ ant clean-jars jar-checksums 
> ...
> $ git status -s
>  D gradle/wrapper/gradle-wrapper.jar{noformat}
> I don't think we should delete the gradle-wrapper.jar here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14243) ant clean-jars should not delete gradle-wrapper.jar

2020-02-10 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss reassigned SOLR-14243:
--

Assignee: Dawid Weiss

> ant clean-jars should not delete gradle-wrapper.jar
> ---
>
> Key: SOLR-14243
> URL: https://issues.apache.org/jira/browse/SOLR-14243
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Assignee: Dawid Weiss
>Priority: Major
> Attachments: SOLR-14243-01.patch
>
>
> Right now ant clean-jars deletes {{gradle/wrapper/gradle-wrapper.jar}}, so if 
> I execute the following command to recreate the checksums it shows up as as 
> deleted file in git:
> {noformat}
> $ ant clean-jars jar-checksums 
> ...
> $ git status -s
>  D gradle/wrapper/gradle-wrapper.jar{noformat}
> I don't think we should delete the gradle-wrapper.jar here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14243) ant clean-jars should not delete gradle-wrapper.jar

2020-02-10 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-14243:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thank you!

> ant clean-jars should not delete gradle-wrapper.jar
> ---
>
> Key: SOLR-14243
> URL: https://issues.apache.org/jira/browse/SOLR-14243
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Assignee: Dawid Weiss
>Priority: Major
> Attachments: SOLR-14243-01.patch
>
>
> Right now ant clean-jars deletes {{gradle/wrapper/gradle-wrapper.jar}}, so if 
> I execute the following command to recreate the checksums it shows up as as 
> deleted file in git:
> {noformat}
> $ ant clean-jars jar-checksums 
> ...
> $ git status -s
>  D gradle/wrapper/gradle-wrapper.jar{noformat}
> I don't think we should delete the gradle-wrapper.jar here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9201) Port documentation-lint task to Gradle build

2020-02-10 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033658#comment-17033658
 ] 

Dawid Weiss commented on LUCENE-9201:
-

Gradle's built-in javadoc task is indeed pretty dumb with respect to the set of 
options allowed by the javadoc tool. Maybe they take the lowest denominator 
across all javadoc/jvm versions, I don't know.

We can change it, I don't mind. We don't even have to rely on ant - we can just 
run javadoc as an external tool and build the set of options required.

The pull request attached to the issue has some odd fragments in it (filtering 
for projects based on directory structure for example). I'm not familiar with 
what ant does here. Should I review the PR or should we rather focus on trying 
to remove python from the loop and use the built-in html validation combined 
with custom doclet instead?



> Port documentation-lint task to Gradle build
> 
>
> Key: LUCENE-9201
> URL: https://issues.apache.org/jira/browse/LUCENE-9201
> Project: Lucene - Core
>  Issue Type: Sub-task
>Affects Versions: master (9.0)
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Major
> Attachments: javadocGRADLE.png, javadocHTML4.png, javadocHTML5.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Ant build's "documentation-lint" target consists of those two sub targets.
>  * "-ecj-javadoc-lint" (Javadoc linting by ECJ)
>  * "-documentation-lint"(Missing javadocs / broken links check by python 
> scripts)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13996) Refactor HttpShardHandler#prepDistributed() into smaller pieces

2020-02-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033639#comment-17033639
 ] 

ASF subversion and git services commented on SOLR-13996:


Commit c65b97665c61116632bc93e5f88f84bdb5cccf21 in lucene-solr's branch 
refs/heads/master from Shalin Shekhar Mangar
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c65b976 ]

SOLR-13996: Refactor HttpShardHandler.prepDistributed method (#1220)

SOLR-13996: Refactor HttpShardHandler.prepDistributed method into smaller pieces

This commit introduces an interface named ReplicaSource which is marked as 
experimental. It has two sub-classes named CloudReplicaSource (for solr cloud) 
and LegacyReplicaSource for non-cloud clusters. The prepDistributed method now 
calls out to these sub-classes depending on whether the cluster is running on 
cloud mode or not.

> Refactor HttpShardHandler#prepDistributed() into smaller pieces
> ---
>
> Key: SOLR-13996
> URL: https://issues.apache.org/jira/browse/SOLR-13996
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
>Priority: Major
> Attachments: SOLR-13996.patch, SOLR-13996.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, it is very hard to understand all the various things being done in 
> HttpShardHandler. I'm starting with refactoring the prepDistributed() method 
> to make it easier to grasp. It has standalone and cloud code intertwined, and 
> wanted to cleanly separate them out. Later, we can even have two separate 
> method (for standalone and cloud, each).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] shalinmangar merged pull request #1220: SOLR-13996: Refactor HttpShardHandler.prepDistributed method

2020-02-10 Thread GitBox
shalinmangar merged pull request #1220: SOLR-13996: Refactor 
HttpShardHandler.prepDistributed method
URL: https://github.com/apache/lucene-solr/pull/1220
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9196) look into adding -XX:ActiveProcessorCount=1 to parallel build jvms

2020-02-10 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033612#comment-17033612
 ] 

Dawid Weiss commented on LUCENE-9196:
-

Maybe I'm being overprotective here but it'll make it a pain to switch between 
JVMs if those local settings are just global. And 
those rare people that run with J9 would have the generated defaults that'd 
fail on a subsequent gradle invocation. I'm not saying no - I'm just looking at 
how it could work for everyone in a reasonable way (and I don't see any obvious 
answer).

> look into adding -XX:ActiveProcessorCount=1 to parallel build jvms
> --
>
> Key: LUCENE-9196
> URL: https://issues.apache.org/jira/browse/LUCENE-9196
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Priority: Major
>
> I've been using this in my own gradle.properties (both for test and gradle 
> vms).
> I think otherwise there may be a bad multiplicative effect at play: if you 
> have N (say 16 cores), the build defaults to using 8 parallel resources. But 
> each of these jvms uses ergonomic defaults to size stuff like compiler/gc 
> threads according to the entire machine. Similar to the reasons behind 
> container support, etc.
> I tell each build/test JVM to pretend like it runs on 1 cpu machine with this 
> flag. It seems to give a lower load average when running tests? Something to 
> look into.
> {quote}
>-XX:ActiveProcessorCount=x
>   Overrides  the  number of CPUs that the VM will use to calculate
>   the size of thread pools it will use for various operations such
>   as Garbage Collection and ForkJoinPool.
>   The  VM  normally  determines the number of available processors
>   from the operating system.  This flag can be useful  for  parti‐
>   tioning  CPU  resources  when running multiple Java processes in
>   docker containers.  This flag is honored even  if  UseContainer‐
>   Support  is not enabled.  See -XX:-UseContainerSupport for a de‐
>   scription of enabling and disabling container support.
> {quote}
> cc [~dweiss]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson opened a new pull request #1248: LUCENE-9134: Port ant-regenerate tasks to Gradle build

2020-02-10 Thread GitBox
ErickErickson opened a new pull request #1248: LUCENE-9134: Port ant-regenerate 
tasks to Gradle build
URL: https://github.com/apache/lucene-solr/pull/1248
 
 
   This adds the generation targets for util/packed and util/automaton.
   
   For whatever reason my local Python doesn't do anything weird like it did 
when regenerating the html entities, the generated code is identical.
   
   One thing I'd like to draw attention to is that I had to change 
createLevAutomata.py to path to the new place moman is downloaded to.
   
   I'll merge upstream in the next day or two barring objections.
   
   I think this finishes off the regeneration work, so I'll close LUCENE-9134 
after merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson closed pull request #1241: Gradle util

2020-02-10 Thread GitBox
ErickErickson closed pull request #1241: Gradle util
URL: https://github.com/apache/lucene-solr/pull/1241
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson commented on issue #1241: Gradle util

2020-02-10 Thread GitBox
ErickErickson commented on issue #1241: Gradle util
URL: https://github.com/apache/lucene-solr/pull/1241#issuecomment-584119269
 
 
   Didn't link appropriately,  I wondered why nobody replied.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9188) Add jacoco code coverage support to gradle build

2020-02-10 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033554#comment-17033554
 ] 

Robert Muir commented on LUCENE-9188:
-

We have to pass the last one precisely for the security policy. today that is 
just how it is setup for ant, so i gave it the property it wanted to get it 
working... I agree lets see if we can switch it to something like user.dir

> Add jacoco code coverage support to gradle build
> 
>
> Key: LUCENE-9188
> URL: https://issues.apache.org/jira/browse/LUCENE-9188
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9188.patch, report.png
>
>
> Seems to be missing. I looked into it a little, all the documented ways of 
> using the jacoco plugin seem to involve black magic if you are using "java" 
> plugin, but we are using "javaLibrary", so I wasn't able to hold it right.
> This one should work very well, it has low overhead and should work fine 
> running tests in parallel (since it supports merging of coverage data files: 
> that's how it works in the ant build)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9188) Add jacoco code coverage support to gradle build

2020-02-10 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033549#comment-17033549
 ] 

Dawid Weiss edited comment on LUCENE-9188 at 2/10/20 12:17 PM:
---

I'm not really familiar with jacoco - I can imagine test coverage reports can 
be useful but I don't run them myself. In fact, I ignored jacoco when porting 
ant code because I didn't think anybody was looking into those reports? 

Anyway. The patch looks good overall. Minor comments.

{code}
+  // XXX: too many things called "workingDir" in gradle!
+  def targetDir = "${buildDir}/tmp/tests-cwd"
{code}
There is an extension property set on each project in defaults-tests.gradle:
{code}
project.ext {
  testsCwd = file("${buildDir}/tmp/tests-cwd") 
{code}
so you could replace targetDir with pahs referencing that instead.

{code}
+  systemProperty 'junit4.childvm.cwd', workingDir as String
{code}

I don't think we have to pass this property anymore. This was used by 
randomizedtesting ant runner to pass cwd for different forked JVMs (because it 
wasn't known apriori in isolation mode). The property is referenced from 
security policies but it could just become a reference to ${user.dir}.





was (Author: dweiss):
I'm not really familiar with jacoco - I can imagine test coverage reports can 
be useful but I don't run them myself. In fact, I ignored jacoco when porting 
ant code because I didn't think anybody was looking into those reports? 

Anyway. The patch looks good overall. Minor comments.

{code}
+  // XXX: too many things called "workingDir" in gradle!
+  def targetDir = "${buildDir}/tmp/tests-cwd"
{code}
There is an extension property set on each project in defaults-tests.gradle:
{code}
project.ext {
  testsCwd = file("${buildDir}/tmp/tests-cwd") 
{code}
so you could replace targetDir with papths referencing that instead.

{code}
+  systemProperty 'junit4.childvm.cwd', workingDir as String
{code}

I don't think we have to pass this property anymore. This was used by 
randomizedtesting ant runner to pass cwd for different forked JVMs (because it 
wasn't known apriori in isolation mode). The property is referenced from 
security policies but it could just become a reference to ${user.dir}.




> Add jacoco code coverage support to gradle build
> 
>
> Key: LUCENE-9188
> URL: https://issues.apache.org/jira/browse/LUCENE-9188
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9188.patch, report.png
>
>
> Seems to be missing. I looked into it a little, all the documented ways of 
> using the jacoco plugin seem to involve black magic if you are using "java" 
> plugin, but we are using "javaLibrary", so I wasn't able to hold it right.
> This one should work very well, it has low overhead and should work fine 
> running tests in parallel (since it supports merging of coverage data files: 
> that's how it works in the ant build)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9188) Add jacoco code coverage support to gradle build

2020-02-10 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033549#comment-17033549
 ] 

Dawid Weiss commented on LUCENE-9188:
-

I'm not really familiar with jacoco - I can imagine test coverage reports can 
be useful but I don't run them myself. In fact, I ignored jacoco when porting 
ant code because I didn't think anybody was looking into those reports? 

Anyway. The patch looks good overall. Minor comments.

{code}
+  // XXX: too many things called "workingDir" in gradle!
+  def targetDir = "${buildDir}/tmp/tests-cwd"
{code}
There is an extension property set on each project in defaults-tests.gradle:
{code}
project.ext {
  testsCwd = file("${buildDir}/tmp/tests-cwd") 
{code}
so you could replace targetDir with papths referencing that instead.

{code}
+  systemProperty 'junit4.childvm.cwd', workingDir as String
{code}

I don't think we have to pass this property anymore. This was used by 
randomizedtesting ant runner to pass cwd for different forked JVMs (because it 
wasn't known apriori in isolation mode). The property is referenced from 
security policies but it could just become a reference to ${user.dir}.




> Add jacoco code coverage support to gradle build
> 
>
> Key: LUCENE-9188
> URL: https://issues.apache.org/jira/browse/LUCENE-9188
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9188.patch, report.png
>
>
> Seems to be missing. I looked into it a little, all the documented ways of 
> using the jacoco plugin seem to involve black magic if you are using "java" 
> plugin, but we are using "javaLibrary", so I wasn't able to hold it right.
> This one should work very well, it has low overhead and should work fine 
> running tests in parallel (since it supports merging of coverage data files: 
> that's how it works in the ant build)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9196) look into adding -XX:ActiveProcessorCount=1 to parallel build jvms

2020-02-10 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033543#comment-17033543
 ] 

Robert Muir commented on LUCENE-9196:
-

I think the idea here is that jenkins will pass its own VM options anyway to 
specify the garbage collector it wants and stuff. So it will basically override 
all of our defaults.

> look into adding -XX:ActiveProcessorCount=1 to parallel build jvms
> --
>
> Key: LUCENE-9196
> URL: https://issues.apache.org/jira/browse/LUCENE-9196
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Priority: Major
>
> I've been using this in my own gradle.properties (both for test and gradle 
> vms).
> I think otherwise there may be a bad multiplicative effect at play: if you 
> have N (say 16 cores), the build defaults to using 8 parallel resources. But 
> each of these jvms uses ergonomic defaults to size stuff like compiler/gc 
> threads according to the entire machine. Similar to the reasons behind 
> container support, etc.
> I tell each build/test JVM to pretend like it runs on 1 cpu machine with this 
> flag. It seems to give a lower load average when running tests? Something to 
> look into.
> {quote}
>-XX:ActiveProcessorCount=x
>   Overrides  the  number of CPUs that the VM will use to calculate
>   the size of thread pools it will use for various operations such
>   as Garbage Collection and ForkJoinPool.
>   The  VM  normally  determines the number of available processors
>   from the operating system.  This flag can be useful  for  parti‐
>   tioning  CPU  resources  when running multiple Java processes in
>   docker containers.  This flag is honored even  if  UseContainer‐
>   Support  is not enabled.  See -XX:-UseContainerSupport for a de‐
>   scription of enabling and disabling container support.
> {quote}
> cc [~dweiss]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9196) look into adding -XX:ActiveProcessorCount=1 to parallel build jvms

2020-02-10 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033538#comment-17033538
 ] 

Dawid Weiss commented on LUCENE-9196:
-

I agree we could fine-tune the defaults better. I have two concerns: 

1) these XX flags are tricky in that they may come and go. And of course 
they're not compatible with J9. I wonder if it'd be better to know which JVM 
we're running and whether it supports these options, and then apply it at 
runtime rather than via static local settings. It'll quickly get hairy with 
different variants. 

2) tweaking JVM defaults makes it less likely we can discover hotspot bugs. Not 
that I care about it that much but perhaps it'd be nice to at least allow the 
CI to run with standard, unoptimized settings (gradle workers count can be set 
to 1 on such a job).

> look into adding -XX:ActiveProcessorCount=1 to parallel build jvms
> --
>
> Key: LUCENE-9196
> URL: https://issues.apache.org/jira/browse/LUCENE-9196
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Priority: Major
>
> I've been using this in my own gradle.properties (both for test and gradle 
> vms).
> I think otherwise there may be a bad multiplicative effect at play: if you 
> have N (say 16 cores), the build defaults to using 8 parallel resources. But 
> each of these jvms uses ergonomic defaults to size stuff like compiler/gc 
> threads according to the entire machine. Similar to the reasons behind 
> container support, etc.
> I tell each build/test JVM to pretend like it runs on 1 cpu machine with this 
> flag. It seems to give a lower load average when running tests? Something to 
> look into.
> {quote}
>-XX:ActiveProcessorCount=x
>   Overrides  the  number of CPUs that the VM will use to calculate
>   the size of thread pools it will use for various operations such
>   as Garbage Collection and ForkJoinPool.
>   The  VM  normally  determines the number of available processors
>   from the operating system.  This flag can be useful  for  parti‐
>   tioning  CPU  resources  when running multiple Java processes in
>   docker containers.  This flag is honored even  if  UseContainer‐
>   Support  is not enabled.  See -XX:-UseContainerSupport for a de‐
>   scription of enabling and disabling container support.
> {quote}
> cc [~dweiss]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] andywebb1975 commented on issue #1247: SOLR-14252 use double rather than Double to avoid NPE

2020-02-10 Thread GitBox
andywebb1975 commented on issue #1247: SOLR-14252 use double rather than Double 
to avoid NPE
URL: https://github.com/apache/lucene-solr/pull/1247#issuecomment-584087529
 
 
   The PR really just changes an exception to a warning - it may be papering 
over another issue. I'm going to try changing `public Object value;` to `public 
Number value;` at line 41 in order to trigger earlier exceptions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14252) NullPointerException in AggregateMetric

2020-02-10 Thread Andy Webb (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Webb updated SOLR-14252:
-
Description: 
The {{getMax}} and {{getMin}} methods in 
[AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java]
 can throw an NPE if non-{{Number}} values are present in {{values}}, when it 
tries to cast a {{null}} {{Double}} to a {{double}}.

This PR prevents the NPE occurring and triggers warnings instead: 
[https://github.com/apache/lucene-solr/pull/1247]

We've seen it report {{not a Number: false}} and {{not a Number: 
LocalStatsCache}} - so the NPE may have been hiding other issues with metrics 
gathering, which warrant further investigation.

  was:
The {{getMax}} and {{getMin}} methods in 
[AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java]
 can throw an NPE if non-{{Number}} values are present in {{values}}, when it 
tries to cast a {{null}} {{Double}} to a {{double}}. We've seen the values 
{{false}} 

PR: [https://github.com/apache/lucene-solr/pull/1247]

The patch adds a warning for non-{{Number}} values - we've seen it report {{not 
a Number: false}} and {{not a Number: LocalStatsCache}} - so the NPE may have 
been hiding other issues with metrics gathering, which warrant further 
investigation.


> NullPointerException in AggregateMetric
> ---
>
> Key: SOLR-14252
> URL: https://issues.apache.org/jira/browse/SOLR-14252
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Andy Webb
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{getMax}} and {{getMin}} methods in 
> [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java]
>  can throw an NPE if non-{{Number}} values are present in {{values}}, when it 
> tries to cast a {{null}} {{Double}} to a {{double}}.
> This PR prevents the NPE occurring and triggers warnings instead: 
> [https://github.com/apache/lucene-solr/pull/1247]
> We've seen it report {{not a Number: false}} and {{not a Number: 
> LocalStatsCache}} - so the NPE may have been hiding other issues with metrics 
> gathering, which warrant further investigation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14252) NullPointerException in AggregateMetric

2020-02-10 Thread Andy Webb (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Webb updated SOLR-14252:
-
Description: 
The {{getMax}} and {{getMin}} methods in 
[AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java]
 can throw an NPE if non-{{Number}} values are present in {{values}}, when it 
tries to cast a {{null}} {{Double}} to a {{double}}. We've seen the values 
{{false}} 

PR: [https://github.com/apache/lucene-solr/pull/1247]

The patch adds a warning for non-{{Number}} values - we've seen it report {{not 
a Number: false}} and {{not a Number: LocalStatsCache}} - so the NPE may have 
been hiding other issues with metrics gathering, which warrant further 
investigation.

  was:
The {{getMax}} and {{getMin}} methods in 
[AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java]
 can throw an NPE if non-{{Number}} values are present in {{values}}, when it 
tries to cast a {{null}} {{Double}} to a {{double}}.

PR: [https://github.com/apache/lucene-solr/pull/1247]


> NullPointerException in AggregateMetric
> ---
>
> Key: SOLR-14252
> URL: https://issues.apache.org/jira/browse/SOLR-14252
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Andy Webb
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{getMax}} and {{getMin}} methods in 
> [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java]
>  can throw an NPE if non-{{Number}} values are present in {{values}}, when it 
> tries to cast a {{null}} {{Double}} to a {{double}}. We've seen the values 
> {{false}} 
> PR: [https://github.com/apache/lucene-solr/pull/1247]
> The patch adds a warning for non-{{Number}} values - we've seen it report 
> {{not a Number: false}} and {{not a Number: LocalStatsCache}} - so the NPE 
> may have been hiding other issues with metrics gathering, which warrant 
> further investigation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14252) NullPointerException in AggregateMetric

2020-02-10 Thread Andy Webb (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Webb updated SOLR-14252:
-
Description: 
The {{getMax}} and {{getMin}} methods in 
[AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java]
 can throw an NPE if non-{{Number}} values are present in {{values}}, when it 
tries to cast a {{null}} {{Double}} to a {{double}}.

PR: [https://github.com/apache/lucene-solr/pull/1247]

  was:The {{getMax}} and {{getMin}} methods in 
[AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java]
 can throw an NPE if non-{{Number}} values are present in {{values}}, when it 
tries to cast a {{null}} {{Double}} to a {{double}}.


> NullPointerException in AggregateMetric
> ---
>
> Key: SOLR-14252
> URL: https://issues.apache.org/jira/browse/SOLR-14252
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Andy Webb
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{getMax}} and {{getMin}} methods in 
> [AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java]
>  can throw an NPE if non-{{Number}} values are present in {{values}}, when it 
> tries to cast a {{null}} {{Double}} to a {{double}}.
> PR: [https://github.com/apache/lucene-solr/pull/1247]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] andywebb1975 opened a new pull request #1247: SOLR-14252 use double rather than Double to avoid NPE

2020-02-10 Thread GitBox
andywebb1975 opened a new pull request #1247: SOLR-14252 use double rather than 
Double to avoid NPE
URL: https://github.com/apache/lucene-solr/pull/1247
 
 
   # Description
   
   The getMax and getMin methods in AggregateMetric can throw an NPE if 
non-Number values are present in values, when it tries to cast a null Double to 
a double.
   
   # Solution
   
   This PR switches to using primitive doubles, defaulting to zero, and warns 
when non-Number values are provided.
   
   # Tests
   
   TBC
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [ ] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [ ] I have developed this patch against the `master` branch.
   - [ ] I have run `ant precommit` and the appropriate test suite.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14252) NullPointerException in AggregateMetric

2020-02-10 Thread Andy Webb (Jira)
Andy Webb created SOLR-14252:


 Summary: NullPointerException in AggregateMetric
 Key: SOLR-14252
 URL: https://issues.apache.org/jira/browse/SOLR-14252
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: metrics
Reporter: Andy Webb


The {{getMax}} and {{getMin}} methods in 
[AggregateMetric|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/metrics/AggregateMetric.java]
 can throw an NPE if non-{{Number}} values are present in {{values}}, when it 
tries to cast a {{null}} {{Double}} to a {{double}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14251) Shard Split on HDFS

2020-02-10 Thread Johannes Brucher (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johannes Brucher updated SOLR-14251:

Description: 
Shard Split on HDFS Index will evaluate local disc space instead of HDFS space

When performing a shard split on an index that is stored on HDFS the 
SplitShardCmd however evaluates the free disc space on the local file system of 
the server where Solr is installed.

SplitShardCmd assumes that its main phase (when the Lucene index is being 
split) always executes on the local file system of the shard leader; and indeed 
the ShardSplitCmd.checkDiskSpace() checks the local file system's free disk 
space - even though the actual data is written to the HDFS Directory so it 
(almost) doesn't affect the local FS (except for core.properties file).

See also: [https://lucene.472066.n3.nabble.com/HDFS-Shard-Split-td4449920.html]

My setup to reproduce the issue:
 * Solr deployed on Openshift with local disc of about 5GB
 * HDFS configuration based on solrconfig.xml with

{code:java}

    hdfs://path/to/index/
...
{code}
 * Split command:

{code:java}
.../admin/collections?action=SPLITSHARD=collection1=shard1=1234{code}
 * Response:

{code:java}
{
  "responseHeader":{"status":0,"QTime":32},
  "Operation splitshard caused 
exception:":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
 not enough free disk space to perform index split on node :8983_solr, required: 294.64909074269235, available: 5.4632568359375",
  "exception":{
    "msg":"not enough free disk space to perform index split on node :8983_solr, required: 294.64909074269235, available: 5.4632568359375",
    "rspCode":500},
  "status":{"state":"failed","msg":"found [1234] in failed tasks"}
}
{code}
 

 

  was:
Shard Split on HDFS Index will evaluate local disc space instead of HDFS space

When performing a shard split on an index that is stored on HDFS the 
SplitShardCmd however evaluates the free disc space on the local file system of 
the server where Solr is installed.

SplitShardCmd assumes that its main phase (when the Lucene index is being 
split) always executes on the local file system of the shard leader; and indeed 
the ShardSplitCmd.checkDiskSpace() checks the local file system's free disk 
space - even though the actual data is written to the HDFS Directory so it 
(almost) doesn't affect the local FS (except for core.properties file).

See also: [https://lucene.472066.n3.nabble.com/HDFS-Shard-Split-td4449920.html]

My setup to reproduce the issue:
 * Solr deployed on Openshift with local disc of about 5GB
 * HDFS configuration based on solrconfig.xml with

{code:java}

    hdfs://path/to/index/str>
...
{code}
 * Split command:

{code:java}
.../admin/collections?action=SPLITSHARD=collection1=shard1=1234{code}
 * Response:

{code:java}
{
  "responseHeader":{"status":0,"QTime":32},
  "Operation splitshard caused 
exception:":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
 not enough free disk space to perform index split on node :8983_solr, required: 294.64909074269235, available: 5.4632568359375",
  "exception":{
    "msg":"not enough free disk space to perform index split on node :8983_solr, required: 294.64909074269235, available: 5.4632568359375",
    "rspCode":500},
  "status":{"state":"failed","msg":"found [1234] in failed tasks"}
}
{code}
 

 


> Shard Split on HDFS 
> 
>
> Key: SOLR-14251
> URL: https://issues.apache.org/jira/browse/SOLR-14251
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: hdfs
>Affects Versions: 8.4
>Reporter: Johannes Brucher
>Priority: Major
>
> Shard Split on HDFS Index will evaluate local disc space instead of HDFS space
> When performing a shard split on an index that is stored on HDFS the 
> SplitShardCmd however evaluates the free disc space on the local file system 
> of the server where Solr is installed.
> SplitShardCmd assumes that its main phase (when the Lucene index is being 
> split) always executes on the local file system of the shard leader; and 
> indeed the ShardSplitCmd.checkDiskSpace() checks the local file system's free 
> disk space - even though the actual data is written to the HDFS Directory so 
> it (almost) doesn't affect the local FS (except for core.properties file).
> See also: 
> [https://lucene.472066.n3.nabble.com/HDFS-Shard-Split-td4449920.html]
> My setup to reproduce the issue:
>  * Solr deployed on Openshift with local disc of about 5GB
>  * HDFS configuration based on solrconfig.xml with
> {code:java}
> 
>     hdfs://path/to/index/
> ...
> {code}
>  * Split command:
> {code:java}
> .../admin/collections?action=SPLITSHARD=collection1=shard1=1234{code}
>  * Response:
> {code:java}
> {
>   "responseHeader":{"status":0,"QTime":32},

[jira] [Commented] (LUCENE-9216) TestDoubleValuesSource#testSortMissingExplicit failure

2020-02-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033499#comment-17033499
 ] 

ASF subversion and git services commented on LUCENE-9216:
-

Commit 49a37708a0f453a57d532ccfdb8afb367eb07f3b in lucene-solr's branch 
refs/heads/branch_8x from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=49a3770 ]

LUCENE-9216: Make sure we index LEAST_DOUBLE_VALUE (#1246)



> TestDoubleValuesSource#testSortMissingExplicit failure
> --
>
> Key: LUCENE-9216
> URL: https://issues.apache.org/jira/browse/LUCENE-9216
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Ignacio Vera
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Following test has failed:
>  
> {code:java}
> ant test  -Dtestcase=TestDoubleValuesSource 
> -Dtests.method=testSortMissingExplicit -Dtests.seed=B75F561D1F45F362 
> -Dtests.slow=true -Dtests.locale=sr-Cyrl-ME -Dtests.timezone=Etc/GMT-8 
> -Dtests.asserts=true -Dtests.file.encoding=Cp1252 {code}
> It is a problem in the test due to test refactoring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9216) TestDoubleValuesSource#testSortMissingExplicit failure

2020-02-10 Thread Ignacio Vera (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera resolved LUCENE-9216.
--
Fix Version/s: 8.5
 Assignee: Ignacio Vera
   Resolution: Fixed

> TestDoubleValuesSource#testSortMissingExplicit failure
> --
>
> Key: LUCENE-9216
> URL: https://issues.apache.org/jira/browse/LUCENE-9216
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Ignacio Vera
>Assignee: Ignacio Vera
>Priority: Minor
> Fix For: 8.5
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Following test has failed:
>  
> {code:java}
> ant test  -Dtestcase=TestDoubleValuesSource 
> -Dtests.method=testSortMissingExplicit -Dtests.seed=B75F561D1F45F362 
> -Dtests.slow=true -Dtests.locale=sr-Cyrl-ME -Dtests.timezone=Etc/GMT-8 
> -Dtests.asserts=true -Dtests.file.encoding=Cp1252 {code}
> It is a problem in the test due to test refactoring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9216) TestDoubleValuesSource#testSortMissingExplicit failure

2020-02-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033498#comment-17033498
 ] 

ASF subversion and git services commented on LUCENE-9216:
-

Commit 87421d7231cf7f7acb2a912bc3221ada8f992831 in lucene-solr's branch 
refs/heads/master from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=87421d7 ]

LUCENE-9216: Make sure we index LEAST_DOUBLE_VALUE (#1246)



> TestDoubleValuesSource#testSortMissingExplicit failure
> --
>
> Key: LUCENE-9216
> URL: https://issues.apache.org/jira/browse/LUCENE-9216
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Ignacio Vera
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Following test has failed:
>  
> {code:java}
> ant test  -Dtestcase=TestDoubleValuesSource 
> -Dtests.method=testSortMissingExplicit -Dtests.seed=B75F561D1F45F362 
> -Dtests.slow=true -Dtests.locale=sr-Cyrl-ME -Dtests.timezone=Etc/GMT-8 
> -Dtests.asserts=true -Dtests.file.encoding=Cp1252 {code}
> It is a problem in the test due to test refactoring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase merged pull request #1246: LUCENE-9216: Make sure we index LEAST_DOUBLE_VALUE

2020-02-10 Thread GitBox
iverase merged pull request #1246: LUCENE-9216: Make sure we index 
LEAST_DOUBLE_VALUE
URL: https://github.com/apache/lucene-solr/pull/1246
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14251) Shard Split on HDFS

2020-02-10 Thread Johannes Brucher (Jira)
Johannes Brucher created SOLR-14251:
---

 Summary: Shard Split on HDFS 
 Key: SOLR-14251
 URL: https://issues.apache.org/jira/browse/SOLR-14251
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: hdfs
Affects Versions: 8.4
Reporter: Johannes Brucher


Shard Split on HDFS Index will evaluate local disc space instead of HDFS space

When performing a shard split on an index that is stored on HDFS the 
SplitShardCmd however evaluates the free disc space on the local file system of 
the server where Solr is installed.

SplitShardCmd assumes that its main phase (when the Lucene index is being 
split) always executes on the local file system of the shard leader; and indeed 
the ShardSplitCmd.checkDiskSpace() checks the local file system's free disk 
space - even though the actual data is written to the HDFS Directory so it 
(almost) doesn't affect the local FS (except for core.properties file).

See also: [https://lucene.472066.n3.nabble.com/HDFS-Shard-Split-td4449920.html]

My setup to reproduce the issue:
 * Solr deployed on Openshift with local disc of about 5GB
 * HDFS configuration based on solrconfig.xml with

{code:java}

    hdfs://path/to/index/str>
...
{code}
 * Split command:

{code:java}
.../admin/collections?action=SPLITSHARD=collection1=shard1=1234{code}
 * Response:

{code:java}
{
  "responseHeader":{"status":0,"QTime":32},
  "Operation splitshard caused 
exception:":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
 not enough free disk space to perform index split on node :8983_solr, required: 294.64909074269235, available: 5.4632568359375",
  "exception":{
    "msg":"not enough free disk space to perform index split on node :8983_solr, required: 294.64909074269235, available: 5.4632568359375",
    "rspCode":500},
  "status":{"state":"failed","msg":"found [1234] in failed tasks"}
}
{code}
 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] alessandrobenedetti commented on issue #357: [SOLR-12238] Synonym Queries boost

2020-02-10 Thread GitBox
alessandrobenedetti commented on issue #357: [SOLR-12238] Synonym Queries boost
URL: https://github.com/apache/lucene-solr/pull/357#issuecomment-584059142
 
 
   Latest comments have been addressed, let me know if there's anything else 
needed here :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] alessandrobenedetti commented on a change in pull request #357: [SOLR-12238] Synonym Queries boost

2020-02-10 Thread GitBox
alessandrobenedetti commented on a change in pull request #357: [SOLR-12238] 
Synonym Queries boost
URL: https://github.com/apache/lucene-solr/pull/357#discussion_r376982361
 
 

 ##
 File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/boost/DelimitedBoostTokenFilter.java
 ##
 @@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.boost;
+
+import org.apache.lucene.analysis.TokenFilter;
+import org.apache.lucene.analysis.TokenStream;
+import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
+import org.apache.lucene.search.BoostAttribute;
+
+import java.io.IOException;
+
+
+/**
+ * Characters before the delimiter are the "token", those after are the boost.
+ * 
+ * For example, if the delimiter is '|', then for the string "foo|0.7", foo is 
the token
+ * and 0.7 is the boost.
+ * 
+ * Note make sure your Tokenizer doesn't split on the delimiter, or this won't 
work
+ */
+public final class DelimitedBoostTokenFilter extends TokenFilter {
+  private final char delimiter;
+  private final CharTermAttribute termAtt = 
addAttribute(CharTermAttribute.class);
+  private final BoostAttribute boostAtt = addAttribute(BoostAttribute.class);
+
+  public DelimitedBoostTokenFilter(TokenStream input, char delimiter) {
+super(input);
+this.delimiter = delimiter;
+  }
+
+  @Override
+  public boolean incrementToken() throws IOException {
+if (input.incrementToken()) {
+  final char[] buffer = termAtt.buffer();
+  final int length = termAtt.length();
+  for (int i = 0; i < length; i++) {
+if (buffer[i] == delimiter) {
+  float boost = Float.parseFloat(new String(buffer, i + 1, (length - 
(i + 1;
+  boostAtt.setBoost(boost);
+  termAtt.setLength(i);
+  return true;
+}
+  }
+  return true;
+} else return false;
 
 Review comment:
   coming in the next commit, can you check it? I took it from the 
delimitedPayload, I guess code style is somewhat inconsistent across the 
project (I verified that multiple times in the past)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] alessandrobenedetti commented on a change in pull request #357: [SOLR-12238] Synonym Queries boost

2020-02-10 Thread GitBox
alessandrobenedetti commented on a change in pull request #357: [SOLR-12238] 
Synonym Queries boost
URL: https://github.com/apache/lucene-solr/pull/357#discussion_r376980226
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/util/QueryBuilder.java
 ##
 @@ -63,6 +66,25 @@
   protected boolean enableGraphQueries = true;
   protected boolean autoGenerateMultiTermSynonymsPhraseQuery = false;
 
+  /**
+   * Wraps a term and boost
+   */
+  public static class TermAndBoost {
+private static final float DEFAULT_BOOST = 1.0f;
 
 Review comment:
   I agree, coming in the upcoming commit.
   Furthermore in a lot of places in Lucene and Solr 1.0f is used when it is 
actually the DEFAULT_BOOST, I won't change that, it's not the scope of this 
issue but it would be nice to add a ticket to do that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] alessandrobenedetti commented on a change in pull request #357: [SOLR-12238] Synonym Queries boost

2020-02-10 Thread GitBox
alessandrobenedetti commented on a change in pull request #357: [SOLR-12238] 
Synonym Queries boost
URL: https://github.com/apache/lucene-solr/pull/357#discussion_r376980226
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/util/QueryBuilder.java
 ##
 @@ -63,6 +66,25 @@
   protected boolean enableGraphQueries = true;
   protected boolean autoGenerateMultiTermSynonymsPhraseQuery = false;
 
+  /**
+   * Wraps a term and boost
+   */
+  public static class TermAndBoost {
+private static final float DEFAULT_BOOST = 1.0f;
 
 Review comment:
   I agree, coming in the upcoming commit


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] alessandrobenedetti commented on a change in pull request #357: [SOLR-12238] Synonym Queries boost

2020-02-10 Thread GitBox
alessandrobenedetti commented on a change in pull request #357: [SOLR-12238] 
Synonym Queries boost
URL: https://github.com/apache/lucene-solr/pull/357#discussion_r376978767
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/util/graph/GraphTokenStreamFiniteStrings.java
 ##
 @@ -124,6 +126,15 @@ public boolean hasSidePath(int state) {
 .toArray(Term[]::new);
   }
 
+  /**
+   * Returns the list of terms that start at the provided state
+   */
+  public QueryBuilder.TermAndBoost[] getTermsAndBoosts(String field, int 
state) {
 
 Review comment:
   no worries at all, done in the upcoming commit!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] alessandrobenedetti commented on a change in pull request #357: [SOLR-12238] Synonym Queries boost

2020-02-10 Thread GitBox
alessandrobenedetti commented on a change in pull request #357: [SOLR-12238] 
Synonym Queries boost
URL: https://github.com/apache/lucene-solr/pull/357#discussion_r376974784
 
 

 ##
 File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/boost/package-info.java
 ##
 @@ -0,0 +1,21 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * Provides various convenience classes for creating boosts on Tokens.
+ */
+package org.apache.lucene.analysis.boost;
 
 Review comment:
   So let's keep boost package then? no strong opinion here my side


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on issue #1215: LUCENE-9164: Ignore ACE on tragic event if IW is closed

2020-02-10 Thread GitBox
s1monw commented on issue #1215: LUCENE-9164: Ignore ACE on tragic event if IW 
is closed
URL: https://github.com/apache/lucene-solr/pull/1215#issuecomment-584038675
 
 
   I will start working on some refactorings to streamline this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] romseygeek commented on a change in pull request #357: [SOLR-12238] Synonym Queries boost

2020-02-10 Thread GitBox
romseygeek commented on a change in pull request #357: [SOLR-12238] Synonym 
Queries boost
URL: https://github.com/apache/lucene-solr/pull/357#discussion_r376937485
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/util/QueryBuilder.java
 ##
 @@ -63,6 +66,25 @@
   protected boolean enableGraphQueries = true;
   protected boolean autoGenerateMultiTermSynonymsPhraseQuery = false;
 
+  /**
+   * Wraps a term and boost
+   */
+  public static class TermAndBoost {
+private static final float DEFAULT_BOOST = 1.0f;
 
 Review comment:
   I think this should probably be on `BoostAttribute` rather than here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] romseygeek commented on a change in pull request #357: [SOLR-12238] Synonym Queries boost

2020-02-10 Thread GitBox
romseygeek commented on a change in pull request #357: [SOLR-12238] Synonym 
Queries boost
URL: https://github.com/apache/lucene-solr/pull/357#discussion_r376936277
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/util/graph/GraphTokenStreamFiniteStrings.java
 ##
 @@ -124,6 +126,15 @@ public boolean hasSidePath(int state) {
 .toArray(Term[]::new);
   }
 
+  /**
+   * Returns the list of terms that start at the provided state
+   */
+  public QueryBuilder.TermAndBoost[] getTermsAndBoosts(String field, int 
state) {
 
 Review comment:
   Yes, let's go back to `AttributeSource` - sorry for the back and forth on 
this @alessandrobenedetti 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] romseygeek commented on a change in pull request #357: [SOLR-12238] Synonym Queries boost

2020-02-10 Thread GitBox
romseygeek commented on a change in pull request #357: [SOLR-12238] Synonym 
Queries boost
URL: https://github.com/apache/lucene-solr/pull/357#discussion_r376935901
 
 

 ##
 File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/boost/package-info.java
 ##
 @@ -0,0 +1,21 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * Provides various convenience classes for creating boosts on Tokens.
+ */
+package org.apache.lucene.analysis.boost;
 
 Review comment:
   I like the `boost` package - I'm already thinking about a 
`TypeToBoostTokenFilter` that would automatically boost tokens marked with a 
`SYNONYM` type for example, and there are probably other boosting filters we 
can come up with, so a package to collect them all makes sense to me.  I prefer 
to group packages by functionality rather than implementation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase opened a new pull request #1246: LUCENE-9216: Make sure we index LEAST_DOUBLE_VALUE

2020-02-10 Thread GitBox
iverase opened a new pull request #1246: LUCENE-9216: Make sure we index 
LEAST_DOUBLE_VALUE
URL: https://github.com/apache/lucene-solr/pull/1246
 
 
   Trivial test fix


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9216) TestDoubleValuesSource#testSortMissingExplicit failure

2020-02-10 Thread Ignacio Vera (Jira)
Ignacio Vera created LUCENE-9216:


 Summary: TestDoubleValuesSource#testSortMissingExplicit failure
 Key: LUCENE-9216
 URL: https://issues.apache.org/jira/browse/LUCENE-9216
 Project: Lucene - Core
  Issue Type: Test
Reporter: Ignacio Vera


Following test has failed:

 
{code:java}
ant test  -Dtestcase=TestDoubleValuesSource 
-Dtests.method=testSortMissingExplicit -Dtests.seed=B75F561D1F45F362 
-Dtests.slow=true -Dtests.locale=sr-Cyrl-ME -Dtests.timezone=Etc/GMT-8 
-Dtests.asserts=true -Dtests.file.encoding=Cp1252 {code}

It is a problem in the test due to test refactoring.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9136) Introduce IVFFlat to Lucene for ANN similarity search

2020-02-10 Thread Xin-Chun Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xin-Chun Zhang updated LUCENE-9136:
---
Description: 
Representation learning (RL) has been an established discipline in the machine 
learning space for decades but it draws tremendous attention lately with the 
emergence of deep learning. The central problem of RL is to determine an 
optimal representation of the input data. By embedding the data into a high 
dimensional vector, the vector retrieval (VR) method is then applied to search 
the relevant items.

With the rapid development of RL over the past few years, the technique has 
been used extensively in industry from online advertising to computer vision 
and speech recognition. There exist many open source implementations of VR 
algorithms, such as Facebook's FAISS and Microsoft's SPTAG, providing various 
choices for potential users. However, the aforementioned implementations are 
all written in C++, and no plan for supporting Java interface, making it hard 
to be integrated in Java projects or those who are not familier with C/C++  
[[https://github.com/facebookresearch/faiss/issues/105]]. 

The algorithms for vector retrieval can be roughly classified into four 
categories,
 # Tree-base algorithms, such as KD-tree;
 # Hashing methods, such as LSH (Local Sensitive Hashing);
 # Product quantization based algorithms, such as IVFFlat;
 # Graph-base algorithms, such as HNSW, SSG, NSG;

where IVFFlat and HNSW are the most popular ones among all the VR algorithms.

IVFFlat is better for high-precision applications such as face recognition, 
while HNSW performs better in general scenarios including recommendation and 
personalized advertisement. *The recall ratio of IVFFlat could be gradually 
increased by adjusting the query parameter (nprobe), while it's hard for HNSW 
to improve its accuracy*. In theory, IVFFlat could achieve 100% recall ratio. 

Recently, the implementation of HNSW (Hierarchical Navigable Small World, 
LUCENE-9004) for Lucene, has made great progress. The issue draws attention of 
those who are interested in Lucene or hope to use HNSW with Solr/Lucene. 

As an alternative for solving ANN similarity search problems, IVFFlat is also 
very popular with many users and supporters. Compared with HNSW, IVFFlat has 
smaller index size but requires k-means clustering, while HNSW is faster in 
query (no training required) but requires extra storage for saving graphs 
[indexing 1M 
vectors|[https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors]]. 
Another advantage is that IVFFlat can be faster and more accurate when enables 
GPU parallel computing (current not support in Java). Both algorithms have 
their merits and demerits. Since HNSW is now under development, it may be 
better to provide both implementations (HNSW && IVFFlat) for potential users 
who are faced with very different scenarios and want to more choices.

  was:
Representation learning (RL) has been an established discipline in the machine 
learning space for decades but it draws tremendous attention lately with the 
emergence of deep learning. The central problem of RL is to determine an 
optimal representation of the input data. By embedding the data into a high 
dimensional vector, the vector retrieval (VR) method is then applied to search 
the relevant items.

With the rapid development of RL over the past few years, the technique has 
been used extensively in industry from online advertising to computer vision 
and speech recognition. There exist many open source implementations of VR 
algorithms, such as Facebook's FAISS and Microsoft's SPTAG, providing various 
choices for potential users. However, the aforementioned implementations are 
all written in C++, and no plan for supporting Java interface, making it hard 
to be integrated in Java projects or those who are not familier with C/C++  
[[https://github.com/facebookresearch/faiss/issues/105]]. 

The algorithms for vector retrieval can be roughly classified into four 
categories,
 # Tree-base algorithms, such as KD-tree;
 # Hashing methods, such as LSH (Local Sensitive Hashing);
 # Product quantization based algorithms, such as IVFFlat;
 # Graph-base algorithms, such as HNSW, SSG, NSG;

where IVFFlat and HNSW are the most popular ones among all the VR algorithms.

Recently, the implementation of HNSW (Hierarchical Navigable Small World, 
LUCENE-9004) for Lucene, has made great progress. The issue draws attention of 
those who are interested in Lucene or hope to use HNSW with Solr/Lucene. 

As an alternative for solving ANN similarity search problems, IVFFlat is also 
very popular with many users and supporters. Compared with HNSW, IVFFlat has 
smaller index size but requires k-means clustering, while HNSW is faster in 
query (no training required) but requires extra storage for saving graphs 
[indexing 1M