date:20210308

[jira] [Resolved] (LUCENE-9580) Tessellator failure for a certain polygon

2021-03-08 Thread Ignacio Vera (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera resolved LUCENE-9580.
--
Fix Version/s: master (9.0)
 Assignee: Ignacio Vera
   Resolution: Fixed

> Tessellator failure for a certain polygon
> -
>
> Key: LUCENE-9580
> URL: https://issues.apache.org/jira/browse/LUCENE-9580
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 8.5, 8.6
>Reporter: Iurii Vyshnevskyi
>Assignee: Ignacio Vera
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This bug was discovered while using ElasticSearch (checked with versions 
> 7.6.2 and 7.9.2).
> But I've created an isolated test case just for Lucene: 
> [https://github.com/apache/lucene-solr/pull/2006/files]
>  
> The unit test fails with "java.lang.IllegalArgumentException: Unable to 
> Tessellate shape".
>  
> The polygon contains two holes that share the same vertex and one more 
> standalone hole.
> Removing any of them makes the unit test pass. 
>  
> Changing the least significant digit in any coordinate of the "common vertex" 
> in any of two first holes, so that these vertices become different in each 
> hole - also makes unit test pass.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9580) Tessellator failure for a certain polygon

2021-03-08 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297916#comment-17297916
 ] 

ASF subversion and git services commented on LUCENE-9580:
-

Commit 578b2aea8f50deead79a70fac229140a63b8221c in lucene-solr's branch 
refs/heads/master from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=578b2ae ]

LUCENE-9580: Fix bug in the polygon tessellator when introducing collinear 
edges during polygon splitting (#2452)



> Tessellator failure for a certain polygon
> -
>
> Key: LUCENE-9580
> URL: https://issues.apache.org/jira/browse/LUCENE-9580
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 8.5, 8.6
>Reporter: Iurii Vyshnevskyi
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This bug was discovered while using ElasticSearch (checked with versions 
> 7.6.2 and 7.9.2).
> But I've created an isolated test case just for Lucene: 
> [https://github.com/apache/lucene-solr/pull/2006/files]
>  
> The unit test fails with "java.lang.IllegalArgumentException: Unable to 
> Tessellate shape".
>  
> The polygon contains two holes that share the same vertex and one more 
> standalone hole.
> Removing any of them makes the unit test pass. 
>  
> Changing the least significant digit in any coordinate of the "common vertex" 
> in any of two first holes, so that these vertices become different in each 
> hole - also makes unit test pass.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] iverase merged pull request #2452: LUCENE-9580: Don't introduce collinear edges when splitting polygon

2021-03-08 Thread GitBox



iverase merged pull request #2452:
URL: https://github.com/apache/lucene-solr/pull/2452


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8626) standardise test class naming

2021-03-08 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297911#comment-17297911
 ] 

Dawid Weiss commented on LUCENE-8626:
-

I've fixed it already.

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8626) standardise test class naming

2021-03-08 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297897#comment-17297897
 ] 

ASF subversion and git services commented on LUCENE-8626:
-

Commit 8969225bd2307ef46b9169d69b5446e361cd7740 in lucene-solr's branch 
refs/heads/master from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8969225 ]

LUCENE-8626: correct test suite name.


> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8626) standardise test class naming

2021-03-08 Thread Ignacio Vera (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297899#comment-17297899
 ] 

Ignacio Vera commented on LUCENE-8626:
--

It seems the last test introduced is making nightly test unhappy:

 {{gradlew test --tests StressRamUsageEstimator -Dtests.seed=6D66DDAAD355DB2C 
-Dtests.nightly=true -Dtests.slow=true -Dtests.badapples=true 
-Dtests.locale=ewo-CM -Dtests.timezone=EET -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8}}

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9705) Move all codec formats to the o.a.l.codecs.Lucene90 package

2021-03-08 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297875#comment-17297875
 ] 

ASF subversion and git services commented on LUCENE-9705:
-

Commit 144ef2a0c054b54ee533f5618f36651931825f7d in lucene-solr's branch 
refs/heads/master from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=144ef2a ]

LUCENE-9705: Create Lucene90StoredFieldsFormat (#2444)



> Move all codec formats to the o.a.l.codecs.Lucene90 package
> ---
>
> Key: LUCENE-9705
> URL: https://issues.apache.org/jira/browse/LUCENE-9705
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> Current formats are distributed in different packages, prefixed with the 
> Lucene version they were created. With the upcoming release of Lucene 9.0, it 
> would be nice to move all those formats to just the o.a.l.codecs.Lucene90 
> package (and of course moving the current ones to the backwards-codecs).
> This issue would actually facilitate moving the directory API to little 
> endian (LUCENE-9047) as the only codecs that would need to handle backwards 
> compatibility will be the codecs in backwards codecs.
> In addition, it can help formalising the use of internal versions vs format 
> versioning ( LUCENE-9616)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] iverase merged pull request #2444: LUCENE-9705: Create Lucene90StoredFieldsFormat

2021-03-08 Thread GitBox



iverase merged pull request #2444:
URL: https://github.com/apache/lucene-solr/pull/2444


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-15038) Add elevateDocsWithoutMatchingQ and onlyElevatedRepresentative parameters to elevation functionality

2021-03-08 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-15038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297840#comment-17297840
 ] 

David Smiley commented on SOLR-15038:
-

FYI I filed SOLR-15222 to have Solr *stop* auto-creating "userfiles".

> Add elevateDocsWithoutMatchingQ and onlyElevatedRepresentative parameters to 
> elevation functionality
> 
>
> Key: SOLR-15038
> URL: https://issues.apache.org/jira/browse/SOLR-15038
> Project: Solr
>  Issue Type: Improvement
>  Components: query
>Reporter: Tobias Kässmann
>Priority: Minor
> Fix For: 8.9
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> We've worked a lot with Query Elevation component in the last time and we 
> were missing two features:
>  * Elevate only documents that are part of the search result
>  * In combination with collapsing: Only show the representative if the 
> elevated documents does have the same collapse field value.
> Because of this, we've added these two feature toggles 
> _elevateDocsWithoutMatchingQ_ and _onlyElevatedRepresentative._
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Deleted] (SOLR-15228) Single host in a bad state can block collection creation for the cluster with autoscaling enabled

2021-03-08 Thread David Smiley (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-15228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley deleted SOLR-15228:



> Single host in a bad state can block collection creation for the cluster with 
> autoscaling enabled
> -
>
> Key: SOLR-15228
> URL: https://issues.apache.org/jira/browse/SOLR-15228
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andy Throgmorton
>Priority: Minor
>
> We configured a SolrCloud cluster (running 8.2) with this cluster autoscaling 
> policy:
> {noformat}
> {
>   "set-cluster-preferences":[
> {
>   "minimize":"cores",
>   "precision":5
> },
> {
>   "maximize":"freedisk",
>   "precision":25
> },
> {
>   "minimize":"sysLoadAvg",
>   "precision":10
> }],
>   "set-cluster-policy":[
> {
>   "replica": "<2",
>   "node": "#ANY"
> }],
>   "set-trigger": {
> "name":".auto_add_replicas",
> "event":"nodeLost",
> "waitFor":"10m",
> "enabled":true,
> "actions":[
>   {
> "name":"auto_add_replicas_plan",
> "class":"solr.AutoAddReplicasPlanAction"},
>   {
> "name":"execute_plan",
> "class":"solr.ExecutePlanAction"}]
>   }
> }{noformat}
> A node was rebooted at one point, and when that node came back, it had 
> trouble establishing a connection with ZK when it was initializing the 
> CoreContainer. As a result, it returns 404s for (I think?) all admin requests.
> Now, any call to create a collection in that cluster throw an error, with 
> this stacktrace:
> {noformat}
> 2021-03-04 12:47:03.615 ERROR 
> (OverseerThreadFactory-141-thread-4-processing-n:HOST_REDACTED:8983_solr) [   
> ] o.a.s.c.a.c.OverseerCollectionMessageHandler Collection: COLLECTON_REDACTED 
> operation: create failed:org.apache.solr.common.SolrException: Error getting 
> replica locations : unable to get autoscaling policy session
> at 
> org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:195)
> at 
> org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:264)
> at 
> org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:505)
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: org.apache.solr.common.SolrException: unable to get autoscaling 
> policy session
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getReplicaLocations(PolicyHelper.java:129)
> at 
> org.apache.solr.cloud.api.collections.Assign.getPositionsUsingPolicy(Assign.java:382)
> at 
> org.apache.solr.cloud.api.collections.Assign$PolicyBasedAssignStrategy.assign(Assign.java:630)
> at 
> org.apache.solr.cloud.api.collections.CreateCollectionCmd.buildReplicaPositions(CreateCollectionCmd.java:410)
> at 
> org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:190)
> ... 6 more
> Caused by: org.apache.solr.common.SolrException: 
> org.apache.solr.common.SolrException: Error getting remote info
> at 
> org.apache.solr.common.cloud.rule.ImplicitSnitch.getTags(ImplicitSnitch.java:78)
> at 
> org.apache.solr.client.solrj.impl.SolrClientNodeStateProvider.fetchTagValues(SolrClientNodeStateProvider.java:139)
> at 
> org.apache.solr.client.solrj.impl.SolrClientNodeStateProvider.getNodeValues(SolrClientNodeStateProvider.java:128)
> at org.apache.solr.client.solrj.cloud.autoscaling.Row.(Row.java:71)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session.(Policy.java:575)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.Policy.createSession(Policy.java:396)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.Policy.createSession(Policy.java:358)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper$SessionRef.createSession(PolicyHelper.java:492)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper$SessionRef.get(PolicyHelper.java:457)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getSession(PolicyHelper.java:513)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getReplicaLocations(PolicyHelper.java:127)
> ... 10 more
> Caused by: org.apache.solr.common.SolrException: Error getting remote info
> at 
> org.apache.solr.client.solrj.impl.SolrClient

[jira] [Resolved] (SOLR-2852) SolrJ doesn't need woodstox jar

2021-03-08 Thread David Smiley (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley resolved SOLR-2852.

Fix Version/s: master (9.0)
   Resolution: Fixed

> SolrJ doesn't need woodstox jar
> ---
>
> Key: SOLR-2852
> URL: https://issues.apache.org/jira/browse/SOLR-2852
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Minor
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The /dist/solrj-lib/ directory contains wstx-asl-3.2.7.jar (Woodstox StAX 
> API).  SolrJ doesn't actually have any type of dependency on this library. 
> The maven build doesn't have it as a dependency and the tests pass.  Perhaps 
> Woodstox is faster than the JDK's StAX, I don't know, but I find that point 
> quite moot since SolrJ can use the efficient binary format.  Woodstox is not 
> a small library either, weighting in at 524KB, and of course if someone 
> actually wants to use it, they can.
> I propose woodstox be removed as a SolrJ dependency.  I am *not* proposing it 
> be removed as a Solr WAR dependency since it is actually required there due 
> to an obscure XSLT issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-2852) SolrJ doesn't need woodstox jar

2021-03-08 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297835#comment-17297835
 ] 

ASF subversion and git services commented on SOLR-2852:
---

Commit cf1025e576a6cec6f724108994a778795cad6b64 in lucene-solr's branch 
refs/heads/master from David Smiley
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cf1025e ]

SOLR-2852: SolrJ: remove Woodstox dependency (#2461)

It was never truly required there.
Pervasive use of "javabin" reduces the need to care about client-side XML 
speed.  Better to reduce dependencies and let clients use the libs they want.

> SolrJ doesn't need woodstox jar
> ---
>
> Key: SOLR-2852
> URL: https://issues.apache.org/jira/browse/SOLR-2852
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The /dist/solrj-lib/ directory contains wstx-asl-3.2.7.jar (Woodstox StAX 
> API).  SolrJ doesn't actually have any type of dependency on this library. 
> The maven build doesn't have it as a dependency and the tests pass.  Perhaps 
> Woodstox is faster than the JDK's StAX, I don't know, but I find that point 
> quite moot since SolrJ can use the efficient binary format.  Woodstox is not 
> a small library either, weighting in at 524KB, and of course if someone 
> actually wants to use it, they can.
> I propose woodstox be removed as a SolrJ dependency.  I am *not* proposing it 
> be removed as a Solr WAR dependency since it is actually required there due 
> to an obscure XSLT issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dsmiley merged pull request #2461: SOLR-2852: SolrJ: remove Woodstox dependency

2021-03-08 Thread GitBox



dsmiley merged pull request #2461:
URL: https://github.com/apache/lucene-solr/pull/2461


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on a change in pull request #2403: SOLR-15164: Implement Task Management Interface

2021-03-08 Thread GitBox



atris commented on a change in pull request #2403:
URL: https://github.com/apache/lucene-solr/pull/2403#discussion_r589952550



##
File path: solr/solr-ref-guide/src/common-query-parameters.adoc
##
@@ -84,6 +84,18 @@ You can use the `rows` parameter to paginate results from a 
query. The parameter
 
 The default value is `10`. That is, by default, Solr returns 10 documents at a 
time in response to a query.
 
+== canCancel Parameter
+
+This parameter defines if this query is cancellable i.e. can be cancelled 
during execution using the
+task management interface.
+
+== queryUUID Parameter
+
+For cancellable queries, this allows specifying a custom UUID to identify the 
query with. If `canCancel` is specified and `queryUUID` is not set, an auto 
generated UUID will be assigned to the query.
+
+If `queryUUID` is specified, this UUID will be used for identifying the query. 
Note that if using `queryUUID`, the responsibility of ensuring uniqueness of 
the UUID lies with the caller.

Review comment:
   Updated the docs





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on a change in pull request #2403: SOLR-15164: Implement Task Management Interface

2021-03-08 Thread GitBox



atris commented on a change in pull request #2403:
URL: https://github.com/apache/lucene-solr/pull/2403#discussion_r589950273



##
File path: 
lucene/core/src/java/org/apache/lucene/search/CancellableCollector.java
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.Objects;
+import java.util.concurrent.atomic.AtomicBoolean;
+import org.apache.lucene.index.LeafReaderContext;
+
+/** Allows a query to be cancelled */
+public class CancellableCollector implements Collector, CancellableTask {
+
+  /** Thrown when a query gets cancelled */
+  public static class QueryCancelledException extends RuntimeException {}
+
+  private Collector collector;
+  private AtomicBoolean isQueryCancelled;
+
+  public CancellableCollector(Collector collector) {
+Objects.requireNonNull(collector, "Internal collector not provided but 
wrapper collector accessed");
+
+this.collector = collector;
+this.isQueryCancelled = new AtomicBoolean();
+  }
+
+  @Override
+  public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
+
+if (isQueryCancelled.compareAndSet(true, false)) {

Review comment:
   No -- the idea was that once the cancellation is processed, you "reset" 
the flag for sanity, but I guess it serves no special purpose. Changed, thanks





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on a change in pull request #2403: SOLR-15164: Implement Task Management Interface

2021-03-08 Thread GitBox



atris commented on a change in pull request #2403:
URL: https://github.com/apache/lucene-solr/pull/2403#discussion_r589948751



##
File path: solr/core/src/java/org/apache/solr/core/SolrCore.java
##
@@ -3245,6 +3252,75 @@ public void postClose(SolrCore core) {
 return blobRef;
   }
 
+  /** Generates a UUID for the given query or if the user provided a UUID
+   * for this query, uses that.
+   */
+  public String generateQueryID(SolrQueryRequest req) {
+String queryID;
+String customQueryUUID = req.getParams().get(CUSTOM_QUERY_UUID, null);
+
+if (customQueryUUID != null) {
+  queryID = customQueryUUID;
+} else {
+  queryID = UUID.randomUUID().toString();

Review comment:
   I didnt quite parse that -- you mean, not use `UUID`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-15225) standardise test class naming

2021-03-08 Thread Mark Robert Miller (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297795#comment-17297795
 ] 

Mark Robert Miller edited comment on SOLR-15225 at 3/9/21, 4:54 AM:


I don’t have any comment on this issue, out of the game. And I have no interest 
in asking for anything given I don’t know exactly what’s coming and the selfish 
requirement and element to that work. But while this week is absolutely packed 
and I’m pulling back from my rapid fire add unleashed mind, yes, lots of docs 
and results coming so we can finally have a real discussion. 

* I’m trying stay focused on the work that needs to be done this week and 
disengage to really return to normal, but just an FYI, sure, we may get a 
little less for some reason, perhaps it takes longer, but we are going to gain 
major from this branch regardless of anything going in anywhere. 


was (Author: markrmiller):
I don’t have any comment on this issue, out of the game. And I have no interest 
in asking for anything given I don’t know exactly what’s coming and the selfish 
requirement and element to that work. But while this week is absolutely packed 
and I’m pulling back from my rapid fire add unleashed mind, yes, lots of docs 
and results coming so we can finally have a real discussion. 

* I’m trying stay focused on the work that needs to be done this week and 
disengage to really return to normal, but just an FYI, sure, we get a little 
for some reason, perhaps it takes longer, but we are going to gain major from 
this branch regardless of anything going in anywhere. 

> standardise test class naming
> -
>
> Key: SOLR-15225
> URL: https://issues.apache.org/jira/browse/SOLR-15225
> Project: Solr
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
>
> LUCENE-8626 started out as standardisation effort for both Lucene and Solr 
> test.
> The standardisation for Lucene tests ({{org.apache.lucene}} package space) is 
> now complete and enforced.
> This SOLR ticket here is for the standardisation of Solr test class names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-15225) standardise test class naming

2021-03-08 Thread Mark Robert Miller (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297795#comment-17297795
 ] 

Mark Robert Miller edited comment on SOLR-15225 at 3/9/21, 4:53 AM:


I don’t have any comment on this issue, out of the game. And I have no interest 
in asking for anything given I don’t know exactly what’s coming and the selfish 
requirement and element to that work. But while this week is absolutely packed 
and I’m pulling back from my rapid fire add unleashed mind, yes, lots of docs 
and results coming so we can finally have a real discussion. 

* I’m trying stay focused on the work that needs to be done this week and 
disengage to really return to normal, but just an FYI, sure, we get a little 
for some reason, perhaps it takes longer, but we are going to gain major from 
this branch regardless of anything going in anywhere. 


was (Author: markrmiller):
I don’t have any comment on this issue, out of the game. And I have no interest 
in asking for anything given I don’t know exactly what’s coming and the selfish 
requirement and element to that work. But while this week is absolutely packed 
and I’m pulling back from my rapid fire add unleashed mind, yes, lots of docs 
and results coming so we can finally have a real discussion. 

> standardise test class naming
> -
>
> Key: SOLR-15225
> URL: https://issues.apache.org/jira/browse/SOLR-15225
> Project: Solr
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
>
> LUCENE-8626 started out as standardisation effort for both Lucene and Solr 
> test.
> The standardisation for Lucene tests ({{org.apache.lucene}} package space) is 
> now complete and enforced.
> This SOLR ticket here is for the standardisation of Solr test class names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8626) standardise test class naming

2021-03-08 Thread Marcus Eagan (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297815#comment-17297815
 ] 

Marcus Eagan commented on LUCENE-8626:
--

Thank you Christine for wrapping this up.  Looks like it was open for 859d 47m. 

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-15225) standardise test class naming

2021-03-08 Thread Mark Robert Miller (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297795#comment-17297795
 ] 

Mark Robert Miller commented on SOLR-15225:
---

I don’t have any comment on this issue, out of the game. And I have no interest 
in asking for anything given I don’t know exactly what’s coming and the selfish 
requirement and element to that work. But while this week is absolutely packed 
and I’m pulling back from my rapid fire add unleashed mind, yes, lots of docs 
and results coming so we can finally have a real discussion. 

> standardise test class naming
> -
>
> Key: SOLR-15225
> URL: https://issues.apache.org/jira/browse/SOLR-15225
> Project: Solr
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
>
> LUCENE-8626 started out as standardisation effort for both Lucene and Solr 
> test.
> The standardisation for Lucene tests ({{org.apache.lucene}} package space) is 
> now complete and enforced.
> This SOLR ticket here is for the standardisation of Solr test class names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] anshumg commented on a change in pull request #2403: SOLR-15164: Implement Task Management Interface

2021-03-08 Thread GitBox



anshumg commented on a change in pull request #2403:
URL: https://github.com/apache/lucene-solr/pull/2403#discussion_r589872439



##
File path: 
lucene/core/src/java/org/apache/lucene/search/CancellableCollector.java
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.Objects;
+import java.util.concurrent.atomic.AtomicBoolean;
+import org.apache.lucene.index.LeafReaderContext;
+
+/** Allows a query to be cancelled */
+public class CancellableCollector implements Collector, CancellableTask {
+
+  /** Thrown when a query gets cancelled */
+  public static class QueryCancelledException extends RuntimeException {}
+
+  private Collector collector;
+  private AtomicBoolean isQueryCancelled;

Review comment:
   This can also be declared final

##
File path: solr/core/src/test/org/apache/solr/search/TestTaskManagement.java
##
@@ -0,0 +1,263 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.search;
+
+import org.apache.lucene.util.BytesRef;
+import org.apache.solr.client.solrj.SolrRequest;
+import org.apache.solr.client.solrj.request.CollectionAdminRequest;
+import org.apache.solr.client.solrj.request.QueryRequest;
+import org.apache.solr.cloud.SolrCloudTestCase;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.ModifiableSolrParams;
+import org.apache.solr.common.util.ExecutorUtil;
+import org.apache.solr.common.util.NamedList;
+import org.eclipse.jetty.util.ConcurrentHashSet;
+import org.junit.After;
+import org.junit.AfterClass;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+
+public class TestTaskManagement extends SolrCloudTestCase {
+private static final String COLLECTION_NAME = "collection1";
+
+private ExecutorService executorService;
+
+@BeforeClass
+public static void setupCluster() throws Exception {
+initCore("solrconfig.xml", "schema11.xml");
+
+configureCluster(4)
+.addConfig("conf", configset("sql"))
+.configure();
+}
+
+@AfterClass
+public static void tearDownCluster() throws Exception {
+shutdownCluster();
+}
+
+@Before
+public void setup() throws Exception {
+super.setUp();
+
+CollectionAdminRequest.createCollection(COLLECTION_NAME, "conf", 2, 1)
+.setPerReplicaState(SolrCloudTestCase.USE_PER_REPLICA_STATE)
+.process(cluster.getSolrClient());
+cluster.waitForActiveCollection(COLLECTION_NAME, 2, 2);
+cluster.getSolrClient().setDefaultCollection(COLLECTION_NAME);
+
+cluster.getSolrClient().setDefaultCollection("collection1");
+
+executorService = 
ExecutorUtil.newMDCAwareCachedThreadPool("TestTaskManagement");
+
+List docs = new ArrayList<>();
+for (int i = 0; i < 100; i++) {
+SolrInputDocument doc = new SolrInputDocument();
+doc.addField("id", i);
+doc.addField("foo1_s", Integer.toString(i));
+doc.addField("foo2_s", Boolean.toString(i % 2 == 0));
+doc.addField("

[GitHub] [lucene-site] anshumg commented on pull request #42: fix the example command

2021-03-08 Thread GitBox



anshumg commented on pull request #42:
URL: https://github.com/apache/lucene-site/pull/42#issuecomment-793238022


   @epugh  - you might want to merge this in soon? :) 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-15228) Single host in a bad state can block collection creation for the cluster with autoscaling enabled

2021-03-08 Thread Andy Throgmorton (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-15228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Throgmorton resolved SOLR-15228.
-
Resolution: Duplicate

I guess Jira made another bug when I hit refresh?

> Single host in a bad state can block collection creation for the cluster with 
> autoscaling enabled
> -
>
> Key: SOLR-15228
> URL: https://issues.apache.org/jira/browse/SOLR-15228
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: 8.2
>Reporter: Andy Throgmorton
>Priority: Minor
>
> We configured a SolrCloud cluster (running 8.2) with this cluster autoscaling 
> policy:
> {noformat}
> {
>   "set-cluster-preferences":[
> {
>   "minimize":"cores",
>   "precision":5
> },
> {
>   "maximize":"freedisk",
>   "precision":25
> },
> {
>   "minimize":"sysLoadAvg",
>   "precision":10
> }],
>   "set-cluster-policy":[
> {
>   "replica": "<2",
>   "node": "#ANY"
> }],
>   "set-trigger": {
> "name":".auto_add_replicas",
> "event":"nodeLost",
> "waitFor":"10m",
> "enabled":true,
> "actions":[
>   {
> "name":"auto_add_replicas_plan",
> "class":"solr.AutoAddReplicasPlanAction"},
>   {
> "name":"execute_plan",
> "class":"solr.ExecutePlanAction"}]
>   }
> }{noformat}
> A node was rebooted at one point, and when that node came back, it had 
> trouble establishing a connection with ZK when it was initializing the 
> CoreContainer. As a result, it returns 404s for (I think?) all admin requests.
> Now, any call to create a collection in that cluster throw an error, with 
> this stacktrace:
> {noformat}
> 2021-03-04 12:47:03.615 ERROR 
> (OverseerThreadFactory-141-thread-4-processing-n:HOST_REDACTED:8983_solr) [   
> ] o.a.s.c.a.c.OverseerCollectionMessageHandler Collection: COLLECTON_REDACTED 
> operation: create failed:org.apache.solr.common.SolrException: Error getting 
> replica locations : unable to get autoscaling policy session
> at 
> org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:195)
> at 
> org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:264)
> at 
> org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:505)
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: org.apache.solr.common.SolrException: unable to get autoscaling 
> policy session
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getReplicaLocations(PolicyHelper.java:129)
> at 
> org.apache.solr.cloud.api.collections.Assign.getPositionsUsingPolicy(Assign.java:382)
> at 
> org.apache.solr.cloud.api.collections.Assign$PolicyBasedAssignStrategy.assign(Assign.java:630)
> at 
> org.apache.solr.cloud.api.collections.CreateCollectionCmd.buildReplicaPositions(CreateCollectionCmd.java:410)
> at 
> org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:190)
> ... 6 more
> Caused by: org.apache.solr.common.SolrException: 
> org.apache.solr.common.SolrException: Error getting remote info
> at 
> org.apache.solr.common.cloud.rule.ImplicitSnitch.getTags(ImplicitSnitch.java:78)
> at 
> org.apache.solr.client.solrj.impl.SolrClientNodeStateProvider.fetchTagValues(SolrClientNodeStateProvider.java:139)
> at 
> org.apache.solr.client.solrj.impl.SolrClientNodeStateProvider.getNodeValues(SolrClientNodeStateProvider.java:128)
> at org.apache.solr.client.solrj.cloud.autoscaling.Row.(Row.java:71)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session.(Policy.java:575)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.Policy.createSession(Policy.java:396)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.Policy.createSession(Policy.java:358)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper$SessionRef.createSession(PolicyHelper.java:492)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper$SessionRef.get(PolicyHelper.java:457)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getSession(PolicyHelper.java:513)
> at 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getReplicaLocations(PolicyHelper.java:127)
>

[jira] [Created] (SOLR-15228) Single host in a bad state can block collection creation for the cluster with autoscaling enabled

2021-03-08 Thread Andy Throgmorton (Jira)

Andy Throgmorton created SOLR-15228:
---

 Summary: Single host in a bad state can block collection creation 
for the cluster with autoscaling enabled
 Key: SOLR-15228
 URL: https://issues.apache.org/jira/browse/SOLR-15228
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: AutoScaling
Affects Versions: 8.2
Reporter: Andy Throgmorton


We configured a SolrCloud cluster (running 8.2) with this cluster autoscaling 
policy:
{noformat}
{
  "set-cluster-preferences":[
{
  "minimize":"cores",
  "precision":5
},
{
  "maximize":"freedisk",
  "precision":25
},
{
  "minimize":"sysLoadAvg",
  "precision":10
}],
  "set-cluster-policy":[
{
  "replica": "<2",
  "node": "#ANY"
}],
  "set-trigger": {
"name":".auto_add_replicas",
"event":"nodeLost",
"waitFor":"10m",
"enabled":true,
"actions":[
  {
"name":"auto_add_replicas_plan",
"class":"solr.AutoAddReplicasPlanAction"},
  {
"name":"execute_plan",
"class":"solr.ExecutePlanAction"}]
  }
}{noformat}
A node was rebooted at one point, and when that node came back, it had trouble 
establishing a connection with ZK when it was initializing the CoreContainer. 
As a result, it returns 404s for (I think?) all admin requests.

Now, any call to create a collection in that cluster throw an error, with this 
stacktrace:
{noformat}
2021-03-04 12:47:03.615 ERROR 
(OverseerThreadFactory-141-thread-4-processing-n:HOST_REDACTED:8983_solr) [   ] 
o.a.s.c.a.c.OverseerCollectionMessageHandler Collection: COLLECTON_REDACTED 
operation: create failed:org.apache.solr.common.SolrException: Error getting 
replica locations : unable to get autoscaling policy session
at 
org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:195)
at 
org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:264)
at 
org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:505)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.solr.common.SolrException: unable to get autoscaling 
policy session
at 
org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getReplicaLocations(PolicyHelper.java:129)
at 
org.apache.solr.cloud.api.collections.Assign.getPositionsUsingPolicy(Assign.java:382)
at 
org.apache.solr.cloud.api.collections.Assign$PolicyBasedAssignStrategy.assign(Assign.java:630)
at 
org.apache.solr.cloud.api.collections.CreateCollectionCmd.buildReplicaPositions(CreateCollectionCmd.java:410)
at 
org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:190)
... 6 more
Caused by: org.apache.solr.common.SolrException: 
org.apache.solr.common.SolrException: Error getting remote info
at 
org.apache.solr.common.cloud.rule.ImplicitSnitch.getTags(ImplicitSnitch.java:78)
at 
org.apache.solr.client.solrj.impl.SolrClientNodeStateProvider.fetchTagValues(SolrClientNodeStateProvider.java:139)
at 
org.apache.solr.client.solrj.impl.SolrClientNodeStateProvider.getNodeValues(SolrClientNodeStateProvider.java:128)
at org.apache.solr.client.solrj.cloud.autoscaling.Row.(Row.java:71)
at 
org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session.(Policy.java:575)
at 
org.apache.solr.client.solrj.cloud.autoscaling.Policy.createSession(Policy.java:396)
at 
org.apache.solr.client.solrj.cloud.autoscaling.Policy.createSession(Policy.java:358)
at 
org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper$SessionRef.createSession(PolicyHelper.java:492)
at 
org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper$SessionRef.get(PolicyHelper.java:457)
at 
org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getSession(PolicyHelper.java:513)
at 
org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getReplicaLocations(PolicyHelper.java:127)
... 10 more
Caused by: org.apache.solr.common.SolrException: Error getting remote info
at 
org.apache.solr.client.solrj.impl.SolrClientNodeStateProvider$AutoScalingSnitch.getRemoteInfo(SolrClientNodeStateProvider.java:364)
at 
org.apache.solr.common.cloud.rule.ImplicitSnitch.getTags(ImplicitSnitch.java:76)
... 20 more
Caused by: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at https://HOSTNAME_REDACTED:8983/solr: Expected mime type 
application/octet-strea

[jira] [Created] (SOLR-15227) Single host in a bad state can block collection creation for the cluster with autoscaling enabled

2021-03-08 Thread Andy Throgmorton (Jira)

Andy Throgmorton created SOLR-15227:
---

 Summary: Single host in a bad state can block collection creation 
for the cluster with autoscaling enabled
 Key: SOLR-15227
 URL: https://issues.apache.org/jira/browse/SOLR-15227
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: AutoScaling
Affects Versions: 8.2
Reporter: Andy Throgmorton


We configured a SolrCloud cluster (running 8.2) with this cluster autoscaling 
policy:
{noformat}
{
  "set-cluster-preferences":[
{
  "minimize":"cores",
  "precision":5
},
{
  "maximize":"freedisk",
  "precision":25
},
{
  "minimize":"sysLoadAvg",
  "precision":10
}],
  "set-cluster-policy":[
{
  "replica": "<2",
  "node": "#ANY"
}],
  "set-trigger": {
"name":".auto_add_replicas",
"event":"nodeLost",
"waitFor":"10m",
"enabled":true,
"actions":[
  {
"name":"auto_add_replicas_plan",
"class":"solr.AutoAddReplicasPlanAction"},
  {
"name":"execute_plan",
"class":"solr.ExecutePlanAction"}]
  }
}{noformat}
A node was rebooted at one point, and when that node came back, it had trouble 
establishing a connection with ZK when it was initializing the CoreContainer. 
As a result, it returns 404s for (I think?) all admin requests.

Now, any call to create a collection in that cluster throw an error, with this 
stacktrace:
{noformat}
2021-03-04 12:47:03.615 ERROR 
(OverseerThreadFactory-141-thread-4-processing-n:HOST_REDACTED:8983_solr) [   ] 
o.a.s.c.a.c.OverseerCollectionMessageHandler Collection: COLLECTON_REDACTED 
operation: create failed:org.apache.solr.common.SolrException: Error getting 
replica locations : unable to get autoscaling policy session
at 
org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:195)
at 
org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:264)
at 
org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:505)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.solr.common.SolrException: unable to get autoscaling 
policy session
at 
org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getReplicaLocations(PolicyHelper.java:129)
at 
org.apache.solr.cloud.api.collections.Assign.getPositionsUsingPolicy(Assign.java:382)
at 
org.apache.solr.cloud.api.collections.Assign$PolicyBasedAssignStrategy.assign(Assign.java:630)
at 
org.apache.solr.cloud.api.collections.CreateCollectionCmd.buildReplicaPositions(CreateCollectionCmd.java:410)
at 
org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:190)
... 6 more
Caused by: org.apache.solr.common.SolrException: 
org.apache.solr.common.SolrException: Error getting remote info
at 
org.apache.solr.common.cloud.rule.ImplicitSnitch.getTags(ImplicitSnitch.java:78)
at 
org.apache.solr.client.solrj.impl.SolrClientNodeStateProvider.fetchTagValues(SolrClientNodeStateProvider.java:139)
at 
org.apache.solr.client.solrj.impl.SolrClientNodeStateProvider.getNodeValues(SolrClientNodeStateProvider.java:128)
at org.apache.solr.client.solrj.cloud.autoscaling.Row.(Row.java:71)
at 
org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session.(Policy.java:575)
at 
org.apache.solr.client.solrj.cloud.autoscaling.Policy.createSession(Policy.java:396)
at 
org.apache.solr.client.solrj.cloud.autoscaling.Policy.createSession(Policy.java:358)
at 
org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper$SessionRef.createSession(PolicyHelper.java:492)
at 
org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper$SessionRef.get(PolicyHelper.java:457)
at 
org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getSession(PolicyHelper.java:513)
at 
org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getReplicaLocations(PolicyHelper.java:127)
... 10 more
Caused by: org.apache.solr.common.SolrException: Error getting remote info
at 
org.apache.solr.client.solrj.impl.SolrClientNodeStateProvider$AutoScalingSnitch.getRemoteInfo(SolrClientNodeStateProvider.java:364)
at 
org.apache.solr.common.cloud.rule.ImplicitSnitch.getTags(ImplicitSnitch.java:76)
... 20 more
Caused by: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at https://HOSTNAME_REDACTED:8983/solr: Expected mime type 
application/octet-strea

[GitHub] [lucene-solr] janhoy opened a new pull request #2466: A silly test pr (DO NOT MERGE)

2021-03-08 Thread GitBox



janhoy opened a new pull request #2466:
URL: https://github.com/apache/lucene-solr/pull/2466


   Just for documentation's sake.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-site] janhoy merged pull request #49: Retire the general list from website

2021-03-08 Thread GitBox



janhoy merged pull request #49:
URL: https://github.com/apache/lucene-site/pull/49


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #2403: SOLR-15164: Implement Task Management Interface

2021-03-08 Thread GitBox



madrob commented on a change in pull request #2403:
URL: https://github.com/apache/lucene-solr/pull/2403#discussion_r589676923



##
File path: solr/core/src/test/org/apache/solr/MinimalSchemaTest.java
##
@@ -117,7 +117,8 @@ public void testAllConfiguredHandlers() {
 handler.startsWith("/terms") ||
 handler.startsWith("/analysis/")||
 handler.startsWith("/debug/") ||
-handler.startsWith("/replication")
+handler.startsWith("/replication") ||
+handler.startsWith("/tasks")

Review comment:
   fix indent

##
File path: 
lucene/core/src/java/org/apache/lucene/search/CancellableCollector.java
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.Objects;
+import java.util.concurrent.atomic.AtomicBoolean;
+import org.apache.lucene.index.LeafReaderContext;
+
+/** Allows a query to be cancelled */
+public class CancellableCollector implements Collector, CancellableTask {
+
+  /** Thrown when a query gets cancelled */
+  public static class QueryCancelledException extends RuntimeException {}
+
+  private Collector collector;
+  private AtomicBoolean isQueryCancelled;
+
+  public CancellableCollector(Collector collector) {
+Objects.requireNonNull(collector, "Internal collector not provided but 
wrapper collector accessed");
+
+this.collector = collector;
+this.isQueryCancelled = new AtomicBoolean();
+  }
+
+  @Override
+  public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
+
+if (isQueryCancelled.compareAndSet(true, false)) {

Review comment:
   Why does getLeafCollector "uncancel" it? Is this collector instance 
meant to be reused?

##
File path: 
solr/core/src/java/org/apache/solr/handler/component/ActiveTasksListComponent.java
##
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.handler.component;
+
+import org.apache.solr.common.MapWriter;
+import org.apache.solr.common.util.NamedList;
+
+import java.io.IOException;
+import java.util.Iterator;
+import java.util.Map;
+
+import static java.util.Arrays.asList;
+import static org.apache.solr.common.util.Utils.fromJSONString;
+
+/** List the active tasks that can be cancelled */
+public class ActiveTasksListComponent extends SearchComponent {
+public static final String COMPONENT_NAME = "activetaskslistcomponent";
+
+private boolean shouldProcess;
+
+@Override
+public void prepare(ResponseBuilder rb) throws IOException {
+if (rb.isTaskListRequest()) {
+shouldProcess = true;
+}
+}
+
+@Override
+public void process(ResponseBuilder rb) {
+if (!shouldProcess) {
+return;
+}
+
+if (rb.getTaskStatusCheckUUID() != null) {
+boolean isActiveOnThisShard = 
rb.req.getCore().getCancellableQueryTracker().isQueryIdActive(rb.getTaskStatusCheckUUID());
+
+rb.rsp.add("taskStatus", isActiveOnThisShard);
+return;
+}
+
+rb.rsp.add("taskList", (MapWriter) ew -> {
+Iterator> iterator = 
rb.req.getCore().getCancellableQueryTracker().getActiveQueriesGenerated();
+
+while (iterator.hasNext()) {
+Map.Entry entry = iterator.next();
+ew.put(entry.getKey(), entry.getValue()

[GitHub] [lucene-solr] mayya-sharipova commented on pull request #2186: LUCENE-9334 Consistency of field data structures

2021-03-08 Thread GitBox



mayya-sharipova commented on pull request #2186:
URL: https://github.com/apache/lucene-solr/pull/2186#issuecomment-793004798


   @jpountz  Hi Adrien. Thanks for your review of the PR: 
https://github.com/apache/lucene-solr/pull/2186. I will go through your review 
and address your comments. I wanted to ask two assumptions I have, and check if 
we are ok with them: 
   
   1) A first doc with a field introduces FieldInfo for this field for the 
whole index, even if eventually this doc doesn't get indexed (e.g. if this doc 
has a too big stored field, the whole doc will be rolled back and deleted).  
   
   2) Doc values updates (`IndexWriter#updateDocValues`) are only applicable 
for fields that are indexed with doc values only. This was not the case before, 
and for example we could update doc values for a field that was indexed with 
postings. 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-15226) org.apache.lucene.index.IndexNotFoundException: no segments* file found in LockValidatingDirectoryWrapper(NRTCachingDirectory

2021-03-08 Thread Charlene de Costa Chaves (Jira)

Charlene de Costa Chaves created SOLR-15226:
---

 Summary: org.apache.lucene.index.IndexNotFoundException: no 
segments* file found in LockValidatingDirectoryWrapper(NRTCachingDirectory
 Key: SOLR-15226
 URL: https://issues.apache.org/jira/browse/SOLR-15226
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrCLI
Affects Versions: 6.1.1
 Environment: Production
Reporter: Charlene de Costa Chaves


We configured 3 cores in our SOLR environment. After a network failure, we are 
experiencing an error in 1 of the core.

In the log file we find the following error:

sei-protocols: org.apache.solr.common.SolrException: 
org.apache.solr.common.SolrException: Error opening new searcher

null: org.apache.solr.common.SolrException: SolrCore 'sei-protocols' is not 
available due to init failure: Error opening new searcher

Caused by: org.apache.lucene.index.IndexNotFoundException: no segments * file 
found in LockValidatingDirectoryWrapper (NRTCachingDirectory (MMapDirectory @ / 
aplic / indices / sei-protocols / content / index 
lockFactory=org.apache.lucene.store.NativeFSLockFactory@1435c5 ; maxCacheMB = 
48.0 maxMergeSizeMB = 4.0)): files: [_6qmw.fdt, _6qmw.fdx, _6qmw.fnm, 
_6qmw.nvd, _6qmw.nvm, _6qmw.si, _6qmw_3mz.liv, 
_6qmw_Lucene50_0qmw_Lucene50_0qmw_Lucene50_0q6

The index directory is not empty. We have already restarted SOLR and the server 
and we are still having an error.

How to solve the problem?
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #2186: LUCENE-9334 Consistency of field data structures

2021-03-08 Thread GitBox



jpountz commented on a change in pull request #2186:
URL: https://github.com/apache/lucene-solr/pull/2186#discussion_r589531266



##
File path: lucene/core/src/java/org/apache/lucene/index/FieldInfo.java
##
@@ -111,6 +111,43 @@ public FieldInfo(
 
   /** Performs internal consistency checks. Always returns true (or throws 
IllegalStateException) */
   public boolean checkConsistency() {
+return checkOptionsCorrectness(

Review comment:
   the different name suggests that it's checking something different, I 
think it would be less confusing to use the same method name since this does 
the same thing?

##
File path: lucene/core/src/java/org/apache/lucene/index/FieldInfo.java
##
@@ -130,127 +167,252 @@ public boolean checkConsistency() {
   }
 }
 
-if (pointDimensionCount < 0) {
+if (docValuesType == null) {
+  throw new IllegalStateException("DocValuesType must not be null (field: 
'" + name + "')");
+}
+if (dvGen != -1 && docValuesType == DocValuesType.NONE) {
   throw new IllegalStateException(
-  "pointDimensionCount must be >= 0; got " + pointDimensionCount);
+  "field '"
+  + name
+  + "' cannot have a docvalues update generation without having 
docvalues");
 }
 
+if (pointDimensionCount < 0) {
+  throw new IllegalStateException(
+  "pointDimensionCount must be >= 0; got "
+  + pointDimensionCount
+  + " (field: '"
+  + name
+  + "')");
+}
 if (pointIndexDimensionCount < 0) {
   throw new IllegalStateException(
-  "pointIndexDimensionCount must be >= 0; got " + 
pointIndexDimensionCount);
+  "pointIndexDimensionCount must be >= 0; got "
+  + pointIndexDimensionCount
+  + " (field: '"
+  + name
+  + "')");
 }
-
 if (pointNumBytes < 0) {
-  throw new IllegalStateException("pointNumBytes must be >= 0; got " + 
pointNumBytes);
+  throw new IllegalStateException(
+  "pointNumBytes must be >= 0; got " + pointNumBytes + " (field: '" + 
name + "')");
 }
 
 if (pointDimensionCount != 0 && pointNumBytes == 0) {
   throw new IllegalStateException(
-  "pointNumBytes must be > 0 when pointDimensionCount=" + 
pointDimensionCount);
+  "pointNumBytes must be > 0 when pointDimensionCount="
+  + pointDimensionCount
+  + " (field: '"
+  + name
+  + "')");
 }
-
 if (pointIndexDimensionCount != 0 && pointDimensionCount == 0) {
   throw new IllegalStateException(
-  "pointIndexDimensionCount must be 0 when pointDimensionCount=0");
+  "pointIndexDimensionCount must be 0 when pointDimensionCount=0"
+  + " (field: '"
+  + name
+  + "')");
 }
-
 if (pointNumBytes != 0 && pointDimensionCount == 0) {
   throw new IllegalStateException(
-  "pointDimensionCount must be > 0 when pointNumBytes=" + 
pointNumBytes);
+  "pointDimensionCount must be > 0 when pointNumBytes="
+  + pointNumBytes
+  + " (field: '"
+  + name
+  + "')");
 }
 
-if (dvGen != -1 && docValuesType == DocValuesType.NONE) {
+if (vectorSearchStrategy == null) {
   throw new IllegalStateException(
-  "field '"
-  + name
-  + "' cannot have a docvalues update generation without having 
docvalues");
+  "Vector search strategy must not be null (field: '" + name + "')");
 }
-
 if (vectorDimension < 0) {
-  throw new IllegalStateException("vectorDimension must be >=0; got " + 
vectorDimension);
+  throw new IllegalStateException(
+  "vectorDimension must be >=0; got " + vectorDimension + " (field: '" 
+ name + "')");
 }
-
 if (vectorDimension == 0 && vectorSearchStrategy != 
VectorValues.SearchStrategy.NONE) {
   throw new IllegalStateException(
-  "vector search strategy must be NONE when dimension = 0; got " + 
vectorSearchStrategy);
+  "vector search strategy must be NONE when dimension = 0; got "
+  + vectorSearchStrategy
+  + " (field: '"
+  + name
+  + "')");
 }
-
 return true;
   }
 
-  // should only be called by FieldInfos#addOrUpdate
-  void update(
-  boolean storeTermVector,
-  boolean omitNorms,
-  boolean storePayloads,
-  IndexOptions indexOptions,
-  Map attributes,
-  int dimensionCount,
-  int indexDimensionCount,
-  int dimensionNumBytes) {
-if (indexOptions == null) {
-  throw new NullPointerException("IndexOptions must not be null (field: 
\"" + name + "\")");
-}
-// System.out.println("FI.update field=" + name + " indexed=" + indexed + 
" omitNorms=" +
-// omitNorms + " this.omitNorms=" + this.omitNorms);
-if (this.indexOptions != indexOptions) {

[jira] [Commented] (SOLR-15225) standardise test class naming

2021-03-08 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297627#comment-17297627
 ] 

David Smiley commented on SOLR-15225:
-

It appears [~markrmil...@gmail.com] is finally/actually working on the 
review-ability of the reference branch.  At least, I've been chatting with him 
on Slack lately on this subject.  Upon success of that (even partial success if 
we want bits/pieces), it will be a nightmare if tests are renamed.  Let's give 
this some more time, please?  We can at least bikeshed what the standard names 
should be :)  

> standardise test class naming
> -
>
> Key: SOLR-15225
> URL: https://issues.apache.org/jira/browse/SOLR-15225
> Project: Solr
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
>
> LUCENE-8626 started out as standardisation effort for both Lucene and Solr 
> test.
> The standardisation for Lucene tests ({{org.apache.lucene}} package space) is 
> now complete and enforced.
> This SOLR ticket here is for the standardisation of Solr test class names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-8626) standardise test class naming

2021-03-08 Thread Christine Poerschke (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke resolved LUCENE-8626.
-
Fix Version/s: master (9.0)
   Resolution: Fixed

Thanks everyone for making Lucene test name standardisation happen!

SOLR-15225 opened for standardisation of Solr test class names.

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-15225) standardise test class naming

2021-03-08 Thread Christine Poerschke (Jira)

Christine Poerschke created SOLR-15225:
--

 Summary: standardise test class naming
 Key: SOLR-15225
 URL: https://issues.apache.org/jira/browse/SOLR-15225
 Project: Solr
  Issue Type: Test
Reporter: Christine Poerschke


LUCENE-8626 started out as standardisation effort for both Lucene and Solr test.



The standardisation for Lucene tests ({{org.apache.lucene}} package space) is 
now complete and enforced.


This SOLR ticket here is for the standardisation of Solr test class names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jtibshirani commented on a change in pull request #2444: LUCENE-9705: Create Lucene90StoredFieldsFormat

2021-03-08 Thread GitBox



jtibshirani commented on a change in pull request #2444:
URL: https://github.com/apache/lucene-solr/pull/2444#discussion_r589618706



##
File path: 
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene50/compressing/Lucene50CompressingStoredFieldsReader.java
##
@@ -654,15 +669,7 @@ public void readBytes(byte[] b, int offset, int len) 
throws IOException {
 
   @Override
   public void skipBytes(long numBytes) throws IOException {
-if (numBytes < 0) {
-  throw new IllegalArgumentException("numBytes must be >= 0, 
got " + numBytes);
-}
-while (numBytes > bytes.length) {
-  numBytes -= bytes.length;
-  fillBuffer();
-}
-bytes.offset += numBytes;
-bytes.length -= numBytes;
+skipBytesSlowly(numBytes);

Review comment:
   Thanks for clarifying.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr-operator] HoustonPutman commented on pull request #231: Add conditional dependency for zk-operator helm chart

2021-03-08 Thread GitBox



HoustonPutman commented on pull request #231:
URL: 
https://github.com/apache/lucene-solr-operator/pull/231#issuecomment-792876296


   This is starting to look very good! I'll give it a test locally soon, to 
make sure that there are no issues with the Zookeeper Cluster that is created 
in the new version of the zookeeper-operator.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr-operator] HoustonPutman commented on a change in pull request #231: Add conditional dependency for zk-operator helm chart

2021-03-08 Thread GitBox



HoustonPutman commented on a change in pull request #231:
URL: 
https://github.com/apache/lucene-solr-operator/pull/231#discussion_r589558078



##
File path: helm/solr-operator/Chart.yaml
##
@@ -95,4 +95,10 @@ annotations:
 name: "example"
 numThreads: 4
 image:
-  tag: 8.7.0
\ No newline at end of file
+  tag: 8.7.0
+
+dependencies:
+  - name: 'zookeeper-operator'
+version: 0.2.9
+repository: https://charts.pravega.io
+condition: useZkOperator

Review comment:
   [The helm 
docs](https://helm.sh/docs/chart_best_practices/dependencies/#conditions-and-tags)
 give some good guidance. I think this logic is very good in general. We should 
just use the name `zookeeper-operator` instead of `zookeeperOperator`, since 
that's the name of the dependency.
   
   We should also think about maybe including backwards compatibility of the 
old `useZookeeperOperator` option, which was unfortunately a string `"true"`.
   
   I can take a stab at updating the documentation around this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-15167) Move lucene-solr-operator repo

2021-03-08 Thread Houston Putman (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297458#comment-17297458
 ] 

Houston Putman commented on SOLR-15167:
---

I'm planning on doing the move on Friday March 12th.

The helm charts are now at a stable location: https://solr.apache.org/charts

> Move lucene-solr-operator repo
> --
>
> Key: SOLR-15167
> URL: https://issues.apache.org/jira/browse/SOLR-15167
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Houston Putman
>Priority: Major
>
> The lucene-solr-operator repo will be (once again) moved, now to 
> "solr-operator". This must be ordered thorugh an INFRA ticker.
> As this will once again break the URL for helm, that also needs an update in 
> relevant places.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8626) standardise test class naming

2021-03-08 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297457#comment-17297457
 ] 

ASF subversion and git services commented on LUCENE-8626:
-

Commit 419db2304113fd97d79a8148dbdbb492ad879eb6 in lucene-solr's branch 
refs/heads/master from Christine Poerschke
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=419db23 ]

LUCENE-8626: enforce name standardisation for org.apache.lucene tests (#2441)

Co-authored-by: Dawid Weiss 

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke merged pull request #2441: LUCENE-8626: enforce name standardisation for org.apache.lucene tests

2021-03-08 Thread GitBox



cpoerschke merged pull request #2441:
URL: https://github.com/apache/lucene-solr/pull/2441


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-15163) Update DOAP for Solr

2021-03-08 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-15163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297423#comment-17297423
 ] 

Jan Høydahl commented on SOLR-15163:


DOAP updated, will check tomorrow that the projects.apache.org is updated 
accordingly and that things look sane, before resolving.

> Update DOAP for Solr
> 
>
> Key: SOLR-15163
> URL: https://issues.apache.org/jira/browse/SOLR-15163
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Anshum Gupta
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently two projects exist at projects.apache.org.
> 1. https://projects.apache.org/project.html?solr (managed by Solr)
> 2. https://projects.apache.org/project.html?lucene-solr (Managed by Lucene)
> We need to merge and/or delete the Solr project listed at #2 into #1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-15163) Update DOAP for Solr

2021-03-08 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-15163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297422#comment-17297422
 ] 

ASF subversion and git services commented on SOLR-15163:


Commit 605d3a00bbdaa5f0b1726b6452ce2a105f8acd50 in lucene-solr's branch 
refs/heads/master from Jan Høydahl
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=605d3a0 ]

SOLR-15163 Update DOAP file for solr TLP (#2464)



> Update DOAP for Solr
> 
>
> Key: SOLR-15163
> URL: https://issues.apache.org/jira/browse/SOLR-15163
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Anshum Gupta
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently two projects exist at projects.apache.org.
> 1. https://projects.apache.org/project.html?solr (managed by Solr)
> 2. https://projects.apache.org/project.html?lucene-solr (Managed by Lucene)
> We need to merge and/or delete the Solr project listed at #2 into #1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] janhoy merged pull request #2464: SOLR-15163 Update DOAP file for solr TLP

2021-03-08 Thread GitBox



janhoy merged pull request #2464:
URL: https://github.com/apache/lucene-solr/pull/2464


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] janhoy commented on pull request #2464: SOLR-15163 Update DOAP file for solr TLP

2021-03-08 Thread GitBox



janhoy commented on pull request #2464:
URL: https://github.com/apache/lucene-solr/pull/2464#issuecomment-792804790


   Thanks. I'll push it and tomorrow we'll know if the projects.apache.org site 
catches up and removes solr project from underneath Lucene.
   
   Once the git split is done we must update some svn location to point to the 
DOAP file in solr.git, but for now the DOAP file lives in lucene-solr.git so it 
should work well.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke commented on pull request #2465: SOLR-15224: delete TestXmlQParser class

2021-03-08 Thread GitBox



cpoerschke commented on pull request #2465:
URL: https://github.com/apache/lucene-solr/pull/2465#issuecomment-792799787


   Removed as part of #2448 instead.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke closed pull request #2465: SOLR-15224: delete TestXmlQParser class

2021-03-08 Thread GitBox



cpoerschke closed pull request #2465:
URL: https://github.com/apache/lucene-solr/pull/2465


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-site] mocobeta commented on pull request #49: Retire the general list from website

2021-03-08 Thread GitBox



mocobeta commented on pull request #49:
URL: https://github.com/apache/lucene-site/pull/49#issuecomment-792781414


   > Also remove out of date link to Wiki for archives - replaced with PonyMail
   
   +1!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14759) Separate the Lucene and Solr builds

2021-03-08 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297376#comment-17297376
 ] 

ASF subversion and git services commented on SOLR-14759:


Commit b591daad38f729825434b26a0ac390a7b3d3d9c2 in lucene-solr's branch 
refs/heads/master from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b591daa ]

SOLR-14759: correct build logic.


> Separate the Lucene and Solr builds
> ---
>
> Key: SOLR-14759
> URL: https://issues.apache.org/jira/browse/SOLR-14759
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> While still in same git repo, separate the builds, so Lucene and Solr can be 
> built independently.
> The preparation step includes optional building of just Lucene from current 
> master (prior to any code removal):
> Current status of joint and separate builds:
>  * (/) joint build
> {code}
> gradlew assemble check
> {code}
>  * (/) Lucene-only
> {code}
> gradlew -Dskip.solr=true assemble check
> {code}
>  * (/) Solr-only (with minor documentation validation exclusions)
> {code}
> gradlew -Dskip.lucene=true assemble check -x test
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8626) standardise test class naming

2021-03-08 Thread Christine Poerschke (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297375#comment-17297375
 ] 

Christine Poerschke commented on LUCENE-8626:
-

[https://github.com/apache/lucene-solr/pull/2441] scoped as _"enforce name 
standardisation for org.apache.lucene tests"_ is available for review.

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14759) Separate the Lucene and Solr builds

2021-03-08 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297370#comment-17297370
 ] 

ASF subversion and git services commented on SOLR-14759:


Commit 409bc37c138eb081e62fc1c1862f58dd7873abff in lucene-solr's branch 
refs/heads/master from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=409bc37 ]

SOLR-14759: a few initial changes so that Lucene can be built independently 
while Solr code is still in place. (#2448)



> Separate the Lucene and Solr builds
> ---
>
> Key: SOLR-14759
> URL: https://issues.apache.org/jira/browse/SOLR-14759
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> While still in same git repo, separate the builds, so Lucene and Solr can be 
> built independently.
> The preparation step includes optional building of just Lucene from current 
> master (prior to any code removal):
> Current status of joint and separate builds:
>  * (/) joint build
> {code}
> gradlew assemble check
> {code}
>  * (/) Lucene-only
> {code}
> gradlew -Dskip.solr=true assemble check
> {code}
>  * (/) Solr-only (with minor documentation validation exclusions)
> {code}
> gradlew -Dskip.lucene=true assemble check -x test
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dweiss merged pull request #2448: SOLR-14759: a few initial changes so that Lucene can be built independently while Solr code is still in place.

2021-03-08 Thread GitBox



dweiss merged pull request #2448:
URL: https://github.com/apache/lucene-solr/pull/2448


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dweiss commented on pull request #2462: SOLR-14759 fix tests that need on lucene test-src

2021-03-08 Thread GitBox



dweiss commented on pull request #2462:
URL: https://github.com/apache/lucene-solr/pull/2462#issuecomment-792772167


   I'll remove it in this PR:
   https://github.com/apache/lucene-solr/pull/2448
   
   On Mon, Mar 8, 2021 at 2:53 PM Christine Poerschke 
   wrote:
   
   > *@cpoerschke* commented on this pull request.
   > --
   >
   > In solr/core/src/test/org/apache/solr/search/TestXmlQParser.java
   > :
   >
   > >  import org.slf4j.Logger;
   >  import org.slf4j.LoggerFactory;
   >
   > -
   > -public class TestXmlQParser extends TestCoreParser {
   > +@Ignore("Was relying on Lucene test sources. Should copy?")
   >
   > Just opened #2465  for
   > the removal before seeing your comment here (but the JIRA ticket).
   >
   > —
   > You are receiving this because you modified the open/close state.
   > Reply to this email directly, view it on GitHub
   > ,
   > or unsubscribe
   > 

   > .
   >
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #2462: SOLR-14759 fix tests that need on lucene test-src

2021-03-08 Thread GitBox



cpoerschke commented on a change in pull request #2462:
URL: https://github.com/apache/lucene-solr/pull/2462#discussion_r589436382



##
File path: solr/core/src/test/org/apache/solr/search/TestXmlQParser.java
##
@@ -18,20 +18,23 @@
 
 import java.lang.invoke.MethodHandles;
 
+import org.apache.lucene.analysis.MockAnalyzer;
+import org.apache.lucene.analysis.MockTokenFilter;
+import org.apache.lucene.analysis.MockTokenizer;
 import org.apache.lucene.queryparser.xml.CoreParser;
 
-import org.apache.lucene.queryparser.xml.TestCoreParser;
-
+import org.apache.solr.SolrTestCase;
 import org.apache.solr.util.StartupLoggingUtils;
 import org.apache.solr.util.TestHarness;
 
 import org.junit.AfterClass;
 import org.junit.BeforeClass;
+import org.junit.Ignore;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-
-public class TestXmlQParser extends TestCoreParser {
+@Ignore("Was relying on Lucene test sources. Should copy?")

Review comment:
   Just opened #2465 for the removal before seeing your comment here (but 
the JIRA ticket).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dweiss commented on a change in pull request #2462: SOLR-14759 fix tests that need on lucene test-src

2021-03-08 Thread GitBox



dweiss commented on a change in pull request #2462:
URL: https://github.com/apache/lucene-solr/pull/2462#discussion_r589435996



##
File path: solr/core/src/test/org/apache/solr/search/TestXmlQParser.java
##
@@ -18,20 +18,23 @@
 
 import java.lang.invoke.MethodHandles;
 
+import org.apache.lucene.analysis.MockAnalyzer;
+import org.apache.lucene.analysis.MockTokenFilter;
+import org.apache.lucene.analysis.MockTokenizer;
 import org.apache.lucene.queryparser.xml.CoreParser;
 
-import org.apache.lucene.queryparser.xml.TestCoreParser;
-
+import org.apache.solr.SolrTestCase;
 import org.apache.solr.util.StartupLoggingUtils;
 import org.apache.solr.util.TestHarness;
 
 import org.junit.AfterClass;
 import org.junit.BeforeClass;
+import org.junit.Ignore;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-
-public class TestXmlQParser extends TestCoreParser {
+@Ignore("Was relying on Lucene test sources. Should copy?")

Review comment:
   Thanks Christine. I'll just remove it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-15224) Resolve the two ignored tests after Lucene test sources become unavailable

2021-03-08 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-15224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297363#comment-17297363
 ] 

ASF subversion and git services commented on SOLR-15224:


Commit 9112b723fe01ac78db0750f2f268cf1ef9181b0c in lucene-solr's branch 
refs/heads/SOLR-15224-TestXmlQParser from Christine Poerschke
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9112b72 ]

SOLR-15224: delete TestXmlQParser class

> Resolve the two ignored tests after Lucene test sources become unavailable
> --
>
> Key: SOLR-15224
> URL: https://issues.apache.org/jira/browse/SOLR-15224
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Dawid Weiss
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke opened a new pull request #2465: SOLR-15224: delete TestXmlQParser class

2021-03-08 Thread GitBox



cpoerschke opened a new pull request #2465:
URL: https://github.com/apache/lucene-solr/pull/2465


   https://issues.apache.org/jira/browse/SOLR-15224



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13434) OpenTracing support for Solr

2021-03-08 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297362#comment-17297362
 ] 

David Smiley commented on SOLR-13434:
-

Why is GlobalTracer.get().close() called from SolrDispatchFilter.close instead 
of CoreContainer.shutdown()?  After all, CC _creates_ the tracer so it is most 
appropriate that it manage the life-cycle.

> OpenTracing support for Solr
> 
>
> Key: SOLR-13434
> URL: https://issues.apache.org/jira/browse/SOLR-13434
> Project: Solr
>  Issue Type: New Feature
>Reporter: Shalin Shekhar Mangar
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.2, master (9.0)
>
> Attachments: SOLR-13434.patch
>
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> [OpenTracing|https://opentracing.io/] is a vendor neutral API and 
> infrastructure for distributed tracing. Many OSS tracers just as Jaeger, 
> OpenZipkin, Apache SkyWalking as well as commercial tools support OpenTracing 
> APIs. Ideally, we can implement it once and have integrations for popular 
> tracers like we have with metrics and prometheus.
> I'm aware of SOLR-9641 but HTrace has since retired from incubator for lack 
> of activity so this is a fresh attempt at solving this problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #2462: SOLR-14759 fix tests that need on lucene test-src

2021-03-08 Thread GitBox



cpoerschke commented on a change in pull request #2462:
URL: https://github.com/apache/lucene-solr/pull/2462#discussion_r589433676



##
File path: solr/core/src/test/org/apache/solr/search/TestXmlQParser.java
##
@@ -18,20 +18,23 @@
 
 import java.lang.invoke.MethodHandles;
 
+import org.apache.lucene.analysis.MockAnalyzer;
+import org.apache.lucene.analysis.MockTokenFilter;
+import org.apache.lucene.analysis.MockTokenizer;
 import org.apache.lucene.queryparser.xml.CoreParser;
 
-import org.apache.lucene.queryparser.xml.TestCoreParser;
-
+import org.apache.solr.SolrTestCase;
 import org.apache.solr.util.StartupLoggingUtils;
 import org.apache.solr.util.TestHarness;
 
 import org.junit.AfterClass;
 import org.junit.BeforeClass;
+import org.junit.Ignore;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-
-public class TestXmlQParser extends TestCoreParser {
+@Ignore("Was relying on Lucene test sources. Should copy?")

Review comment:
   SolrCoreParser extends (Lucene) CoreParser. (Solr) XmlQParser[Plugin] 
uses SolrCoreParser. The extending and use are (currently only) to support 
user-customisation via configuration of the classes, hence the 
`testSomeOtherQuery()` below is commented out and the `TestXmlQParser` test 
here doesn't test anything beyond what it inherits from the `TestCoreParser` 
test.
   
   I think it's okay to just delete the `TestXmlQParser` test.
   
   code links:
   
   * 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr%2F8.8.1/lucene/queryparser/src/java/org/apache/lucene/queryparser/xml/CoreParser.java
   * 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr%2F8.8.1/lucene/queryparser/src/test/org/apache/lucene/queryparser/xml/TestCoreParser.java
   
   * 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr%2F8.8.1/solr/core/src/java/org/apache/solr/search/SolrCoreParser.java
   * 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr%2F8.8.1/solr/core/src/test/org/apache/solr/search/TestSolrCoreParser.java
   
   * 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr%2F8.8.1/solr/core/src/java/org/apache/solr/search/XmlQParserPlugin.java
   * 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr%2F8.8.1/solr/core/src/test/org/apache/solr/search/TestXmlQParserPlugin.java
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-15224) Resolve the two ignored tests after Lucene test sources become unavailable

2021-03-08 Thread Dawid Weiss (Jira)

Dawid Weiss created SOLR-15224:
--

 Summary: Resolve the two ignored tests after Lucene test sources 
become unavailable
 Key: SOLR-15224
 URL: https://issues.apache.org/jira/browse/SOLR-15224
 Project: Solr
  Issue Type: Task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Dawid Weiss






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14759) Separate the Lucene and Solr builds

2021-03-08 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297348#comment-17297348
 ] 

ASF subversion and git services commented on SOLR-14759:


Commit 408b3775ddc204f9e8e27f7fc591e95d584308d4 in lucene-solr's branch 
refs/heads/master from Mike Drob
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=408b377 ]

SOLR-14759 fix tests that need on lucene test-src (#2462)

Rewrite one, ignore the other two.

> Separate the Lucene and Solr builds
> ---
>
> Key: SOLR-14759
> URL: https://issues.apache.org/jira/browse/SOLR-14759
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> While still in same git repo, separate the builds, so Lucene and Solr can be 
> built independently.
> The preparation step includes optional building of just Lucene from current 
> master (prior to any code removal):
> Current status of joint and separate builds:
>  * (/) joint build
> {code}
> gradlew assemble check
> {code}
>  * (/) Lucene-only
> {code}
> gradlew -Dskip.solr=true assemble check
> {code}
>  * (/) Solr-only (with minor documentation validation exclusions)
> {code}
> gradlew -Dskip.lucene=true assemble check -x test
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dweiss merged pull request #2462: SOLR-14759 fix tests that need on lucene test-src

2021-03-08 Thread GitBox



dweiss merged pull request #2462:
URL: https://github.com/apache/lucene-solr/pull/2462


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-site] janhoy opened a new pull request #50: Edits related to switching from master to main branch

2021-03-08 Thread GitBox



janhoy opened a new pull request #50:
URL: https://github.com/apache/lucene-site/pull/50


   We need edits to README and site-instructions.
   
   Also `asf.yaml` needs to refer to 'main' as whoami, so the site will be 
built from this branch.
   
   NB: Don't merge this PR to main branch until we are ready to do the switch 
(i.e. INFRA has changed default branch?), else a site build will kick off and 
change the site prematurely.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14759) Separate the Lucene and Solr builds

2021-03-08 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-14759:
---
Description: 
While still in same git repo, separate the builds, so Lucene and Solr can be 
built independently.

The preparation step includes optional building of just Lucene from current 
master (prior to any code removal):

Current status of joint and separate builds:
 * (/) joint build
{code}
gradlew assemble check
{code}
 * (/) Lucene-only
{code}
gradlew -Dskip.solr=true assemble check
{code}
 * (/) Solr-only (with minor documentation validation exclusions)
{code}
gradlew -Dskip.lucene=true assemble check -x test
{code}

  was:
While still in same git repo, separate the builds, so Lucene and Solr can be 
built independently.

The preparation step includes optional building of just Lucene from current 
master (prior to any code removal):

Current status of joint and separate builds:
 * (/) joint build
{code}
gradlew assemble check
{code}
 * (/) Lucene-only
{code}
gradlew -Dskip.solr=true assemble check
{code}
 * (/) Solr-only (with minor documentation validation exclusions)
{code}
gradlew -p solr  -Dskip.lucene=true check -x test
{code}


> Separate the Lucene and Solr builds
> ---
>
> Key: SOLR-14759
> URL: https://issues.apache.org/jira/browse/SOLR-14759
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> While still in same git repo, separate the builds, so Lucene and Solr can be 
> built independently.
> The preparation step includes optional building of just Lucene from current 
> master (prior to any code removal):
> Current status of joint and separate builds:
>  * (/) joint build
> {code}
> gradlew assemble check
> {code}
>  * (/) Lucene-only
> {code}
> gradlew -Dskip.solr=true assemble check
> {code}
>  * (/) Solr-only (with minor documentation validation exclusions)
> {code}
> gradlew -Dskip.lucene=true assemble check -x test
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14759) Separate the Lucene and Solr builds

2021-03-08 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-14759:
---
Description: 
While still in same git repo, separate the builds, so Lucene and Solr can be 
built independently.

The preparation step includes optional building of just Lucene from current 
master (prior to any code removal):

Current status of joint and separate builds:
 * (/) joint build
{code}
gradlew assemble check
{code}
 * (/) Lucene-only
{code}
gradlew -Dskip.solr=true assemble check
{code}
 * (/) Solr-only (with minor documentation validation exclusions)
{code}
gradlew -p solr  -Dskip.lucene=true check -x test
{code}

  was:
While still in same git repo, separate the builds, so Lucene and Solr can be 
built independently.

The preparation step includes optional building of just Lucene from current 
master (prior to any code removal):

Current status of joint and separate builds:
 * (/) joint build
{code}
gradlew assemble check
{code}
 * (/) Lucene-only
{code}
gradlew -Dskip.solr=true assemble check
{code}
 * (/) Solr-only (with documentation exclusions)
{code}
gradlew -Dskip.lucene=true assemble check -x test -x checkBrokenLinks -x 
checkLocalJavadocLinksSite
{code}


> Separate the Lucene and Solr builds
> ---
>
> Key: SOLR-14759
> URL: https://issues.apache.org/jira/browse/SOLR-14759
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> While still in same git repo, separate the builds, so Lucene and Solr can be 
> built independently.
> The preparation step includes optional building of just Lucene from current 
> master (prior to any code removal):
> Current status of joint and separate builds:
>  * (/) joint build
> {code}
> gradlew assemble check
> {code}
>  * (/) Lucene-only
> {code}
> gradlew -Dskip.solr=true assemble check
> {code}
>  * (/) Solr-only (with minor documentation validation exclusions)
> {code}
> gradlew -p solr  -Dskip.lucene=true check -x test
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8626) standardise test class naming

2021-03-08 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297341#comment-17297341
 ] 

ASF subversion and git services commented on LUCENE-8626:
-

Commit d53b3da0eaad67130a31e6cf6dc3712dc05e22c1 in lucene-solr's branch 
refs/heads/master from Christine Poerschke
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d53b3da ]

LUCENE-8626: standardise 3 more Lucene test names (#2440)



> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke merged pull request #2440: LUCENE-8626: standardise 3 more Lucene test names

2021-03-08 Thread GitBox



cpoerschke merged pull request #2440:
URL: https://github.com/apache/lucene-solr/pull/2440


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-site] janhoy opened a new pull request #49: Retire the general list from website

2021-03-08 Thread GitBox



janhoy opened a new pull request #49:
URL: https://github.com/apache/lucene-site/pull/49


   Also remove out of date link to Wiki for archives - replaced with PonyMail
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9826) Move lucene-site to using "main" branch

2021-03-08 Thread Jira

Jan Høydahl created LUCENE-9826:
---

 Summary: Move lucene-site to using "main" branch
 Key: LUCENE-9826
 URL: https://issues.apache.org/jira/browse/LUCENE-9826
 Project: Lucene - Core
  Issue Type: Task
Reporter: Jan Høydahl
Assignee: Jan Høydahl


Now with the Solr site spun off, let's retire the "master" branch and start 
using "main" instead.

I'll open an INFRA Jira to change default git branch, and a PR to change the 
pelican build.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] donnerpeter commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



donnerpeter commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589317275



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, in

[GitHub] [lucene-solr] donnerpeter commented on pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



donnerpeter commented on pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#issuecomment-792651966


   I've fixed the encountered bugs, added regression tests for them, and 
applied most of these great suggestions, thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] iverase commented on a change in pull request #2444: LUCENE-9705: Create Lucene90StoredFieldsFormat

2021-03-08 Thread GitBox



iverase commented on a change in pull request #2444:
URL: https://github.com/apache/lucene-solr/pull/2444#discussion_r589307069



##
File path: 
lucene/core/src/java/org/apache/lucene/codecs/compressing/Lucene90CompressingStoredFieldsFormat.java
##
@@ -40,7 +40,7 @@
  *
  * @lucene.experimental
  */
-public class CompressingStoredFieldsFormat extends StoredFieldsFormat {
+public class Lucene90CompressingStoredFieldsFormat extends StoredFieldsFormat {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] donnerpeter commented on pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



donnerpeter commented on pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#issuecomment-792642905


   > For greek, if you analyze the distribution of dictionary (I use 
https://scripts.sil.org/UnicodeCharacterCount ), you can see that smallest 
character in the whole dictionary is `0x386` (decimal 902) and largest is 
`0x3CE` (decimal 974). So, even simpler, you could exploit that and encode a 
single `int base = 0x386; // smallest char in use` for this whole dictionary 
and all characters would be single-byte encoded.
   
   @rmuir This seems to bring memory usage a bit down, but not so much as I'd 
hope (11.1->9.4 MB for Greek, 463->458 MB total). As it also complicates the 
code, I'd leave this out, at least for now.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] iverase commented on a change in pull request #2444: LUCENE-9705: Create Lucene90StoredFieldsFormat

2021-03-08 Thread GitBox



iverase commented on a change in pull request #2444:
URL: https://github.com/apache/lucene-solr/pull/2444#discussion_r589289455



##
File path: 
lucene/core/src/java/org/apache/lucene/codecs/compressing/LZ4WithPresetDictCompressionMode.java
##
@@ -14,12 +14,9 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.lucene.codecs.lucene87;
+package org.apache.lucene.codecs.compressing;

Review comment:
   Ok, I copy classes to backwards codec so we are not sharing any of those 
classes





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dweiss commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



dweiss commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589287265



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, int cou

[GitHub] [lucene-solr] dweiss commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



dweiss commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589285841



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, int cou

[GitHub] [lucene-solr] donnerpeter commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



donnerpeter commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589282077



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, in

[GitHub] [lucene-solr] muse-dev[bot] commented on a change in pull request #2403: SOLR-15164: Implement Task Management Interface

2021-03-08 Thread GitBox



muse-dev[bot] commented on a change in pull request #2403:
URL: https://github.com/apache/lucene-solr/pull/2403#discussion_r589280471



##
File path: 
solr/core/src/java/org/apache/solr/handler/component/ActiveTasksListHandler.java
##
@@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.handler.component;
+
+import org.apache.solr.api.Api;
+import org.apache.solr.api.ApiBag;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.request.SolrRequestHandler;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.security.AuthorizationContext;
+import org.apache.solr.security.PermissionNameProvider;
+
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import static org.apache.solr.common.params.CommonParams.TASK_CHECK_UUID;
+
+/**
+ * Handles request for listing all active cancellable tasks
+ */
+public class ActiveTasksListHandler extends TaskManagementHandler {
+// This can be a parent level member but we keep it here to allow future 
handlers to have
+// a custom list of components
+private List components;
+
+@Override
+public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) 
throws Exception {
+Map extraParams = null;
+ResponseBuilder rb = buildResponseBuilder(req, rsp, 
getComponentsList());
+
+String taskStatusCheckUUID = req.getParams().get(TASK_CHECK_UUID, 
null);
+
+if (taskStatusCheckUUID != null) {
+if (rb.isDistrib) {
+extraParams = new HashMap<>();
+
+extraParams.put(TASK_CHECK_UUID, taskStatusCheckUUID);
+}
+
+rb.setTaskStatusCheckUUID(taskStatusCheckUUID);
+}
+
+// Let this be visible to handleResponses in the handling component
+rb.setTaskListRequest(true);
+
+processRequest(req, rb, extraParams);
+}
+
+@Override
+public String getDescription() {
+return "activetaskslist";
+}
+
+@Override
+public Category getCategory() {
+return Category.ADMIN;
+}
+
+@Override
+public PermissionNameProvider.Name getPermissionName(AuthorizationContext 
ctx) {
+return PermissionNameProvider.Name.READ_PERM;
+}
+
+@Override
+public SolrRequestHandler getSubHandler(String path) {

Review comment:
   *:*  param being equated to Strings using equals
   (at-me [in a reply](https://docs.muse.dev/docs/talk-to-muse/) with `help` or 
`ignore`)

##
File path: 
solr/core/src/java/org/apache/solr/handler/component/QueryCancellationHandler.java
##
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.handler.component;
+
+import org.apache.solr.api.Api;
+import org.apache.solr.api.ApiBag;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.request.SolrRequestHandler;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.security.AuthorizationContext;
+import org.apache.solr.security.PermissionNameProvider;
+
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import static org.apache.solr.common.params.CommonParams.QUERY_UUID;
+
+/**
+ * Handles requests for query cancellation for cancellable queries
+ */
+public class QueryCancellationHandler extends TaskManagementHandler

[GitHub] [lucene-solr] donnerpeter commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



donnerpeter commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589276407



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, in

[GitHub] [lucene-solr] donnerpeter commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



donnerpeter commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589273320



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();

Review comment:
   Thanks, but I've tried that and it doesn't seem to make any difference.

##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License

[GitHub] [lucene-solr] dweiss commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



dweiss commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589271596



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, int cou

[GitHub] [lucene-solr] dweiss commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



dweiss commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589270751



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, int cou

[GitHub] [lucene-solr] donnerpeter commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



donnerpeter commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589269346



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, in

[GitHub] [lucene-solr] dweiss commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



dweiss commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589264065



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, int cou

[GitHub] [lucene-solr] donnerpeter commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



donnerpeter commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589262984



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, in

[GitHub] [lucene-solr] iverase commented on a change in pull request #2444: LUCENE-9705: Create Lucene90StoredFieldsFormat

2021-03-08 Thread GitBox



iverase commented on a change in pull request #2444:
URL: https://github.com/apache/lucene-solr/pull/2444#discussion_r589262884



##
File path: 
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene50/compressing/Lucene50CompressingStoredFieldsFormat.java
##
@@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.backward_codecs.lucene50.compressing;
+
+import java.io.IOException;
+import org.apache.lucene.codecs.CodecUtil;
+import org.apache.lucene.codecs.StoredFieldsFormat;
+import org.apache.lucene.codecs.StoredFieldsReader;
+import org.apache.lucene.codecs.StoredFieldsWriter;
+import org.apache.lucene.codecs.compressing.CompressionMode;
+import org.apache.lucene.index.FieldInfos;
+import org.apache.lucene.index.MergePolicy;
+import org.apache.lucene.index.SegmentInfo;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.store.IOContext;
+import org.apache.lucene.util.packed.DirectMonotonicWriter;
+
+/**
+ * A {@link StoredFieldsFormat} that compresses documents in chunks in order 
to improve the
+ * compression ratio.
+ *
+ * For a chunk size of chunkSize bytes, this {@link 
StoredFieldsFormat} does not
+ * support documents larger than (231 - chunkSize) 
bytes.
+ *
+ * For optimal performance, you should use a {@link MergePolicy} that 
returns segments that have
+ * the biggest byte size first.
+ *
+ * @lucene.experimental
+ */
+public class Lucene50CompressingStoredFieldsFormat extends StoredFieldsFormat {
+
+  /** format name */
+  protected final String formatName;

Review comment:
   Not sure as there are some values which are set using special logic. It 
seems important to keep those.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on pull request #2403: SOLR-15164: Implement Task Management Interface

2021-03-08 Thread GitBox



atris commented on pull request #2403:
URL: https://github.com/apache/lucene-solr/pull/2403#issuecomment-792591442


   @sigram Updated per comments, please see and let me know your comments



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on a change in pull request #2403: SOLR-15164: Implement Task Management Interface

2021-03-08 Thread GitBox



atris commented on a change in pull request #2403:
URL: https://github.com/apache/lucene-solr/pull/2403#discussion_r589256768



##
File path: solr/solrj/src/java/org/apache/solr/common/params/ShardParams.java
##
@@ -28,6 +28,9 @@
 public interface ShardParams {
   /** the shards to use (distributed configuration) */
   String SHARDS = "shards";
+
+  /** UUID of the query */
+  String QUERY_ID = "queryID";

Review comment:
   Will that not cause confusion, since this is the generic queryID 
(generated for every cancellable query) and the other parameter references the 
queryID the incoming cancellation/list request wants to deal with?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on a change in pull request #2403: SOLR-15164: Implement Task Management Interface

2021-03-08 Thread GitBox



atris commented on a change in pull request #2403:
URL: https://github.com/apache/lucene-solr/pull/2403#discussion_r589255458



##
File path: solr/solrj/src/java/org/apache/solr/common/params/CommonParams.java
##
@@ -160,6 +160,26 @@
*/
   String TIME_ALLOWED = "timeAllowed";
 
+  /**
+   * Is the query cancellable?
+   */
+  String IS_QUERY_CANCELLABLE = "canCancel";
+
+  /**
+   * Custom query UUID if provided.
+   */
+  String CUSTOM_QUERY_UUID = "queryUUID";
+
+  /**
+   * UUID for query to be cancelled
+   */
+  String QUERY_CANCELLATION_UUID = "cancelUUID";
+
+  /**
+   * UUID of the task whose status is to be checked
+   */
+  String TASK_CHECK_UUID = "taskUUID";

Review comment:
   We will (I plan to get to the long running collection creation soon)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] iverase commented on a change in pull request #2444: LUCENE-9705: Create Lucene90StoredFieldsFormat

2021-03-08 Thread GitBox



iverase commented on a change in pull request #2444:
URL: https://github.com/apache/lucene-solr/pull/2444#discussion_r589255161



##
File path: 
lucene/backward-codecs/src/test/org/apache/lucene/backward_codecs/lucene87/Lucene87RWCodec.java
##
@@ -37,6 +38,17 @@ public PostingsFormat getPostingsFormatForField(String 
field) {
   return defaultPF;
 }
   };
+  private final Mode mode;
+
+  public Lucene87RWCodec() {
+super();
+this.mode = Mode.BEST_COMPRESSION;

Review comment:
   ok





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] iverase commented on a change in pull request #2444: LUCENE-9705: Create Lucene90StoredFieldsFormat

2021-03-08 Thread GitBox



iverase commented on a change in pull request #2444:
URL: https://github.com/apache/lucene-solr/pull/2444#discussion_r589254949



##
File path: 
lucene/backward-codecs/src/test/org/apache/lucene/backward_codecs/lucene50/compressing/Lucene50RWCompressingStoredFieldsFormat.java
##
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.backward_codecs.lucene50.compressing;
+
+import java.io.IOException;
+import org.apache.lucene.codecs.StoredFieldsWriter;
+import org.apache.lucene.codecs.compressing.CompressionMode;
+import org.apache.lucene.index.SegmentInfo;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.store.IOContext;
+
+/** RW impersonation of Lucene50CompressingStoredFieldsFormat. */
+public class Lucene50RWCompressingStoredFieldsFormat extends 
Lucene50CompressingStoredFieldsFormat {
+
+  /** Sole constructor. */
+  public Lucene50RWCompressingStoredFieldsFormat(

Review comment:
   removed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] iverase commented on a change in pull request #2444: LUCENE-9705: Create Lucene90StoredFieldsFormat

2021-03-08 Thread GitBox



iverase commented on a change in pull request #2444:
URL: https://github.com/apache/lucene-solr/pull/2444#discussion_r589253022



##
File path: 
lucene/backward-codecs/src/test/org/apache/lucene/backward_codecs/lucene87/Lucene87RWCodec.java
##
@@ -57,4 +69,10 @@ public PostingsFormat postingsFormat() {
   public TermVectorsFormat termVectorsFormat() {
 return new Lucene50RWTermVectorsFormat();
   }
+
+  @Override
+  public StoredFieldsFormat storedFieldsFormat() {
+// TODO needs to consider compression mode?

Review comment:
   This was a left-over





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on a change in pull request #2403: SOLR-15164: Implement Task Management Interface

2021-03-08 Thread GitBox



atris commented on a change in pull request #2403:
URL: https://github.com/apache/lucene-solr/pull/2403#discussion_r589251072



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/response/QueryResponse.java
##
@@ -184,6 +187,15 @@ else if ( "terms".equals( n ) ) {
   else if ( "moreLikeThis".equals( n ) ) {
 _moreLikeThisInfo = (NamedList) res.getVal( i );
   }
+  else if ("taskList".equals( n )) {

Review comment:
   I didnt quite parse that -- you mean, reuse `taskInfo` across? How would 
that work, since each of the three calls returns a different type?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] donnerpeter commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



donnerpeter commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589250776



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, in

[jira] [Commented] (SOLR-15031) NPE caused by FunctionQParser returning a null ValueSource

2021-03-08 Thread Pieter (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297179#comment-17297179
 ] 

Pieter commented on SOLR-15031:
---

Yes that would help. It will be a lot work of to add the annotations 
everywhere, but that's the case for any solution. The big advantage is, it will 
be compatible with what we have. New issues, possible bugs will be flagged, but 
there will not be a broken build or failing test that needs immediate 
attention, so fixing those issues can be planned.

This issue is a nice demonstration of how that would work (I think, haven't 
tried it out yet): if QueryValueSource would mark the Query constructor 
parameter as @NotNull and QParser.getQuery() would return a @Nullable Query, 
then a problem would be detected (while building the code). Issue if fixed by 
adding an explicit null check in FunctionQParser as I did in my PR.

Regarding SOLR-8319; since Solr's QParsers (like ExtendedDismaxQParser) 
delegate building the Query to QueryBuilder, it makes sense to mark methods in 
there to be returning a @Nullable Query as well. Doing so would also deal with 
SOLR-8319.

To explore the solution, we could use this issue; cut a branch before my PR is 
merged, add annotations in there, observate the warnings being generated, merge 
in the PR and see that the warnings are gone.

> NPE caused by FunctionQParser returning a null ValueSource
> --
>
> Key: SOLR-15031
> URL: https://issues.apache.org/jira/browse/SOLR-15031
> Project: Solr
>  Issue Type: Bug
>Reporter: Pieter
>Assignee: Mike Drob
>Priority: Minor
> Fix For: 8.8, master (9.0)
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When parsing a sub query in a function query, 
> {{FunctionQParser#parseValueSource}} does not check if the produced query 
> object is null. When it is, it just wraps a null in a {{QueryValueSource}} 
> object. This is a cause for NPE's in code consuming that object. Parsed 
> queries can be null, for example when the query string only contains 
> stopwords, so we need handle that condition.
> h3. Steps to reproduce the issue
>  # Start solr with the techproducts example collection: {{solr start -e 
> techproducts}}
>  # Add a stopword to 
> SOLR_DIR/example/techproducts/solr/techproducts/conf/stopwords.txt, for 
> example "at"
>  # Reload the core
>  # Execute a function query:
> {code:java}
> http://localhost:8983/solr/techproducts/select?fieldquery={!field%20f=features%20v=%27%22at%22%27}&q={!func}%20if($fieldquery,1,0){code}
> The following stacktrace is produced:
> {code:java}
> 2020-12-03 13:35:38.868 INFO  (qtp2095677157-21) [   x:techproducts] 
> o.a.s.c.S.Request [techproducts]  webapp=/solr path=/select 
> params={q={!func}+if($fieldquery,1,0)&fieldquery={!field+f%3Dfeatures+v%3D'"at"'}}
>  status=500 QTime=34
> 2020-12-03 13:35:38.872 ERROR (qtp2095677157-21) [   x:techproducts] 
> o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
> at 
> org.apache.lucene.queries.function.valuesource.QueryValueSource.hashCode(QueryValueSource.java:63)
> at 
> org.apache.lucene.queries.function.valuesource.IfFunction.hashCode(IfFunction.java:129)
> at 
> org.apache.lucene.queries.function.FunctionQuery.hashCode(FunctionQuery.java:176)
> at 
> org.apache.solr.search.QueryResultKey.(QueryResultKey.java:53)
> at 
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1341)
> at 
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:580)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] iverase commented on a change in pull request #2444: LUCENE-9705: Create Lucene90StoredFieldsFormat

2021-03-08 Thread GitBox



iverase commented on a change in pull request #2444:
URL: https://github.com/apache/lucene-solr/pull/2444#discussion_r589249744



##
File path: 
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene87/Lucene87Codec.java
##
@@ -101,14 +102,22 @@ public DocValuesFormat getDocValuesFormatForField(String 
field) {
 
   /** Instantiates a new codec. */
   public Lucene87Codec() {
+this(Mode.BEST_COMPRESSION);

Review comment:
   We actually need the default constructor in codecs or you get errors 
like:
   
   ```
   org.apache.lucene.codecs.Codec: 
org.apache.lucene.backward_codecs.lucene87.Lucene87Codec Unable to get public 
no-arg constructor
   java.util.ServiceConfigurationError: org.apache.lucene.codecs.Codec: 
org.apache.lucene.backward_codecs.lucene87.Lucene87Codec Unable to get public 
no-arg constructor
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] donnerpeter commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



donnerpeter commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589248658



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, in

[GitHub] [lucene-solr] janhoy opened a new pull request #2464: SOLR-15163 Update DOAP file for solr TLP

2021-03-08 Thread GitBox



janhoy opened a new pull request #2464:
URL: https://github.com/apache/lucene-solr/pull/2464


   See https://issues.apache.org/jira/browse/SOLR-15163 for background.
   
   The DOAP file is used to generate projects.apache.org, so updating it will 
fix that directory



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-15163) Update DOAP for Solr

2021-03-08 Thread Jira



 [ 
https://issues.apache.org/jira/browse/SOLR-15163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-15163:
---
Summary: Update DOAP for Solr  (was: Merge or delete the old Solr project 
at projects.apache.org)

> Update DOAP for Solr
> 
>
> Key: SOLR-15163
> URL: https://issues.apache.org/jira/browse/SOLR-15163
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Anshum Gupta
>Assignee: Anshum Gupta
>Priority: Major
>
> Currently two projects exist at projects.apache.org.
> 1. https://projects.apache.org/project.html?solr (managed by Solr)
> 2. https://projects.apache.org/project.html?lucene-solr (Managed by Lucene)
> We need to merge and/or delete the Solr project listed at #2 into #1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-15163) Update DOAP for Solr

2021-03-08 Thread Jira



 [ 
https://issues.apache.org/jira/browse/SOLR-15163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reassigned SOLR-15163:
--

Assignee: Jan Høydahl  (was: Anshum Gupta)

> Update DOAP for Solr
> 
>
> Key: SOLR-15163
> URL: https://issues.apache.org/jira/browse/SOLR-15163
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Anshum Gupta
>Assignee: Jan Høydahl
>Priority: Major
>
> Currently two projects exist at projects.apache.org.
> 1. https://projects.apache.org/project.html?solr (managed by Solr)
> 2. https://projects.apache.org/project.html?lucene-solr (Managed by Lucene)
> We need to merge and/or delete the Solr project listed at #2 into #1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] donnerpeter commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions

2021-03-08 Thread GitBox



donnerpeter commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r589246351



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, in

1 2 >

1 - 100 of 112 matches

Mail list logo