[jira] [Commented] (LUCENE-9387) Remove RAM accounting from LeafReader

2021-03-09 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298594#comment-17298594
 ] 

Dawid Weiss commented on LUCENE-9387:
-

I really like those reported diagnostics. I wouldn't worry about fuzziness for 
small indexes (where the actual-reported ratio can be off but it still overall 
not that significant) but I'd still like to see them work fairly well for 
larger segments... 

Why is RamEstimator so much off (I think there is a test that compares reported 
vs. "actual" size?).

I will not stand in the way if you wish to remove it but I think memory 
reporting is useful in many situations (debugging, regression control and even 
at runtime for a rough assessment of the remaining releasable heap, without 
flushing the gc -- we do use this last example).

> Remove RAM accounting from LeafReader
> -
>
> Key: LUCENE-9387
> URL: https://issues.apache.org/jira/browse/LUCENE-9387
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Blocker
> Fix For: master (9.0)
>
>
> Context for this issue can be found at 
> https://lists.apache.org/thread.html/r06b6a63d8689778bbc2736ec7e4e39bf89ae6973c19f2ec6247690fd%40%3Cdev.lucene.apache.org%3E.
> RAM accounting made sense when readers used lots of memory. E.g. when norms 
> were on heap, we could return memory usage of the norms array and memory 
> estimates would be very close to actual memory usage.
> However nowadays, readers consume very little memory, so RAM accounting has 
> become less valuable. Furthermore providing good estimates has become 
> incredibly complex as we can no longer focus on a couple main contributors to 
> memory usage, but would need to start considering things that we historically 
> ignored, such as field infos, segment infos, NIOFS buffers, etc.
> Let's remove RAM accounting from LeafReader?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on pull request #2403: SOLR-15164: Implement Task Management Interface

2021-03-09 Thread GitBox


atris commented on pull request #2403:
URL: https://github.com/apache/lucene-solr/pull/2403#issuecomment-795007935


   @sigram @madrob @anshumg Updated the PR, please see and let me know.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #2403: SOLR-15164: Implement Task Management Interface

2021-03-09 Thread GitBox


atris commented on a change in pull request #2403:
URL: https://github.com/apache/lucene-solr/pull/2403#discussion_r591126938



##
File path: solr/solr-ref-guide/src/task-management.adoc
##
@@ -0,0 +1,73 @@
+= Task Management
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr allows users to control their running tasks by monitoring them, 
specifying tasks as cancellation enabled and allowing
+cancellation of the same.
+
+This is achieved using the task management interface. Currently, this is 
supported for queries.
+
+== Types of Operations
+Task management interface (TMI) supports the following types of operations:
+
+1. List all currently running cancellable tasks.
+2. Cancel a specific task.
+3. Query the status of a specific task.
+
+== Listing All Active Cancellable Tasks
+To list all the active cancellable tasks currently running, please use the 
following syntax:
+
+`\http://localhost:8983/solr/tasks/list`
+
+ Sample Response
+
+`{responseHeader={status=0, QTime=11370}, 
taskList={0=q=*%3A*&canCancel=true&queryUUID=0&_stateVer_=collection1%3A4&wt=javabin&version=2,
 
5=q=*%3A*&canCancel=true&queryUUID=5&_stateVer_=collection1%3A4&wt=javabin&version=2,
 
7=q=*%3A*&canCancel=true&queryUUID=7&_stateVer_=collection1%3A4&wt=javabin&version=2}`
+
+== Cancelling An Active Cancellable Task
+To cancel an active task, please use the following syntax:
+
+`\http://localhost:8983/solr/tasks/cancel?cancelUUID=foobar`
+
+ cancelUUID Parameter
+This parameter is used to specify the UUID of the task to be cancelled.
+
+ Sample Response
+= If the task UUID was found and successfully cancelled:
+
+`{responseHeader={status=0, QTime=39}, status=Query with queryID 85 cancelled 
successfully}`
+
+= If the task UUID was not found
+
+`{responseHeader={status=0, QTime=39}, status=Query with queryID 85 not found}`
+
+= If the cancellation failed
+
+`{responseHeader={status=0, QTime=39}, status=Query with queryID 85 could not 
be cancelled successfully}`
+
+== Check Status of a Specific Task
+To check the status of a specific task, please use the following syntax:
+
+`\http://localhost:8983/solr/tasks/list?taskUUID=foobar`
+
+ taskUUID Parameter
+`taskUUID` parameter can be used to specify a task UUID whose status can be 
checked.
+
+ Sample Response
+`{responseHeader={status=0, QTime=6128}, taskStatus=foobar:true}`

Review comment:
   Yes. This checks if the query exists in the system or not.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15198) Slf4j logs threadName incorrectly in some cases

2021-03-09 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298544#comment-17298544
 ] 

David Smiley commented on SOLR-15198:
-

I think these thread names with MDC is deliberate.  See 
org.apache.solr.common.util.ExecutorUtil.MDCAwareThreadPoolExecutor#execute

> Slf4j logs threadName incorrectly in some cases
> ---
>
> Key: SOLR-15198
> URL: https://issues.apache.org/jira/browse/SOLR-15198
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: logging
>Affects Versions: 8.6
>Reporter: Megan Carey
>Priority: Minor
>
> I'm running Solr 8.6, and I'm seeing some logs report threadName with the 
> entire log message and MDC. I haven't dug in too much, but CoreContainer logs 
> seem to be the biggest culprit. Not sure if it's an issue of thread naming 
> (including delimiter in name?), SolrLogFormat parsing, or something else 
> altogether.
> ```
> { [
>    CoreAdminHandler.action: CREATE
>    CoreAdminHandler.asyncId: 
> rebalance_replicas_trigger/41b16883a3bcTdzc5xqm9leudgekft79dpj5zl/372246269149506
>    collection: collectionName
>    core: collectionName_shard1_0_0_0_1_0_0_0_1_1_0_1_0_0_replica_s7606
>    level: INFO
>    logger: org.apache.solr.core.CoreContainer
>    message: Creating SolrCore 
> 'collectionName_shard1_0_0_0_1_0_0_0_1_1_0_1_0_0_replica_s7606' using 
> configuration from configset crm, trusted=true
>    node_name: REDACTED_HOSTNAME:8983_solr
>    replica: core_node7607
>    shard: shard1_0_0_0_1_0_0_0_1_1_0_1_0_0
>    threadId: 4861
>    *threadName: 
> parallelCoreAdminExecutor-19-thread-6-processing-n:REDACTED_HOSTNAME:8983_solr
>  x:collectionName_shard1_0_0_0_1_0_0_0_1_1_0_1_0_0_replica_s7606 
> t:Shared-932ac44c-06a4-44d3-ba8f-5a64c8f8d708 
> rebalance_replicas_trigger//41b16883a3bcTdzc5xqm9leudgekft79dpj5zl//372246269149506
>  CREATE*
>    timestamp: 2021-02-03T16:26:56.482Z
>    trace_id: Shared-932ac44c-06a4-44d3-ba8f-5a64c8f8d708
> }
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9634) Highlighting of degenerate spans on fields *with offsets* doesn't work properly

2021-03-09 Thread Zach Chen (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298529#comment-17298529
 ] 

Zach Chen commented on LUCENE-9634:
---

Hi [~dweiss], I took a look at this issue and am also not sure what's the 
proper way of fixing it. I'm considering a few possible solutions below, but I 
am wondering if there's other better solution as well. Hence I would like to 
get your opinion on it before I proceed further (I can also open a PR for 
discussion if that's preferred).

For context, the root cause of the issue is that unlike positions read in 
*OffsetsFromPositions#get* with *MatchesIterator#startPosition* and 
*MatchesIterator#endPosition*, which accounts for *before* / *after* values 
properly through *ExtendedIntervalIterator#start* and 
*ExtendedIntervalIterator#end* respectively, ** offset read in 
*OffsetsFromMatchIterator#get* with *MatchesIterator#startOffset* and 
*MatchesIterator#endOffset* doesn't adjust the start and end offset with 
*before* / *after* values at all, hence the incorrect offset highlight and the 
test failure for *TestMatchRegionRetriever#testDegenerateIntervalsWithOffsets*. 
Looking at the other OffsetsRetrievalStrategy implementations such as 
*OffsetsFromTokens* and *OffsetsFromValues,* since they didn't store / use 
*before* / *after* values either, I suspect they may have the same issue (but I 
haven't tested them to confirm yet). 

For the solution to this, I'm considering the following two options:
 # Deprecate *OffsetsFromMatchIterator* with *OffsetsFromPositions*. These two 
appear to have similar implementations, and since supporting position 
adjustment with *before* / *after* values in *OffsetsFromMatchIterator* 
necessarily requires processing token position information as well, the 
processing work involved might be the same with *OffsetsFromPositions* if 
*before* / *after* are used. However, under "typical" scenarios where *before* 
/ *after* adjustment is not needed, *OffsetsFromPositions* does do more work 
than *OffsetsFromMatchIterator* due to the conversion from position to offset 
at the end.
 # Implement *OffsetsFromMatchIterator* similar to *OffsetsFromTokens* and 
*OffsetsFromValues*, by explicitly analyzing and looping over token stream 
again. This does require the *before* / *after* values somehow become available 
in *OffsetsFromMatchIterator*, which may require some signature change.

Other option includes creating a new class similar to 
*ExtendedIntervalIterator*, but handle position adjustment within 
*MatchesIterator#startOffset* and *MatchesIterator#endOffset*  internally with 
token stream processing. But this option also appears to require changing quite 
a few signatures so it may not be ideal.

What do you think about the solutions above?

> Highlighting of degenerate spans on fields *with offsets* doesn't work 
> properly
> ---
>
> Key: LUCENE-9634
> URL: https://issues.apache.org/jira/browse/LUCENE-9634
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
>
> Match highlighter works fine with degenerate interval positions when 
> {{OffsetsFromPositions}} strategy is used to compute offsets but will show 
> incorrect offset ranges if offsets are read from directly from the 
> {{MatchIterator}} ({{OffsetsFromMatchIterator}}).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] tflobbe opened a new pull request #2468: SOLR-15154: Document new options for credentials

2021-03-09 Thread GitBox


tflobbe opened a new pull request #2468:
URL: https://github.com/apache/lucene-solr/pull/2468


   Documentation for the client changes introduced in SOLR-15154



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-15154) Let Http2SolrClient pass Basic Auth credentials to all requests

2021-03-09 Thread Tomas Eduardo Fernandez Lobbe (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomas Eduardo Fernandez Lobbe reassigned SOLR-15154:


Assignee: Tomas Eduardo Fernandez Lobbe

> Let Http2SolrClient pass Basic Auth credentials to all requests
> ---
>
> Key: SOLR-15154
> URL: https://issues.apache.org/jira/browse/SOLR-15154
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrJ
>Reporter: Tomas Eduardo Fernandez Lobbe
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In {{HttpSolrClient}}, one could specify credentials [at the JVM 
> level|https://lucene.apache.org/solr/guide/8_8/basic-authentication-plugin.html#global-jvm-basic-auth-credentials],
>  and that would make all requests to Solr have them. This doesn't work with 
> the Http2 clients case and I think it's very useful. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-15216) Invalid JS Object Key data.followers.currentData

2021-03-09 Thread Tomas Eduardo Fernandez Lobbe (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomas Eduardo Fernandez Lobbe resolved SOLR-15216.
--
Fix Version/s: 8.9
   Resolution: Fixed

Merged the PR. Thanks [~deanpearce]!

> Invalid JS Object Key data.followers.currentData
> 
>
> Key: SOLR-15216
> URL: https://issues.apache.org/jira/browse/SOLR-15216
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI
>Affects Versions: 8.7, 8.8, 8.8.1
>Reporter: Dean Pearce
>Priority: Major
> Fix For: 8.9
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Minor bug in the Admin UI Angular code, a line was changed to `
> settings.currentTime = parseDateToEpoch(data.follower.currentDate);` but the 
> underlying API still refers to `data.slave`. I believe this is fixed in the 
> master stream as the migration to the new leader/follower naming was 
> complete, but is broken in 8.x (8.7 an onwards).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15216) Invalid JS Object Key data.followers.currentData

2021-03-09 Thread Tomas Eduardo Fernandez Lobbe (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomas Eduardo Fernandez Lobbe updated SOLR-15216:
-
Priority: Minor  (was: Major)

> Invalid JS Object Key data.followers.currentData
> 
>
> Key: SOLR-15216
> URL: https://issues.apache.org/jira/browse/SOLR-15216
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI
>Affects Versions: 8.7, 8.8, 8.8.1
>Reporter: Dean Pearce
>Priority: Minor
> Fix For: 8.9
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Minor bug in the Admin UI Angular code, a line was changed to `
> settings.currentTime = parseDateToEpoch(data.follower.currentDate);` but the 
> underlying API still refers to `data.slave`. I believe this is fixed in the 
> master stream as the migration to the new leader/follower naming was 
> complete, but is broken in 8.x (8.7 an onwards).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15216) Invalid JS Object Key data.followers.currentData

2021-03-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298431#comment-17298431
 ] 

ASF subversion and git services commented on SOLR-15216:


Commit d76dfaf68a86fe3609315f2d37a1edbe43f6b439 in lucene-solr's branch 
refs/heads/branch_8x from Dean Pearce
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d76dfaf ]

SOLR-15216 Fix for Invalid Reference to data.followers in Admin UI (#2456)

* SOLR-15216 Fix for incorrect reference to data.followers when API in 8.x 
returns data.slave.

* SOLR-15216 Update CHANGES.txt with bug fix.

> Invalid JS Object Key data.followers.currentData
> 
>
> Key: SOLR-15216
> URL: https://issues.apache.org/jira/browse/SOLR-15216
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI
>Affects Versions: 8.7, 8.8, 8.8.1
>Reporter: Dean Pearce
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Minor bug in the Admin UI Angular code, a line was changed to `
> settings.currentTime = parseDateToEpoch(data.follower.currentDate);` but the 
> underlying API still refers to `data.slave`. I believe this is fixed in the 
> master stream as the migration to the new leader/follower naming was 
> complete, but is broken in 8.x (8.7 an onwards).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15216) Invalid JS Object Key data.followers.currentData

2021-03-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298430#comment-17298430
 ] 

ASF subversion and git services commented on SOLR-15216:


Commit d76dfaf68a86fe3609315f2d37a1edbe43f6b439 in lucene-solr's branch 
refs/heads/branch_8x from Dean Pearce
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d76dfaf ]

SOLR-15216 Fix for Invalid Reference to data.followers in Admin UI (#2456)

* SOLR-15216 Fix for incorrect reference to data.followers when API in 8.x 
returns data.slave.

* SOLR-15216 Update CHANGES.txt with bug fix.

> Invalid JS Object Key data.followers.currentData
> 
>
> Key: SOLR-15216
> URL: https://issues.apache.org/jira/browse/SOLR-15216
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI
>Affects Versions: 8.7, 8.8, 8.8.1
>Reporter: Dean Pearce
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Minor bug in the Admin UI Angular code, a line was changed to `
> settings.currentTime = parseDateToEpoch(data.follower.currentDate);` but the 
> underlying API still refers to `data.slave`. I believe this is fixed in the 
> master stream as the migration to the new leader/follower naming was 
> complete, but is broken in 8.x (8.7 an onwards).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15216) Invalid JS Object Key data.followers.currentData

2021-03-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298429#comment-17298429
 ] 

ASF subversion and git services commented on SOLR-15216:


Commit d76dfaf68a86fe3609315f2d37a1edbe43f6b439 in lucene-solr's branch 
refs/heads/branch_8x from Dean Pearce
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d76dfaf ]

SOLR-15216 Fix for Invalid Reference to data.followers in Admin UI (#2456)

* SOLR-15216 Fix for incorrect reference to data.followers when API in 8.x 
returns data.slave.

* SOLR-15216 Update CHANGES.txt with bug fix.

> Invalid JS Object Key data.followers.currentData
> 
>
> Key: SOLR-15216
> URL: https://issues.apache.org/jira/browse/SOLR-15216
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI
>Affects Versions: 8.7, 8.8, 8.8.1
>Reporter: Dean Pearce
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Minor bug in the Admin UI Angular code, a line was changed to `
> settings.currentTime = parseDateToEpoch(data.follower.currentDate);` but the 
> underlying API still refers to `data.slave`. I believe this is fixed in the 
> master stream as the migration to the new leader/follower naming was 
> complete, but is broken in 8.x (8.7 an onwards).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] tflobbe merged pull request #2456: SOLR-15216 Fix for Invalid Reference to data.followers in Admin UI

2021-03-09 Thread GitBox


tflobbe merged pull request #2456:
URL: https://github.com/apache/lucene-solr/pull/2456


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15237) Distributed search with index sharding is not working with basic authentication plugin enabled

2021-03-09 Thread Samir Huremovic (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samir Huremovic updated SOLR-15237:
---
Affects Version/s: 7.7.3
  Description: 
Issue confirmed for 7.7.3, 8.7 and 8.8.1.

Steps to reproduce are:
1. Following the docs for setting up distributed search 
(https://solr.apache.org/guide/8_8/distributed-search-with-index-sharding.html).
1.1 Stop both nodes after confirming that distributed search works without 
basic auth (last step).
2. Enable basic authentication plugin for both nodes, example for node1 
{{example/nodes/node1/security.json}}:
{noformat}
"authentication":{ 
   "blockUnknown": true, 
   "class":"solr.BasicAuthPlugin",
   "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}, 
   "realm":"My Solr users", 
   "forwardCredentials": false 
}}
{noformat}
3. Configure {{shardsWhitelist}} in {{solr.xml}} for both nodes, example for 
node1 {{example/nodes/node1/solr.xml}}
{noformat}

${socketTimeout:60}
${connTimeout:6}
localhost:8984,localhost:8985
  
{noformat}
4. Start both nodes.
5. Confirm that searching on one node with basic auth works with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&wt=xml&indent=true"}}
6. Confirm that searching on both nodes does not work with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";

Error:
{noformat}
❯ curl --user solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";




  401
  173
  
*:*
localhost:8985/solr/core1,localhost:8984/solr/core1
true
id,name
xml
  


  
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
  
  Error from server at null: Expected mime type 
application/octet-stream but got text/html. 


Error 401 require authentication

HTTP ERROR 401 require authentication

URI:/solr/core1/select
STATUS:401
MESSAGE:require authentication
SERVLET:default
401 {noformat} See also SOLR-14569 that seems similar, but the patch provided does not help after I applied it to 8.8.1, therefore I think this is not the same issue. Adjust priority as necessary. For cases where basic auth is required this means we cannot use Solr as of now. was: Issue confirmed for 8.7 and 8.8.1. Steps to reproduce are: 1. Following the docs for setting up distributed search (https://solr.apache.org/guide/8_8/distributed-search-with-index-sharding.html). 1.1 Stop both nodes after confirming that distributed search works without basic auth (last step). 2. Enable basic authentication plugin for both nodes, example for node1 {{example/nodes/node1/security.json}}: {noformat} "authentication":{ "blockUnknown": true, "class":"solr.BasicAuthPlugin", "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}, "realm":"My Solr users", "forwardCredentials": false }} {noformat} 3. Configure {{shardsWhitelist}} in {{solr.xml}} for both nodes, example for node1 {{example/nodes/node1/solr.xml}} {noformat} ${socketTimeout:60} ${connTimeout:6} localhost:8984,localhost:8985 {noformat} 4. Start both nodes. 5. Confirm that searching on one node with basic auth works with {{curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&wt=xml&indent=true"}} 6. Confirm that searching on both nodes does not work with {{curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml"; Error: {noformat} ❯ curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml"; 401 173 *:* localhost:8985/solr/core1,localhost:8984/solr/core1 true id,name xml org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException Error from server at null: Expected mime type application/octet-stream but got text/html. Error 401 require authentication

HTTP ERROR 401 require authentication

URI:/solr/core1

[jira] [Updated] (SOLR-15237) Distributed search with index sharding is not working with basic authentication plugin enabled



 [ 
https://issues.apache.org/jira/browse/SOLR-15237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samir Huremovic updated SOLR-15237:
---
Description: 
Issue confirmed for 8.7 and 8.8.1.

Steps to reproduce are:
1. Following the docs for setting up distributed search 
(https://solr.apache.org/guide/8_8/distributed-search-with-index-sharding.html).
1.1 Stop both nodes after confirming that distributed search works without 
basic auth (last step).
2. Enable basic authentication plugin for both nodes, example for node1 
{{example/nodes/node1/security.json}}:
{noformat}
"authentication":{ 
   "blockUnknown": true, 
   "class":"solr.BasicAuthPlugin",
   "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}, 
   "realm":"My Solr users", 
   "forwardCredentials": false 
}}
{noformat}
3. Configure {{shardsWhitelist}} in {{solr.xml}} for both nodes, example for 
node1 {{example/nodes/node1/solr.xml}}
{noformat}

${socketTimeout:60}
${connTimeout:6}
localhost:8984,localhost:8985
  
{noformat}
4. Start both nodes.
5. Confirm that searching on one node with basic auth works with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&wt=xml&indent=true"}}
6. Confirm that searching on both nodes does not work with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";

Error:
{noformat}
❯ curl --user solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";




  401
  173
  
*:*
localhost:8985/solr/core1,localhost:8984/solr/core1
true
id,name
xml
  


  
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
  
  Error from server at null: Expected mime type 
application/octet-stream but got text/html. 


Error 401 require authentication

HTTP ERROR 401 require authentication

URI:/solr/core1/select
STATUS:401
MESSAGE:require authentication
SERVLET:default
401 {noformat} See also SOLR-14569 that seems similar, but the patch provided does not help after I applied it to 8.8.1, therefore I think this is not the same issue. Adjust priority as necessary. For cases where basic auth is required this means we cannot use Solr as of now. was: Issue confirmed for 8.7 and 8.8.1. Steps to reproduce are: 1. Following the docs for setting up distributed search (https://solr.apache.org/guide/8_8/distributed-search-with-index-sharding.html). 1.1 Stop both nodes after confirming that distributed search works without basic auth (last step). 2. Enable basic authentication plugin for both nodes, example for {{example/nodes/node1/security.json}}: {noformat} "authentication":{ "blockUnknown": true, "class":"solr.BasicAuthPlugin", "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}, "realm":"My Solr users", "forwardCredentials": false }} {noformat} 3. Configure {{shardsWhitelist}} in {{solr.xml}} of each node {{example/nodes/node1/solr.xml}} {noformat} ${socketTimeout:60} ${connTimeout:6} localhost:8984,localhost:8985 {noformat} 4. Start both nodes. 5. Confirm that searching on one node with basic auth works with {{curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&wt=xml&indent=true"}} 6. Confirm that searching on both nodes does not work with {{curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml"; Error: {noformat} ❯ curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml"; 401 173 *:* localhost:8985/solr/core1,localhost:8984/solr/core1 true id,name xml org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException Error from server at null: Expected mime type application/octet-stream but got text/html. Error 401 require authentication

HTTP ERROR 401 require authentication

URI:/solr/core1/select
STATUS:401
MESSA

[jira] [Updated] (SOLR-15237) Distributed search with index sharding is not working with basic authentication plugin enabled



 [ 
https://issues.apache.org/jira/browse/SOLR-15237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samir Huremovic updated SOLR-15237:
---
Description: 
Issue confirmed for 8.7 and 8.8.1.

Steps to reproduce are:
1. Following the docs for setting up distributed search 
(https://solr.apache.org/guide/8_8/distributed-search-with-index-sharding.html).
1.1 Stop both nodes after confirming that distributed search works without 
basic auth (last step).
2. Enable basic authentication plugin for both nodes, example for 
{{example/nodes/node1/security.json}}:
{noformat}
"authentication":{ 
   "blockUnknown": true, 
   "class":"solr.BasicAuthPlugin",
   "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}, 
   "realm":"My Solr users", 
   "forwardCredentials": false 
}}
{noformat}
3. Configure {{shardsWhitelist}} in {{solr.xml}} of each node 
{{example/nodes/node1/solr.xml}}
{noformat}

${socketTimeout:60}
${connTimeout:6}
localhost:8984,localhost:8985
  
{noformat}
4. Start both nodes.
5. Confirm that searching on one node with basic auth works with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&wt=xml&indent=true"}}
6. Confirm that searching on both nodes does not work with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";

Error:
{noformat}
❯ curl --user solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";




  401
  173
  
*:*
localhost:8985/solr/core1,localhost:8984/solr/core1
true
id,name
xml
  


  
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
  
  Error from server at null: Expected mime type 
application/octet-stream but got text/html. 


Error 401 require authentication

HTTP ERROR 401 require authentication

URI:/solr/core1/select
STATUS:401
MESSAGE:require authentication
SERVLET:default
401 {noformat} See also SOLR-14569 that seems similar, but the patch provided does not help after I applied it to 8.8.1, therefore I think this is not the same issue. Adjust priority as necessary. For cases where basic auth is required this means we cannot use Solr as of now. was: Issue confirmed for 8.7 and 8.8.1. Steps to reproduce are: 1. Following the docs for setting up distributed search (https://solr.apache.org/guide/8_8/distributed-search-with-index-sharding.html). 1.1 Stop both nodes after confirming that distributed search works without basic auth (last step). 2. Enable basic authentication plugin for both nodes, example for {{example/nodes/node1/security.json}}: {{"authentication":{ "blockUnknown": true, "class":"solr.BasicAuthPlugin", "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}, "realm":"My Solr users", "forwardCredentials": false 3. Configure {{shardsWhitelist}} in {{solr.xml}} of each node {{example/nodes/node1/solr.xml}} {{ ${socketTimeout:60} ${connTimeout:6} localhost:8984,localhost:8985 }} 4. Start both nodes. 5. Confirm that searching on one node with basic auth works with {{curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&wt=xml&indent=true"}} 6. Confirm that searching on both nodes does not work with {{curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml"; Error: {{❯ curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml"; 401 173 *:* localhost:8985/solr/core1,localhost:8984/solr/core1 true id,name xml org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException Error from server at null: Expected mime type application/octet-stream but got text/html. Error 401 require authentication

HTTP ERROR 401 require authentication

URI:/solr/core1/select
STATUS:401
MESSAGE:require authentication
SERVLET:d

[jira] [Commented] (LUCENE-9827) Small segments are slower to merge due to stored fields since 8.7



[ 
https://issues.apache.org/jira/browse/LUCENE-9827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298412#comment-17298412
 ] 

Robert Muir commented on LUCENE-9827:
-

You can find some info of what i did before to target this with benchmarks, on 
the original issue adding tooDirty(): LUCENE-6183
Before that issue, we'd never be able to bulk-merge in practice because of 
"alignment" issues: LUCENE-5646

Bulk merge as done with this tooDirty() stuff on LUCENE-6183 has some 
limitations: e.g. it only happens for an append-only use-case, if there are any 
deletes we trigger recompression. So we should keep the worst case in mind 
going forwards too!

But you could address this problem in many totally different ways if we take a 
step back. For example, we could re-compress every merge only with some 
probability, which achieves the same goal (amortizing the recompression cost). 
If there are deletes, maybe we can write some kind of "tombstones" and 
bulk-copy existing data. Eventually it all gets merged away. 

But these are not simple tradeoffs and could result in surprises, so the 
tooDirty was trying to achieve the least surprise.


> Small segments are slower to merge due to stored fields since 8.7
> -
>
> Key: LUCENE-9827
> URL: https://issues.apache.org/jira/browse/LUCENE-9827
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: total-merge-time-by-num-docs-on-small-segments.png
>
>
> [~dm] and [~dimitrisli] looked into an interesting case where indexing slowed 
> down after upgrading to 8.7. After digging we identified that this was due to 
> the merging of stored fields, which had become slower on average.
> This is due to changes to stored fields, which now have top-level blocks that 
> are then split into sub-blocks and compressed using shared dictionaries (one 
> dictionary per top-level block). As the top-level blocks are larger than they 
> were before, segments are more likely to be considered "dirty" by the merging 
> logic. Dirty segments are segments were 1% of the data or more consists of 
> incomplete blocks. For large segments, the size of blocks doesn't really 
> affect the dirtiness of segments: if you flush a segment that has 100 blocks 
> or more, it will never be considered dirty as only the last block may be 
> incomplete. But for small segments it does: for instance if your segment is 
> only 10 blocks, it is very likely considered dirty given that the last block 
> is always incomplete. And the fact that we increased the top-level block size 
> means that segments that used to be considered clean might now be considered 
> dirty.
> And indeed benchmarks reported that while large stored fields merges became 
> slightly faster after upgrading to 8.7, the smaller merges actually became 
> slower. See attached chart, which gives the total merge time as a function of 
> the number of documents in the segment.
> I don't know how we can address this, this is a natural consequence of the 
> larger block size, which is needed to achieve better compression ratios. But 
> I wanted to open an issue about it in case someone has a bright idea how we 
> could make things better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15237) Distributed search with index sharding is not working with basic authentication plugin enabled



 [ 
https://issues.apache.org/jira/browse/SOLR-15237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samir Huremovic updated SOLR-15237:
---
Description: 
Issue confirmed for 8.7 and 8.8.1.

Steps to reproduce are:
1. Following the docs for setting up distributed search 
(https://solr.apache.org/guide/8_8/distributed-search-with-index-sharding.html).
1.1 Stop both nodes after confirming that distributed search works without 
basic auth (last step).
2. Enable basic authentication plugin for both nodes, example for 
{{example/nodes/node1/security.json}}:
{{"authentication":{ 
   "blockUnknown": true, 
   "class":"solr.BasicAuthPlugin",
   "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}, 
   "realm":"My Solr users", 
   "forwardCredentials": false 

3. Configure {{shardsWhitelist}} in {{solr.xml}} of each node 
{{example/nodes/node1/solr.xml}}
{{
${socketTimeout:60}
${connTimeout:6}
localhost:8984,localhost:8985
  }}
4. Start both nodes.
5. Confirm that searching on one node with basic auth works with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&wt=xml&indent=true"}}
6. Confirm that searching on both nodes does not work with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";

Error:
{{❯ curl --user solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";




  401
  173
  
*:*
localhost:8985/solr/core1,localhost:8984/solr/core1
true
id,name
xml
  


  
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
  
  Error from server at null: Expected mime type 
application/octet-stream but got text/html. 


Error 401 require authentication

HTTP ERROR 401 require authentication

URI:/solr/core1/select
STATUS:401
MESSAGE:require authentication
SERVLET:default
401 }} See also SOLR-14569 that seems similar, but the patch provided does not help after I applied it to 8.8.1, therefore I think this is not the same issue. Adjust priority as necessary. For cases where basic auth is required this means we cannot use Solr as of now. was: Issue confirmed for 8.7 and 8.8.1. Steps to reproduce are: 1. Following the docs for setting up distributed search (https://solr.apache.org/guide/8_8/distributed-search-with-index-sharding.html). 1.1 Stop both nodes after confirming that distributed search works without basic auth (last step). 2. Enable basic authentication plugin for both nodes, example for {{example/nodes/node1/security.json}}: {{ { "authentication":{ "blockUnknown": true, "class":"solr.BasicAuthPlugin", "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}, "realm":"My Solr users", "forwardCredentials": false }} }} 3. Configure {{shardsWhitelist}} in {{solr.xml}} of each node {{example/nodes/node1/solr.xml}} {{ ${socketTimeout:60} ${connTimeout:6} localhost:8984,localhost:8985 }} 4. Start both nodes. 5. Confirm that searching on one node with basic auth works with {{curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&wt=xml&indent=true"}} 6. Confirm that searching on both nodes does not work with {{curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml"; Error: {{❯ curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml"; 401 173 *:* localhost:8985/solr/core1,localhost:8984/solr/core1 true id,name xml org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException Error from server at null: Expected mime type application/octet-stream but got text/html. Error 401 require authentication

HTTP ERROR 401 require authentication

URI:/solr/core1/select
STATUS:401
MESSAGE:require authentication
SERVLET:default
401

[jira] [Created] (SOLR-15237) Distributed search with index sharding is not working with basic authentication plugin enabled

Samir Huremovic created SOLR-15237:
--

 Summary: Distributed search with index sharding is not working 
with basic authentication plugin enabled
 Key: SOLR-15237
 URL: https://issues.apache.org/jira/browse/SOLR-15237
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Authentication
Affects Versions: 8.8.1, 8.7
Reporter: Samir Huremovic


Issue confirmed for 8.7 and 8.8.1.

Steps to reproduce are:
1. Following the docs for setting up distributed search 
(https://solr.apache.org/guide/8_8/distributed-search-with-index-sharding.html).
1.1 Stop both nodes after confirming that distributed search works without 
basic auth (last step).
2. Enable basic authentication plugin for both nodes, example for 
{{example/nodes/node1/security.json}}:
{{
{
"authentication":{ 
   "blockUnknown": true, 
   "class":"solr.BasicAuthPlugin",
   "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}, 
   "realm":"My Solr users", 
   "forwardCredentials": false 
}}

}}
3. Configure {{shardsWhitelist}} in {{solr.xml}} of each node 
{{example/nodes/node1/solr.xml}}
{{
  
${socketTimeout:60}
${connTimeout:6}
localhost:8984,localhost:8985
  
}}
4. Start both nodes.
5. Confirm that searching on one node with basic auth works with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&wt=xml&indent=true"}}
6. Confirm that searching on both nodes does not work with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";

Error:
{{
❯ curl --user solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";




  401
  173
  
*:*
localhost:8985/solr/core1,localhost:8984/solr/core1
true
id,name
xml
  


  
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
  
  Error from server at null: Expected mime type 
application/octet-stream but got text/html. 


Error 401 require authentication

HTTP ERROR 401 require authentication

URI:/solr/core1/select
STATUS:401
MESSAGE:require authentication
SERVLET:default
401 }} See also SOLR-14569 that seems similar, but the patch provided does not help after I applied it to 8.8.1, therefore I think this is not the same issue. Adjust priority as necessary. For cases where basic auth is required this means we cannot use Solr as of now. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-15237) Distributed search with index sharding is not working with basic authentication plugin enabled



 [ 
https://issues.apache.org/jira/browse/SOLR-15237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samir Huremovic updated SOLR-15237:
---
Description: 
Issue confirmed for 8.7 and 8.8.1.

Steps to reproduce are:
1. Following the docs for setting up distributed search 
(https://solr.apache.org/guide/8_8/distributed-search-with-index-sharding.html).
1.1 Stop both nodes after confirming that distributed search works without 
basic auth (last step).
2. Enable basic authentication plugin for both nodes, example for 
{{example/nodes/node1/security.json}}:
{{ {
"authentication":{ 
   "blockUnknown": true, 
   "class":"solr.BasicAuthPlugin",
   "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}, 
   "realm":"My Solr users", 
   "forwardCredentials": false 
}} }}
3. Configure {{shardsWhitelist}} in {{solr.xml}} of each node 
{{example/nodes/node1/solr.xml}}
{{
${socketTimeout:60}
${connTimeout:6}
localhost:8984,localhost:8985
  }}
4. Start both nodes.
5. Confirm that searching on one node with basic auth works with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&wt=xml&indent=true"}}
6. Confirm that searching on both nodes does not work with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";

Error:
{{❯ curl --user solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";




  401
  173
  
*:*
localhost:8985/solr/core1,localhost:8984/solr/core1
true
id,name
xml
  


  
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
  
  Error from server at null: Expected mime type 
application/octet-stream but got text/html. 


Error 401 require authentication

HTTP ERROR 401 require authentication

URI:/solr/core1/select
STATUS:401
MESSAGE:require authentication
SERVLET:default
401 }} See also SOLR-14569 that seems similar, but the patch provided does not help after I applied it to 8.8.1, therefore I think this is not the same issue. Adjust priority as necessary. For cases where basic auth is required this means we cannot use Solr as of now. was: Issue confirmed for 8.7 and 8.8.1. Steps to reproduce are: 1. Following the docs for setting up distributed search (https://solr.apache.org/guide/8_8/distributed-search-with-index-sharding.html). 1.1 Stop both nodes after confirming that distributed search works without basic auth (last step). 2. Enable basic authentication plugin for both nodes, example for {{example/nodes/node1/security.json}}: {{ { "authentication":{ "blockUnknown": true, "class":"solr.BasicAuthPlugin", "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}, "realm":"My Solr users", "forwardCredentials": false }} }} 3. Configure {{shardsWhitelist}} in {{solr.xml}} of each node {{example/nodes/node1/solr.xml}} {{ ${socketTimeout:60} ${connTimeout:6} localhost:8984,localhost:8985 }} 4. Start both nodes. 5. Confirm that searching on one node with basic auth works with {{curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&wt=xml&indent=true"}} 6. Confirm that searching on both nodes does not work with {{curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml"; Error: {{ ❯ curl --user solr:SolrRocks "http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml"; 401 173 *:* localhost:8985/solr/core1,localhost:8984/solr/core1 true id,name xml org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException Error from server at null: Expected mime type application/octet-stream but got text/html. Error 401 require authentication

HTTP ERROR 401 require authentication

URI:/solr/core1/select
STATUS:401
MESSAGE:require authentication
SERVLET:default

[jira] [Created] (SOLR-15236) Distributed search with index sharding is not working with basic authentication plugin enabled

Samir Huremovic created SOLR-15236:
--

 Summary: Distributed search with index sharding is not working 
with basic authentication plugin enabled
 Key: SOLR-15236
 URL: https://issues.apache.org/jira/browse/SOLR-15236
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Authentication
Affects Versions: 8.8.1, 8.7
 Environment: Archi Linux, Zulu JDK11, Solr 8.8.1
Reporter: Samir Huremovic


Issue confirmed for 8.7 and 8.8.1.

Steps to reproduce are:
1. Following the docs for setting up distributed search 
(https://solr.apache.org/guide/8_8/distributed-search-with-index-sharding.html).
1.1 Stop both nodes after confirming that distributed search works without 
basic auth (last step).
2. Enable basic authentication plugin for both nodes, example for 
{{example/nodes/node1/security.json}}:
{{
{
"authentication":{ 
   "blockUnknown": true, 
   "class":"solr.BasicAuthPlugin",
   "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}, 
   "realm":"My Solr users", 
   "forwardCredentials": false 
}}

}}
3. Configure {{shardsWhitelist}} in {{solr.xml}} of each node 
{{example/nodes/node1/solr.xml}}
{{
  
${socketTimeout:60}
${connTimeout:6}
localhost:8984,localhost:8985
  
}}
4. Start both nodes.
5. Confirm that searching on one node with basic auth works with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&wt=xml&indent=true"}}
6. Confirm that searching on both nodes does not work with {{curl --user 
solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";

Error:
{{
❯ curl --user solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";




  401
  173
  
*:*
localhost:8985/solr/core1,localhost:8984/solr/core1
true
id,name
xml
  


  
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
  
  Error from server at null: Expected mime type 
application/octet-stream but got text/html. 


Error 401 require authentication

HTTP ERROR 401 require authentication

URI:/solr/core1/select
STATUS:401
MESSAGE:require authentication
SERVLET:default
401 }} See also SOLR-14569 that seems similar, but the patch provided does not help after I applied it to 8.8.1, therefore I think this is not the same issue. Adjust priority as necessary. For cases where basic auth is required this means we cannot use Solr as of now. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr-operator] HoustonPutman closed issue #228: Handle dependency licenses correctly



HoustonPutman closed issue #228:
URL: https://github.com/apache/lucene-solr-operator/issues/228


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman merged pull request #229: Resolve dependency license handling



HoustonPutman merged pull request #229:
URL: https://github.com/apache/lucene-solr-operator/pull/229


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman commented on pull request #231: Add conditional dependency for zk-operator helm chart



HoustonPutman commented on pull request #231:
URL: 
https://github.com/apache/lucene-solr-operator/pull/231#issuecomment-794557931


   Hey @chaicesan , I went ahead and changed the variable name myself, and 
added backwards compatibility with the old `useZkOperator` option. I also 
updated various docs across the project.
   
   I'm not so sure about using Chart.lock and keeping the 
`charts/zookeeper-operator-0.2.9.tgz` and `Chart.lock` in the repo. That's 
something that can be generated at release time, when building the chart. Is 
there a reason why you included it in the PR specifically?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-15235) Distributed search with index sharding is not working with basic authentication plugin enabled

Samir Huremovic created SOLR-15235:
--

 Summary: Distributed search with index sharding is not working 
with basic authentication plugin enabled
 Key: SOLR-15235
 URL: https://issues.apache.org/jira/browse/SOLR-15235
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Authentication
Affects Versions: 8.8.1, 8.7
 Environment: Arch Linux, zulu JDK 11, Solr 8.8.1
Reporter: Samir Huremovic


Steps to reproduce (from 
https://solr.apache.org/guide/8_8/distributed-search-with-index-sharding.html)
1. Create two local servers and index two files as described in the docs.
2. Check that search is working as described in the docs.
3. Stop the instances.
4. Add {{security.json}} for both nodes with configuration for auth plugin, for 
example
{{{
"authentication":{ 
   "blockUnknown": true, 
   "class":"solr.BasicAuthPlugin",
   "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}, 
   "realm":"My Solr users", 
   "forwardCredentials": false 

5. Add both nodes to the {{shardsWhitelist}} in both node's {{solr.xml}}, e.g. 
{{example/nodes/node1/solr.xml}}:
{{
  
${socketTimeout:60}
${connTimeout:6}
localhost:8984,localhost:8985
  
}}
6. Start both nodes again.
7. Try searching on a single node, should work: {{curl --user solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&wt=xml&indent=true"}}
8. Try distributed search on both nodes, should not work anymore: 
{{//localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml"}}

Error:
{{
❯ curl --user solr:SolrRocks 
"http://localhost:8984/solr/core1/select?q=*:*&indent=true&shards=localhost:8985/solr/core1,localhost:8984/solr/core1&fl=id,name&wt=xml";




  401
  173
  
*:*
localhost:8985/solr/core1,localhost:8984/solr/core1
true
id,name
xml
  


  
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException
  
  Error from server at null: Expected mime type 
application/octet-stream but got text/html. 


Error 401 require authentication

HTTP ERROR 401 require authentication

URI:/solr/core1/select
STATUS:401
MESSAGE:require authentication
SERVLET:default
401 }} Please adjust the priority if needed, for us this means we cannot use Solr with basic auth enabled, which means cannot use it at all in cases where it is a requirement. I have linked a related issue that seems to be similar. I have applied the patch from that issue to 8.8.1 and it did not help in my case, therefore I think it is not the exact same issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr-operator] anshumg commented on a change in pull request #229: Resolve dependency license handling



anshumg commented on a change in pull request #229:
URL: 
https://github.com/apache/lucene-solr-operator/pull/229#discussion_r59092



##
File path: dependency_licenses.csv
##
@@ -0,0 +1,53 @@
+cloud.google.com/go/compute/metadata,Unknown,Apache-2.0

Review comment:
   Makes sense :)
   
   Let's keep it here.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-site] mocobeta edited a comment on pull request #50: Edits related to switching from master to main branch



mocobeta edited a comment on pull request #50:
URL: https://github.com/apache/lucene-site/pull/50#issuecomment-794527801


   I've changed this to a Draft, because I thought it's not harmful. Please 
click "Ready for review" button below anytime when you feel it's ready to be 
merged.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-site] mocobeta edited a comment on pull request #50: Edits related to switching from master to main branch



mocobeta edited a comment on pull request #50:
URL: https://github.com/apache/lucene-site/pull/50#issuecomment-794527801


   I've changed this to a Draft, because I thought it's not harmful. Please 
click "Ready for review" button below anytime when you feel it's ready.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-site] mocobeta commented on pull request #50: Edits related to switching from master to main branch



mocobeta commented on pull request #50:
URL: https://github.com/apache/lucene-site/pull/50#issuecomment-794527801


   I've changed this to a Draft, because I thought it's not harmful. Please 
click "Ready for review" button above anytime when you feel it's ready.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-site] mocobeta commented on pull request #50: Edits related to switching from master to main branch



mocobeta commented on pull request #50:
URL: https://github.com/apache/lucene-site/pull/50#issuecomment-794523349


   > NB: Don't merge this PR to main branch until we are ready to do the switch 
(i.e. INFRA has changed default branch?), else a site build will kick off and 
change the site prematurely.
   
   Would it be better to change the status to "Draft" in order not to be merged 
accidentally? Github allows to change open PRs to drafts by "Convert to draft" 
link (right under "Reviewers" section).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman commented on a change in pull request #229: Resolve dependency license handling



HoustonPutman commented on a change in pull request #229:
URL: 
https://github.com/apache/lucene-solr-operator/pull/229#discussion_r590757492



##
File path: dependency_licenses.csv
##
@@ -0,0 +1,53 @@
+cloud.google.com/go/compute/metadata,Unknown,Apache-2.0

Review comment:
   If not, that's fine. We could remove the whole workflow around this. If 
you don't think it's necessary, then I can tear it all out.
   
   I think it's good to have so that we can keep track of the dependency 
Licenses that PRs are adding.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] rmuir commented on pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions



rmuir commented on pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#issuecomment-794518533


   > > Do you have candidates in mind for using this structure?
   > 
   > Well, maybe anywhere we use a UTF16 based FST today, and we might want 
faster lookup and can afford more RAM used ... e.g. maybe `SynonymGraphFilter`?
   > 
   > Anyway this can come later :)
   
   Maybe it is interesting for use-cases like kuromoji (japanese) and nori 
(korean) which also use FST (sometimes with hacky root-arc caching to improve 
performance) today. They have a similar use case to decompounder+stemmer as far 
as what they are doing.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] anshumg commented on a change in pull request #229: Resolve dependency license handling



anshumg commented on a change in pull request #229:
URL: 
https://github.com/apache/lucene-solr-operator/pull/229#discussion_r590713737



##
File path: NOTICE
##
@@ -26,3 +26,12 @@ distributed under the License is distributed on an "AS IS" 
BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
+
+The Solr Operator project is built using Kubebuilder, which is Apache 2.0 
licensed.
+https://github.com/kubernetes-sigs/kubebuilder
+
+The reconcileStorageFinalizer login in
+controllers/solrcloud_controller.go
+was influenced by the same logic in the Zookeeper Operator, which is Apache 
2.0 licensed.
+
https://github.com/pravega/zookeeper-operator/blob/v0.2.9/pkg/controller/zookeepercluster/zookeepercluster_controller.go#L629)
+Copyright (c) 2020 Dell Inc., or its subsidiaries. All Rights Reserved.

Review comment:
   nit: missing newline

##
File path: dependency_licenses.csv
##
@@ -0,0 +1,53 @@
+cloud.google.com/go/compute/metadata,Unknown,Apache-2.0

Review comment:
   Do we need to store this file (in this format) ?

##
File path: hack/install_dependencies.sh
##
@@ -32,3 +32,6 @@ if !(which kubebuilder && (kubebuilder version | grep 
${kubebuilder_version}));
 else
   echo "Kubebuilder already installed at $(which kubebuilder)"
 fi
+
+# Install go-licenses
+go get github.com/google/go-licenses

Review comment:
   nit: new line

##
File path: .github/workflows/docker.yaml
##
@@ -17,4 +17,4 @@ jobs:
 
   # Cleanup & Install dependencies
   - run: docker --version
-  - run: make docker-vendor-build
\ No newline at end of file
+  - run: make docker-build

Review comment:
   nit: new line





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-site] janhoy opened a new pull request #52: PYLUCENE-57 Add DOAP file for pylucene



janhoy opened a new pull request #52:
URL: https://github.com/apache/lucene-site/pull/52


   See https://issues.apache.org/jira/browse/PYLUCENE-57
   
   Next step is to add a reference to the file 
(https://lucene.apache.org/pylucene/doap.rdf) to 
https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml
 in svn, and next day it will be listed as a sub project under Lucene TLP at 
https://projects.apache.org/committee.html?lucene



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman commented on pull request #229: Resolve dependency license handling



HoustonPutman commented on pull request #229:
URL: 
https://github.com/apache/lucene-solr-operator/pull/229#issuecomment-794475698


   This approach sounds good to go according to the feedback in 
[LEGAL-562](https://issues.apache.org/jira/browse/LEGAL-562)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15234) PRS has issues with many concurrent creations



[ 
https://issues.apache.org/jira/browse/SOLR-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298303#comment-17298303
 ] 

Mike Drob commented on SOLR-15234:
--

This was in master, I’ll look for branch_8x tests shortly

> PRS has issues with many concurrent creations
> -
>
> Key: SOLR-15234
> URL: https://issues.apache.org/jira/browse/SOLR-15234
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Mike Drob
>Priority: Major
>
> Running in a cloud environment, I am creating lots of 2x2 collections on a 
> single node with 10 threads using PRS. These tests work fine without PRS.
> I see the following errors in my logs:
> {noformat}
> 2021-03-09 05:57:46.311 ERROR 
> (OverseerStateUpdate-72069990450593800-hostname:8983_solr-n_02) [   ] 
> o.a.s.c.c.PerReplicaStatesOps Multi-op exception: [core_node8:2:A:L, 
> core_node6:4:D, core_node2:4:D, core_node4:2:A:L]
> {noformat}
> It would be good to include MDC logging on this if we can, so that I can know 
> what collection the error describes.
> A little bit later:
> {noformat}
> 2021-03-09 05:57:46.694 INFO  (qtp594858858-16) [   
> x:collection-78_shard2_replica_n7] o.a.s.h.a.PrepRecoveryOp Going to wait for 
> coreNodeName: core_node6, state: recovering, checkLive: true, onlyIfLeader: 
> true, onlyIfLeaderActive: true
> 2021-03-09 05:57:46.695 ERROR (qtp594858858-16) [   
> x:collection-78_shard2_replica_n7] o.a.s.h.RequestHandlerBase 
> org.apache.solr.common.SolrException: core not 
> found:collection-78_shard2_replica_n7
>   at 
> org.apache.solr.handler.admin.PrepRecoveryOp.execute(PrepRecoveryOp.java:71)
>   at 
> org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367)
>   at 
> org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:397)
>   at 
> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:181)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)
>   at 
> org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:836)
>   at 
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:800)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:545)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:518)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:432)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1612)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1349)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
>   at 
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>   at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>   at org.eclipse.jetty.server.Server.handle(Server.java:516)
>   at 
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)

[jira] [Commented] (SOLR-15229) Link Solr DOAP correctly from website



[ 
https://issues.apache.org/jira/browse/SOLR-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298283#comment-17298283
 ] 

Houston Putman commented on SOLR-15229:
---

When we update the releaseWizard to do Solr alone, should we move the doap file 
to the website?

> Link Solr DOAP correctly from website
> -
>
> Key: SOLR-15229
> URL: https://issues.apache.org/jira/browse/SOLR-15229
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> It is recommended to have the DOAP file on the project's website, as it is 
> confusing that the DOAP file exists in different versions in the various code 
> branches (it is only the DOAP files from master branch that is used). So in 
> this Jira we'll
>  * -Move Solr's DOAP file to the solr-site.git repo (e.g. 
> [https://solr.apache.org/doap-Solr.rdf)|https://solr.apache.org/doap-Solr.rdf]-
>  * Update .htaccess redirect so [https://solr.apache.org/doap/solr.rdf] 
> points to our DOAP in git
>  * Update 
> [https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml]
>  to link to the new location
> Last step will be to update .htaccess once again to point to new solr.git 
> once it is live (see [https://github.com/apache/solr-site/pull/5)] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9828) Update Lucene's DOAP file redirect



[ 
https://issues.apache.org/jira/browse/LUCENE-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298259#comment-17298259
 ] 

Jan Høydahl commented on LUCENE-9828:
-

Merge https://github.com/apache/lucene-site/pull/51 once new git repo exists

> Update Lucene's DOAP file redirect
> --
>
> Key: LUCENE-9828
> URL: https://issues.apache.org/jira/browse/LUCENE-9828
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> ASF recommends hosting the DOAP file on the website, not in svn or git. 
> However, we have some release script dependencies needing it to be in main 
> git repo. There is a .htaccess entry linking 
> [https://lucene.apache.org/core/soap.rdf] to the git location.
> Once code moves to lucene.git, we need to update the git location on 
> lucene-site.git



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-site] janhoy opened a new pull request #51: LUCENE-9828 Update DOAP link



janhoy opened a new pull request #51:
URL: https://github.com/apache/lucene-site/pull/51


   Merge this when new git repo exists
   
   https://issues.apache.org/jira/browse/LUCENE-9828



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9828) Update Lucene's DOAP file redirect



 [ 
https://issues.apache.org/jira/browse/LUCENE-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated LUCENE-9828:

Description: 
ASF recommends hosting the DOAP file on the website, not in svn or git. 
However, we have some release script dependencies needing it to be in main git 
repo. There is a .htaccess entry linking 
[https://lucene.apache.org/core/soap.rdf] to the git location.

Once code moves to lucene.git, we need to update the git location on 
lucene-site.git

  was:
ASF recommends hosting the DOAP file on the website, not in svn or git. 
However, we have some release script dependencies needing it to be in main git 
repo. There is a .htaccess entry linking 
[https://lucene.apache.org/core/soap.rdf] to the git location.



See SOLR-15229 for the Solr DOAP


> Update Lucene's DOAP file redirect
> --
>
> Key: LUCENE-9828
> URL: https://issues.apache.org/jira/browse/LUCENE-9828
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> ASF recommends hosting the DOAP file on the website, not in svn or git. 
> However, we have some release script dependencies needing it to be in main 
> git repo. There is a .htaccess entry linking 
> [https://lucene.apache.org/core/soap.rdf] to the git location.
> Once code moves to lucene.git, we need to update the git location on 
> lucene-site.git



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9828) Update Lucene's DOAP file redirect



 [ 
https://issues.apache.org/jira/browse/LUCENE-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated LUCENE-9828:

Description: 
ASF recommends hosting the DOAP file on the website, not in svn or git. 
However, we have some release script dependencies needing it to be in main git 
repo. There is a .htaccess entry linking 
[https://lucene.apache.org/core/soap.rdf] to the git location.



See SOLR-15229 for the Solr DOAP

  was:
ASF recommends hosting the DOAP file on the website, not in svn or git. So this 
task deletes the file from main git repo and adds it to lucene-site.git under 
[https://lucene.apache.org/doap/lucene.rdf]

There will be a commit to the main repo with updated releaseWizard commands

See SOLR-15229 for the Solr DOAP


> Update Lucene's DOAP file redirect
> --
>
> Key: LUCENE-9828
> URL: https://issues.apache.org/jira/browse/LUCENE-9828
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> ASF recommends hosting the DOAP file on the website, not in svn or git. 
> However, we have some release script dependencies needing it to be in main 
> git repo. There is a .htaccess entry linking 
> [https://lucene.apache.org/core/soap.rdf] to the git location.
> See SOLR-15229 for the Solr DOAP



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9828) Update Lucene's DOAP file redirect



 [ 
https://issues.apache.org/jira/browse/LUCENE-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated LUCENE-9828:

Summary: Update Lucene's DOAP file redirect  (was: Move Lucene's DOAP file 
to website repo)

> Update Lucene's DOAP file redirect
> --
>
> Key: LUCENE-9828
> URL: https://issues.apache.org/jira/browse/LUCENE-9828
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> ASF recommends hosting the DOAP file on the website, not in svn or git. So 
> this task deletes the file from main git repo and adds it to lucene-site.git 
> under [https://lucene.apache.org/doap/lucene.rdf]
> There will be a commit to the main repo with updated releaseWizard commands
> See SOLR-15229 for the Solr DOAP



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15229) Link Solr DOAP correctly from website



 [ 
https://issues.apache.org/jira/browse/SOLR-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-15229:
---
Description: 
It is recommended to have the DOAP file on the project's website, as it is 
confusing that the DOAP file exists in different versions in the various code 
branches (it is only the DOAP files from master branch that is used). So in 
this Jira we'll
 * -Move Solr's DOAP file to the solr-site.git repo (e.g. 
[https://solr.apache.org/doap-Solr.rdf)|https://solr.apache.org/doap-Solr.rdf]-
 * Update .htaccess redirect so [https://solr.apache.org/doap/solr.rdf] points 
to our DOAP in git
 * Update 
[https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml]
 to link to the new location

Last step will be to update .htaccess once again to point to new solr.git once 
it is live (see [https://github.com/apache/solr-site/pull/5)] 

  was:
It is recommended to have the DOAP file on the project's website, as it is 
confusing that the DOAP file exists in different versions in the various code 
branches (it is only the DOAP files from master branch that is used). So in 
this Jira we'll
 * Move Solr's DOAP file to the solr-site.git repo (e.g. 
[https://solr.apache.org/doap-Solr.rdf)|https://solr.apache.org/doap-Solr.rdf]
 * Update 
[https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml]
 to link to the new location


> Link Solr DOAP correctly from website
> -
>
> Key: SOLR-15229
> URL: https://issues.apache.org/jira/browse/SOLR-15229
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> It is recommended to have the DOAP file on the project's website, as it is 
> confusing that the DOAP file exists in different versions in the various code 
> branches (it is only the DOAP files from master branch that is used). So in 
> this Jira we'll
>  * -Move Solr's DOAP file to the solr-site.git repo (e.g. 
> [https://solr.apache.org/doap-Solr.rdf)|https://solr.apache.org/doap-Solr.rdf]-
>  * Update .htaccess redirect so [https://solr.apache.org/doap/solr.rdf] 
> points to our DOAP in git
>  * Update 
> [https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml]
>  to link to the new location
> Last step will be to update .htaccess once again to point to new solr.git 
> once it is live (see [https://github.com/apache/solr-site/pull/5)] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions



mikemccand commented on pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#issuecomment-794291450


   > Do you have candidates in mind for using this structure?
   
   Well, maybe anywhere we use a UTF16 based FST today, and we might want 
faster lookup and can afford more RAM used ... e.g. maybe `SynonymGraphFilter`?
   
   Anyway this can come later :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15229) Link Solr DOAP correctly from website



 [ 
https://issues.apache.org/jira/browse/SOLR-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-15229:
---
Summary: Link Solr DOAP correctly from website  (was: Move Solr's DOAP file 
from git to solr website)

> Link Solr DOAP correctly from website
> -
>
> Key: SOLR-15229
> URL: https://issues.apache.org/jira/browse/SOLR-15229
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> It is recommended to have the DOAP file on the project's website, as it is 
> confusing that the DOAP file exists in different versions in the various code 
> branches (it is only the DOAP files from master branch that is used). So in 
> this Jira we'll
>  * Move Solr's DOAP file to the solr-site.git repo (e.g. 
> [https://solr.apache.org/doap-Solr.rdf)|https://solr.apache.org/doap-Solr.rdf]
>  * Update 
> [https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml]
>  to link to the new location



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15229) Move Solr's DOAP file from git to solr website



[ 
https://issues.apache.org/jira/browse/SOLR-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298250#comment-17298250
 ] 

Jan Høydahl commented on SOLR-15229:


Updated 
[https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml]
 

> Move Solr's DOAP file from git to solr website
> --
>
> Key: SOLR-15229
> URL: https://issues.apache.org/jira/browse/SOLR-15229
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> It is recommended to have the DOAP file on the project's website, as it is 
> confusing that the DOAP file exists in different versions in the various code 
> branches (it is only the DOAP files from master branch that is used). So in 
> this Jira we'll
>  * Move Solr's DOAP file to the solr-site.git repo (e.g. 
> [https://solr.apache.org/doap-Solr.rdf)|https://solr.apache.org/doap-Solr.rdf]
>  * Update 
> [https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml]
>  to link to the new location



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions



mikemccand commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r590625033



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;
+  pos -= in.readVInt();
+}
+  }
+
+  private static void skipVInts(ByteArrayDataInput in, int

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions



mikemccand commented on a change in pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#discussion_r590622129



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java
##
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.ByteArrayDataInput;
+import org.apache.lucene.store.ByteArrayDataOutput;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.CharsRef;
+import org.apache.lucene.util.IntsRef;
+import org.apache.lucene.util.IntsRefBuilder;
+import org.apache.lucene.util.fst.IntSequenceOutputs;
+
+/**
+ * A data structure for memory-efficient word storage and fast 
lookup/enumeration. Each dictionary
+ * entry is stored as:
+ *
+ * 
+ *   the last character
+ *   pointer to a similar entry for the prefix (all characters except the 
last one)
+ *   value data: a list of ints representing word flags and morphological 
data, and a pointer to
+ *   hash collisions, if any
+ * 
+ *
+ * There's only one entry for each prefix, so it's like a trie/{@link
+ * org.apache.lucene.util.fst.FST}, but a reversed one: each nodes points to a 
single previous nodes
+ * instead of several following ones. For example, "abc" and "abd" point to 
the same prefix entry
+ * "ab" which points to "a" which points to 0.
+ * 
+ * The entries are stored in a contiguous byte array, identified by their 
offsets, using {@link
+ * DataOutput#writeVInt} ()} VINT} format for compression.
+ */
+class WordStorage {
+  /**
+   * A map from word's hash (modulo array's length) into the offset of the 
last entry in {@link
+   * #wordData} with this hash. Negated, if there's more than one entry with 
the same hash.
+   */
+  private final int[] hashTable;
+
+  /**
+   * An array of word entries:
+   *
+   * 
+   *   VINT: the word's last character
+   *   VINT: pointer to the entry for the same word without the last 
character. It's relative:
+   *   the difference of this entry's start and the prefix's entry start. 
0 for single-character
+   *   entries
+   *   Optional, for non-leaf entries only:
+   *   
+   * VINT: the length of the word form data, returned from {@link 
#lookupWord}
+   * n * VINT: the word form data
+   * Optional, for hash-colliding entries only:
+   * 
+   *   BYTE: 1 if the next collision entry has further 
collisions, 0 if it's the
+   *   last of the entries with the same hash
+   *   VINT: (relative) pointer to the previous entry with the 
same hash
+   * 
+   *   
+   * 
+   */
+  private final byte[] wordData;
+
+  private WordStorage(int[] hashTable, byte[] wordData) {
+this.hashTable = hashTable;
+this.wordData = wordData;
+  }
+
+  IntsRef lookupWord(char[] word, int offset, int length) {
+assert length > 0;
+
+int hash = Math.abs(CharsRef.stringHashCode(word, offset, length) % 
hashTable.length);
+int pos = hashTable[hash];
+if (pos == 0) {
+  return null;
+}
+
+boolean collision = pos < 0;
+pos = Math.abs(pos);
+
+char lastChar = word[offset + length - 1];
+ByteArrayDataInput in = new ByteArrayDataInput(wordData);
+while (true) {
+  in.setPosition(pos);
+  char c = (char) in.readVInt();
+  int prevPos = pos - in.readVInt();
+  int beforeForms = in.getPosition();
+  boolean found = c == lastChar && isSameString(word, offset, length - 1, 
prevPos, in);
+  if (!collision && !found) {
+return null;
+  }
+
+  in.setPosition(beforeForms);
+  int formLength = in.readVInt();
+  if (found) {
+IntsRef forms = new IntsRef(formLength);
+readForms(forms, in, formLength);
+return forms;
+  } else {
+skipVInts(in, formLength);
+  }
+
+  collision = in.readByte() == 1;

Review comment:
   Good idea (storing word length) -- that'd make lookups (that hashed to 
the sa

[GitHub] [lucene-solr] mikemccand commented on pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions



mikemccand commented on pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#issuecomment-794278954


   > bq. Progress not perfection!
   > 
   > I hate that quote, Mike. :)
   
   Wow, why? :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9827) Small segments are slower to merge due to stored fields since 8.7



[ 
https://issues.apache.org/jira/browse/LUCENE-9827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298245#comment-17298245
 ] 

Robert Muir commented on LUCENE-9827:
-

{quote}
I don't know how we can address this, this is a natural consequence of the 
larger block size, which is needed to achieve better compression ratios. But I 
wanted to open an issue about it in case someone has a bright idea how we could 
make things better.
{quote}

Well that's the real issue. I'd argue the current heuristic was always bad 
here, even before the block size increase. But now it happens to be worse 
because blocksize increased, because compression ratio is better, etc etc. But 
lets forget about exact sizes and try to just improve the logic...

I'll inline the code to make it easier to look at:
{code}
  /**
   * Returns true if we should recompress this reader, even though we could 
bulk merge compressed
   * data
   *
   * The last chunk written for a segment is typically incomplete, so 
without recompressing, in
   * some worst-case situations (e.g. frequent reopen with tiny flushes), over 
time the compression
   * ratio can degrade. This is a safety switch.
   */
  boolean tooDirty(Lucene90CompressingStoredFieldsReader candidate) {
// more than 1% dirty, or more than hard limit of 1024 dirty chunks
return candidate.getNumDirtyChunks() > 1024
|| candidate.getNumDirtyDocs() * 100 > candidate.getNumDocs();
  }
{code}

Please ignore the safety switch for now (1024 dirty chunks), as this isn't 
relevant to small merges. The logic is really just "more than 1% dirty docs".

I think we should avoid recompressing the data over and over for small merges. 
In other words I don't want to recompress everything for Merge1, Merge2, 
Merge3, Merge4, Merge5, and then finally at Merge6 we are bulk copying. We were 
probably doing this before to some extent already. 

I'd like it if the formula had an "expected value" baked into it. In other 
words, we only recompress everything on purpose, if its going to result in us 
getting less dirty than we were before, or something like that (e.g. we think 
we'll fold some dirty chunks into "complete chunk" and make "progress").

This would have a cost in the compression ratio for the small segments, but it 
shouldn't be bad in the big scheme of things.

> Small segments are slower to merge due to stored fields since 8.7
> -
>
> Key: LUCENE-9827
> URL: https://issues.apache.org/jira/browse/LUCENE-9827
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: total-merge-time-by-num-docs-on-small-segments.png
>
>
> [~dm] and [~dimitrisli] looked into an interesting case where indexing slowed 
> down after upgrading to 8.7. After digging we identified that this was due to 
> the merging of stored fields, which had become slower on average.
> This is due to changes to stored fields, which now have top-level blocks that 
> are then split into sub-blocks and compressed using shared dictionaries (one 
> dictionary per top-level block). As the top-level blocks are larger than they 
> were before, segments are more likely to be considered "dirty" by the merging 
> logic. Dirty segments are segments were 1% of the data or more consists of 
> incomplete blocks. For large segments, the size of blocks doesn't really 
> affect the dirtiness of segments: if you flush a segment that has 100 blocks 
> or more, it will never be considered dirty as only the last block may be 
> incomplete. But for small segments it does: for instance if your segment is 
> only 10 blocks, it is very likely considered dirty given that the last block 
> is always incomplete. And the fact that we increased the top-level block size 
> means that segments that used to be considered clean might now be considered 
> dirty.
> And indeed benchmarks reported that while large stored fields merges became 
> slightly faster after upgrading to 8.7, the smaller merges actually became 
> slower. See attached chart, which gives the total merge time as a function of 
> the number of documents in the segment.
> I don't know how we can address this, this is a natural consequence of the 
> larger block size, which is needed to achieve better compression ratios. But 
> I wanted to open an issue about it in case someone has a bright idea how we 
> could make things better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15234) PRS has issues with many concurrent creations



[ 
https://issues.apache.org/jira/browse/SOLR-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298229#comment-17298229
 ] 

Mike Drob commented on SOLR-15234:
--

Looking at the code, that first error seems like it could be totally benign?

> PRS has issues with many concurrent creations
> -
>
> Key: SOLR-15234
> URL: https://issues.apache.org/jira/browse/SOLR-15234
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Mike Drob
>Priority: Major
>
> Running in a cloud environment, I am creating lots of 2x2 collections on a 
> single node with 10 threads using PRS. These tests work fine without PRS.
> I see the following errors in my logs:
> {noformat}
> 2021-03-09 05:57:46.311 ERROR 
> (OverseerStateUpdate-72069990450593800-hostname:8983_solr-n_02) [   ] 
> o.a.s.c.c.PerReplicaStatesOps Multi-op exception: [core_node8:2:A:L, 
> core_node6:4:D, core_node2:4:D, core_node4:2:A:L]
> {noformat}
> It would be good to include MDC logging on this if we can, so that I can know 
> what collection the error describes.
> A little bit later:
> {noformat}
> 2021-03-09 05:57:46.694 INFO  (qtp594858858-16) [   
> x:collection-78_shard2_replica_n7] o.a.s.h.a.PrepRecoveryOp Going to wait for 
> coreNodeName: core_node6, state: recovering, checkLive: true, onlyIfLeader: 
> true, onlyIfLeaderActive: true
> 2021-03-09 05:57:46.695 ERROR (qtp594858858-16) [   
> x:collection-78_shard2_replica_n7] o.a.s.h.RequestHandlerBase 
> org.apache.solr.common.SolrException: core not 
> found:collection-78_shard2_replica_n7
>   at 
> org.apache.solr.handler.admin.PrepRecoveryOp.execute(PrepRecoveryOp.java:71)
>   at 
> org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367)
>   at 
> org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:397)
>   at 
> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:181)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)
>   at 
> org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:836)
>   at 
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:800)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:545)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:518)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:432)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1612)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1349)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
>   at 
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>   at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>   at org.eclipse.jetty.server.Server.handle(Server.java:516)
>   at 
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(Htt

[jira] [Commented] (SOLR-15234) PRS has issues with many concurrent creations



[ 
https://issues.apache.org/jira/browse/SOLR-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298230#comment-17298230
 ] 

Ishan Chattopadhyaya commented on SOLR-15234:
-

Is this happening on branch_8x / Solr 8.8.1?

> PRS has issues with many concurrent creations
> -
>
> Key: SOLR-15234
> URL: https://issues.apache.org/jira/browse/SOLR-15234
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Mike Drob
>Priority: Major
>
> Running in a cloud environment, I am creating lots of 2x2 collections on a 
> single node with 10 threads using PRS. These tests work fine without PRS.
> I see the following errors in my logs:
> {noformat}
> 2021-03-09 05:57:46.311 ERROR 
> (OverseerStateUpdate-72069990450593800-hostname:8983_solr-n_02) [   ] 
> o.a.s.c.c.PerReplicaStatesOps Multi-op exception: [core_node8:2:A:L, 
> core_node6:4:D, core_node2:4:D, core_node4:2:A:L]
> {noformat}
> It would be good to include MDC logging on this if we can, so that I can know 
> what collection the error describes.
> A little bit later:
> {noformat}
> 2021-03-09 05:57:46.694 INFO  (qtp594858858-16) [   
> x:collection-78_shard2_replica_n7] o.a.s.h.a.PrepRecoveryOp Going to wait for 
> coreNodeName: core_node6, state: recovering, checkLive: true, onlyIfLeader: 
> true, onlyIfLeaderActive: true
> 2021-03-09 05:57:46.695 ERROR (qtp594858858-16) [   
> x:collection-78_shard2_replica_n7] o.a.s.h.RequestHandlerBase 
> org.apache.solr.common.SolrException: core not 
> found:collection-78_shard2_replica_n7
>   at 
> org.apache.solr.handler.admin.PrepRecoveryOp.execute(PrepRecoveryOp.java:71)
>   at 
> org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367)
>   at 
> org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:397)
>   at 
> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:181)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)
>   at 
> org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:836)
>   at 
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:800)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:545)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:518)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:432)
>   at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1612)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1582)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1349)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
>   at 
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>   at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>   at org.eclipse.jetty.server.Server.handle(Server.java:516)
>   at 
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.j

[jira] [Created] (SOLR-15234) PRS has issues with many concurrent creations

Mike Drob created SOLR-15234:


 Summary: PRS has issues with many concurrent creations
 Key: SOLR-15234
 URL: https://issues.apache.org/jira/browse/SOLR-15234
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrCloud
Reporter: Mike Drob


Running in a cloud environment, I am creating lots of 2x2 collections on a 
single node with 10 threads using PRS. These tests work fine without PRS.

I see the following errors in my logs:

{noformat}
2021-03-09 05:57:46.311 ERROR 
(OverseerStateUpdate-72069990450593800-hostname:8983_solr-n_02) [   ] 
o.a.s.c.c.PerReplicaStatesOps Multi-op exception: [core_node8:2:A:L, 
core_node6:4:D, core_node2:4:D, core_node4:2:A:L]
{noformat}

It would be good to include MDC logging on this if we can, so that I can know 
what collection the error describes.

A little bit later:

{noformat}
2021-03-09 05:57:46.694 INFO  (qtp594858858-16) [   
x:collection-78_shard2_replica_n7] o.a.s.h.a.PrepRecoveryOp Going to wait for 
coreNodeName: core_node6, state: recovering, checkLive: true, onlyIfLeader: 
true, onlyIfLeaderActive: true
2021-03-09 05:57:46.695 ERROR (qtp594858858-16) [   
x:collection-78_shard2_replica_n7] o.a.s.h.RequestHandlerBase 
org.apache.solr.common.SolrException: core not 
found:collection-78_shard2_replica_n7
at 
org.apache.solr.handler.admin.PrepRecoveryOp.execute(PrepRecoveryOp.java:71)
at 
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367)
at 
org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:397)
at 
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:181)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)
at 
org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:836)
at 
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:800)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:545)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:518)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:432)
at 
org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
at 
org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1612)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1582)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1349)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
at 
org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:516)
at 
org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:556)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInt

[GitHub] [lucene-solr-operator] HoustonPutman commented on issue #228: Handle dependency licenses correctly



HoustonPutman commented on issue #228:
URL: 
https://github.com/apache/lucene-solr-operator/issues/228#issuecomment-794221394


   Since the docker release contains bundled dependencies, its LICENSE and 
NOTICE file will be different than the source release (what is included at the 
base of the repo).
   
   We will document where to find these files in the Docker image. 
(`/etc/licenses`)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman commented on issue #234: Migrate to apache/solr-operator



HoustonPutman commented on issue #234:
URL: 
https://github.com/apache/lucene-solr-operator/issues/234#issuecomment-794213330


   https://solr.apache.org/charts is now live. Please begin using it, as the 
old helm repo will be decommissioned when the repo is migrated, on Monday March 
15.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15229) Move Solr's DOAP file from git to solr website



[ 
https://issues.apache.org/jira/browse/SOLR-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298212#comment-17298212
 ] 

Jan Høydahl commented on SOLR-15229:


See LUCENE-9828 - cannot move to website git right now, so I'll repurpose this 
Jira to update the pointer at  
[https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml]
  to Solr's website [https://solr.apache.org/doap_Solr.rdf] and update 
.htaccess on the site to point to that file in the new solr.git repo

> Move Solr's DOAP file from git to solr website
> --
>
> Key: SOLR-15229
> URL: https://issues.apache.org/jira/browse/SOLR-15229
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> It is recommended to have the DOAP file on the project's website, as it is 
> confusing that the DOAP file exists in different versions in the various code 
> branches (it is only the DOAP files from master branch that is used). So in 
> this Jira we'll
>  * Move Solr's DOAP file to the solr-site.git repo (e.g. 
> [https://solr.apache.org/doap-Solr.rdf)|https://solr.apache.org/doap-Solr.rdf]
>  * Update 
> [https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml]
>  to link to the new location



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9828) Move Lucene's DOAP file to website repo



[ 
https://issues.apache.org/jira/browse/LUCENE-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298211#comment-17298211
 ] 

Jan Høydahl commented on LUCENE-9828:
-

Just realized that some tools rely on the doap file being local in the same 
repo:
 * buildAndPushRelease.py
 * changes-to-html.gradle

So I'll halt this task for now, and instead make sure to update .htaccess so 
that [http://lucene.apache.org/core/doap.rdf] points to the new git repo.

> Move Lucene's DOAP file to website repo
> ---
>
> Key: LUCENE-9828
> URL: https://issues.apache.org/jira/browse/LUCENE-9828
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> ASF recommends hosting the DOAP file on the website, not in svn or git. So 
> this task deletes the file from main git repo and adds it to lucene-site.git 
> under [https://lucene.apache.org/doap/lucene.rdf]
> There will be a commit to the main repo with updated releaseWizard commands
> See SOLR-15229 for the Solr DOAP



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15233) ConfigurableInternodeAuthHadoopPlugin with Ranger is broken



 [ 
https://issues.apache.org/jira/browse/SOLR-15233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geza Nagy updated SOLR-15233:
-
Affects Version/s: 8.4.1

> ConfigurableInternodeAuthHadoopPlugin with Ranger is broken
> ---
>
> Key: SOLR-15233
> URL: https://issues.apache.org/jira/browse/SOLR-15233
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.4.1
>Reporter: Geza Nagy
>Priority: Major
>  Labels: authentication, authorization
> Attachments: Screenshot 2021-03-09 at 18.15.31.png, security.json
>
>
> Setting up a cluster with multiple solr nodes with Kerberos using it for 
> internode communication as well (attached security.json) and added Ranger as 
> authorization plugin.
> When sending requests the authentication happens against the end user but the 
> authorization is for solr service user.
> Tested two cases (3 nodes, have a collection with 2 replicas on 2 nodes of 
> it):
> 1. send a query to a node where the collection has replica. Authorization is 
> wrong every nodes
> 2. send a query to a node which doesn't contain a replica. The first place 
> authorization is fine but when the query distributed it goes as solr service 
> user issued.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15233) ConfigurableInternodeAuthHadoopPlugin with Ranger is broken



 [ 
https://issues.apache.org/jira/browse/SOLR-15233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geza Nagy updated SOLR-15233:
-
Component/s: Authorization
 Authentication

> ConfigurableInternodeAuthHadoopPlugin with Ranger is broken
> ---
>
> Key: SOLR-15233
> URL: https://issues.apache.org/jira/browse/SOLR-15233
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication, Authorization
>Affects Versions: 8.4.1
>Reporter: Geza Nagy
>Priority: Major
>  Labels: authentication, authorization
> Attachments: Screenshot 2021-03-09 at 18.15.31.png, security.json
>
>
> Setting up a cluster with multiple solr nodes with Kerberos using it for 
> internode communication as well (attached security.json) and added Ranger as 
> authorization plugin.
> When sending requests the authentication happens against the end user but the 
> authorization is for solr service user.
> Tested two cases (3 nodes, have a collection with 2 replicas on 2 nodes of 
> it):
> 1. send a query to a node where the collection has replica. Authorization is 
> wrong every nodes
> 2. send a query to a node which doesn't contain a replica. The first place 
> authorization is fine but when the query distributed it goes as solr service 
> user issued.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-15233) ConfigurableInternodeAuthHadoopPlugin with Ranger is broken

Geza Nagy created SOLR-15233:


 Summary: ConfigurableInternodeAuthHadoopPlugin with Ranger is 
broken
 Key: SOLR-15233
 URL: https://issues.apache.org/jira/browse/SOLR-15233
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Geza Nagy
 Attachments: Screenshot 2021-03-09 at 18.15.31.png, security.json

Setting up a cluster with multiple solr nodes with Kerberos using it for 
internode communication as well (attached security.json) and added Ranger as 
authorization plugin.

When sending requests the authentication happens against the end user but the 
authorization is for solr service user.

Tested two cases (3 nodes, have a collection with 2 replicas on 2 nodes of it):
1. send a query to a node where the collection has replica. Authorization is 
wrong every nodes

2. send a query to a node which doesn't contain a replica. The first place 
authorization is fine but when the query distributed it goes as solr service 
user issued.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman opened a new issue #235: Prepare the Solr Operator for an Apache Release



HoustonPutman opened a new issue #235:
URL: https://github.com/apache/lucene-solr-operator/issues/235


   There are a few items that need to be completed for the Solr Operator to be 
able to have it's first release
   
   - [ ] Migrate the repo to apache/solr-operator #234 
   - [ ] Create the Docker repo apache/solr-operator 
[INFRA-21545](https://issues.apache.org/jira/browse/INFRA-21545)
   - [ ] Create the Solr Operator sub-page of the Solr website 
(https://solr.apache.org/operator)
   - [ ] Make the Solr Operator an official sub-project of Apache Solr 
[SOLR-15211](https://issues.apache.org/jira/browse/SOLR-15211)
   - [ ] Complete all necessary licensing steps for a source and binary docker 
release #228 
   - [ ] Create a release wizard for the Solr Operator
   - [ ] Find a way to host the Solr Operator documentation on the Solr 
Operator website. (Optional)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9828) Move Lucene's DOAP file to website repo

Jan Høydahl created LUCENE-9828:
---

 Summary: Move Lucene's DOAP file to website repo
 Key: LUCENE-9828
 URL: https://issues.apache.org/jira/browse/LUCENE-9828
 Project: Lucene - Core
  Issue Type: Sub-task
Reporter: Jan Høydahl
Assignee: Jan Høydahl


ASF recommends hosting the DOAP file on the website, not in svn or git. So this 
task deletes the file from main git repo and adds it to lucene-site.git under 
[https://lucene.apache.org/doap/lucene.rdf]

There will be a commit to the main repo with updated releaseWizard commands

See SOLR-15229 for the Solr DOAP



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15230) Missing fields and dynamicfields from schema/fieldtypes API



[ 
https://issues.apache.org/jira/browse/SOLR-15230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298160#comment-17298160
 ] 

ASHISH MEKALA commented on SOLR-15230:
--

I have removed the edited the schema-api.adoc to remove the "fields" and 
"dynamicFields" section in the example for the FieldTypes in Schema-API 
commands.

> Missing fields and dynamicfields from schema/fieldtypes API
> ---
>
> Key: SOLR-15230
> URL: https://issues.apache.org/jira/browse/SOLR-15230
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> SOLR-8736 removed some schema API GET functionality in Solr 6.0, but it was 
> added back by SOLR-8992 in Solr 6.0.1
> I think some of the functionality is still missing:
> {{schema/fieldtypes}} and {{schema/fieldtypes/typename}} does not give the 
> following information anymore:
>  * {{fields}}: the fields with the given field type
>  * {{dynamicFields}}: the dynamic fields with the given field type
> Here is a sample output for {{text_general}}:
> {noformat}
> {
>   "responseHeader":{
> "status":0,
> "QTime":1},
>   "fieldType":{
> "name":"text_general",
> "class":"solr.TextField",
> "positionIncrementGap":"100",
> "multiValued":true,
> "indexAnalyzer":{
>   "tokenizer":{
> "class":"solr.StandardTokenizerFactory"},
>   "filters":[{
>   "class":"solr.StopFilterFactory",
>   "words":"stopwords.txt",
>   "ignoreCase":"true"},
> {
>   "class":"solr.LowerCaseFilterFactory"}]},
> "queryAnalyzer":{
>   "tokenizer":{
> "class":"solr.StandardTokenizerFactory"},
>   "filters":[{
>   "class":"solr.StopFilterFactory",
>   "words":"stopwords.txt",
>   "ignoreCase":"true"},
> {
>   "class":"solr.SynonymGraphFilterFactory",
>   "expand":"true",
>   "ignoreCase":"true",
>   "synonyms":"synonyms.txt"},
> {
>   "class":"solr.LowerCaseFilterFactory"}]}}}  {noformat}
> I tested it using Solr 7.4 and Solr 8.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] amekala2514 opened a new pull request #2467: SOLR-15230: Removed Fields and DynamicFields Section for FieldTypes in Schema-API Command



amekala2514 opened a new pull request #2467:
URL: https://github.com/apache/lucene-solr/pull/2467


   
   
   
   # Description
   
   Removed the field and dynamicField section in example provided by the 
Schema-API command used to gather information on FieldTypes
   # Solution
   
   Edited the schema-api.adoc file under the fieldTypes api section to remove 
the fields and dynamicFields section in the example provided.
   
   # Tests
   
   Ran ./gradlew buildSite command to verify the changes on the schema-api.html 
file to confirm the changes.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [X] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [X] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [X] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [X] I have developed this patch against the `master` branch.
   - [ ] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-15232) Add replica(s) as a part of node startup

Andrzej Bialecki created SOLR-15232:
---

 Summary: Add replica(s) as a part of node startup
 Key: SOLR-15232
 URL: https://issues.apache.org/jira/browse/SOLR-15232
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Andrzej Bialecki
Assignee: Andrzej Bialecki


In containerized environments it would make sense to be able to initialize a 
new node (pod) and designate it immediately to hold newly created replica(s) of 
specified collection/shard(s) once it's up and running.

Currently this is not easy to do, it requires the intervention of an external 
agent that additionally has to first check if the node is up, all of which 
makes the process needlessly complicated.

This functionality could be as simple as adding a command-line switch to 
{{bin/solr start}}, which would cause it to invoke appropriate ADDREPLICA 
commands once it verifies the node is up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #2186: LUCENE-9334 Consistency of field data structures



jpountz commented on a change in pull request #2186:
URL: https://github.com/apache/lucene-solr/pull/2186#discussion_r590479158



##
File path: 
lucene/test-framework/src/java/org/apache/lucene/index/BaseVectorFormatTestCase.java
##
@@ -131,7 +115,8 @@ public void testIllegalDimChangeTwoDocs() throws Exception {
   Document doc = new Document();
   doc.add(new VectorField("f", new float[4], 
VectorValues.SearchStrategy.DOT_PRODUCT_HNSW));
   w.addDocument(doc);
-  if (random().nextBoolean()) {
+  boolean rb = random().nextBoolean();
+  if (rb) {

Review comment:
   nit: maybe we should make sure we test both branches all the time to 
have good coverage, especially now that the error messages differ? (and 
likewise for other places that have similar logic)

##
File path: 
lucene/test-framework/src/java/org/apache/lucene/index/BasePointsFormatTestCase.java
##
@@ -1174,8 +1174,11 @@ public void testMixedSchema() throws Exception {
 
 Document doc = new Document();
 doc.add(new IntPoint("id", 0));
-w.addDocument(doc);
-// now we write another segment where the id field does have points:
+IllegalArgumentException ex =
+expectThrows(IllegalArgumentException.class, () -> w.addDocument(doc));
+assertEquals(
+"cannot change field \"id\" from index options=DOCS to inconsistent 
index options=NONE",
+ex.getMessage());

Review comment:
   I think we should refactor or drop this test, as it is not testing the 
points format now, but IndexingChain/FieldsInfos' logic. Maybe we could rename 
the test `testMergeMissing` and configure the first segment to not have the 
`id` field at all.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] donnerpeter commented on pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions



donnerpeter commented on pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#issuecomment-794120571


   OK :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions



dweiss commented on pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#issuecomment-794104640


   Can I use this PR as an example of how to port a PR to the new repo? ;)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14749) Provide a clean API for cluster-level event processing



[ 
https://issues.apache.org/jira/browse/SOLR-14749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298146#comment-17298146
 ] 

Andrzej Bialecki commented on SOLR-14749:
-

It looks like the underlying cause of the failures was the same as in 
SOLR-15122. I added a Phaser-based mechanism for tests to monitor the changes 
in configuration in the {{ContainerPluginsRegistry}}, similar to the one used 
in SOLR-15122.

I'm leaving this issue open to see if the fix works on jenkins (local beasting 
can't reproduce this failure).

> Provide a clean API for cluster-level event processing
> --
>
> Key: SOLR-14749
> URL: https://issues.apache.org/jira/browse/SOLR-14749
> Project: Solr
>  Issue Type: Improvement
>  Components: AutoScaling
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
>  Labels: clean-api
> Fix For: master (9.0)
>
>  Time Spent: 22h
>  Remaining Estimate: 0h
>
> This is a companion issue to SOLR-14613 and it aims at providing a clean, 
> strongly typed API for the functionality formerly known as "triggers" - that 
> is, a component for generating cluster-level events corresponding to changes 
> in the cluster state, and a pluggable API for processing these events.
> The 8x triggers have been removed so this functionality is currently missing 
> in 9.0. However, this functionality is crucial for implementing the automatic 
> collection repair and re-balancing as the cluster state changes (nodes going 
> down / up, becoming overloaded / unused / decommissioned, etc).
> For this reason we need this API and a default implementation of triggers 
> that at least can perform automatic collection repair (maintaining the 
> desired replication factor in presence of live node changes).
> As before, the actual changes to the collections will be executed using 
> existing CollectionAdmin API, which in turn may use the placement plugins 
> from SOLR-14613.
> h3. Division of responsibility
>  * built-in Solr components (non-pluggable):
>  ** cluster state monitoring and event generation,
>  ** simple scheduler to periodically generate scheduled events
>  * plugins:
>  ** automatic collection repair on {{nodeLost}} events (provided by default)
>  ** re-balancing of replicas (periodic or on {{nodeAdded}} events)
>  ** reporting (eg. requesting additional node provisioning)
>  ** scheduled maintenance (eg. removing inactive shards after split)
> h3. Other considerations
> These plugins (unlike the placement plugins) need to execute on one 
> designated node in the cluster. Currently the easiest way to implement this 
> is to run them on the Overseer leader node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14749) Provide a clean API for cluster-level event processing



[ 
https://issues.apache.org/jira/browse/SOLR-14749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298138#comment-17298138
 ] 

ASF subversion and git services commented on SOLR-14749:


Commit 7ada4032180b516548fc0263f42da6a7a917f92b in lucene-solr's branch 
refs/heads/master from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7ada403 ]

SOLR-14749: Make sure the plugin config is reloaded on Overseer.


> Provide a clean API for cluster-level event processing
> --
>
> Key: SOLR-14749
> URL: https://issues.apache.org/jira/browse/SOLR-14749
> Project: Solr
>  Issue Type: Improvement
>  Components: AutoScaling
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
>  Labels: clean-api
> Fix For: master (9.0)
>
>  Time Spent: 22h
>  Remaining Estimate: 0h
>
> This is a companion issue to SOLR-14613 and it aims at providing a clean, 
> strongly typed API for the functionality formerly known as "triggers" - that 
> is, a component for generating cluster-level events corresponding to changes 
> in the cluster state, and a pluggable API for processing these events.
> The 8x triggers have been removed so this functionality is currently missing 
> in 9.0. However, this functionality is crucial for implementing the automatic 
> collection repair and re-balancing as the cluster state changes (nodes going 
> down / up, becoming overloaded / unused / decommissioned, etc).
> For this reason we need this API and a default implementation of triggers 
> that at least can perform automatic collection repair (maintaining the 
> desired replication factor in presence of live node changes).
> As before, the actual changes to the collections will be executed using 
> existing CollectionAdmin API, which in turn may use the placement plugins 
> from SOLR-14613.
> h3. Division of responsibility
>  * built-in Solr components (non-pluggable):
>  ** cluster state monitoring and event generation,
>  ** simple scheduler to periodically generate scheduled events
>  * plugins:
>  ** automatic collection repair on {{nodeLost}} events (provided by default)
>  ** re-balancing of replicas (periodic or on {{nodeAdded}} events)
>  ** reporting (eg. requesting additional node provisioning)
>  ** scheduled maintenance (eg. removing inactive shards after split)
> h3. Other considerations
> These plugins (unlike the placement plugins) need to execute on one 
> designated node in the cluster. Currently the easiest way to implement this 
> is to run them on the Overseer leader node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] donnerpeter commented on pull request #2459: LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions



donnerpeter commented on pull request #2459:
URL: https://github.com/apache/lucene-solr/pull/2459#issuecomment-794080863


   Hi, any chance to get this merged before the Great Repository Split?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15122) ClusterEventProducerTest.testEvents is unstable



[ 
https://issues.apache.org/jira/browse/SOLR-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298116#comment-17298116
 ] 

Andrzej Bialecki commented on SOLR-15122:
-

[~mdrob] I think we can close this, the changes that you implemented seem to be 
working well.

> ClusterEventProducerTest.testEvents is unstable
> ---
>
> Key: SOLR-15122
> URL: https://issues.apache.org/jira/browse/SOLR-15122
> Project: Solr
>  Issue Type: Bug
>  Components: Tests
>Reporter: Mike Drob
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> This test looks to be unstable according to Jenkins since about Nov 5. I just 
> started seeing occasional failures locally when running the whole suite but 
> cannot reproduce when running in isolation.
> https://lists.apache.org/thread.html/rf0c16b257bc3236ea414be51451806352b55f15d4949f4fd54a3b71a%40%3Cbuilds.lucene.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-12730) Implement staggered SPLITSHARD requests in IndexSizeTrigger



 [ 
https://issues.apache.org/jira/browse/SOLR-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki resolved SOLR-12730.
-
Fix Version/s: (was: master (9.0))
   Resolution: Fixed

This has been fixed a long time ago (and subsequently removed from 9.0).

> Implement staggered SPLITSHARD requests in IndexSizeTrigger
> ---
>
> Key: SOLR-12730
> URL: https://issues.apache.org/jira/browse/SOLR-12730
> Project: Solr
>  Issue Type: Improvement
>  Components: AutoScaling
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.1
>
>
> Simulated large scale tests uncovered an interesting scenario that occurs 
> also in real clusters where {{IndexSizeTrigger}} is used for controlling the 
> maximum shard size.
> As index size grows and the number of shards grows, if document assignment is 
> more or less even then at equal intervals (on a {{log2}} scale) there will be 
> an avalanche of SPLITSHARD operations, because all shards will reach the 
> critical size at approximately the same time.
> A hundred or more split shard operations running in parallel may severely 
> affect the cluster performance.
> One possible approach to reduce the likelihood of this situation is to split 
> shards not exactly in half but rather fudge the proportions around 60/40% in 
> a random sequence, so that the resulting sub-sub-sub…shards would reach the 
> thresholds at different times. This would require modifications to the 
> SPLITSHARD command to allow this randomization.
> Another approach would be to simply limit the maximum number of parallel 
> split shard operations. However, this would slow down the process of reaching 
> the balance (increase lag) and possibly violate other operational constraints 
> due to some shards waiting too long for the split and significantly exceeding 
> their max size.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-15225) standardise test class naming



[ 
https://issues.apache.org/jira/browse/SOLR-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298109#comment-17298109
 ] 

Mark Robert Miller edited comment on SOLR-15225 at 3/9/21, 3:18 PM:


That’s what I’m saying. Not expecting or asking anyone to hold off for me. I 
have my own schedules. (I’m referring to a different and much larger discussion)


was (Author: markrmiller):
That’s what I’m saying. Not expecting or asking anyone to hold off for me. I 
have my own schedules. 

> standardise test class naming
> -
>
> Key: SOLR-15225
> URL: https://issues.apache.org/jira/browse/SOLR-15225
> Project: Solr
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
>
> LUCENE-8626 started out as standardisation effort for both Lucene and Solr 
> test.
> The standardisation for Lucene tests ({{org.apache.lucene}} package space) is 
> now complete and enforced.
> This SOLR ticket here is for the standardisation of Solr test class names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9827) Small segments are slower to merge due to stored fields since 8.7



 [ 
https://issues.apache.org/jira/browse/LUCENE-9827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-9827:
-
Description: 
[~dm] and [~dimitrisli] looked into an interesting case where indexing slowed 
down after upgrading to 8.7. After digging we identified that this was due to 
the merging of stored fields, which had become slower on average.

This is due to changes to stored fields, which now have top-level blocks that 
are then split into sub-blocks and compressed using shared dictionaries (one 
dictionary per top-level block). As the top-level blocks are larger than they 
were before, segments are more likely to be considered "dirty" by the merging 
logic. Dirty segments are segments were 1% of the data or more consists of 
incomplete blocks. For large segments, the size of blocks doesn't really affect 
the dirtiness of segments: if you flush a segment that has 100 blocks or more, 
it will never be considered dirty as only the last block may be incomplete. But 
for small segments it does: for instance if your segment is only 10 blocks, it 
is very likely considered dirty given that the last block is always incomplete. 
And the fact that we increased the top-level block size means that segments 
that used to be considered clean might now be considered dirty.

And indeed benchmarks reported that while large stored fields merges became 
slightly faster after upgrading to 8.7, the smaller merges actually became 
slower. See attached chart, which gives the total merge time as a function of 
the number of documents in the segment.

I don't know how we can address this, this is a natural consequence of the 
larger block size, which is needed to achieve better compression ratios. But I 
wanted to open an issue about it in case someone has a bright idea how we could 
make things better.

  was:
[~dm] and [~dimitrisli] looked into an interesting case where indexing slowed 
down after upgrading to 8.7. After digging we identified that this was due to 
the merging of stored fields, which had become slower on average.

This is due to changes to stored fields, which now have top-level blocks that 
are then split into sub-blocks and compressed using shared dictionaries (one 
dictionary per top-level block). As the top-level blocks are larger than they 
were before, segments are more likely to be considered "dirty" by the merging 
logic. Dirty segments are segments were 1% of the data or more consists of 
incomplete blocks. For large segments, the size of blocks doesn't really affect 
the dirtiness of segments: if you flush a segment that has 100 blocks or more, 
it will never be considered dirty as only the last block may be incomplete. But 
for small segments it does: for instance if your segment is only 10 blocks, it 
is very likely considered dirty given that the last block is always incomplete. 
And the fact that we increased the top-level block size means that segments 
that used to be considered clean might now be considered dirty.

And indeed benchmarks reported that while large stored fields merges became 
slightly faster after upgrading to 8.7, the smaller merges actually became 
slower. See attached chart, which gives the average merge time as a function of 
the number of documents in the segment.

I don't know how we can address this, this is a natural consequence of the 
larger block size, which is needed to achieve better compression ratios. But I 
wanted to open an issue about it in case someone has a bright idea how we could 
make things better.


> Small segments are slower to merge due to stored fields since 8.7
> -
>
> Key: LUCENE-9827
> URL: https://issues.apache.org/jira/browse/LUCENE-9827
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: total-merge-time-by-num-docs-on-small-segments.png
>
>
> [~dm] and [~dimitrisli] looked into an interesting case where indexing slowed 
> down after upgrading to 8.7. After digging we identified that this was due to 
> the merging of stored fields, which had become slower on average.
> This is due to changes to stored fields, which now have top-level blocks that 
> are then split into sub-blocks and compressed using shared dictionaries (one 
> dictionary per top-level block). As the top-level blocks are larger than they 
> were before, segments are more likely to be considered "dirty" by the merging 
> logic. Dirty segments are segments were 1% of the data or more consists of 
> incomplete blocks. For large segments, the size of blocks doesn't really 
> affect the dirtiness of segments: if you flush a segment that has 100 blocks 
> or more, it will never be considered dirty as only the last block may be 
> incomplete. But for small segments it does: for instance if your segmen

[jira] [Commented] (SOLR-15225) standardise test class naming



[ 
https://issues.apache.org/jira/browse/SOLR-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298109#comment-17298109
 ] 

Mark Robert Miller commented on SOLR-15225:
---

That’s what I’m saying. Not expecting or asking anyone to hold off for me. I 
have my own schedules. 

> standardise test class naming
> -
>
> Key: SOLR-15225
> URL: https://issues.apache.org/jira/browse/SOLR-15225
> Project: Solr
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
>
> LUCENE-8626 started out as standardisation effort for both Lucene and Solr 
> test.
> The standardisation for Lucene tests ({{org.apache.lucene}} package space) is 
> now complete and enforced.
> This SOLR ticket here is for the standardisation of Solr test class names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on pull request #2186: LUCENE-9334 Consistency of field data structures



jpountz commented on pull request #2186:
URL: https://github.com/apache/lucene-solr/pull/2186#issuecomment-794019595


   @mayya-sharipova These assumptions sound right to me.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9827) Small segments are slower to merge due to stored fields since 8.7

Adrien Grand created LUCENE-9827:


 Summary: Small segments are slower to merge due to stored fields 
since 8.7
 Key: LUCENE-9827
 URL: https://issues.apache.org/jira/browse/LUCENE-9827
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
 Attachments: total-merge-time-by-num-docs-on-small-segments.png

[~dm] and [~dimitrisli] looked into an interesting case where indexing slowed 
down after upgrading to 8.7. After digging we identified that this was due to 
the merging of stored fields, which had become slower on average.

This is due to changes to stored fields, which now have top-level blocks that 
are then split into sub-blocks and compressed using shared dictionaries (one 
dictionary per top-level block). As the top-level blocks are larger than they 
were before, segments are more likely to be considered "dirty" by the merging 
logic. Dirty segments are segments were 1% of the data or more consists of 
incomplete blocks. For large segments, the size of blocks doesn't really affect 
the dirtiness of segments: if you flush a segment that has 100 blocks or more, 
it will never be considered dirty as only the last block may be incomplete. But 
for small segments it does: for instance if your segment is only 10 blocks, it 
is very likely considered dirty given that the last block is always incomplete. 
And the fact that we increased the top-level block size means that segments 
that used to be considered clean might now be considered dirty.

And indeed benchmarks reported that while large stored fields merges became 
slightly faster after upgrading to 8.7, the smaller merges actually became 
slower. See attached chart, which gives the average merge time as a function of 
the number of documents in the segment.

I don't know how we can address this, this is a natural consequence of the 
larger block size, which is needed to achieve better compression ratios. But I 
wanted to open an issue about it in case someone has a bright idea how we could 
make things better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9387) Remove RAM accounting from LeafReader



[ 
https://issues.apache.org/jira/browse/LUCENE-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298094#comment-17298094
 ] 

Adrien Grand commented on LUCENE-9387:
--

This is breaking enough that it should probably be done in a major, so I made 
it a 9.0 blocker to make sure we consider it.

I've recently been doing some tests on an index to compare how much memory it 
actually used (by opening the index as many times as possible until the JVM 
OOMEs) vs. how much {{Accountable#ramBytesUsed}} reported.  
{{Accountable#ramBytesUsed}} reported a memory usage of 48kB while the test 
that tries to measure actual memory usage reported 832kB. I'm not especially 
surprised, as there are many things that add up and contribute to memory usage 
like index inputs, various threadlocals, multiplied by the number of fields / 
number of segments / number of threads. I don't think we can realistically make 
memory usage accounting much more accurate without entering rabbit holes such 
as introducing memory accounting on IndexInput (NIOFS uses more heap memory as 
it needs a buffer) or thread locals (the more threads have access to an index 
reader, the higher memory usage) and I doubt that returning a number that is 
wrong by a factor 20 is actually useful, so I'm leaning towards proceeding with 
no longer implementing Accountable.



> Remove RAM accounting from LeafReader
> -
>
> Key: LUCENE-9387
> URL: https://issues.apache.org/jira/browse/LUCENE-9387
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Blocker
> Fix For: master (9.0)
>
>
> Context for this issue can be found at 
> https://lists.apache.org/thread.html/r06b6a63d8689778bbc2736ec7e4e39bf89ae6973c19f2ec6247690fd%40%3Cdev.lucene.apache.org%3E.
> RAM accounting made sense when readers used lots of memory. E.g. when norms 
> were on heap, we could return memory usage of the norms array and memory 
> estimates would be very close to actual memory usage.
> However nowadays, readers consume very little memory, so RAM accounting has 
> become less valuable. Furthermore providing good estimates has become 
> incredibly complex as we can no longer focus on a couple main contributors to 
> memory usage, but would need to start considering things that we historically 
> ignored, such as field infos, segment infos, NIOFS buffers, etc.
> Let's remove RAM accounting from LeafReader?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-15231) zk upconfig does not recognize ZK_HOST style url

Subhajit Das created SOLR-15231:
---

 Summary: zk upconfig does not recognize ZK_HOST style url
 Key: SOLR-15231
 URL: https://issues.apache.org/jira/browse/SOLR-15231
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrCLI
Affects Versions: 8.8.1
Reporter: Subhajit Das


While uploading new configset to zookeeper using Solr control script, the -z 
parameter is not recognizing ZK_HOST style string.
 
 Say, I use ,,/solr, then the config is uploaded to  
directly, instead of /solr znode.
 
 Note: /solr,/solr,/solr seems to work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9387) Remove RAM accounting from LeafReader



 [ 
https://issues.apache.org/jira/browse/LUCENE-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-9387:
-
Fix Version/s: master (9.0)

> Remove RAM accounting from LeafReader
> -
>
> Key: LUCENE-9387
> URL: https://issues.apache.org/jira/browse/LUCENE-9387
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Blocker
> Fix For: master (9.0)
>
>
> Context for this issue can be found at 
> https://lists.apache.org/thread.html/r06b6a63d8689778bbc2736ec7e4e39bf89ae6973c19f2ec6247690fd%40%3Cdev.lucene.apache.org%3E.
> RAM accounting made sense when readers used lots of memory. E.g. when norms 
> were on heap, we could return memory usage of the norms array and memory 
> estimates would be very close to actual memory usage.
> However nowadays, readers consume very little memory, so RAM accounting has 
> become less valuable. Furthermore providing good estimates has become 
> incredibly complex as we can no longer focus on a couple main contributors to 
> memory usage, but would need to start considering things that we historically 
> ignored, such as field infos, segment infos, NIOFS buffers, etc.
> Let's remove RAM accounting from LeafReader?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9387) Remove RAM accounting from LeafReader



 [ 
https://issues.apache.org/jira/browse/LUCENE-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-9387:
-
Priority: Blocker  (was: Minor)

> Remove RAM accounting from LeafReader
> -
>
> Key: LUCENE-9387
> URL: https://issues.apache.org/jira/browse/LUCENE-9387
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Blocker
>
> Context for this issue can be found at 
> https://lists.apache.org/thread.html/r06b6a63d8689778bbc2736ec7e4e39bf89ae6973c19f2ec6247690fd%40%3Cdev.lucene.apache.org%3E.
> RAM accounting made sense when readers used lots of memory. E.g. when norms 
> were on heap, we could return memory usage of the norms array and memory 
> estimates would be very close to actual memory usage.
> However nowadays, readers consume very little memory, so RAM accounting has 
> become less valuable. Furthermore providing good estimates has become 
> incredibly complex as we can no longer focus on a couple main contributors to 
> memory usage, but would need to start considering things that we historically 
> ignored, such as field infos, segment infos, NIOFS buffers, etc.
> Let's remove RAM accounting from LeafReader?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-15225) standardise test class naming



[ 
https://issues.apache.org/jira/browse/SOLR-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298063#comment-17298063
 ] 

Jason Gerlowski edited comment on SOLR-15225 at 3/9/21, 1:49 PM:
-

The community already [had a 
discussion|https://lists.apache.org/thread.html/r1931fbe04a8085f4edcd76bf2ed34ae557a0e720e35061ad2a815287%40%3Cdev.lucene.apache.org%3E]
 on the (pre-split) dev list about this test renaming, and the consensus there 
(seemingly with David's agreement?)  was that we shouldn't delay this work if 
someone was willing to get their hands dirty.

If disagreement on this is re-emerging, maybe we should resume on that thread 
rather than start a new discussion here?

Though maybe there's no disagreement here - I'm not 100% sure but it sounds 
like Mark is saying that he doesn't ask/expect anyone to hold off on their work 
for him.




was (Author: gerlowskija):
The community already [had a 
discussion|https://lists.apache.org/thread.html/r1931fbe04a8085f4edcd76bf2ed34ae557a0e720e35061ad2a815287%40%3Cdev.lucene.apache.org%3E]
 on the (pre-split) dev list about this test renaming, and the consensus there 
(seemingly with David's agreement?)  was that we shouldn't delay this work if 
someone was willing to get their hands dirty.

If disagreement on this is re-emerging, maybe we should resume on that thread 
rather than start a new discussion here?



> standardise test class naming
> -
>
> Key: SOLR-15225
> URL: https://issues.apache.org/jira/browse/SOLR-15225
> Project: Solr
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
>
> LUCENE-8626 started out as standardisation effort for both Lucene and Solr 
> test.
> The standardisation for Lucene tests ({{org.apache.lucene}} package space) is 
> now complete and enforced.
> This SOLR ticket here is for the standardisation of Solr test class names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on a change in pull request #2391: SOLR-14341: Move configName into DocCollection class



dsmiley commented on a change in pull request #2391:
URL: https://github.com/apache/lucene-solr/pull/2391#discussion_r590352385



##
File path: solr/solrj/src/java/org/apache/solr/common/cloud/DocCollection.java
##
@@ -300,6 +307,9 @@ public Integer getReplicationFactor() {
 return replicationFactor;
   }
 
+  /**
+   * Return non-null configName.

Review comment:
   ```suggestion
  * Return non-null configset name.
   ```

##
File path: solr/solrj/src/java/org/apache/solr/common/cloud/DocCollection.java
##
@@ -88,10 +88,17 @@ public DocCollection(String name, Map 
slices, Map
* @param zkVersion The version of the Collection node in Zookeeper (used 
for conditional updates).
*/
   public DocCollection(String name, Map slices, Map props, DocRouter router, int zkVersion) {
-super(props);
-if (props == null || props.containsKey("baseConfigSet")) {
+if (props == null) {

Review comment:
   null || props.isEmpty() can be combined

##
File path: solr/core/src/java/org/apache/solr/cloud/CloudConfigSetService.java
##
@@ -86,12 +86,11 @@ public SolrResourceLoader 
createCoreResourceLoader(CoreDescriptor cd) {
 
 // The configSet is read from ZK and populated.  Ignore CD's pre-existing 
configSet; only populated in standalone
 final String configSetName;
-try {
-  configSetName = zkController.getZkStateReader().readConfigName(colName);
-  cd.setConfigSet(configSetName);
-} catch (KeeperException ex) {
-  throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, 
"Trouble resolving configSet for collection " + colName + ": " + 
ex.getMessage());
+configSetName = 
zkController.getZkStateReader().getClusterState().getCollection(colName).getConfigName();
+if (configSetName == null) {

Review comment:
   still an issue?

##
File path: 
solr/core/src/java/org/apache/solr/cloud/api/collections/OverseerCollectionMessageHandler.java
##
@@ -554,12 +554,11 @@ private void modifyCollection(ClusterState clusterState, 
ZkNodeProps message, @S
 final String collectionName = 
message.getStr(ZkStateReader.COLLECTION_PROP);
 //the rest of the processing is based on writing cluster state properties
 //remove the property here to avoid any errors down the pipeline due to 
this property appearing
-String configName = (String) 
message.getProperties().remove(CollectionAdminParams.COLL_CONF);
+String configName = (String) 
message.getProperties().get(CollectionAdminParams.COLL_CONF);

Review comment:
   just observing that this innocent looking change seems important to this 
PR.  Previously this data had disappeared from the state.

##
File path: 
solr/core/src/java/org/apache/solr/cloud/OverseerConfigSetMessageHandler.java
##
@@ -367,12 +367,7 @@ private void deleteConfigSet(String configSetName, boolean 
force) throws IOExcep
 
 for (Map.Entry entry : 
zkStateReader.getClusterState().getCollectionsMap().entrySet()) {
   String configName = null;
-  try {
-configName = zkStateReader.readConfigName(entry.getKey());
-  } catch (KeeperException ex) {
-throw new SolrException(ErrorCode.BAD_REQUEST,
-"Can not delete ConfigSet as it is currently being used by 
collection [" + entry.getKey() + "]");
-  }
+  configName = entry.getValue().getConfigName();

Review comment:
   combine declaration and initialization





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8626) standardise test class naming



[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298073#comment-17298073
 ] 

Ignacio Vera commented on LUCENE-8626:
--

Thanks Dawid!

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-site] epugh merged pull request #42: fix the example command



epugh merged pull request #42:
URL: https://github.com/apache/lucene-site/pull/42


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15225) standardise test class naming



[ 
https://issues.apache.org/jira/browse/SOLR-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298063#comment-17298063
 ] 

Jason Gerlowski commented on SOLR-15225:


The community already [had a 
discussion|https://lists.apache.org/thread.html/r1931fbe04a8085f4edcd76bf2ed34ae557a0e720e35061ad2a815287%40%3Cdev.lucene.apache.org%3E]
 on the (pre-split) dev list about this test renaming, and the consensus there 
(seemingly with David's agreement?)  was that we shouldn't delay this work if 
someone was willing to get their hands dirty.

If disagreement on this is re-emerging, maybe we should resume on that thread 
rather than start a new discussion here?



> standardise test class naming
> -
>
> Key: SOLR-15225
> URL: https://issues.apache.org/jira/browse/SOLR-15225
> Project: Solr
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
>
> LUCENE-8626 started out as standardisation effort for both Lucene and Solr 
> test.
> The standardisation for Lucene tests ({{org.apache.lucene}} package space) is 
> now complete and enforced.
> This SOLR ticket here is for the standardisation of Solr test class names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-15190) Create a release repo for Solr



 [ 
https://issues.apache.org/jira/browse/SOLR-15190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-15190.

Resolution: Fixed

Repo created with initial README and KEYS files. Now the link from webpage is 
no longer a 404.

> Create a release repo for Solr
> --
>
> Key: SOLR-15190
> URL: https://issues.apache.org/jira/browse/SOLR-15190
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> I think this will be created as we do the first release, i.e. nothing 
> explicit to do here until then?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15190) Create a release repo for Solr



[ 
https://issues.apache.org/jira/browse/SOLR-15190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298013#comment-17298013
 ] 

Jan Høydahl commented on SOLR-15190:


Added KEYS and README to [https://dist.apache.org/repos/dist/release/solr/]

Will hopefully be visible on [https://downloads.apache.org/solr] soon.

> Create a release repo for Solr
> --
>
> Key: SOLR-15190
> URL: https://issues.apache.org/jira/browse/SOLR-15190
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> I think this will be created as we do the first release, i.e. nothing 
> explicit to do here until then?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-15230) Missing fields and dynamicfields from schema/fieldtypes API

Andras Salamon created SOLR-15230:
-

 Summary: Missing fields and dynamicfields from schema/fieldtypes 
API
 Key: SOLR-15230
 URL: https://issues.apache.org/jira/browse/SOLR-15230
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Andras Salamon


SOLR-8736 removed some schema API GET functionality in Solr 6.0, but it was 
added back by SOLR-8992 in Solr 6.0.1

I think some of the functionality is still missing:

{{schema/fieldtypes}} and {{schema/fieldtypes/typename}} does not give the 
following information anymore:
 * {{fields}}: the fields with the given field type
 * {{dynamicFields}}: the dynamic fields with the given field type

Here is a sample output for {{text_general}}:
{noformat}
{
  "responseHeader":{
"status":0,
"QTime":1},
  "fieldType":{
"name":"text_general",
"class":"solr.TextField",
"positionIncrementGap":"100",
"multiValued":true,
"indexAnalyzer":{
  "tokenizer":{
"class":"solr.StandardTokenizerFactory"},
  "filters":[{
  "class":"solr.StopFilterFactory",
  "words":"stopwords.txt",
  "ignoreCase":"true"},
{
  "class":"solr.LowerCaseFilterFactory"}]},
"queryAnalyzer":{
  "tokenizer":{
"class":"solr.StandardTokenizerFactory"},
  "filters":[{
  "class":"solr.StopFilterFactory",
  "words":"stopwords.txt",
  "ignoreCase":"true"},
{
  "class":"solr.SynonymGraphFilterFactory",
  "expand":"true",
  "ignoreCase":"true",
  "synonyms":"synonyms.txt"},
{
  "class":"solr.LowerCaseFilterFactory"}]}}}  {noformat}

I tested it using Solr 7.4 and Solr 8.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15190) Create a release repo for Solr



[ 
https://issues.apache.org/jira/browse/SOLR-15190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297937#comment-17297937
 ] 

Jan Høydahl commented on SOLR-15190:


I'll start by adding a KEYS file and a README so that the space is not empty. 
In the README we'll add a note that previous releases are in lucene/solr area.

> Create a release repo for Solr
> --
>
> Key: SOLR-15190
> URL: https://issues.apache.org/jira/browse/SOLR-15190
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> I think this will be created as we do the first release, i.e. nothing 
> explicit to do here until then?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-15190) Create a release repo for Solr



 [ 
https://issues.apache.org/jira/browse/SOLR-15190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reassigned SOLR-15190:
--

Assignee: Jan Høydahl

> Create a release repo for Solr
> --
>
> Key: SOLR-15190
> URL: https://issues.apache.org/jira/browse/SOLR-15190
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> I think this will be created as we do the first release, i.e. nothing 
> explicit to do here until then?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-15229) Move Solr's DOAP file from git to solr website

Jan Høydahl created SOLR-15229:
--

 Summary: Move Solr's DOAP file from git to solr website
 Key: SOLR-15229
 URL: https://issues.apache.org/jira/browse/SOLR-15229
 Project: Solr
  Issue Type: Sub-task
Reporter: Jan Høydahl
Assignee: Jan Høydahl


It is recommended to have the DOAP file on the project's website, as it is 
confusing that the DOAP file exists in different versions in the various code 
branches (it is only the DOAP files from master branch that is used). So in 
this Jira we'll
 * Move Solr's DOAP file to the solr-site.git repo (e.g. 
[https://solr.apache.org/doap-Solr.rdf)|https://solr.apache.org/doap-Solr.rdf]
 * Update 
[https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml]
 to link to the new location



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #2403: SOLR-15164: Implement Task Management Interface



atris commented on a change in pull request #2403:
URL: https://github.com/apache/lucene-solr/pull/2403#discussion_r590061216



##
File path: solr/solr-ref-guide/src/task-management.adoc
##
@@ -0,0 +1,73 @@
+= Task Management
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr allows users to control their running tasks by monitoring them, 
specifying tasks as cancellation enabled and allowing
+cancellation of the same.
+
+This is achieved using the task management interface. Currently, this is 
supported for queries.
+
+== Types of Operations
+Task management interface (TMI) supports the following types of operations:
+
+1. List all currently running cancellable tasks.
+2. Cancel a specific task.
+3. Query the status of a specific task.
+
+== Listing All Active Cancellable Tasks
+To list all the active cancellable tasks currently running, please use the 
following syntax:
+
+`\http://localhost:8983/solr/tasks/list`
+
+ Sample Response
+
+`{responseHeader={status=0, QTime=11370}, 
taskList={0=q=*%3A*&canCancel=true&queryUUID=0&_stateVer_=collection1%3A4&wt=javabin&version=2,
 
5=q=*%3A*&canCancel=true&queryUUID=5&_stateVer_=collection1%3A4&wt=javabin&version=2,
 
7=q=*%3A*&canCancel=true&queryUUID=7&_stateVer_=collection1%3A4&wt=javabin&version=2}`
+
+== Cancelling An Active Cancellable Task
+To cancel an active task, please use the following syntax:
+
+`\http://localhost:8983/solr/tasks/cancel?cancelUUID=foobar`
+
+ cancelUUID Parameter
+This parameter is used to specify the UUID of the task to be cancelled.
+
+ Sample Response
+= If the task UUID was found and successfully cancelled:
+
+`{responseHeader={status=0, QTime=39}, status=Query with queryID 85 cancelled 
successfully}`
+
+= If the task UUID was not found
+
+`{responseHeader={status=0, QTime=39}, status=Query with queryID 85 not found}`
+
+= If the cancellation failed
+
+`{responseHeader={status=0, QTime=39}, status=Query with queryID 85 could not 
be cancelled successfully}`

Review comment:
   Honestly, this is just me being paranoid -- there is no known cause why 
a query will not be cancelled. If there is a SolrServerException or some other 
kind of runtime issue, it will be propagated appropriately without a need for 
this response. Removed the cancellation failed messaging.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15211) Set up solr-operator as a sub project



[ 
https://issues.apache.org/jira/browse/SOLR-15211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297934#comment-17297934
 ] 

Jan Høydahl commented on SOLR-15211:


Houston, for the DOAP, you can create a DOAP file for solr-operator with this 
tool [https://projects.apache.org/create.html]

Then upload the rdf file to the Solr website somewhere 
(solr.apache.org/doap-SolrOperator.rdf ?) and add a reference to the file in 
[https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/projects.xml]
 (svn). Then the next day the new project should show up at 
[https://projects.apache.org/projects.html] and linked under Solr.

> Set up solr-operator as a sub project
> -
>
> Key: SOLR-15211
> URL: https://issues.apache.org/jira/browse/SOLR-15211
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Houston Putman
>Priority: Major
>
> * Create DOAP file for it
>  * Find a way to highlight the sub project on Solr webpage (with a link to 
> GitHub)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15163) Update DOAP for Solr



[ 
https://issues.apache.org/jira/browse/SOLR-15163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297922#comment-17297922
 ] 

Jan Høydahl commented on SOLR-15163:


Confirming that [https://projects.apache.org/project.html?solr] now shows 
correct info for Solr, that Solr is gone from 
[https://projects.apache.org/project.html?lucene] and that there is now only 
one hit for "Solr" when you search. Resolving.

> Update DOAP for Solr
> 
>
> Key: SOLR-15163
> URL: https://issues.apache.org/jira/browse/SOLR-15163
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Anshum Gupta
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently two projects exist at projects.apache.org.
> 1. https://projects.apache.org/project.html?solr (managed by Solr)
> 2. https://projects.apache.org/project.html?lucene-solr (Managed by Lucene)
> We need to merge and/or delete the Solr project listed at #2 into #1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-15163) Update DOAP for Solr



 [ 
https://issues.apache.org/jira/browse/SOLR-15163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-15163.

Resolution: Fixed

> Update DOAP for Solr
> 
>
> Key: SOLR-15163
> URL: https://issues.apache.org/jira/browse/SOLR-15163
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Anshum Gupta
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently two projects exist at projects.apache.org.
> 1. https://projects.apache.org/project.html?solr (managed by Solr)
> 2. https://projects.apache.org/project.html?lucene-solr (Managed by Lucene)
> We need to merge and/or delete the Solr project listed at #2 into #1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org