Re: Limiting Jetty threads?

2023-03-29 Thread Michael Gibney
Yes, this was observed to be a problem with
CompressedStoredFieldsReader, and there may indeed be other cases. The
new storedFields API should help as of Lucene 9.5, but I think we'd
still do well to also pursue a faster QueuedThreadPool shrinking
configuration.

The current Solr default jetty config of 10k maxThreads is less
problematic IMO than the default jetty idleTimeout config of 2
minutes, which amounts to "pool shrinks at the rate of 1 thread every
2 minutes". I suspect a default idleTimeout of 5s or something would
be preferable, but I'm hopeful that we'll have better options soon
from upstream jetty. I wouldn't be surprised if many cases could do
just fine by reducing the maxThreads below 10k, but again I think the
shrink rate is the more important factor, and memory issues are
possible (at least pre-Lucene-9.5) even at way fewer threads (e.g.,
4k).

There's a PR eclipse/jetty.project#9498 [1] that improves the
situation somewhat (especially for shrinking fast) and appears to be
close to being merged; but if configured to shrink very fast it will
shrink pool capacity that shouldn't be considered "idle" by any
reasonable definition. I have an open PR eclipse/jetty.project#9532
[2] that's been tracking the first PR, and fully separates the concept
of pool shrink rate from that of idleTimeout as a keepalive/minimumTTL
for pool capacity to be considered idle (I'd appreciate any feedback
on the PR btw). The initial enhancement proposal also provides some
context [3].

Tangential but relevant: definitely eagerly anticipating incorporating
Lucene-9.5/storedFields change.

[1] https://github.com/eclipse/jetty.project/pull/9498
[2] https://github.com/eclipse/jetty.project/pull/9532
[3] https://github.com/eclipse/jetty.project/issues/9237

On Wed, Mar 29, 2023 at 5:51 PM David Smiley  wrote:
>
> Has anyone experimented with reducing the number of request/Jetty threads
> in jetty.xml?  The default is 10K.  I'm concerned about the use of
> ThreadLocals caching stuff, like maybe Lucene analysis chains (Analyzer
> ReuseStrategy) or other things.  I think Michael Gibney reported to the
> Jetty project that these could be closed more aggressively.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Draining a Solr node for traffic before shutting down

2023-03-29 Thread David Smiley
Looks like there's room for improvement.  I too would want the desired
state to be reflected in ZK first before attempting to make it happen.
Remove live_nodes first, then iterate the local replicas to be state=DOWN,
then close down all the things.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Mar 29, 2023 at 9:16 AM Jan Høydahl  wrote:

> Hi,
>
> Trying to prevent traffic being sent to a Solr node that is going to shut
> down, to avoid interruption of service as seen from various clients.
> First part of the puzzle is signaling to any (external) load balancer to
> stop sending requests to the node.
> The other part is having SolrJ understand that the node is being stopped,
> and not routing internal requests to cores on the node.
>
> Does anyone have a good command of the Shutdown logic in Solr?
> My understanding is a bit sparse, but here's what I can see in the code:
>
> bin/solr stop will send a STOP command to Jetty's STOP_PORT with
> (not-so-secret) stop key
> Jetty starts the shutdown process, destroying all servlets and filters,
> including Solr's dispatchFilter
> Solr is notified about the shutdown through a callback in
> CoreContainerProvider.
> CoreContainerProvider#close() is called which calls CC#shutdown
> CC shuts down every core on the node and then calls zkController#preClose
> ZkController#preClose removes ephemeral live_nodes/myNode and then
> publishes down state in state.json
> Wait for shutdown of executors mm and let Jetty exit
>
> I could have got it wrong though.
>
> I was hoping that a Solr node would first publish itself as "not ready" in
> ZK before rejecting requests, but seems as this is all reversed, since
> shutdown is initiated by Jetty?
> So could we instead register our own shutdown-port in Solr, and let our
> bin/solr script trigger that one? There we could orchestrate the shutdown
> as we want:
>
> Remove live_nodes znode in ZK
> Publish itself as not ready on api/node/health handler (or a new
> api/node/ready?)
> Sleep for a few seconds (or longer with an optional &shutdownDelay
> argument to our shutdown endpoint)
> trigger server.stop() to take down Jetty and kill the servlet
>
> I filed https://issues.apache.org/jira/browse/SOLR-16722 to discuss a
> technical solution.
> The primary goal is to drain traffic right before shutting a node down,
> but it could also be designed as a generic Readiness Probe <
> https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-readiness-probes>
> modeled from Kubernetes?
> I'm also aware that any solr client should be prepared to hit a dead node
> due to network/power events, and retry. But it won't hurt to be graceful
> whenever we can..
>
> Happy to hear your thoughts. Is this a made-up problem?
>
> Jan


Limiting Jetty threads?

2023-03-29 Thread David Smiley
Has anyone experimented with reducing the number of request/Jetty threads
in jetty.xml?  The default is 10K.  I'm concerned about the use of
ThreadLocals caching stuff, like maybe Lucene analysis chains (Analyzer
ReuseStrategy) or other things.  I think Michael Gibney reported to the
Jetty project that these could be closed more aggressively.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


ShardSplitTest flakiness investigation

2023-03-29 Thread Alex Deparvu
Hi,

Following David's recent email about the failing tests (and obviously not
knowing any better) I started to look at the ShardSplitTest failures.

The easy part was turning the current NPEs into an actual assertion. The
hard part is that the documents seem to sometimes be missing the _version_
field which causes the test to fail.
If anyone has more pointers into where the cause might be, please let me
know I would be happy to dig further down this rabbit hole.

This is what I have so far https://github.com/apache/solr/pull/1504

best,
alex


Re: Draining a Solr node for traffic before shutting down

2023-03-29 Thread Shawn Heisey

On 3/29/23 07:16, Jan Høydahl wrote:

Trying to prevent traffic being sent to a Solr node that is going to shut down, 
to avoid interruption of service as seen from various clients.
First part of the puzzle is signaling to any (external) load balancer to stop 
sending requests to the node.
The other part is having SolrJ understand that the node is being stopped, and 
not routing internal requests to cores on the node.


I would use the ping handler with a healthcheck file for this.  A load 
balancer can send a request to /solr/CORE_NAME/admin/ping (probably with 
distrib=false) as a healthcheck ... probably best to have a dedicated 
replica of an empty collection for that purpose.  Disable the ping 
handler before shutting the node down, and the load balancer should stop 
sending requests there pretty quickly.


I would hope that existing mechanisms in SolrCloud are robust enough to 
handle this transparently in the event of server failure or a deletion 
request via the collections API, but I do not know if that is the case.


Thanks,
Shawn

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Draining a Solr node for traffic before shutting down

2023-03-29 Thread Jan Høydahl
Hi,

Trying to prevent traffic being sent to a Solr node that is going to shut down, 
to avoid interruption of service as seen from various clients.
First part of the puzzle is signaling to any (external) load balancer to stop 
sending requests to the node.
The other part is having SolrJ understand that the node is being stopped, and 
not routing internal requests to cores on the node.

Does anyone have a good command of the Shutdown logic in Solr?
My understanding is a bit sparse, but here's what I can see in the code: 

bin/solr stop will send a STOP command to Jetty's STOP_PORT with 
(not-so-secret) stop key
Jetty starts the shutdown process, destroying all servlets and filters, 
including Solr's dispatchFilter
Solr is notified about the shutdown through a callback in CoreContainerProvider.
CoreContainerProvider#close() is called which calls CC#shutdown
CC shuts down every core on the node and then calls zkController#preClose
ZkController#preClose removes ephemeral live_nodes/myNode and then publishes 
down state in state.json
Wait for shutdown of executors mm and let Jetty exit

I could have got it wrong though.

I was hoping that a Solr node would first publish itself as "not ready" in ZK 
before rejecting requests, but seems as this is all reversed, since shutdown is 
initiated by Jetty?
So could we instead register our own shutdown-port in Solr, and let our 
bin/solr script trigger that one? There we could orchestrate the shutdown as we 
want:

Remove live_nodes znode in ZK
Publish itself as not ready on api/node/health handler (or a new 
api/node/ready?)
Sleep for a few seconds (or longer with an optional &shutdownDelay argument to 
our shutdown endpoint)
trigger server.stop() to take down Jetty and kill the servlet

I filed https://issues.apache.org/jira/browse/SOLR-16722 to discuss a technical 
solution.
The primary goal is to drain traffic right before shutting a node down, but it 
could also be designed as a generic Readiness Probe 

 modeled from Kubernetes?
I'm also aware that any solr client should be prepared to hit a dead node due 
to network/power events, and retry. But it won't hurt to be graceful whenever 
we can..

Happy to hear your thoughts. Is this a made-up problem?

Jan

A Message from the Board to PMC members

2023-03-29 Thread Rich Bowen
Dear Apache Project Management Committee (PMC) members,

The Board wants to take just a moment of your time to communicate a few
things that seem to have been forgotten by a number of PMC members,
across the Foundation, over the past few years.  Please note that this
is being sent to all projects - yours has not been singled out.

The Project Management Committee (PMC) as a whole[1] is tasked with the
oversight, health, and sustainability of the project. The PMC members
are responsible collectively, and individually, for ensuring that the
project operates in a way that is in line with ASF philosophy, and in a
way that serves the developers and users of the project.

The PMC Chair is not the project leader, in any sense. It is the person
who files board reports and makes sure they are delivered on time. It
is the secretary for the project, and the project’s  ambassador to the
Board of Directors. The VP title is given as an artifact of US
corporate law, and not because the PMC Chair has any special powers. If
you are treating your PMC Chair as the project lead, or granting them
any other special powers or privileges, you need to be aware that
that’s not the intent of the Chair role. The Chair is a PMC member peer
with a few extra duties.

Every PMC member has an equal voice in deliberations. Each has one
vote. Each has veto power. Every vote weighs the same. It is not only
your right, but it is your obligation, to use that vote for the good of
the project and its users, not to appease the Chair, your employer, or
any other voice in the project. 

Every PMC member can, and should, nominate new committers, and new PMC
members. This is not the sole domain of the PMC Chair. This might be
your most important responsibility to the project, as succession
planning is the path to sustainability.

Every PMC member can, and should, respond when the Board sends email to
your private list. You should not wait for the PMC Chair to respond.
The Board views the entire PMC as responsible for the project, not just
one member.

Every PMC member should be subscribed to the private@ mailing list. If
you are not, then you are neglecting your duty of oversight. If you no
longer wish to be responsible for oversight of the project, you should
resign your PMC seat, not merely drop off of the private@ list and
ignore it. You can determine which PMC members are not subscribed to
your private list by looking at your PMC roster at
https://whimsy.apache.org/roster/committee/  Names with an asterisk (*)
next to them are not subscribed to the list. We encourage you to take a
moment to contact them with this information.

Thank you for your attention to these matters, and thank you for
keeping our projects healthy.

Rich, for The Board of Directors

[1] https://apache.org/foundation/how-it-works.html#pmc-members


-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [DISCUSS] Community Virtual Meetup, April 2023

2023-03-29 Thread Jan Høydahl
You can check one of the public holiday lists, such as

https://www.qppstudio.net/global-holidays-observances/2023.htm

Jan

> 29. mar. 2023 kl. 13:10 skrev Ishan Chattopadhyaya 
> :
> 
> Oh, if that's the case, when could be a good day after that to expect
> maximum attendance?
> 
> On Wed, 29 Mar, 2023, 4:37 pm Jason Gerlowski, 
> wrote:
> 
>> Hey Ishan,
>> 
>> Thanks for volunteering to organize!  Let me know if you have any
>> questions or I can help at all.
>> 
>> April 6th works for me personally, but I wonder if folks might be off
>> for Passover or Easter observances?  Maybe not, just thinking aloud...
>> 
>> Best,
>> 
>> Jason
>> 
>> On Wed, Mar 29, 2023 at 6:36 AM Ishan Chattopadhyaya
>>  wrote:
>>> 
>>> Hey Jason,
>>> I'd like to volunteer for the next virtual meet-up.
>>> 
>>> Would April 6 be okay with everyone?
>>> Thanks and regards,
>>> Ishan
>>> 
>>> On Mon, 27 Mar, 2023, 7:07 pm Jason Gerlowski, 
>>> wrote:
>>> 
 Hey all,
 
 It's time once again to start thinking ahead to our Virtual Meetup for
 April.  As always, there's two main questions to answer in terms of
 planning:
 
 1. Any volunteers to organize?  Organizer duties are pretty light and
 are summarized here:
 https://cwiki.apache.org/confluence/display/SOLR/Meeting+notes.
 Volunteers get some discretion in terms of picking a meeting time/day
 that works for them.  If no one signs on in the next few days (say, by
 Thursday March 30th), I'lll be happy to organize-by-default.
 
 2. When should we meet?
 
 Excited to see you all "in person" in a few weeks.
 
 Best,
 
 Jason
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
 For additional commands, e-mail: dev-h...@solr.apache.org
 
 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>> For additional commands, e-mail: dev-h...@solr.apache.org
>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [DISCUSS] Community Virtual Meetup, April 2023

2023-03-29 Thread Ishan Chattopadhyaya
Oh, if that's the case, when could be a good day after that to expect
maximum attendance?

On Wed, 29 Mar, 2023, 4:37 pm Jason Gerlowski, 
wrote:

> Hey Ishan,
>
> Thanks for volunteering to organize!  Let me know if you have any
> questions or I can help at all.
>
> April 6th works for me personally, but I wonder if folks might be off
> for Passover or Easter observances?  Maybe not, just thinking aloud...
>
> Best,
>
> Jason
>
> On Wed, Mar 29, 2023 at 6:36 AM Ishan Chattopadhyaya
>  wrote:
> >
> > Hey Jason,
> > I'd like to volunteer for the next virtual meet-up.
> >
> > Would April 6 be okay with everyone?
> > Thanks and regards,
> > Ishan
> >
> > On Mon, 27 Mar, 2023, 7:07 pm Jason Gerlowski, 
> > wrote:
> >
> > > Hey all,
> > >
> > > It's time once again to start thinking ahead to our Virtual Meetup for
> > > April.  As always, there's two main questions to answer in terms of
> > > planning:
> > >
> > > 1. Any volunteers to organize?  Organizer duties are pretty light and
> > > are summarized here:
> > > https://cwiki.apache.org/confluence/display/SOLR/Meeting+notes.
> > > Volunteers get some discretion in terms of picking a meeting time/day
> > > that works for them.  If no one signs on in the next few days (say, by
> > > Thursday March 30th), I'lll be happy to organize-by-default.
> > >
> > > 2. When should we meet?
> > >
> > > Excited to see you all "in person" in a few weeks.
> > >
> > > Best,
> > >
> > > Jason
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > > For additional commands, e-mail: dev-h...@solr.apache.org
> > >
> > >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Re: [DISCUSS] Community Virtual Meetup, April 2023

2023-03-29 Thread Jason Gerlowski
Hey Ishan,

Thanks for volunteering to organize!  Let me know if you have any
questions or I can help at all.

April 6th works for me personally, but I wonder if folks might be off
for Passover or Easter observances?  Maybe not, just thinking aloud...

Best,

Jason

On Wed, Mar 29, 2023 at 6:36 AM Ishan Chattopadhyaya
 wrote:
>
> Hey Jason,
> I'd like to volunteer for the next virtual meet-up.
>
> Would April 6 be okay with everyone?
> Thanks and regards,
> Ishan
>
> On Mon, 27 Mar, 2023, 7:07 pm Jason Gerlowski, 
> wrote:
>
> > Hey all,
> >
> > It's time once again to start thinking ahead to our Virtual Meetup for
> > April.  As always, there's two main questions to answer in terms of
> > planning:
> >
> > 1. Any volunteers to organize?  Organizer duties are pretty light and
> > are summarized here:
> > https://cwiki.apache.org/confluence/display/SOLR/Meeting+notes.
> > Volunteers get some discretion in terms of picking a meeting time/day
> > that works for them.  If no one signs on in the next few days (say, by
> > Thursday March 30th), I'lll be happy to organize-by-default.
> >
> > 2. When should we meet?
> >
> > Excited to see you all "in person" in a few weeks.
> >
> > Best,
> >
> > Jason
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
> >

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [DISCUSS] Community Virtual Meetup, April 2023

2023-03-29 Thread Ishan Chattopadhyaya
Hey Jason,
I'd like to volunteer for the next virtual meet-up.

Would April 6 be okay with everyone?
Thanks and regards,
Ishan

On Mon, 27 Mar, 2023, 7:07 pm Jason Gerlowski, 
wrote:

> Hey all,
>
> It's time once again to start thinking ahead to our Virtual Meetup for
> April.  As always, there's two main questions to answer in terms of
> planning:
>
> 1. Any volunteers to organize?  Organizer duties are pretty light and
> are summarized here:
> https://cwiki.apache.org/confluence/display/SOLR/Meeting+notes.
> Volunteers get some discretion in terms of picking a meeting time/day
> that works for them.  If no one signs on in the next few days (say, by
> Thursday March 30th), I'lll be happy to organize-by-default.
>
> 2. When should we meet?
>
> Excited to see you all "in person" in a few weeks.
>
> Best,
>
> Jason
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>