Re: Apache Solr 9.7 branch created

2024-08-07 Thread David Smiley
Anshum, please review the following:

Prometheus Response Writer:
* https://github.com/apache/solr/pull/2616 fixes the content type.
Very trivial; ready to merge.  No CHANGES.txt needed as it improves
the existing merged work.
* FYI there's another PR that does an internal class rename to avoid
confusion before we ship this; it'd be nice to merge this too if you
let me :-)

ThreadPoolExecutor configuration affecting multiple features albeit one PR:
* bad performance for backups of large collections
* new multiThreaded executor configuration
* nonoptimal for update log replay
See this issue and especially this comment:
https://issues.apache.org/jira/browse/SOLR-17391?focusedCommentId=17871790=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17871790
The PR is ready IMO albeit missing CHANGES.txt.  I'd be happy to move it along.
Hats off to my colleague Pierre for spotting a really useful option
"allowCoreThreadTimeOut"

On Tue, Aug 6, 2024 at 4:31 PM Anshum Gupta  wrote:
>
> Yup, realized the 'release wizard early' bit :)
>
> The wizard is wonderful, specially as the last time I released it was the
> manual process!
>
> On Tue, Aug 6, 2024 at 11:25 AM Gus Heck  wrote:
>
> > Yeah release wizard early... see my writeup here:
> >
> > https://github.com/apache/solr/blob/main/dev-docs/releasing.adoc#step-1-run-the-release-wizard
> > (and feel free to add to it)
> >
> > Also you'll want to peek at
> >
> > https://issues.apache.org/jira/issues/?jql=project%20%3D%20SOLR%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20release-scripts%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
> > for a list of gotchas
> >
> > On Tue, Aug 6, 2024 at 12:25 PM Anshum Gupta 
> > wrote:
> >
> > > Thanks for letting me know, Houston and David. I'll take a look and take
> > > care of that.
> > >
> > > On Mon, Aug 5, 2024 at 9:28 PM Houston Putman 
> > wrote:
> > >
> > > > It is in there. Definitely start the steps now anshum. You dont have to
> > > be
> > > > ready to do the release to do the first 1/3 of them.
> > > >
> > > > On Mon, Aug 5, 2024 at 10:54 PM David Smiley 
> > wrote:
> > > >
> > > > > Does the release wizard not have the next version (9.8) in JIRA get
> > > > > created?  Ideally that is done with the branch freezing so as to
> > > > > resolve JIRA issues not making the release.
> > > > >
> > > > > On Mon, Aug 5, 2024 at 11:44 PM David Smiley 
> > > wrote:
> > > > > >
> > > > > > This change here is a super simple fix to the new
> > > > > > PrometheusResponseWriter mime type:
> > > > > > https://github.com/apache/solr/pull/2616
> > > > > > I'd like to get this in.
> > > > > >
> > > > > > On Mon, Aug 5, 2024 at 3:46 PM Anshum Gupta <
> > ans...@anshumgupta.net>
> > > > > wrote:
> > > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I've created the release branch for 9.7. Can you please let me
> > know
> > > > > what's
> > > > > > > pending and needs to still make it into the release?
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > --
> > > > > > > Anshum Gupta
> > > > >
> > > > > -
> > > > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > > > > For additional commands, e-mail: dev-h...@solr.apache.org
> > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Anshum Gupta
> > >
> >
> >
> > --
> > http://www.needhamsoftware.com (work)
> > https://a.co/d/b2sZLD9 (my fantasy fiction book)
> >
>
>
> --
> Anshum Gupta

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Apache Solr 9.7 branch created

2024-08-06 Thread David Smiley
Another simple item reported by a user with a simple fix under review:
https://issues.apache.org/jira/browse/SOLR-17391

On Tue, Aug 6, 2024 at 2:23 PM Gus Heck  wrote:
>
> Yeah release wizard early... see my writeup here:
> https://github.com/apache/solr/blob/main/dev-docs/releasing.adoc#step-1-run-the-release-wizard
> (and feel free to add to it)
>
> Also you'll want to peek at
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20SOLR%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20release-scripts%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
> for a list of gotchas
>
> On Tue, Aug 6, 2024 at 12:25 PM Anshum Gupta  wrote:
>
> > Thanks for letting me know, Houston and David. I'll take a look and take
> > care of that.
> >
> > On Mon, Aug 5, 2024 at 9:28 PM Houston Putman  wrote:
> >
> > > It is in there. Definitely start the steps now anshum. You dont have to
> > be
> > > ready to do the release to do the first 1/3 of them.
> > >
> > > On Mon, Aug 5, 2024 at 10:54 PM David Smiley  wrote:
> > >
> > > > Does the release wizard not have the next version (9.8) in JIRA get
> > > > created?  Ideally that is done with the branch freezing so as to
> > > > resolve JIRA issues not making the release.
> > > >
> > > > On Mon, Aug 5, 2024 at 11:44 PM David Smiley 
> > wrote:
> > > > >
> > > > > This change here is a super simple fix to the new
> > > > > PrometheusResponseWriter mime type:
> > > > > https://github.com/apache/solr/pull/2616
> > > > > I'd like to get this in.
> > > > >
> > > > > On Mon, Aug 5, 2024 at 3:46 PM Anshum Gupta 
> > > > wrote:
> > > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > I've created the release branch for 9.7. Can you please let me know
> > > > what's
> > > > > > pending and needs to still make it into the release?
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > --
> > > > > > Anshum Gupta
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > > > For additional commands, e-mail: dev-h...@solr.apache.org
> > > >
> > > >
> > >
> >
> >
> > --
> > Anshum Gupta
> >
>
>
> --
> http://www.needhamsoftware.com (work)
> https://a.co/d/b2sZLD9 (my fantasy fiction book)

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Apache Solr 9.7 branch created

2024-08-05 Thread David Smiley
Does the release wizard not have the next version (9.8) in JIRA get
created?  Ideally that is done with the branch freezing so as to
resolve JIRA issues not making the release.

On Mon, Aug 5, 2024 at 11:44 PM David Smiley  wrote:
>
> This change here is a super simple fix to the new
> PrometheusResponseWriter mime type:
> https://github.com/apache/solr/pull/2616
> I'd like to get this in.
>
> On Mon, Aug 5, 2024 at 3:46 PM Anshum Gupta  wrote:
> >
> > Hi everyone,
> >
> > I've created the release branch for 9.7. Can you please let me know what's
> > pending and needs to still make it into the release?
> >
> > Thanks!
> >
> > --
> > Anshum Gupta

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Apache Solr 9.7 branch created

2024-08-05 Thread David Smiley
This change here is a super simple fix to the new
PrometheusResponseWriter mime type:
https://github.com/apache/solr/pull/2616
I'd like to get this in.

On Mon, Aug 5, 2024 at 3:46 PM Anshum Gupta  wrote:
>
> Hi everyone,
>
> I've created the release branch for 9.7. Can you please let me know what's
> pending and needs to still make it into the release?
>
> Thanks!
>
> --
> Anshum Gupta

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: JSON Parsing, Jackson / Noggit

2024-08-05 Thread David Smiley
I just finished some benchmarking work using Solr's benchmark module.
It should be pretty easy to tweak an existing benchmark to try both.
Purely from a maintainability standpoint, we could make a hard break
decision in Solr 10.

On Mon, Aug 5, 2024 at 1:00 PM Jason Gerlowski  wrote:
>
> My hunch is that Jackson would be more performant than Noggit, but I
> don't have any hard numbers to back that up so it's just an educated
> guess.  I swear there was some other issue that gave Noggit vs.
> Jackson numbers but I can't find it now.  SOLR-16691 (where Noble
> switched at least some things over to using Jackson) mentions perf
> improvements in the issue description but doesn't quantify those.
> Maybe someone else with context can chime in with data?
>
> Personally, I'd rather see us use Jackson across the board.  I'm sure
> we can write and maintain great serialization code if we want to spend
> that effort, but do we?  Ultimately we're here for Search - it's hard
> to imagine us wanting to spend anywhere near the amount of time on
> serde code that a project like Jackson does as their raison d'etre.
>
> The Noggit lenient parsing *is* really nice for making requests by
> hand, but that's a minority use case.  If there's evidence that
> Jackson is faster, is it worth slowing down 99% of JSON requests just
> so that we can leniently parse the 0.1% of malformed reqs that need
> it?  Is it worth the cost of maintaining our own JSON parsing code in
> perpetuity?
>
> Best,
>
> Jason
>
> On Mon, Aug 5, 2024 at 11:54 AM David Smiley  wrote:
> >
> > We have a couple JSON Parsing libraries -- "Noggit" (internal to Solr)
> > and "Jackson".  Noggit is more lenient in parsing.  I suppose Solr
> > should use Noggit for parsing JSON coming into it, but AFAIK Solr only
> > returns/emits valid JSON; yes?  For parsing JSON that we assume is
> > compliant (e.g. from Solr), should we prefer Jackson or Noggit?  Are
> > there performance advantages?
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [DISCUSS] Solr 9.7 release

2024-08-03 Thread David Smiley
Eric, merge away; no need to ask permission.  Without a release branch
(there *still* isn't one), it's not reasonable for the RM to begin a
feature freeze as was kind of hinted at.  It's still open season!

On Sat, Aug 3, 2024 at 7:54 AM Eric Pugh
 wrote:
>
> I had earlier asked to wait to get both 
> https://github.com/apache/solr/pull/2593 and 
> https://github.com/apache/solr/pull/2577 into 9.7.
>
> 2577 is in, and Jenkins for main and branch_9x seem to have passed (the 
> failures are on the prometheus tests)….   2577 fixes a lot of documentation 
> and user messages and I think is important for 9.7.
>
> 2593 needs more work, and is all “behind the scenes” changes, so if it ends 
> up in 9.8 that is okay.
>
> Eric
>
>
> > On Aug 3, 2024, at 12:59 AM, David Smiley  wrote:
> >
> > The release branch has not yet been cut.  Anshum, based on your
> > comments, I think you should cut it.  I would like to merge things to
> > branch_9x without it going to 9.7 as I would like more bake time on
> > some things.
> >
> > On Fri, Aug 2, 2024 at 3:09 PM Gus Heck  wrote:
> >>
> >> For https://issues.apache.org/jira/browse/SOLR-17298
> >>
> >> I think the key thing that remains is to verify if the exclusion of
> >> graph/join/rerank queries really is correct. The tests didn't fail out of
> >> the box when I removed that restriction, but I want to add some tests that
> >> explicitly turn it on to be sure.
> >>
> >> -Gus
> >>
> >> On Fri, Aug 2, 2024 at 2:50 PM Anshum Gupta  wrote:
> >>
> >>> Do we have an estimate on when the open issues will get wrapped up? I
> >>> certainly would recommend that we don't rush with releasing code and
> >>> instead let it bake. If it's almost done, I can create the branch early
> >>> next week and wait a couple of days before moving forward with the 
> >>> release.
> >>>
> >>> There's no rush, but I wouldn't want to be blocked indefinitely.
> >>>
> >>> On Thu, Aug 1, 2024 at 1:14 PM David Smiley  wrote:
> >>>
> >>>> For the Prometheus test failures, I re-opened the issue.  Consider
> >>>> that one just test noise; we should ignore it if it doesn't get fixed
> >>>> soon.
> >>>> https://issues.apache.org/jira/browse/SOLR-17368
> >>>>
> >>>> On Thu, Aug 1, 2024 at 8:34 AM Gus Heck  wrote:
> >>>>>
> >>>>> Also looking at
> >>>> http://fucit.org/solr-jenkins-reports/failure-report.html
> >>>>> we have a couple tests with high fail rates. This one seems to be
> >>>>> reproducible (reliably with seed, but also not hard to generate
> >>> without)
> >>>>>
> >>>>> org.apache.solr.cloud.DistribDocExpirationUpdateProcessorTest >
> >>>> testNoAuth
> >>>>> FAILED
> >>>>>java.lang.AssertionError: Give up waiting for no results:
> >>>>>
> >>>>
> >>> q=should_expire_s:yup=0&_trace=init_batch_check&_stateVer_=expiring:13
> >>>>> expected:<0> but was:<55>
> >>>>>at
> >>>>>
> >>> __randomizedtesting.SeedInfo.seed([11EF0AA307058EA1:CD5AB0CEBDD35C89]:0)
> >>>>>
> >>>>>  2> 563294 ERROR (qtp1873537458-54511-null-134677) [n:127.0.0.1:34489
> >>>> _solr
> >>>>> c:expiring_secure s:shard2 r:core_node7
> >>>> x:expiring_secure_shard2_replica_n3
> >>>>> t:null-134677] o.a.s.h.RequestHandlerBase Client exception
> >>>>>  2>   => org.apache.solr.common.SolrException: can not use
> >>>>> FieldCache on a field w/o docValues unless it is indexed
> >>>> uninvertible=true
> >>>>> and the type supports Uninversion: _version_
> >>>>>  2> at
> >>>>>
> >>>>
> >>> org.apache.solr.schema.SchemaField.checkFieldCacheSource(SchemaField.java:295)
> >>>>>
> >>>>> On Thu, Aug 1, 2024 at 7:22 AM Christine Poerschke (BLOOMBERG/ LONDON)
> >>> <
> >>>>> cpoersc...@bloomberg.net> wrote:
> >>>>>
> >>>>>> If there is time I'd like to nominate
> >>>>>> https://issues.apache.org/jira/browse/SOLR-17386 and
> >>>>>> https://github.com/apache/solr/pull/2607 to be in the 9.7 release.
> >>>

Re: [DISCUSS] Solr 9.7 release

2024-08-02 Thread David Smiley
The release branch has not yet been cut.  Anshum, based on your
comments, I think you should cut it.  I would like to merge things to
branch_9x without it going to 9.7 as I would like more bake time on
some things.

On Fri, Aug 2, 2024 at 3:09 PM Gus Heck  wrote:
>
> For https://issues.apache.org/jira/browse/SOLR-17298
>
> I think the key thing that remains is to verify if the exclusion of
> graph/join/rerank queries really is correct. The tests didn't fail out of
> the box when I removed that restriction, but I want to add some tests that
> explicitly turn it on to be sure.
>
> -Gus
>
> On Fri, Aug 2, 2024 at 2:50 PM Anshum Gupta  wrote:
>
> > Do we have an estimate on when the open issues will get wrapped up? I
> > certainly would recommend that we don't rush with releasing code and
> > instead let it bake. If it's almost done, I can create the branch early
> > next week and wait a couple of days before moving forward with the release.
> >
> > There's no rush, but I wouldn't want to be blocked indefinitely.
> >
> > On Thu, Aug 1, 2024 at 1:14 PM David Smiley  wrote:
> >
> > > For the Prometheus test failures, I re-opened the issue.  Consider
> > > that one just test noise; we should ignore it if it doesn't get fixed
> > > soon.
> > > https://issues.apache.org/jira/browse/SOLR-17368
> > >
> > > On Thu, Aug 1, 2024 at 8:34 AM Gus Heck  wrote:
> > > >
> > > > Also looking at
> > > http://fucit.org/solr-jenkins-reports/failure-report.html
> > > > we have a couple tests with high fail rates. This one seems to be
> > > > reproducible (reliably with seed, but also not hard to generate
> > without)
> > > >
> > > > org.apache.solr.cloud.DistribDocExpirationUpdateProcessorTest >
> > > testNoAuth
> > > > FAILED
> > > > java.lang.AssertionError: Give up waiting for no results:
> > > >
> > >
> > q=should_expire_s:yup=0&_trace=init_batch_check&_stateVer_=expiring:13
> > > > expected:<0> but was:<55>
> > > > at
> > > >
> > __randomizedtesting.SeedInfo.seed([11EF0AA307058EA1:CD5AB0CEBDD35C89]:0)
> > > >
> > > >   2> 563294 ERROR (qtp1873537458-54511-null-134677) [n:127.0.0.1:34489
> > > _solr
> > > > c:expiring_secure s:shard2 r:core_node7
> > > x:expiring_secure_shard2_replica_n3
> > > > t:null-134677] o.a.s.h.RequestHandlerBase Client exception
> > > >   2>   => org.apache.solr.common.SolrException: can not use
> > > > FieldCache on a field w/o docValues unless it is indexed
> > > uninvertible=true
> > > > and the type supports Uninversion: _version_
> > > >   2> at
> > > >
> > >
> > org.apache.solr.schema.SchemaField.checkFieldCacheSource(SchemaField.java:295)
> > > >
> > > > On Thu, Aug 1, 2024 at 7:22 AM Christine Poerschke (BLOOMBERG/ LONDON)
> > <
> > > > cpoersc...@bloomberg.net> wrote:
> > > >
> > > > > If there is time I'd like to nominate
> > > > > https://issues.apache.org/jira/browse/SOLR-17386 and
> > > > > https://github.com/apache/solr/pull/2607 to be in the 9.7 release.
> > > > >
> > > > > -Christine
> > > > >
> > > > > From: dev@solr.apache.org At: 07/29/24 15:50:24 UTC+1:00To:
> > > > > dev@solr.apache.org
> > > > > Subject: Re: [DISCUSS] Solr 9.7 release
> > > > >
> > > > > https://issues.apache.org/jira/browse/SOLR-17298 is more or less
> > > ready,
> > > > > but
> > > > > would like it to at least have a few days sitting in main before I
> > > backport
> > > > > it. Without it the multi-threaded search needs to be documented as
> > not
> > > > > working with cpuTimeAllowed.
> > > > >
> > > > > On Thu, Jul 25, 2024 at 12:31 PM Anshum Gupta <
> > ans...@anshumgupta.net>
> > > > > wrote:
> > > > >
> > > > > > Sure, I'll wait until the 30th. Thanks for letting me know.
> > > > > >
> > > > > > On Wed, Jul 24, 2024 at 9:39 PM Ishan Chattopadhyaya <
> > > > > > ichattopadhy...@gmail.com> wrote:
> > > > > >
> > > > > > > Hi Anshum,
> > > > > > > I'm still working on unblocking SOLR-13350. Can we please push
> > the
> > > date
> > > > > &g

Re: Several bugs in the Solr source code found using a static code analyzer

2024-08-01 Thread David Smiley
Hello Daniil,

Thanks!  I like your commentary in the blog; it should be easy work
for someone to fix them.

A dose of reality:  Don't be surprised if nobody takes up this task.
People generally work on problems they experience, not hypothetical
ones.  Preferably, static analysis is considered during development
(in IDE) or PR review (we had this but don't currently).

On Thu, Aug 1, 2024 at 5:50 AM Daniil Liakhov
 wrote:
>
> Hello!
>
> My name is Daniil and I am Java software engineer working in PVS-Studio.
> We are developing a static analyzer for C++, C# and Java code and sometimes 
> we verify open-source projects using it.
> I have recently checked the source code of Solr and have found several bugs, 
> which I noted in my article and would like to show them to you.
> Here is the link: https://pvs-studio.com/en/blog/posts/java/1147/
>
> Thank You for your attention.
> --
> Best regards,
> Daniil Liakhov
> PVS-Studio LLC

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [DISCUSS] Solr 9.7 release

2024-08-01 Thread David Smiley
For the Prometheus test failures, I re-opened the issue.  Consider
that one just test noise; we should ignore it if it doesn't get fixed
soon.
https://issues.apache.org/jira/browse/SOLR-17368

On Thu, Aug 1, 2024 at 8:34 AM Gus Heck  wrote:
>
> Also looking at http://fucit.org/solr-jenkins-reports/failure-report.html
> we have a couple tests with high fail rates. This one seems to be
> reproducible (reliably with seed, but also not hard to generate without)
>
> org.apache.solr.cloud.DistribDocExpirationUpdateProcessorTest > testNoAuth
> FAILED
> java.lang.AssertionError: Give up waiting for no results:
> q=should_expire_s:yup=0&_trace=init_batch_check&_stateVer_=expiring:13
> expected:<0> but was:<55>
> at
> __randomizedtesting.SeedInfo.seed([11EF0AA307058EA1:CD5AB0CEBDD35C89]:0)
>
>   2> 563294 ERROR (qtp1873537458-54511-null-134677) [n:127.0.0.1:34489_solr
> c:expiring_secure s:shard2 r:core_node7 x:expiring_secure_shard2_replica_n3
> t:null-134677] o.a.s.h.RequestHandlerBase Client exception
>   2>   => org.apache.solr.common.SolrException: can not use
> FieldCache on a field w/o docValues unless it is indexed uninvertible=true
> and the type supports Uninversion: _version_
>   2> at
> org.apache.solr.schema.SchemaField.checkFieldCacheSource(SchemaField.java:295)
>
> On Thu, Aug 1, 2024 at 7:22 AM Christine Poerschke (BLOOMBERG/ LONDON) <
> cpoersc...@bloomberg.net> wrote:
>
> > If there is time I'd like to nominate
> > https://issues.apache.org/jira/browse/SOLR-17386 and
> > https://github.com/apache/solr/pull/2607 to be in the 9.7 release.
> >
> > -Christine
> >
> > From: dev@solr.apache.org At: 07/29/24 15:50:24 UTC+1:00To:
> > dev@solr.apache.org
> > Subject: Re: [DISCUSS] Solr 9.7 release
> >
> > https://issues.apache.org/jira/browse/SOLR-17298 is more or less ready,
> > but
> > would like it to at least have a few days sitting in main before I backport
> > it. Without it the multi-threaded search needs to be documented as not
> > working with cpuTimeAllowed.
> >
> > On Thu, Jul 25, 2024 at 12:31 PM Anshum Gupta 
> > wrote:
> >
> > > Sure, I'll wait until the 30th. Thanks for letting me know.
> > >
> > > On Wed, Jul 24, 2024 at 9:39 PM Ishan Chattopadhyaya <
> > > ichattopadhy...@gmail.com> wrote:
> > >
> > > > Hi Anshum,
> > > > I'm still working on unblocking SOLR-13350. Can we please push the date
> > > > back by a week, say 30 July?
> > > >
> > > > Thanks and regards,
> > > > Ishan
> > > >
> > > > On Wed, 24 Jul, 2024, 10:52 pm Anshum Gupta, 
> > > > wrote:
> > > >
> > > > > As there are still a few dependency upgrade PRs open, I'll give it a
> > > few
> > > > > days and try and help review + merge those in before starting the
> > > process
> > > > > later this week.
> > > > >
> > > > > On Wed, Jul 17, 2024 at 10:04 AM Eric Pugh <
> > > > > ep...@opensourceconnections.com>
> > > > > wrote:
> > > > >
> > > > > > I would like to see the back port of solr cli changes make it…
> > > > > > https://github.com/apache/solr/pull/2540
> > > > > >
> > > > > > Thanks to some great work from Jan I suspect it can be committed
> > this
> > > > > week.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Jul 17, 2024 at 6:55 PM David Smiley 
> > > > wrote:
> > > > > >
> > > > > > > There is at least one blocker, a notable one:
> > > > > > > https://issues.apache.org/jira/browse/SOLR-13350 (search
> > segments
> > > in
> > > > > > > parallel)
> > > > > > >
> > > > > > > On Tue, Jul 16, 2024 at 4:12 PM Anshum Gupta 
> > > > > wrote:
> > > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > The Change log for Solr 9.7 looks pretty good already with the
> > > > Lucene
> > > > > > > > upgrade and a bunch of other fixes, improvements, and features.
> > > > > > > >
> > > > > > > > I'd like to start the release process next *Tuesday, July 23
> > > > *unless
> > > > > > > there
> > > > > > > > are objections or reasons to wait.
> > > > > > > >
> > > > > > > > -Anshum
> > > > > > >
> > > > > > >
> > > -
> > > > > > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > > > > > > For additional commands, e-mail: dev-h...@solr.apache.org
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Anshum Gupta
> > > > >
> > > >
> > >
> > >
> > > --
> > > Anshum Gupta
> > >
> >
> >
> > --
> > http://www.needhamsoftware.com (work)
> > https://a.co/d/b2sZLD9 (my fantasy fiction book)
> >
> >
> >
>
> --
> http://www.needhamsoftware.com (work)
> https://a.co/d/b2sZLD9 (my fantasy fiction book)

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Cleaning generated jmh classes

2024-08-01 Thread David Smiley
BTW I've recently started using Crave for benchmarks, which sidesteps
this issue.  In fact I never recall seeing the issue reported here,
even when I was starting off local.
https://github.com/apache/solr/pull/2548#pullrequestreview-2213174440

On Thu, Aug 1, 2024 at 10:12 AM Houston Putman  wrote:
>
> I would think we could put them in the build directory, like we do with the
> api module.
>
> On Thu, Aug 1, 2024 at 9:07 AM Mike Drob  wrote:
>
> > I think the idea is correct, but I would also check if there is something
> > we could do with the generator task to mark those directories as generated
> > sources that should be cleaned up (I could be totally wrong here)
> >
> > On Thu, Aug 1, 2024 at 8:49 AM Jason Gerlowski 
> > wrote:
> >
> > > +1 - that makes sense afaict.
> > >
> > > On Thu, Aug 1, 2024 at 8:31 AM Gus Heck  wrote:
> > > >
> > > > I keep having to waste time on errors like this:
> > > >
> > > > > Task :solr:benchmark:compileJava
> > > > error: Annotation generator had thrown the exception.
> > > > javax.annotation.processing.FilerException: Attempt to recreate a file
> > > for
> > > > type
> > > >
> > >
> > org.apache.solr.bench.search.jmh_generated.QueryResponseWriters_query_jmhTest
> > > > at
> > > >
> > >
> > jdk.compiler/com.sun.tools.javac.processing.JavacFiler.checkNameAndExistence(JavacFiler.java:732)
> > > > at
> > > >
> > >
> > jdk.compiler/com.sun.tools.javac.processing.JavacFiler.createSourceOrClassFile(JavacFiler.java:498)
> > > > at
> > > >
> > >
> > jdk.compiler/com.sun.tools.javac.processing.JavacFiler.createSourceFile(JavacFiler.java:435)
> > > > at
> > > >
> > >
> > org.gradle.api.internal.tasks.compile.processing.IncrementalFiler.createSourceFile(IncrementalFiler.java:45)
> > > > at
> > > >
> > >
> > org.openjdk.jmh.generators.annotations.APGeneratorDestinaton.newClass(APGeneratorDestinaton.java:62)
> > > > at
> > > >
> > >
> > org.openjdk.jmh.generators.core.BenchmarkGenerator.generateClass(BenchmarkGenerator.java:448)
> > > > at
> > > >
> > >
> > org.openjdk.jmh.generators.core.BenchmarkGenerator.generate(BenchmarkGenerator.java:86)
> > > > ... etc
> > > >
> > > > Does anyone object to adding this to :solr:benchmark so that clean
> > > actually
> > > > cleans everything?
> > > >
> > > > clean {
> > > >   delete "${layout.projectDirectory}/src/java/generated"
> > > > }
> > > >
> > > >
> > > > --
> > > > http://www.needhamsoftware.com (work)
> > > > https://a.co/d/b2sZLD9 (my fantasy fiction book)
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > > For additional commands, e-mail: dev-h...@solr.apache.org
> > >
> > >
> >

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Exception handling in background tasks

2024-07-25 Thread David Smiley
(face-palm) -- you were very clear but I misread it; sorry Andrey!

I agree with what you propose / imply: code should *not* call submit()
if it ignores the returned Future; it misses the point of the submit
method!  execute() should be used in that case.  In fact I recently
proposed this very improvement with the same rationale on a PR focused
on one spot.  You propose to do so widely -- okay.  If some spots give
trouble (as you found) then at least change the others that don't and
leave a TODO comment on the others; maybe linking to a JIRA.

On Thu, Jul 25, 2024 at 11:07 AM Andrey Bozhko  wrote:
>
> Hi David,
>
> In the example of SolrZkClient.ProcessWatchWithExecutor#process, it should
> be possible to use ExecutorService#submit and ExecutorService#execute
> methods interchangeably - because either method would run the task in the
> background. So I went ahead and replaced `submit` with `execute`, and ran
> the tests. The outcome was that 200+ tests broke, and reported stacktraces
> like below:
>
> > com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an
> uncaught exception in thread: Thread[id=68, name=zkCallback-23-thread-1,
> state=RUNNABLE, group=TGRP-TestConfigSetsAPI]
>  > at
> __randomizedtesting.SeedInfo.seed([5ECEE82AA0B50470:5D74B82B43D74CFA]:0)
>  >
>  > Caused by:
>  > org.apache.solr.common.SolrException: Error updating shard term for
> collection: newcollection
>  > at __randomizedtesting.SeedInfo.seed([5ECEE82AA0B50470]:0)
>  > at
> app//org.apache.solr.cloud.ZkShardTerms.refreshTerms(ZkShardTerms.java:377)
>  > at
> app//org.apache.solr.cloud.ZkShardTerms.lambda$registerWatcher$8(ZkShardTerms.java:426)
>  > at
> app//org.apache.solr.common.cloud.SolrZkClient$ProcessWatchWithExecutor.lambda$process$1(SolrZkClient.java:1083)
>  > at
> app//org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:363)
>  > at java.base@17.0.11
> /java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  > at java.base@17.0.11
> /java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  > at java.base@17.0.11/java.lang.Thread.run(Thread.java:840)
>  >
>  > Caused by:
>  > org.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode = NoNode for /collections/newcollection/terms/shard1
>  > at
> app//org.apache.zookeeper.KeeperException.create(KeeperException.java:117)
>  > at
> app//org.apache.zookeeper.KeeperException.create(KeeperException.java:53)
>  > at
> app//org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1972)
>  > at
> app//org.apache.solr.common.cloud.SolrZkClient.lambda$getData$6(SolrZkClient.java:448)
>  > at
> app//org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:70)
>  > at
> app//org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:448)
>  > at
> app//org.apache.solr.cloud.ZkShardTerms.refreshTerms(ZkShardTerms.java:373)
>  > ... 6 more
> 2> NOTE: reproduce with: gradlew test --tests TestConfigSetsAPI.testUpload
> -Dtests.seed=5ECEE82AA0B50470 -Dtests.locale=mer-Latn-KE
> -Dtests.timezone=Asia/Magadan -Dtests.asserts=true
> -Dtests.file.encoding=UTF-8
>
> While the error indicates that there is some minor bug/race condition/other
> issue with either ZkShardTerms#refreshTerms or the test itself, my takeaway
> here is that this issue (and potentially other kinds of issues) was
> unintentionally concealed because the result of ExecutorService#submit was
> discarded. There is nothing wrong with the implementation of
> ExecutorService#submit, but you could say that the method implies
> "must-use-return-value", and the code didn't do that.
>
> For the schema reload scenario, please refer to the existing test
> "org.apache.solr.pkg.TestPackages.testSchemaPlugins". In that scenario,
> uploading the new package version to the package store triggers reloading
> of the schema:
> -
> https://github.com/apache/solr/blob/661b1dac2284ab556573605ae81a4951a5703c49/solr/core/src/java/org/apache/solr/core/SolrResourceLoader.java#L975-L983
> -
> https://github.com/apache/solr/blob/661b1dac2284ab556573605ae81a4951a5703c49/solr/core/src/java/org/apache/solr/pkg/PackageListeners.java#L82
>
> This test also has a similarly concealed issue, which could be revealed by
> replacing `submit` method with `execute` in CoreContainer#runAsync.
>
> As for the next steps, I agree with Jason that we could start by auditing
> the usage of ExecutorService#submit in the codebase, to see if there are
> any more concealed issues and address them.
>
> I don't think that there's a need to drastically change any exception
> handling/logging in Solr. It may suffice to just discourage/disallow the
> pattern where the code discards the result of ExecutorService#submit - but
> we can 

Re: A thought for our release process...

2024-07-24 Thread David Smiley
I like the idea yet I also wonder if most folks simply won't care to
provide any input anyway, and then it becomes just yet another release
task.  Also, I wouldn't want to recommend any process that would need
to be rethought if we streamline CHANGES.txt management (e.g. we spoke
of using separate files and a script that generates a combined one).
But we could piggy-back off of such... like that would end up
generating a CHANGES.txt section of the release, and thus it could
easily be posted as a PR for follow-up editing.

On Wed, Jul 24, 2024 at 12:44 PM Eric Pugh
 wrote:
>
> I was thinking that as part of our release process we should have a “review 
> CHANGES.txt” step?   I don’t know if that is already in there, but it might 
> be a good step for us as a community to have a single point in time to review 
> that to make sure it’s clear and accurate….
>
> We have lots of folks contributing to it, each with their own style and their 
> own opinion of what change is what type of change, and so maybe having a step 
> in the release process to say “Hey, now is the time to review these to make 
> it look the best it can” might help?
>
> Eric
>
> ___
> Eric Pugh | Founder | OpenSource Connections, LLC | 434.466.1467 | 
> http://www.opensourceconnections.com  
> | My Free/Busy 
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
> 
> This e-mail and all contents, including attachments, is considered to be 
> Company Confidential unless explicitly stated otherwise, regardless of 
> whether attachments are marked as such.
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Exception handling in background tasks

2024-07-24 Thread David Smiley
I think it's a case-by-case matter.  I don't think there's something
wrong with the Executor.submit method generally.

Looking at callers of CoreContainer.runAsync, I didn't find the core
reload use-case you speak of.  I did find DistribFileStore.delete and
looked closer.   I do see there's an issue there since it didn't
propagate exceptions as it ought to.  To fix that, I suggest the
delete code use a FutureTask and then loop the results for exceptions.

For SolrZkClient.ProcessWatchWithExecutor#process , I don't know what
you propose should be different; it's up to the Watcher to have its
error handling.  It's fundamentally an async concept.

On Tue, Jul 23, 2024 at 12:55 PM Andrey Bozhko  wrote:
>
> Hi all,
>
> I'd like to bring up for discussion how Solr handles failures of various
> background tasks.
>
>
> Typically with an ExecutorService, the task can be offloaded to a
> background thread via `execute(...)` or `submit(...)` methods:
> - if using `execute(Runnable)` method, any exception thrown by the task
> (assuming that the task doesn't have a try-catch) is intercepted and
> handled by the thread's UncaughtExceptionHandler - e.g., printed to console;
> - if using `submit(Callable)` method, the caller *must* hold on to the
> future instance returned from the method, as the task result (or the task
> failure) can only be
> retrieved by invoking `get()` on the future instance.
>
>
> That said, there are quite a few places in Solr codebase where the
> background task is created by invoking `submit(...)` method but which do
> not retan any reference to the returned future.
> So if the background task fails for any reason, the failure will go
> completely unnoticed.
>
> Some of these places are:
> - CoreContainer#runAsync (
> https://github.com/apache/solr/blob/06950c656f21577db624102b913fb659ef1f0306/solr/core/src/java/org/apache/solr/core/CoreContainer.java#L2588
> )
> - SolrZkClient.ProcessWatchWithExecutor#process (
> https://github.com/apache/solr/blob/06950c656f21577db624102b913fb659ef1f0306/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/SolrZkClient.java#L1077-L1085
> )
>
> For example, CoreContainer#runAsync may be used to asynchronously reload
> the collection schema in certain cases - so if the reloading fails, I
> imagine the users would want to be aware of the failure and not let it go
> unnoticed.
>
>
> Does any of the above describe a real issue? Well, so far I tried searching
> the codebase for usage of ExecutorService `submit(...)` methods and
> replacing them with `execute(...)` where it makes sense - and then running
> the tests. Doing so broke 200+ tests due to uncaught exceptions in
> background threads. But I did not go through those uncaught exceptions to
> see which ones indicate a real issue and which ones are harmless.
>
> Thoughts?
>
>
> Best,
> Andrey Bozhko

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [DISCUSS] Community Virtual Meetup, July 2024

2024-07-18 Thread David Smiley
Thanks; yes that works for me.

On Thu, Jul 18, 2024 at 9:05 AM Jason Gerlowski  wrote:
>
> Alright,
>
> I'm happy to organize this month.  I'll create the meeting notes pages
> and links shortly.
>
> As for a time/date, how would 9am PT (noon ET) on Thursday the 25th
> work for everyone?
>
> Best,
>
> Jason
>
> On Mon, Jul 15, 2024 at 8:31 AM Jason Gerlowski  wrote:
> >
> > Hey all,
> >
> > It's time once again to start thinking ahead to this month's virtual meetup!
> >
> > As always, two questions:
> >
> > 1. Does anyone have an interest in organizing?  Duties are light but
> > it's an important job.  I'm happy to organize by default if there
> > aren't any volunteers in the next day or two.  (Addtl details:
> > https://cwiki.apache.org/confluence/display/SOLR/Meeting+notes)
> >
> > 2. Does anyone have preferences on the date or time-of-day?  Maybe we
> > could shoot for time-slot sometime in the middle of next week?
> >
> > Best,
> >
> > Jason
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: ZkStateReader.getUpdateLock / ClusterState immutability

2024-07-17 Thread David Smiley
On Wed, Jul 17, 2024 at 2:31 AM Mark Miller  wrote:
> And if you really
> wanted to keep an immutable cluster state object, that still doesn’t mean
> ZkStateReader has to use one for its cluster state structure just because
> that’s what it gives out with getClusterState.

Right -- absolutely.  I _think_ most consumers can deal with
mutability fine; that there are relatively few consumers that require
a snapshot.  They should explicitly ask for that so that the
snapshotting cost isn't needlessly paid.

> The separation of replica states from cluster structure is useful in
> addressing an efficient cluster state structure and update strategy in
> ZkStateReader, if not just so that you know what you actually *need* to
> update. If you get a collection worth of JSON, you have to update it all or
> do some silly gymnastics to reverse engineer what the update actually is.

> For me, a concurrent hashmap in ZkStateReader was better than a cluster
> state object. It mapped a collection to some kind of collection state, and
> in the replica state, I put an atomic integer indicating the state. Then,
> you can throw out the global lock and forget about any lock or object
> creation when updating replica states.

You speak in the past tense as if it worked this way.  Maybe you mean
"would have been" and hopefully you still think so?

Practically speaking, I think "ClusterState" should continue to exist
but simply be mutable (containing a ConcurrentHashMap), effectively
shifting some of ZkStateReader's extensive responsibilities to this
collaborator.  No sense in updating lots of callers needlessly.

> The entire communication path can be made easily 100x+ more scalable with
> changes that are just as simple and straightforward. “Oh, this is crazy.
> And it’s easy to do something that’s not.” And without any brain sweat, you
> end up with a system that works in parallel on independent work, transmits
> state sizes close to the actual state that needs transmitting, doesn’t spam
> updates that are either unnecessary or already outdated, and operates at a
> designed developer drum beat rather than an arbitrary army of drummers.

Then I'm looking forward to seeing you take a stab at this ;-) even if
it's just a draft PR / straw-man.  You can count on me for giving a
code review.

~ David

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



JIRA "pull-request-available" label

2024-07-17 Thread David Smiley
Our JIRA issues, according to .asf.yaml configuration, *should have*
been getting a label "pull-request-available".  This wasn't working
because an ASF bot needed committer permissions in JIRA, in accordance
with ASF docs on this.  I addressed that matter last night.  A nice
side-effect of the bot doing this is that it notifies anyone watching
the JIRA issue so that we're aware of the PR.  The PR auto-link
doesn't do that.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: SIP-7 New Admin UI

2024-07-17 Thread David Smiley
to add 3rd party modules, upgrade libs
> >> etc.
> >> I dislike the fact that the UI is hosted by the main Solr process and
> >> talks directly to Solr backend APIs. I'd like for the UI to be served by a
> >> separate servlet/backend that acts as a proxy, so that the Admin UI could
> >> be installed separately in a DMZ network and poke a hole in firewalls
> >> between the AdminUI's own backend and the Solr cluster (which would be on a
> >> secure inner network).
> >>
> >> If we managed to separate the new UI as an independent servlet, perhaps
> >> with its own /login logic, it would be so much easier to later move the
> >> entire UI to a separate repo, should the need arise.
> >>
> >>> - Would you be interested in contributing to the UI implementation?
> >>
> >> I could probably lend a helping hand here and there, do some reviews, and
> >> if we manage to partition the elephant, pick a few tasks further down the
> >> road.
> >>
> >> I do Kotlin in day job, and it is an absolute joy to work with. Not hard
> >> at all, so to committers fearing a "new" language, it is actually not that
> >> different, just skip the "new" keywork and semicolons, hehe.
> >>
> >> Jan Høydahl
> >>
> >>> 15. juli 2024 kl. 15:49 skrev Christos Malliaridis <
> >> c.malliari...@gmail.com>:
> >>>
> >>> Thanks for the references David, those are very insightful to me. I am
> >>> definitely not the first one coming up with these ideas, that's for sure.
> >>>
> >>> I think the fact that there are multiple third-party frontends for Solr
> >>> shows how important the UI is to the users and it should push us even
> >> more
> >>> to do something about the current state.
> >>>
> >>> *If there is no objection about the proposed approach I would like to
> >>> proceed and discuss the technology stack that could be used and fulfill
> >> our
> >>> current requirements.*
> >>>
> >>> As I already mentioned before, I've been working on a proof-of-concept
> >> with
> >>> Compose Multiplatform (Kotlin) that demonstrates what an integration
> >> would
> >>> look like.
> >>> Since there are many pros and cons for all the available UI frameworks
> >> out
> >>> there, I broke down my point of view and reasons for Compose in a writeup
> >>> <
> >> https://docs.google.com/document/d/17B6TuUbbpvg823ixrsnVPT6hJ4vuVv9UHzIz4jITvHI/edit?usp=sharing
> >>>
> >>> again.
> >>>
> >>> But because this is a very opinionated topic, *your input is needed*. To
> >> be
> >>> more precise, here are a few questions:
> >>> - What technology stack would you consider and why?
> >>> - What was your experience so far with Solr's UI code? What would you
> >> avoid
> >>> doing again, what did you like before?
> >>> - Would you be interested in contributing to the UI implementation?
> >>> - Would you consider a web-based / javascript-based framework easier to
> >> get
> >>> started with, or a JVM-based / kotlin-based UI framework?
> >>>
> >>> Best,
> >>> Christos
> >>>
> >>> On Fri, Jul 12, 2024 at 11:39 PM David Smiley 
> >> wrote:
> >>>
> >>>> An admin UI can definitely be plugged in.  Here is one:
> >>>> https://github.com/yasa-org/yasa
> >>>> And you would not be the first to consider a desktop client.  There is
> >>>> one of those too: https://solr.search-navigator.org/
> >>>>
> >>>> On Tue, Jul 9, 2024 at 9:37 PM Christos Malliaridis
> >>>>  wrote:
> >>>>>
> >>>>> Thanks for your input, votes and feedback so far, I appreciate it.
> >>>>>
> >>>>> The security concerns are justified and are something I am currently
> >>>>> looking into. With a rewrite it will be easier to take that into
> >> account
> >>>>> and consider alternative options that could also enhance security, too.
> >>>> For
> >>>>> example, I am experimenting with a JVM-based and standalone desktop
> >>>> client
> >>>>> (that is probably a safer option and provides extended authentication
> >>>>> support) that can also be run alongside the current 

Re: [DISCUSS] Solr 9.7 release

2024-07-17 Thread David Smiley
There is at least one blocker, a notable one:
https://issues.apache.org/jira/browse/SOLR-13350 (search segments in
parallel)

On Tue, Jul 16, 2024 at 4:12 PM Anshum Gupta  wrote:
>
> Hi everyone,
>
> The Change log for Solr 9.7 looks pretty good already with the Lucene
> upgrade and a bunch of other fixes, improvements, and features.
>
> I'd like to start the release process next *Tuesday, July 23 *unless there
> are objections or reasons to wait.
>
> -Anshum

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



ZkStateReader.getUpdateLock / ClusterState immutability

2024-07-16 Thread David Smiley
At work, in a scenario when a node starts with thousands of cores for
thousands of collections, we've seen that core registration can
bottleneck on ZkStateReader.forceUpdateCollection(collection) which
synchronizes on getUpdateLock, a global lock (not per-collection).  I
don't know the history or strategy behind that lock, but it's a
code-smell to see a global lock that is used in a circumstance that is
scoped to one collection.  I suspect it's there because ClusterState
is immutable and encompasses basically all state.  If it was instead a
cache that can be snapshotted (for consumers that require an immutable
state to act on), we could probably make getUpdateLock go away.  *If*
a collection's state needs to be locked (and I'm suspicious that it
is, so long as cache insertion is done properly / exclusively), we
could have a lock just for the collection.

Any concerns with this idea?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: getRealtimeSearcher isn't needed?

2024-07-16 Thread David Smiley
I did run nightlies as well.  There were some failures but the console
never printed my "nocommit".  This is just a quick experiment I'm
doing to help me understand some things.

I'm inclined to add comments to say that calling getRealtimeSearcher
will "nearly always" return an existing one without having to open
one.  This is useful for anyone reading this code.  Ideally the
comment should describe why there might not be one.

If it's so extremely rare... I'd argue the complexity of making the
distinction is not worth it.  A straw-man proposal, anyway :-)

On Tue, Jul 16, 2024 at 9:27 AM Mark Miller  wrote:
>
> I'm not currently looking at any code, but if the idea is that you put in
> that assert and ran the nightly or none nightly tests, I wouldn't come to
> the conclusion that that code path is never hit unless you've walked
> through all of the possible concurrency potential around it in the code.
>
> - MRM
>
>
> On Tue, Jul 16, 2024 at 8:04 AM David Smiley  wrote:
>
> > SolrCore.getRealtimeSearcher returns the existing searcher if there is
> > one, or otherwise creates one.  I did a little experiment to "assert
> > false" where it creates one.  It never tripped!  On any commit, that
> > searcher is installed as the realtime searcher, and furthermore a
> > searcher is created on core open as well (of course).  I wonder what
> > led to the distinction in the first place.  Has it always been this
> > way or was there a time long ago when there was a real chance of not
> > having a "real" searcher?  Hmm.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
> >

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



getRealtimeSearcher isn't needed?

2024-07-16 Thread David Smiley
SolrCore.getRealtimeSearcher returns the existing searcher if there is
one, or otherwise creates one.  I did a little experiment to "assert
false" where it creates one.  It never tripped!  On any commit, that
searcher is installed as the realtime searcher, and furthermore a
searcher is created on core open as well (of course).  I wonder what
led to the distinction in the first place.  Has it always been this
way or was there a time long ago when there was a real chance of not
having a "real" searcher?  Hmm.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: IntelliJ not working with latest main

2024-07-15 Thread David Smiley
FWIW I use IntelliJ and merely open the gradle project naturally.  I
have never run "gradlew idea" nor know what the point of that is.

On Mon, Jul 15, 2024 at 12:01 PM Ishan Chattopadhyaya
 wrote:
>
> Thanks Jason and Christos,
> Apologies, I meant "./gradlew idea" instead of "./gradlew main". It
> generates an IntelliJ project that didn't load for me.
> I'll try to reproduce this on another machine and record a screencast for
> it, to help further debugging.
> Thanks and regards,
> Ishan
>
> On Mon, 15 Jul 2024 at 17:59, Jason Gerlowski  wrote:
>
> > > Cannot resolve symbol CollectionsApi, ListCollectionsResponse.
> >
> > +1 to what Christos said.
> >
> > In short - 'CollectionsApi' and 'ListCollectionsResponse' are both
> > Java classes that we generate from our OAS.  That generation is setup
> > in gradle to always happen prior to compiling "solr-solrj" (which of
> > course happens before compiling "solr-core").  If you're not seeing
> > that happen and have a short script to reproduce, I'm happy to take a
> > look!
> >
> > > SolrCore 'collection1' is not available due to init failure:
> > RAMDirectory can only be used with the 'single' lock factory type.
> >
> > Not sure if there's a fix yet, but David discussed this a bit recently
> > in the thread "Use MockDirectoryFactory not RAMDirectoryFactory in
> > test configs".
> >
> > As discussed there at least, it sounded like this only happens when
> > using IntelliJ's "native" test-runner to run tests, instead of having
> > IntelliJ utilize our gradle build.  So a workaround might be to have
> > IntelliJ run the tests using gradle for the time being.
> >
> > (And, now that I think about it, if IntelliJ isn't using the gradle
> > build for some reason, it makes sense that the code-generation
> > wouldn't be happening either.  So maybe switching IntelliJ to using
> > gradle for the test-runner might solve both of your issues?)
> >
> > Jason
> >
> > On Sun, Jul 14, 2024 at 7:40 AM Christos Malliaridis
> >  wrote:
> > >
> > > Hey Ishan, I'm sorry to hear that. The test uses a generated class to
> > > verify the bug fix, since the issue occurs in a template fro class
> > > generation.
> > >
> > > The error you sent points to one of generated classes that I used
> > > (CollectionsApi) and that is also in SolrJ module. Since it is a
> > generated
> > > class, is maybe something broken with your build cache? If I am not
> > > mistaken, the classes should always be generated in advance for SolrJ.
> > >
> > > I also tried to reproduce the error and it said for "./gradlew main" that
> > > main does not exist. Is it maybe "./gradlew dev" what you are trying to
> > > execute?
> > >
> > >
> > > On Sun, 14 Jul 2024, 11:54 Ishan Chattopadhyaya, <
> > ichattopadhy...@gmail.com>
> > > wrote:
> > >
> > > > I removed the ApiMustacheTemplateTests.java temporarily, and all errors
> > > > went away.
> > > > However, tried running tests, and ran into this:
> > > >
> > > > BasicFunctionalityTest.java:
> > > > java.lang.RuntimeException:
> > > > org.apache.solr.core.SolrCoreInitializationException: SolrCore
> > > > 'collection1' is not available due to init failure: RAMDirectory can
> > only
> > > > be used with the 'single' lock factory type.
> > > >
> > > > Any ideas, please?
> > > >
> > > >
> > > > On Sun, 14 Jul 2024 at 14:17, Ishan Chattopadhyaya <
> > > > ichattopadhy...@gmail.com> wrote:
> > > >
> > > > > Hi all,
> > > > > I pulled latest commits, but ./gradlew main is resulting in a project
> > > > that
> > > > > doesn't load without errors:
> > > > > ApiMustacheTemplateTests.java: Cannot resolve symbol CollectionsApi,
> > > > > ListCollectionsResponse.
> > > > >
> > > > > Any ideas, please?
> > > > >
> > > > > If I rollback to the commit before the following one, it works fine:
> > > > >
> > > > > commit 461955f00118c69d06f50e72addeff12c8dd8169
> > > > > Author: Christos Malliaridis 
> > > > > Date:   Tue Jun 11 18:15:01 2024 +0200
> > > > >
> > > > > SOLR-17326: Fix references in generated SolrRequest impls (#2510)
> > > > >
> > > > > A handful of the v2 SolrRequest implementations generated
> > > > > by our OAS spec relied on response model classes whose names
> > > > > conflicted with other (unrelated) classes in solrj.  This caused
> > > > > errors at request time as JacksonParsingResponse would try to
> > > > > deserialize the JSON, XML, etc. response body into these
> > > > > unintended classes.
> > > > >
> > > > > This commit fixes this by modifying the 'api.mustache' template
> > > > > so that generated SolrRequest classes now reference their
> > > > > response model using the fully-qualified classname (i.e.
> > including
> > > > > the package).  This resolves the ambiguity.
> > > > >
> > > > > -
> > > > >
> > > > > Co-authored-by: Jason Gerlowski 
> > > > >
> > > > >
> > > > >
> > > >
> >
> > -
> > To unsubscribe, e-mail: 

Re: SIP-7 New Admin UI

2024-07-12 Thread David Smiley
An admin UI can definitely be plugged in.  Here is one:
https://github.com/yasa-org/yasa
And you would not be the first to consider a desktop client.  There is
one of those too: https://solr.search-navigator.org/

On Tue, Jul 9, 2024 at 9:37 PM Christos Malliaridis
 wrote:
>
> Thanks for your input, votes and feedback so far, I appreciate it.
>
> The security concerns are justified and are something I am currently
> looking into. With a rewrite it will be easier to take that into account
> and consider alternative options that could also enhance security, too. For
> example, I am experimenting with a JVM-based and standalone desktop client
> (that is probably a safer option and provides extended authentication
> support) that can also be run alongside the current Admin UI as a
> WebAssembly app if needed (see changes in
> https://github.com/malliaridis/solr/tree/composeui). Another option I was
> considering was to write and provide the UI as a Solr plugin, but I am not
> sure if this could work with the current way plugins are loaded.
>
> So in my opinion and alongside the current concerns like maintenance of UI
> code, this might be solvable with the right technology selection and API
> implementation (which would be follow-up topics).
>
> On Tue, Jul 9, 2024 at 10:57 PM Gus Heck  wrote:
>
> > Disabling certainly is helpful, but... there's the risk it gets enabled,
> > it will still contribute to the footprint that vulnerability scanners have
> > to cover.
> >
> > If it's something that can be enabled/disabled or removed from the full
> > distro, and added to the slim distro if desired, that would be even better.
> > The easier all of those things are, the better of course.
> >
> > Food for thought: https://github.com/jetty/jetty.project/issues/5007
> >
> > If the UI is a self contained web-app containing only JS/HTML that can be
> > undeployed that's pretty much a standards based solution to the problem.
> > This sort of wheel was invented long long ago, and we have the basic tools
> > at our disposal already (jetty)... There is no need for the UI to have any
> > java code at all I suspect...
> >
> > -Gus
> >
> > On Tue, Jul 9, 2024 at 3:20 PM David Smiley  wrote:
> >
> > > RE security; disabling it would suffice and if I recall is already
> > > supported.
> > >
> > > On Tue, Jul 9, 2024 at 3:09 PM Gus Heck  wrote:
> > > >
> > > > Also +1 ... "in the same repo and alongside" is how the last migration
> > > was
> > > > done IIRC. The big plus of this is as it's developed to a point of
> > > partial
> > > > utility you can put a link in the old UI to try out the new UI and get
> > > > feedback and make testing much easier.
> > > >
> > > > One thing that might be nice if we can do it, is to make the UI more
> > > > pluggable, and allow those who have no desire to test it to start solr
> > > with
> > > > it fully uninstalled. (i.e because they don't want to account for its
> > > > security in production)
> > > >
> > > > Also it would be very good if we carefully understood how we want to
> > > > achieve security (including information exposure, and role based
> > > > access/display) before we put it in a release.
> > > >
> > > > On Tue, Jul 9, 2024 at 10:40 AM Houston Putman 
> > > wrote:
> > > >
> > > > > I agree with Jason on everything.
> > > > >
> > > > > Thank you so much for putting this much work into something with so
> > > much
> > > > > baggage in the community!
> > > > >
> > > > > I'm a huge +1 here, and love the things I saw in your screenshots on
> > > Slack.
> > > > >
> > > > > - Houston
> > > > >
> > > > > On Mon, Jul 8, 2024 at 2:23 PM Jason Gerlowski <
> > gerlowsk...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hey Christos,
> > > > > >
> > > > > > Sorry for the delay responding here - lots of context to read up
> > on!
> > > > > >
> > > > > > Firstly, thanks for the huge effort you've put into writing this
> > all
> > > > > > up!  Quite the thorough job, and it's really helpful to enable us
> > > > > > non-UI folks to follow along haha.
> > > > > >
> > > > > > If I understand things correctly, there's a few distinct aspects to
> >

Re: SIP-7 New Admin UI

2024-07-09 Thread David Smiley
RE security; disabling it would suffice and if I recall is already supported.

On Tue, Jul 9, 2024 at 3:09 PM Gus Heck  wrote:
>
> Also +1 ... "in the same repo and alongside" is how the last migration was
> done IIRC. The big plus of this is as it's developed to a point of partial
> utility you can put a link in the old UI to try out the new UI and get
> feedback and make testing much easier.
>
> One thing that might be nice if we can do it, is to make the UI more
> pluggable, and allow those who have no desire to test it to start solr with
> it fully uninstalled. (i.e because they don't want to account for its
> security in production)
>
> Also it would be very good if we carefully understood how we want to
> achieve security (including information exposure, and role based
> access/display) before we put it in a release.
>
> On Tue, Jul 9, 2024 at 10:40 AM Houston Putman  wrote:
>
> > I agree with Jason on everything.
> >
> > Thank you so much for putting this much work into something with so much
> > baggage in the community!
> >
> > I'm a huge +1 here, and love the things I saw in your screenshots on Slack.
> >
> > - Houston
> >
> > On Mon, Jul 8, 2024 at 2:23 PM Jason Gerlowski 
> > wrote:
> >
> > > Hey Christos,
> > >
> > > Sorry for the delay responding here - lots of context to read up on!
> > >
> > > Firstly, thanks for the huge effort you've put into writing this all
> > > up!  Quite the thorough job, and it's really helpful to enable us
> > > non-UI folks to follow along haha.
> > >
> > > If I understand things correctly, there's a few distinct aspects to
> > > your proposal:
> > >
> > > 1. New UI would live alongside the existing one (for a time)
> > > 2. The code for the new UI would live in the main repository.
> > > 3. Development would be piece-meal (i.e. not one big code-dump)
> > >
> > > Overall, this sounds like a reasonable approach to me.
> > >
> > > I think a big concern with putting code in the main repo is that it's
> > > pretty far from the (current) PMC's/community's wheelhouse to
> > > maintain.  I definitely share that concern.  But IMO we're already
> > > sortof at a "worst case" in that regard with our existing Admin UI
> > > code.  Doing the "refresh" in the main repo gives us a forcing
> > > function (i.e. the review process itself) to ensure that at least a
> > > few community members will understand the code to at least some
> > > extent.  That'll be a huge improvement over where we are today.
> > >
> > > Anyway, I'm a cautious '+1' based on these details at least.  To quote
> > > a message from Jan in Slack: "I'd rather see some imperfect movement
> > > than a perfect plan never realized."
> > >
> > > (Here's hoping my reply will bump this to the top of folks' Inboxes,
> > > and get you some more feedback.)
> > >
> > > Best,
> > >
> > > Jason
> > >
> > > On Mon, Jul 1, 2024 at 12:25 PM Christos Malliaridis
> > >  wrote:
> > > >
> > > > Hello everyone,
> > > >
> > > > In regards to SIP-7
> > > > <
> > >
> > https://cwiki.apache.org/confluence/display/SOLR/SIP-7+Updated+Solr+Admin+UI
> > > >
> > > > and SIP-10
> > > > <
> > >
> > https://cwiki.apache.org/confluence/display/SOLR/SIP-10+Improve+Getting+Started+experience
> > > >
> > > > I would like to add my perspective and address the current concerns
> > about
> > > > implementing a new UI, so that we can take some actions and improve the
> > > > overall quality and experience of Solr Admin UI.
> > > >
> > > > There are many discussions and opinions about the UI and how to resolve
> > > the
> > > > current issues, but they all led to the topic becoming stale. In my
> > > > opinion, developing and introducing a new UI into the main repository
> > > piece
> > > > by piece without replacing the current UI until feature-complete could
> > > >
> > > > - address all the issues currently reported (and not),
> > > > - add new features,
> > > > - replace the EOL framework and
> > > > - improve the overall user experience.
> > > >
> > > > And the maintenance, which is one of the most important parts, could be
> > > > addressed with the right choice of framework.
> > > >
> > > > I created a detailed writeup
> > > > <
> > >
> > https://docs.google.com/document/d/14F1QARdkIrmKXQ4zuWUuOXduH4v_XwZ_Zrd0d2jE468/edit?usp=sharing
> > > >
> > > > for those who are interested, where I also write about the alternative
> > > > approaches proposed in the past and listing the pros and cons of each
> > one
> > > > individually.
> > > >
> > > > I also started to improve this part by simply designing a new UI
> > > > <
> > >
> > https://www.figma.com/design/VdbEfcWQ8mirFNquBzbPk2/Apache-Solr-Admin-UI-v2-Concept
> > > >
> > > > and addressing multiple issues at once. I have already received some
> > > community
> > > > feedback
> > > > ,
> > > but
> > > > it is far from production-ready and needs more input. I think this
> > could
> > > be
> > > > further refined and moved to 

Re: Hi, can you please add me to subscription list ? Tx in advance

2024-07-09 Thread David Smiley
I subscribed you.  You might have received a confirmation.

On Tue, Jul 9, 2024 at 12:40 PM Kaminski, Adi
 wrote:
>
>
>
> Adi Kaminski
> Director of Software Engineering | Analytics COE
> Verint(r). The CX Automation Company(tm)
> Email: adi.kamin...@verint.com | +972545914916
> www.verint.com
>
> [Bots (GLOBAL)]
>
>
> This electronic message may contain proprietary and confidential information 
> of Verint Systems Inc., its affiliates and/or subsidiaries. The information 
> is intended to be for the use of the individual(s) or entity(ies) named 
> above. If you are not the intended recipient (or authorized to receive this 
> e-mail for the intended recipient), you may not use, copy, disclose or 
> distribute to anyone this message or any information contained in this 
> message. If you have received this electronic message in error, please notify 
> us by replying to this e-mail.

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: SIP-18: A Solr Kubernetes Module for native integration

2024-07-09 Thread David Smiley
ConfigSetService is pluggable, and this works!(*)  At Salesforce, we
use FileSystemConfigSetService with SolrCloud.  Well actually a hybrid
thing that we could open-source but we have yet needed it actually, as
the only ConfigSet we use is immutable in our docker image.

(*) there is a feature or two that still makes assumptions.  A mutable
schema is one.

On Tue, Jul 9, 2024 at 10:44 AM Houston Putman  wrote:
>
> I'm not sure this was help up by arguments. The first part that I tried to
> tackle was the Kubernetes Config Set management, and that turned into an
> absolute beast.
> Apparently, a ton of our configSet code relies on the fact that configSets
> live in Zookeeper (even though there is an interface...) if Solr is running
> in Cloud mode.
>
> Maybe I'll get this going again but tackle the security plugins first...
> That would be a huge win independent of the ConfigSet feature.
>
> - Houston
>
> On Tue, May 28, 2024 at 5:53 AM Eric Pugh 
> wrote:
>
> > We need a complete path for scaling from the smallest Solr set ups to the
> > largest that is well supported by the community, and this seems to be key
> > to supporting the largest deployments.   So this make sense to me.
> >
> > Would saying that this kind of change is targeting Solr 10 take some of
> > the pressure off of us?  Our normal pattern of back porting everything to
> > the current branch means that every code change has to be in a releasable
> > state, which maybe leads to more discussion.  If this is 10x, then maybe
> > less pressure?   I guess this is really up to whoever or group of whoever
> > decide to move this forward ;-).
> >
> >
> >
> > > On May 28, 2024, at 5:53 AM, Jan Høydahl  wrote:
> > >
> > > I think of this from time to time. To get some progress, should be first
> > agree in this thread that it is a decent idea, and that a new Solr module
> > is warranted for this?
> > >
> > > I'd hate to see good initatives like this to he held up by arguments not
> > related to the code itself but to the lifecycle or wish for separate git
> > repos etc.
> > >
> > > Once we agree to move forward, the JIRA could be split up into
> > manageable tasks that more community members could help with.
> > >
> > > Jan
> > >
> > > On 2023/04/05 16:45:26 Houston Putman wrote:
> > >> Hey everyone,
> > >>
> > >> This is a new SIP, not a duplicate of SIP-17 (Authoscaling on
> > Kubernetes),
> > >> and completely unrelated.
> > >>
> > >> Basically there is a lot of very messy logic we do in the Solr Operator
> > to
> > >> bootstrap security and manage various things. This logic must exist
> > because
> > >> Solr has no idea that Kubernetes exists.
> > >> If we can use Kubernetes APIs to pull in information, instead of
> > relying on
> > >> the Solr Operator to inject that information in hacky-ways, the user
> > >> experience on Kubernetes is going to get many times better for users
> > >> wanting to secure their SolrClouds. This will also help us use
> > >> authorization by default (which we always preach) via the Solr Operator.
> > >>
> > >> This SIP is not very filled out because I'm still thinking on various
> > >> aspects. But in general, we can attack the different plugins one-by-one
> > and
> > >> the SIP can evolve throughout the process. This SIP is very easy to
> > break
> > >> up, which is nice.
> > >>
> > >> Please let me know if I can explain more, or how I can make the SIP page
> > >> better.
> > >>
> > >> - Houston
> > >>
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > > For additional commands, e-mail: dev-h...@solr.apache.org
> > >
> >
> > ___
> > Eric Pugh | Founder | OpenSource Connections, LLC | 434.466.1467 |
> > http://www.opensourceconnections.com <
> > http://www.opensourceconnections.com/> | My Free/Busy <
> > http://tinyurl.com/eric-cal>
> > Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
> > https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
> >
> > This e-mail and all contents, including attachments, is considered to be
> > Company Confidential unless explicitly stated otherwise, regardless of
> > whether attachments are marked as such.
> >
> >

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: ASF Jenkins Build Timeouts

2024-07-08 Thread David Smiley
What is the current timeout?  You propose setting it to 3 hours?  Are
you referring to a Jenkins job specific thing or a Java based test
suite timeout?  I've looked into the latter, which is set to 2 hours
in LuceneTestCase (see @TimeoutSuite) -- pretty crazy and IMO should
not be set there at all.  I've overridden this for the Crave build
with an unsatisfying hack.

On Mon, Jul 8, 2024 at 1:58 PM Jason Gerlowski  wrote:
>
> Hey all,
>
> I spent some time last week looking into a few ASF Jenkins jobs that
> hung indefinitely until the timeout killed them nearly a day later.
> The specific cause has since been fixed, but it raised the question:
> why is the timeout for our builds so long?
>
> Allowing "stuck" builds to linger blocks other jobs from running, and
> seemingly even our longest-running jobs should complete in a few
> hours.  Would anyone object to my lowering some of these long build
> timeouts to, say, 3 hours?
>
> The following jobs would be affected:
>
> - Solr-BadApples-Tests-main
> - Solr-check-9.6
> - Solr-Check-9.x
> - Solr-Check-main
> - Solr-Check-main-s390x
> - Solr-Docker-Nightly-9.6
> - Solr-Docker-Nightly-9.x
> - Solr-Docker-Nightly-main
> - Solr-Docker-Official-Test-main
> - Solr-Docker-Test-main
> - Solr-NightlyTests-main
> - Solr-Smoketest-9.6
> - Solr-Smoketest-9.x
>
> Best,
>
> Jason
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Use MockDirectoryFactory not RAMDirectoryFactory in test configs

2024-07-01 Thread David Smiley
Some of our tests don't run correctly/consistently via IntelliJ's
internal JUnit runner (compared to Gradle).  IntelliJ's JUnit runner
is faster and better integrated, saving developer productivity.

I debugged the issue.  The test TestInPlaceUpdatesDistrib failed
because "RAMDirectory can only be used with the 'single' lock factory
type.".  This test's solrconfig.xml is very typical:
class="${solr.directoryFactory:solr.RAMDirectoryFactory}"
The code for this does *not* set the property.  It turns out the
randomization.gradle configures all tests to run with this property
set to "org.apache.solr.core.MockDirectoryFactory".  At least this is
the only one; the others set there have null values.
https://github.com/apache/solr/blob/bde8c14bddecc2a417d2fd36abe965675e8e670e/gradle/testing/randomization.gradle#L129

Proposal: remove this from the gradle config.  Instead, replace all
"solr.directoryFactory:solr.RAMDirectoryFactory" with
"solr.directoryFactory:solr.MockDirectoryFactory" in all test
solrconfig files.
This is ultimately a minor refactoring of sorts; an improvement to the
build.  No JIRA.

Any insights / concerns?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Looking for final review of SOLR-16842

2024-06-28 Thread David Smiley
Thanks so much Jason for investing your time to fix the build machines!

RE timeout -- we ought to explicitly set a timeout so as to not hog
the machine indefinitely.  For Bats, I don't know where that is.  I've
done changes in the recent past for normal / JUnit tests.

On Fri, Jun 28, 2024 at 5:42 PM Jason Gerlowski  wrote:
>
> > I wanted to confirm that our Jenkins builds are still happy, and while our 
> > normal Jenkins boxes are busted
>
> The boxes are fixed!  Or, lucene-solr-1 should be at least.  I'm still
> looking into lucene-solr-2.  So any failures at this point are
> legitimate issues.
>
> Case in point, it looks like our 9x BATs tests are broken.  I'm able
> to reproduce the error in this build on two different Macs. (Looking
> into a fix, now, so hopefully I can handle this one) [1].
>
> Slight tangent - do we have any time-constraints on these builds?
> Following a bats failure the job hangs, seemingly indefinitely.
> (Current one has been running for ~17 hrs)  It's a real holdup
> considering we only have 1 functional build machine at this point.
>
> > I’m running out of time before vacation for three weeks (hello Spain!) and 
> > so thinking [list of 3 or 4 different options]
>
> Option (3) i.e. the "skip backporting until after your trip" option,
> sounds best to me.  I agree with Jan that users will benefit from 9.x
> deprecation warnings, and that it's worth getting something on
> branch_9x at some point.  But IMO there's no rush on that, as long as
> you're still up for it following your time away.  It's less stress,
> afaik there's no proposed 9.x release that you'd miss by waiting, and
> since the build-machines have only recently gotten fixed the change
> would probably benefit from a bit more time to "bake" on 'main'
> anyway.
>
> Just my two-cents.
>
> When you're back around later this summer I'm willing to pair on some
> of those backporting difficulties, fwiw.
>
> Jason
>
> [1] https://ci-builds.apache.org/job/Solr/job/Solr-Check-9.x/6066/
>
> On Fri, Jun 28, 2024 at 9:39 AM Jan Høydahl  wrote:
> >
> > From an upgrade 9->10 perspective, it would be good to start getting 
> > deprecation warnings in 9.x for old options and change scripts etc so you 
> > don't have a big-bang moment in 10.0.
> >
> > A potential option 5) is to introduce the new options and deprecate old in 
> > 10.0, meaning they won't be removed until 11.0
> >
> > Jan
> >
> > > 28. juni 2024 kl. 13:18 skrev Eric Pugh :
> > >
> > > Hi all….  Quick update that I merged SOLR-16842 into main.
> > >
> > > I wanted to confirm that our Jenkins builds are still happy, and while 
> > > our normal Jenkins boxes are busted, I see that we have 
> > > “Solr-Check-main-s390x”?   Looking at it, it appears that recent code 
> > > merges are fine, that the one test that failed on run 
> > > https://ci-builds.apache.org/job/Solr/job/Solr-Check-main-s390x/646/ is 
> > > probably just flaky.   So feeling good about that!
> > >
> > > https://ci-builds.apache.org/job/Solr/job/Solr-Check-main-s390x/
> > >
> > > I have started a back port PR to branch_9x, however it’s not going well…. 
> > > :-(.   https://github.com/apache/solr/pull/2540.  There are more 
> > > differences between main and branch_9x for the CLI than I quite realized. 
> > >   Bats tests on main that aren’t on branch_9x, all the changes in how we 
> > > craft Solr urls, the fact that we never back ported basic auth for 
> > > SolrCLI tools….  We replaced “CreateCoreTool" and “CreateCollectionTool" 
> > > tools with “CreateTool” on main...  I haven’t committed all my local 
> > > changes as the tests are failing, but may try that….
> > >
> > > I’m running out of time before vacation for three weeks (hello Spain!) 
> > > and so thinking:
> > >
> > > 1) Ask for help.  If someone else wanted to take a crack at the back port 
> > > who has more Git-fu than I do, more than welcome.
> > > 2) Push up the code to the branch even though tests etc fail.
> > > 3) Not worry about back port to branch_9x till I get back last week of 
> > > July.
> > > 4) Pivot on the back port plan and declare SOLR-16842 to be a Solr 10 
> > > only feature.  Figure out a new plan for removing deprecated options from 
> > > code.  (That is starting to feel the path of least resistance to me).
> > >
> > > I would very much like to NOT rollback the change so didn’t list that as 
> > > an option….
> > >
> > >
> > > Thanks
> > >
> > > Eric
> > >
> > >
> > >> On Jun 22, 2024, at 10:05 AM, Eric Pugh  > >> > wrote:
> > >>
> > >> Thanks Jason, I’m happy to stall till end of day on Monday to click 
> > >> “merge”!   Thanks.
> > >>
> > >>
> > >>> On Jun 21, 2024, at 12:33 PM, Jason Gerlowski  
> > >>> wrote:
> > >>>
> > >>> Hey Eric,
> > >>>
> > >>> Sorry for the delay - I am hoping to review that PR.  I'll try to have
> > >>> a review done by Monday (if not today).  That should leave a week or
> > >>> so for any followups prior to your big trip!  (Safe travels!)
> > 

Solr "benchmark", wikipedia

2024-06-28 Thread David Smiley
I was thinking of using Solr's "benchmark" module/thing to benchmark
parallel segment search (coming to Solr 9.7 but needs more love).  I
don't notice any substantial data to query for in this module,
however.  Has anyone considered adding wikipedia, like how Lucene's
"luceneutil" does?  Or something else?  Is this a bad idea for this
benchmark module or should I be looking elsewhere like Searchscale's
solr-bench[1]?

https://github.com/searchscale/solr-bench

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Looking for final review of SOLR-16842

2024-06-28 Thread David Smiley
I don’t care if this is 10x only; that’s the easiest path

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jun 28, 2024 at 1:18 PM Eric Pugh 
wrote:

> Hi all….  Quick update that I merged SOLR-16842 into main.
>
> I wanted to confirm that our Jenkins builds are still happy, and while our
> normal Jenkins boxes are busted, I see that we have
> “Solr-Check-main-s390x”?   Looking at it, it appears that recent code
> merges are fine, that the one test that failed on run
> https://ci-builds.apache.org/job/Solr/job/Solr-Check-main-s390x/646/ is
> probably just flaky.   So feeling good about that!
>
> Solr-Check-main-s390x [Solr] [Jenkins]
> <https://ci-builds.apache.org/job/Solr/job/Solr-Check-main-s390x/>
> ci-builds.apache.org
> <https://ci-builds.apache.org/job/Solr/job/Solr-Check-main-s390x/>
> [image: apple-touch-icon.png]
> <https://ci-builds.apache.org/job/Solr/job/Solr-Check-main-s390x/>
> <https://ci-builds.apache.org/job/Solr/job/Solr-Check-main-s390x/>
>
>
> I have started a back port PR to branch_9x, however it’s not going well….
> :-(.   https://github.com/apache/solr/pull/2540.  There are more
> differences between main and branch_9x for the CLI than I quite realized.
> Bats tests on main that aren’t on branch_9x, all the changes in how we
> craft Solr urls, the fact that we never back ported basic auth for SolrCLI
> tools….  We replaced “CreateCoreTool" and “CreateCollectionTool" tools with
> “CreateTool” on main...  I haven’t committed all my local changes as the
> tests are failing, but may try that….
>
> I’m running out of time before vacation for three weeks (hello Spain!) and
> so thinking:
>
> 1) Ask for help.  If someone else wanted to take a crack at the back port
> who has more Git-fu than I do, more than welcome.
> 2) Push up the code to the branch even though tests etc fail.
> 3) Not worry about back port to branch_9x till I get back last week of
> July.
> 4) Pivot on the back port plan and declare SOLR-16842 to be a Solr 10 only
> feature.  Figure out a new plan for removing deprecated options from code.
>  (That is starting to feel the path of least resistance to me).
>
> I would very much like to NOT rollback the change so didn’t list that as
> an option….
>
>
> Thanks
>
> Eric
>
>
> On Jun 22, 2024, at 10:05 AM, Eric Pugh 
> wrote:
>
> Thanks Jason, I’m happy to stall till end of day on Monday to click
> “merge”!   Thanks.
>
>
>
> On Jun 21, 2024, at 12:33 PM, Jason Gerlowski 
> wrote:
>
> Hey Eric,
>
> Sorry for the delay - I am hoping to review that PR.  I'll try to have
> a review done by Monday (if not today).  That should leave a week or
> so for any followups prior to your big trip!  (Safe travels!)
>
> If Monday isn't early enough and you need to merge on Saturday, IMO
> that's fine.  The PR's been open for nearly a year and you've been
> more than patient at this point.  In that case I'll just review
> post-merge and we can handle any follow ups as needed.
>
> Best,
>
> Jason
>
> On Fri, Jun 21, 2024 at 12:07 PM Eric Pugh
>  wrote:
>
>
> Hi all, I’m planning on merging https://github.com/apache/solr/pull/1768,
> SOLR-16824: Adopt Linux Command line tool pattern of -- for long option
> commands tomorrow (Saturday).
>
> I’m going to be traveling (Spain!) for three weeks starting June 29th, and
> I’d like to make sure this fairly big change has plenty time for any
> panicky follow up fixes that turn out to be needed when it goes into main
> and branch_9x.
>
> Eric
>
>
> ___
> Eric Pugh | Founder | OpenSource Connections, LLC | 434.466.1467 |
> http://www.opensourceconnections.com <
> http://www.opensourceconnections.com/> | My Free/Busy <
> http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
> >
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless of
> whether attachments are marked as such.
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>
> ___
> *Eric Pugh **| *Founder | OpenSource Connections, LLC | 434.466.1467 |
> http://www.opensourceconnections.com | My Free/Busy
> <http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
> <https://www.

Re: Simplify Ref Guide Examples by Merging Windows and Mac/Linux Examples?

2024-06-14 Thread David Smiley
+1

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jun 14, 2024 at 3:30 PM Eric Pugh 
wrote:

> In the ref guide we duplicate all out bin/solr post examples to deal with
> the / for unix/Mac and \ for windows.
>
> I asked ChatGPT about this, and it said that Java just deals with it…
>
> I was thinking we could reduce the duplication by just providing the linux
> example, and not labeling it “Linux/Mac” and not having a separate windows
> one…
>
> Thoughts?
>
> Eric
>
>
> What ChatGPT said:
> In Java, the file path handling is designed to be platform-independent, so
> a path like example/films/films.json will generally work on both Unix-based
> systems (like Linux or macOS) and Windows, regardless of the underlying
> file system conventions.
>
> Java's File class, which is used to interact with the file system,
> automatically handles the differences in path separators between platforms.
> On Unix-based systems, the path separator is the forward slash (/), while
> on Windows, it's the backslash (\).
>
> When you pass a path like example/films/films.json to Java, it will
> interpret the path correctly on both platforms. On Windows, Java will
> automatically convert the forward slashes to backslashes as needed.
>
> Similarly, if you pass a Windows-style path like example\films\films.json,
> Java will also handle that correctly on both Unix-based systems and Windows.
>
> The key point is that Java abstracts away the differences in file system
> conventions between platforms, allowing your code to work consistently
> across different operating systems. As long as you use Java's file system
> APIs (such as File, Path, or Paths), you don't need to worry about the
> underlying path separator characters.
>
> ___
> Eric Pugh | Founder | OpenSource Connections, LLC | 434.466.1467 |
> http://www.opensourceconnections.com <
> http://www.opensourceconnections.com/> | My Free/Busy <
> http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless of
> whether attachments are marked as such.
>
>


CHANGES.txt -- is worthy?

2024-06-13 Thread David Smiley
When I think of CHANGES.txt, I think of communicating to users.  We
write something succinct for the benefit of users to understand how
this change will impact them.

I don't think of CHANGES.txt as a log of all changes.  I think
increasing the scope to such wastes the time of users (and *we* are
also users!), and dilutes the value of more meaningful entries.

I know we have an "Other" section... perhaps this section allows us to
add stuff that no user ought to care about.  Not sure if users know to
not waste their time looking at it.  Still... I think we shouldn't
bother adding some changes to CHANGES.txt at all.

So... if a deprecated method is replaced with an equivalent... lets
just not add this; okay?  Specific example:  replacing new URL("...")
with the new equivalent.  Honestly I don't even need to ask; just
don't.  Perhaps others have input on other examples or have different
perspectives of thinking about this.  The presence of a JIRA issue
doesn't necessarily mean a CHANGES.txt entry needs to be added.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Integration test failures on Jenkins

2024-06-11 Thread David Smiley
Thanks for investigating. The nature of the failure would never occur in a
contained infrastructure. It’s sad to see having to deal with this. Do you
know if it’s even an option to run a particular job in a container?
Anyway, reach infra on Slack #askinfra

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Jun 11, 2024 at 6:42 PM Houston Putman  wrote:

> Currently the integration tests have been failing on Jenkins since around
> May 29th. It looks like there is a rogue process that has Auth enabled that
> is causing this problem. (Port 38603)
>
> For some reason a lot of the bin/solr commands choose to connect to this
> process instead of the SOLR_PORT that is in use by the integration tests at
> that point. I have a PR for making all commands in bin/solr use the same
> port defaulting, but that can really only land in 10.0.
>
> In the meantime we should see if infra can stop that rogue process so that
> the tests can start passing.
>
> - Houston
>


Re: CI builds for older Solr releases linger... why?

2024-06-10 Thread David Smiley
Much appreciated for that Houston!

On Thu, Jun 6, 2024 at 8:50 PM Houston Putman  wrote:
>
> It can definitely be made more explicit, so I did that:
> https://github.com/apache/solr/pull/2504 (as well as clean up some
> deprecation notices)
>
> On Thu, Jun 6, 2024 at 12:37 PM David Smiley  wrote:
>
> > I saw an entry that refers people to the Confluence page I was
> > referring to.  That page clearly instructs to delete old jobs.
> >
> > On Thu, Jun 6, 2024 at 1:19 PM Houston Putman  wrote:
> > >
> > > Ok, I see that in the release wizard there is an item to add new jenkins
> > > task for the release branch, but there is not an item to remove the old
> > > jenkins tasks.
> > >
> > > I'll go ahead and make a PR for that.
> > >
> > > - Houston
> > >
> > > On Thu, Jun 6, 2024 at 8:19 AM Eric Pugh  wrote:
> > >
> > > > David, I went to look at the builds and the long list is rather
> > > > overwhelming!  So +1 to pruning.
> > > >
> > > > On 2024/06/04 19:53:14 David Smiley wrote:
> > > > > I suspect nobody was reading the conversation Eric and I were having
> > > > > on bui...@solr.apache.org; maybe because nobody looks there.  Maybe
> > we
> > > > > should never do that and have it be build-only messages.
> > Nevertheless
> > > > > all active Solr committers should subscribe to that list if you
> > > > > haven't (it's a basic project hygiene thing -- monitor builds).  So I
> > > > > am copy-pasting to our dev list.
> > > > >
> > > > > I will very soon take action to DELETE (not disable) the Jenkins CI
> > > > > builds for 8.9, 8.10 (not 8.11), 9.0, 9.1, 9.2, 9.3, 9.4, 9.5 --
> > there
> > > > > are more than one jobs for some of these releases.  Our ref guide
> > > > > instructions actually indicate to do this:
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/SOLR/JenkinsReleaseBuilds+-+Solr
> > > > > so I won't wait for someone to tell me not to follow these
> > > > > instructions ;-).   Yet release after release, nobody has done this
> > > > > despite this being a release-wizard step (AFAICT).  What's broken in
> > > > > our process here folks?  (Don't ask me, I only did a *patch* release
> > > > > once which has no step to do here.)
> > > > >
> > > > > ~ David
> > > > >
> > > > > -- Forwarded message -
> > > > > From: David Smiley 
> > > > > Date: Fri, May 31, 2024 at 11:14 AM
> > > > > Subject: Re: [JENKINS] Solr » Solr-Smoketest-9.4 - Build # 284 -
> > Still
> > > > Failing!
> > > > > To: , Gus Heck ,
> > > > > 
> > > > > Cc: Eric Pugh , 
> > > > >
> > > > >
> > > > > I don't think we need release jobs for older releases -- older than
> > > > > the latest.  Our release process refers RMs to visit
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/SOLR/JenkinsReleaseBuilds+-+Solr
> > > > > which first instructs to remove old jobs.  I think this only happens
> > > > > for major/minor releases but not patch releases.
> > > > >
> > > > > https://ci-builds.apache.org/job/Solr/
> > > > >
> > > > > Gus, you did 9.6.0.  Did the release wizard direct you to
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/SOLR/JenkinsReleaseBuilds+-+Solr
> > > > > ?
> > > > > Jason, you did 9.5.0.  Same question.
> > > > >
> > > > >
> > > > > On Fri, May 31, 2024 at 8:45 AM Eric Pugh
> > > > >  wrote:
> > > > > >
> > > > > > At first blush, running locally things are fine….
> > > > > >
> > > > > > Is there any chance that the various Jenkins jobs could be
> > > > sharing/communicating across each other where a bad running Solr
> > instance
> > > > in main is still there and causing others to fail?  I ask because why
> > would
> > > > 9.1, 9.3, 9.4, 9.5, 9.6 all start failing between 3 days and 10 hours
> > ago
> > > > and 2 days 9 hours ago?   I get changes on 9.6, but not on the previous
> > > > versions.
> > > > > >
> > > > > >
> > &g

Re: Failed startup behavior

2024-06-06 Thread David Smiley
Interestingly it was SOLR-179 (Ryan M) that added the prevention of
error propagation because the idea was to start anyway and display a
webpage of the error.  But it was later partially undone by SOLR-1846
(Hossman), albeit left this bit, probably by mistake -- an oversight.
Because it didn't quite work anyway.  Regardless, Jetty still stays
running albeit in a zombie state, 503; so again, it's not working as
designed.  This wart can be removed but won't address the matter.
FWIW I have a WIP BATS test to try to demonstrate the issue.

Despite whatever rationale Jetty / specs have on staying started, we
can take action to shut it down if that's best for Solr (and I think
it is).

I noticed the Solr Operator uses a liveness probe to
/admin/info/system so in practice it may not be all that important in
k8s.

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: CI builds for older Solr releases linger... why?

2024-06-06 Thread David Smiley
I saw an entry that refers people to the Confluence page I was
referring to.  That page clearly instructs to delete old jobs.

On Thu, Jun 6, 2024 at 1:19 PM Houston Putman  wrote:
>
> Ok, I see that in the release wizard there is an item to add new jenkins
> task for the release branch, but there is not an item to remove the old
> jenkins tasks.
>
> I'll go ahead and make a PR for that.
>
> - Houston
>
> On Thu, Jun 6, 2024 at 8:19 AM Eric Pugh  wrote:
>
> > David, I went to look at the builds and the long list is rather
> > overwhelming!  So +1 to pruning.
> >
> > On 2024/06/04 19:53:14 David Smiley wrote:
> > > I suspect nobody was reading the conversation Eric and I were having
> > > on bui...@solr.apache.org; maybe because nobody looks there.  Maybe we
> > > should never do that and have it be build-only messages.  Nevertheless
> > > all active Solr committers should subscribe to that list if you
> > > haven't (it's a basic project hygiene thing -- monitor builds).  So I
> > > am copy-pasting to our dev list.
> > >
> > > I will very soon take action to DELETE (not disable) the Jenkins CI
> > > builds for 8.9, 8.10 (not 8.11), 9.0, 9.1, 9.2, 9.3, 9.4, 9.5 -- there
> > > are more than one jobs for some of these releases.  Our ref guide
> > > instructions actually indicate to do this:
> > >
> > https://cwiki.apache.org/confluence/display/SOLR/JenkinsReleaseBuilds+-+Solr
> > > so I won't wait for someone to tell me not to follow these
> > > instructions ;-).   Yet release after release, nobody has done this
> > > despite this being a release-wizard step (AFAICT).  What's broken in
> > > our process here folks?  (Don't ask me, I only did a *patch* release
> > > once which has no step to do here.)
> > >
> > > ~ David
> > >
> > > -- Forwarded message -
> > > From: David Smiley 
> > > Date: Fri, May 31, 2024 at 11:14 AM
> > > Subject: Re: [JENKINS] Solr » Solr-Smoketest-9.4 - Build # 284 - Still
> > Failing!
> > > To: , Gus Heck ,
> > > 
> > > Cc: Eric Pugh , 
> > >
> > >
> > > I don't think we need release jobs for older releases -- older than
> > > the latest.  Our release process refers RMs to visit
> > >
> > https://cwiki.apache.org/confluence/display/SOLR/JenkinsReleaseBuilds+-+Solr
> > > which first instructs to remove old jobs.  I think this only happens
> > > for major/minor releases but not patch releases.
> > >
> > > https://ci-builds.apache.org/job/Solr/
> > >
> > > Gus, you did 9.6.0.  Did the release wizard direct you to
> > >
> > https://cwiki.apache.org/confluence/display/SOLR/JenkinsReleaseBuilds+-+Solr
> > > ?
> > > Jason, you did 9.5.0.  Same question.
> > >
> > >
> > > On Fri, May 31, 2024 at 8:45 AM Eric Pugh
> > >  wrote:
> > > >
> > > > At first blush, running locally things are fine….
> > > >
> > > > Is there any chance that the various Jenkins jobs could be
> > sharing/communicating across each other where a bad running Solr instance
> > in main is still there and causing others to fail?  I ask because why would
> > 9.1, 9.3, 9.4, 9.5, 9.6 all start failing between 3 days and 10 hours ago
> > and 2 days 9 hours ago?   I get changes on 9.6, but not on the previous
> > versions.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > > On May 31, 2024, at 8:19 AM, Eric Pugh <
> > ep...@opensourceconnections.com> wrote:
> > > > >
> > > > > Looks like it’s failing in 9x too. I’ll check out what’s going
> > on.
> > > > >
> > > > > What is our policy for having older tests….  Do we actually need to
> > keep around the checks for 9.0 through 9.5?  If we found a major issue in a
> > previous release like 9.2, would we just ship an updated 9.x, so it would
> > be a 9.6.2 or a 9.7?
> > > > >
> > > > > Wondering if having fewer Jenkins jobs would make it easier to keep
> > tabs on them?
> > > > >
> > > > >
> > > > >> On May 31, 2024, at 1:33 AM, David Smiley 
> > wrote:
> > > > >>
> > > > >> Eric, maybe you were working on authentication matters and could
> > thus
> > > > >> guess as to why some smoke tests fail here?  This one is for 9.4 but
> > > > >> there's another for 9.6
> > > > >>
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > > For additional commands, e-mail: dev-h...@solr.apache.org
> > >
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
> >

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Seeking guidance on Upgrading Minimum Java Version for Main Branch

2024-06-06 Thread David Smiley
The OWASP one succeeded.  I updated some other "main" branch builds to
use JDK 21.  I didn't yet touch the ones that run tests but will do
that after seeing if this round goes well.  Ideally our test situation
would be in better shape :-/

On Wed, Jun 5, 2024 at 12:17 AM David Smiley  wrote:
>
> We'll need to upgrade all the build servers (Apache, Thetaphi, Crave).
> I started by choosing just the OWASP build for Solr as a trial to see
> if it works without actually merging your PR.
>
>
>
> On Tue, Jun 4, 2024 at 8:59 AM David Smiley  wrote:
> >
> > +1 to move Solr main to Lucene's version (21) as soon as you wish.
> > Like just go for it now if you are motivated.  I echo Jan's preference
> > to not make "sweeping changes" yet however.  On the server side, we
> > can be fairly up to date with Java versions.  This does force plugin
> > writers (and people love their Solr plugins) so I don't think we
> > should pick 22 so as to not annoy our users who cannot upgrade
> > aggressively).
> >
> > We should not be so aggressive with SolrJ!  Set the language there
> > (and thus solr-api, a dependency) to the 17 language / compiled-class
> > level, at least on the "main" (not test) codeline.  The test codeline
> > uses solr-test-framework which uses solr-core thus that must always
> > match the server end.
> >
> > On Tue, Jun 4, 2024 at 1:47 AM sanjay dutt
> >  wrote:
> > >
> > >  The purpose of this email thread is to initiate a discussion about 
> > > upgrading the minimum Java version (Could be same as Lucene minimum Java 
> > > for main branch i.e. 21) for the main branch. We seek to understand the 
> > > community's major concerns and gather their guidance on this matter.
> > > Regards,Sanjay

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: CI build failures

2024-06-06 Thread David Smiley
We could add (yet) another Github workflow validation.  Or maybe take
the Docker one, which looks at certain paths (including bin/solr), and
generalizes to be not just testing docker but also running BATS
integration tests.  Then think of this as the "integration test"
build.  The "paths" it looks at could be expanded to include some of
the Java source CLI patterns so this runs in more cases that are
likely to be impacted.

I'm slightly reluctant to take the Crave build and slow it down to run
those integration tests that run serially.  It's amazing, particularly
when working locally, to get fast feedback on all Java based tests.
Admittedly it's not an either-or; it's a matter of indicating the
desired targets in the build, so I shouldn't be reluctant here.
Anyway, I'm more keen to expand the scope of the Docker build.

On Thu, Jun 6, 2024 at 9:17 AM Eric Pugh
 wrote:
>
> I think a weekly heads up would be great to have.
>
> You mention “PR validation doesn’t run BATS tests”….   Maybe it should?  
> We’ve had a lot of churn in the CLI, and we’ll probably continue to have that 
> till 10x comes out, so that would be a nice check.   Plus, if we add more 
> integration style BATS tests, why not have them be run on the PR’s?
>
> > On May 31, 2024, at 11:40 AM, David Smiley  wrote:
> >
> > I'm concerned that too few people look at the bui...@solr.apache.org
> > From time to time I go look but we have no notifications other than
> > emailing that list.
> >
> > If hypothetically nobody looked, we might as well not have CI at all;
> > we'd only have PR based validation.  We'd lose out on historic test
> > failure tracking and detection of introducing a problem that got
> > merged anyway.  PR validation doesn't run BATS tests or "Nightly"
> > tests.
> >
> > Long ago, I recall getting a direct email to me if I contributed a
> > change to a CI failure.  I would like this.  I would also like a dev
> > list email periodically (weekly?) listing the CI job status.
> >
> > Any opinions on what to do here?
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
>
> ___
> Eric Pugh | Founder | OpenSource Connections, LLC | 434.466.1467 | 
> http://www.opensourceconnections.com <http://www.opensourceconnections.com/> 
> | My Free/Busy <http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
> This e-mail and all contents, including attachments, is considered to be 
> Company Confidential unless explicitly stated otherwise, regardless of 
> whether attachments are marked as such.
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Changes in JAX-RS APIs mean may need to "gradlew clean" ?

2024-06-06 Thread David Smiley
Thanks Jason!

On Thu, Jun 6, 2024 at 9:02 AM Jason Gerlowski  wrote:
>
> It worked!  There's a PR here with the fix:
> https://github.com/apache/solr/pull/2502. Take a look when you get a
> chance and let me know what you think!
>
> (I haven't created a JIRA ticket, since it's a minor change to our
> build.  If anyone would prefer a JIRA ticket to document this beyond
> what this dev-thread and Github PR provide, let me know.)
>
> Best,
>
> Jason
>
> On Thu, Jun 6, 2024 at 7:05 AM Jason Gerlowski  wrote:
> >
> > Yeah, this is definitely a pain from time to time.
> >
> > For anyone who hasn't hit this, steps to reproduce are:
> >
> > 1. Start on a branch that has a new JAX-RS-annotated interface in
> > Solr's 'api' module.
> > 2. Any gradle task that compiles SolrJ will build Solr's OpenAPI spec,
> > generate SolrRequest implementations from that spec, and add them to
> > solrj's build directory as a part of its "source set".  These
> > generated classes often reference 'model' classes (e.g. request or
> > response POJOs) that exist on the current branch, but may not be on
> > other branches.
> > 3. Switch to a new branch, which lacks the new JAX-RS API.
> > 4. If solrj is compiled without running "clean" first, the previously
> > generated SolrRequest implementations will fail to compile (because
> > the request/response POJOs they rely on are missing).
> >
> > In terms of *when* this happens, I often see it when changing from a
> > PR-branch to main.  Though you can also see it going from 'main' to
> > 'branch_9x', if 'main' has a JAX-RS API that hasn't been backported
> > yet.
> >
> > I've been a little despairing in the past about fixing this- I know it
> > should be done, but my gradle knowledge is pretty lacking.  Though I
> > notice in writing this email that the 'openApiGenerate' task itself
> > has a few options that might fix this without any broader gradle
> > changes, particularly the "cleanupOutput" and "skipOverwrite" options.
> > I'll try playing with those a bit and report back if there's any
> > promise.
> >
> > Best,
> >
> > Jason
> >
> >
> > On Tue, Jun 4, 2024 at 6:52 PM David Smiley  wrote:
> > >
> > > I noticed some generated source files, and in particular
> > > solr/solrj/build/generated/src/main/java/org/apache/solr/client/solrj/request/FileStoreApi.java
> > > that suddenly had a compilation issue.  This is almost certainly due
> > > to the API evolving, and SOLR-17302 in particular.  Just do "gradlew
> > > clean" to start fresh.  I've hit this a couple times for different
> > > specific API issues over some months.
> > >
> > > Still... should the build be smart enough to avoid this?  For example
> > > if the generator is blind to the output directory's contents, we may
> > > as well clean the generated directory fully first.  On the other hand,
> > > maybe it smartly recognizes existing generated stuff and can tell that
> > > it doesn't need to re-generate (like javac).
> > >
> > > ~ David Smiley
> > > Apache Lucene/Solr Search Developer
> > > http://www.linkedin.com/in/davidwsmiley
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > > For additional commands, e-mail: dev-h...@solr.apache.org
> > >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Null Pointer exception while doing data import

2024-06-05 Thread David Smiley
No need to decompile; the source is available :-).
I don't see a branch or tag for that version over there.
Still; the place to ask is there, not here.

On Wed, Jun 5, 2024 at 4:44 AM Mugi, Krishnavamsireddy
 wrote:
>
> Hi David,
>
> Can you please let me how to get support on decompiling the 9.2.1 data import 
> handler jar from the below github repository?
>
> Thanks
> KrishnaVamsi
>
> -Original Message-
> From: David Smiley 
> Sent: Tuesday, June 4, 2024 11:21 PM
> To: dev@solr.apache.org
> Cc: Mugi, Krishnavamsireddy 
> Subject: Re: Null Pointer exception while doing data import
>
>
> 
>
>
> Hello,
>
> The DataImportHandler is now at
> https://urldefense.com/v3/__https://github.com/SearchScale/dataimporthandler__;!!CxwJSw!M6gPL10IeFMDgxulxNGgchxJ7Lpdv3cNFKNhYT08dR3PT6nbY0G-r7wtAl5U_fjAMiJu1nAyCnMVWEZg44_5maOmDzaz3g$
>   which is the right place to get support for it.
>
> On Tue, Jun 4, 2024 at 8:00 AM Mugi, Krishnavamsireddy 
>  wrote:
> >
> > Hi Team,
> >
> > We have multiple feed url's configured in solr, While performing data 
> > import from feed url to solr, we are getting below exception. Can you 
> > please let me know why will we get this exception while performing data 
> > import? FYI, it is not continuous, It is happening intermittently.
> >
> > Exception:
> >
> > "ERROR","org.apache.logging.slf4j.Log4jLogger","org.apache.solr.handle
> > r.dataimport.DataImporter","Full Import
> > failed","b520b029-0ab3-474c-95b7-97dbd8fd2e7c",9,"logs.APM",52,"Thread
> > -1278",1717478116656,"java.lang.RuntimeException","java.lang.NullPoint
> > erException: Cannot invoke
> > ""org.apache.solr.common.util.FastOutputStream.size()"" because
> > ""this.fos"" is null",
> >
> > Thanks
> > KrishnaVamsi
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org For additional 
> commands, e-mail: dev-h...@solr.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Seeking guidance on Upgrading Minimum Java Version for Main Branch

2024-06-04 Thread David Smiley
We'll need to upgrade all the build servers (Apache, Thetaphi, Crave).
I started by choosing just the OWASP build for Solr as a trial to see
if it works without actually merging your PR.



On Tue, Jun 4, 2024 at 8:59 AM David Smiley  wrote:
>
> +1 to move Solr main to Lucene's version (21) as soon as you wish.
> Like just go for it now if you are motivated.  I echo Jan's preference
> to not make "sweeping changes" yet however.  On the server side, we
> can be fairly up to date with Java versions.  This does force plugin
> writers (and people love their Solr plugins) so I don't think we
> should pick 22 so as to not annoy our users who cannot upgrade
> aggressively).
>
> We should not be so aggressive with SolrJ!  Set the language there
> (and thus solr-api, a dependency) to the 17 language / compiled-class
> level, at least on the "main" (not test) codeline.  The test codeline
> uses solr-test-framework which uses solr-core thus that must always
> match the server end.
>
> On Tue, Jun 4, 2024 at 1:47 AM sanjay dutt
>  wrote:
> >
> >  The purpose of this email thread is to initiate a discussion about 
> > upgrading the minimum Java version (Could be same as Lucene minimum Java 
> > for main branch i.e. 21) for the main branch. We seek to understand the 
> > community's major concerns and gather their guidance on this matter.
> > Regards,Sanjay

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



CI builds for older Solr releases linger... why?

2024-06-04 Thread David Smiley
I suspect nobody was reading the conversation Eric and I were having
on bui...@solr.apache.org; maybe because nobody looks there.  Maybe we
should never do that and have it be build-only messages.  Nevertheless
all active Solr committers should subscribe to that list if you
haven't (it's a basic project hygiene thing -- monitor builds).  So I
am copy-pasting to our dev list.

I will very soon take action to DELETE (not disable) the Jenkins CI
builds for 8.9, 8.10 (not 8.11), 9.0, 9.1, 9.2, 9.3, 9.4, 9.5 -- there
are more than one jobs for some of these releases.  Our ref guide
instructions actually indicate to do this:
https://cwiki.apache.org/confluence/display/SOLR/JenkinsReleaseBuilds+-+Solr
so I won't wait for someone to tell me not to follow these
instructions ;-).   Yet release after release, nobody has done this
despite this being a release-wizard step (AFAICT).  What's broken in
our process here folks?  (Don't ask me, I only did a *patch* release
once which has no step to do here.)

~ David

-- Forwarded message -
From: David Smiley 
Date: Fri, May 31, 2024 at 11:14 AM
Subject: Re: [JENKINS] Solr » Solr-Smoketest-9.4 - Build # 284 - Still Failing!
To: , Gus Heck ,

Cc: Eric Pugh , 


I don't think we need release jobs for older releases -- older than
the latest.  Our release process refers RMs to visit
https://cwiki.apache.org/confluence/display/SOLR/JenkinsReleaseBuilds+-+Solr
which first instructs to remove old jobs.  I think this only happens
for major/minor releases but not patch releases.

https://ci-builds.apache.org/job/Solr/

Gus, you did 9.6.0.  Did the release wizard direct you to
https://cwiki.apache.org/confluence/display/SOLR/JenkinsReleaseBuilds+-+Solr
?
Jason, you did 9.5.0.  Same question.


On Fri, May 31, 2024 at 8:45 AM Eric Pugh
 wrote:
>
> At first blush, running locally things are fine….
>
> Is there any chance that the various Jenkins jobs could be 
> sharing/communicating across each other where a bad running Solr instance in 
> main is still there and causing others to fail?  I ask because why would 9.1, 
> 9.3, 9.4, 9.5, 9.6 all start failing between 3 days and 10 hours ago and 2 
> days 9 hours ago?   I get changes on 9.6, but not on the previous versions.
>
>
>
>
>
>
> > On May 31, 2024, at 8:19 AM, Eric Pugh  
> > wrote:
> >
> > Looks like it’s failing in 9x too. I’ll check out what’s going on.
> >
> > What is our policy for having older tests….  Do we actually need to keep 
> > around the checks for 9.0 through 9.5?  If we found a major issue in a 
> > previous release like 9.2, would we just ship an updated 9.x, so it would 
> > be a 9.6.2 or a 9.7?
> >
> > Wondering if having fewer Jenkins jobs would make it easier to keep tabs on 
> > them?
> >
> >
> >> On May 31, 2024, at 1:33 AM, David Smiley  wrote:
> >>
> >> Eric, maybe you were working on authentication matters and could thus
> >> guess as to why some smoke tests fail here?  This one is for 9.4 but
> >> there's another for 9.6
> >>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Null Pointer exception while doing data import

2024-06-04 Thread David Smiley
Hello,

The DataImportHandler is now at
https://github.com/SearchScale/dataimporthandler which is the right
place to get support for it.

On Tue, Jun 4, 2024 at 8:00 AM Mugi, Krishnavamsireddy
 wrote:
>
> Hi Team,
>
> We have multiple feed url's configured in solr, While performing data import 
> from feed url to solr, we are getting below exception. Can you please let me 
> know why will we get this exception while performing data import? FYI, it is 
> not continuous, It is happening intermittently.
>
> Exception:
>
> "ERROR","org.apache.logging.slf4j.Log4jLogger","org.apache.solr.handler.dataimport.DataImporter","Full
>  Import 
> failed","b520b029-0ab3-474c-95b7-97dbd8fd2e7c",9,"logs.APM",52,"Thread-1278",1717478116656,"java.lang.RuntimeException","java.lang.NullPointerException:
>  Cannot invoke ""org.apache.solr.common.util.FastOutputStream.size()"" 
> because ""this.fos"" is null",
>
> Thanks
> KrishnaVamsi
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Seeking guidance on Upgrading Minimum Java Version for Main Branch

2024-06-04 Thread David Smiley
+1 to move Solr main to Lucene's version (21) as soon as you wish.
Like just go for it now if you are motivated.  I echo Jan's preference
to not make "sweeping changes" yet however.  On the server side, we
can be fairly up to date with Java versions.  This does force plugin
writers (and people love their Solr plugins) so I don't think we
should pick 22 so as to not annoy our users who cannot upgrade
aggressively).

We should not be so aggressive with SolrJ!  Set the language there
(and thus solr-api, a dependency) to the 17 language / compiled-class
level, at least on the "main" (not test) codeline.  The test codeline
uses solr-test-framework which uses solr-core thus that must always
match the server end.

On Tue, Jun 4, 2024 at 1:47 AM sanjay dutt
 wrote:
>
>  The purpose of this email thread is to initiate a discussion about upgrading 
> the minimum Java version (Could be same as Lucene minimum Java for main 
> branch i.e. 21) for the main branch. We seek to understand the community's 
> major concerns and gather their guidance on this matter.
> Regards,Sanjay

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



TimeOut vs ZkStateReader.waitForState

2024-06-03 Thread David Smiley
We've got this utility TimeOut that assists waiting for a condition to
hold in a limited period of time. As a generic utility, of course it
involves a sleep period.  We use it in many places.  IMO it's a sad
choice to use when it's possible to alternatively wait on a condition
that will wake up the thread.  Those who have touched SolrCloud should
be aware of ZkStateReader.waitForState, a specific alternative I have
in mind.

Say for example, in CreateCollectionCmd
https://github.com/apache/solr/blob/cae69c7973303653cade8f9de7b96e26ccd0919e/solr/core/src/java/org/apache/solr/cloud/api/collections/CreateCollectionCmd.java#L224
There's this:

// wait for a while until we see the collection
TimeOut waitUntil =
new TimeOut(30, TimeUnit.SECONDS,
ccc.getSolrCloudManager().getTimeSource());
boolean created = false;
while (!waitUntil.hasTimedOut()) {
  waitUntil.sleep(100);
  created =
ccc.getSolrCloudManager().getClusterState().hasCollection(collectionName);
  if (created) break;
}
if (!created) {
  throw new SolrException(
  SolrException.ErrorCode.SERVER_ERROR,
  "Could not fully create collection: " + collectionName);
}

But imagine replacing it with:

try {
  zkStateReader.waitForState(collectionName, 30,
TimeUnit.SECONDS, Objects::nonNull);
} catch (TimeoutException e) {
  throw new SolrException(
  SolrException.ErrorCode.SERVER_ERROR,
  "Could not fully create collection: " + collectionName,
  e);
}

I see other places in "Cmd" classes.  We also have
CloudUtil.waitForState which uses TimeOut, even for cases clearly
based on cluster state :-(

Q:  Is this replacement fine or am I missing something?  TimeSource
may be pluggable but it seems sad to forgo waitForState over such a
matter.

I could file a work item -- should be an easy "newdev" one.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Purpose of IndexUpgraderTool

2024-06-03 Thread David Smiley
FWIW, in my experience I've never run this tool (nor colleagues) at
any stage in my career that I can remember.  For one reason, all the
systems could re-index if they needed to.
It may be best to remove this information, as it could introduce more
confusion than it helps.

On Mon, Jun 3, 2024 at 1:34 PM Jason Gerlowski  wrote:
>
> Hey all,
>
> I was poking around the ref-guide a bit recently and noticed our page
> on the "IndexUpgraderTool" that Lucene produces. [1]
>
> AFAICT, the page doesn't hint at when/why a user might want to use the
> IndexUpgraderTool.  Maybe at one point the tool might've been
> preferred to loading the index in an upgraded Solr version, but I
> haven't heard of anyone doing that in the last few years.
>
> Is this something we expect users to still do?  If so, for what
> usecase?  And if not - should we drop it from the ref-guide - it seems
> like it might confuse folks since it's not actually needed to upgrade
> Solr versions...
>
> Best,
>
> Jason
>
> [1] 
> https://solr.apache.org/guide/solr/latest/deployment-guide/indexupgrader-tool.html
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Berlin Buzzwords attendance

2024-05-31 Thread David Smiley
Yup me as well, and some other speakers like Varun will be there.

On Fri, May 31, 2024 at 12:13 PM Doug Turnbull
 wrote:
>
> I'll be there! I'll be talking about a project that uses Solr Learning to
> Rank :)
>
> https://2024.berlinbuzzwords.de/sessions/?id=WKMGWH
>
> On Fri, May 31, 2024 at 10:18 AM Ilan Ginzburg  wrote:
>
> > Hi Solr devs!
> >
> > I will be attending Berlin Buzzwords this year (June 9-11) and I know a few
> > others here will be present as well.
> > Would be nice to meet in person! 
> >
> > Ilan
> >

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



CI build failures

2024-05-31 Thread David Smiley
I'm concerned that too few people look at the bui...@solr.apache.org
>From time to time I go look but we have no notifications other than
emailing that list.

If hypothetically nobody looked, we might as well not have CI at all;
we'd only have PR based validation.  We'd lose out on historic test
failure tracking and detection of introducing a problem that got
merged anyway.  PR validation doesn't run BATS tests or "Nightly"
tests.

Long ago, I recall getting a direct email to me if I contributed a
change to a CI failure.  I would like this.  I would also like a dev
list email periodically (weekly?) listing the CI job status.

Any opinions on what to do here?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [JENKINS] Solr » Solr-Smoketest-9.4 - Build # 284 - Still Failing!

2024-05-31 Thread David Smiley
I don't think we need release jobs for older releases -- older than
the latest.  Our release process refers RMs to visit
https://cwiki.apache.org/confluence/display/SOLR/JenkinsReleaseBuilds+-+Solr
which first instructs to remove old jobs.  I think this only happens
for major/minor releases but not patch releases.

https://ci-builds.apache.org/job/Solr/

Gus, you did 9.6.0.  Did the release wizard direct you to
https://cwiki.apache.org/confluence/display/SOLR/JenkinsReleaseBuilds+-+Solr
?
Jason, you did 9.5.0.  Same question.


On Fri, May 31, 2024 at 8:45 AM Eric Pugh
 wrote:
>
> At first blush, running locally things are fine….
>
> Is there any chance that the various Jenkins jobs could be 
> sharing/communicating across each other where a bad running Solr instance in 
> main is still there and causing others to fail?  I ask because why would 9.1, 
> 9.3, 9.4, 9.5, 9.6 all start failing between 3 days and 10 hours ago and 2 
> days 9 hours ago?   I get changes on 9.6, but not on the previous versions.
>
>
>
>
>
>
> > On May 31, 2024, at 8:19 AM, Eric Pugh  
> > wrote:
> >
> > Looks like it’s failing in 9x too. I’ll check out what’s going on.
> >
> > What is our policy for having older tests….  Do we actually need to keep 
> > around the checks for 9.0 through 9.5?  If we found a major issue in a 
> > previous release like 9.2, would we just ship an updated 9.x, so it would 
> > be a 9.6.2 or a 9.7?
> >
> > Wondering if having fewer Jenkins jobs would make it easier to keep tabs on 
> > them?
> >
> >
> >> On May 31, 2024, at 1:33 AM, David Smiley  wrote:
> >>
> >> Eric, maybe you were working on authentication matters and could thus
> >> guess as to why some smoke tests fail here?  This one is for 9.4 but
> >> there's another for 9.6
> >>
> >> On Fri, May 31, 2024 at 1:21 AM Apache Jenkins Server
> >>  wrote:
> >>>
> >>> Build: https://ci-builds.apache.org/job/Solr/job/Solr-Smoketest-9.4/284/
> >>>
> >>> Log:
> >>> Started by timer
> >>> Started by timer
> >>> Running as SYSTEM
> >>> [EnvInject] - Loading node environment variables.
> >>> Building remotely on lucene-solr-1 (solr lucene) in workspace 
> >>> /home/jenkins/jenkins-slave/workspace/Solr/Solr-Smoketest-9.4
> >>> [WS-CLEANUP] Deleting project workspace...
> >>> [WS-CLEANUP] Deferred wipeout is used...
> >>> [WS-CLEANUP] Done
> >>> The recommended git tool is: NONE
> >>> No credentials specified
> >>> Cloning the remote Git repository
> >>> Cloning repository https://github.com/apache/solr.git
> >>>> git init /home/jenkins/jenkins-slave/workspace/Solr/Solr-Smoketest-9.4 # 
> >>>> timeout=10
> >>> Fetching upstream changes from https://github.com/apache/solr.git
> >>>> git --version # timeout=10
> >>>> git --version # 'git version 2.17.1'
> >>>> git fetch --tags --progress -- https://github.com/apache/solr.git 
> >>>> +refs/heads/*:refs/remotes/origin/* # timeout=10
> >>>> git config remote.origin.url https://github.com/apache/solr.git # 
> >>>> timeout=10
> >>>> git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* 
> >>>> # timeout=10
> >>> Avoid second fetch
> >>>> git rev-parse origin/branch_9_4^{commit} # timeout=10
> >>> Checking out Revision fe0b0fda7c235857231e1af2eb03b3622cd5a4cf 
> >>> (origin/branch_9_4)
> >>>> git config core.sparsecheckout # timeout=10
> >>>> git checkout -f fe0b0fda7c235857231e1af2eb03b3622cd5a4cf # timeout=10
> >>>> git branch -a -v --no-abbrev # timeout=10
> >>>> git checkout -b branch_9_4 fe0b0fda7c235857231e1af2eb03b3622cd5a4cf # 
> >>>> timeout=10
> >>> Commit message: "[RefGuide] Update system-requirements (#2361)"
> >>>> git rev-list --no-walk fe0b0fda7c235857231e1af2eb03b3622cd5a4cf # 
> >>>> timeout=10
> >>> No emails were triggered.
> >>> provisioning config files...
> >>> copy managed file [gradle.properties] to 
> >>> file:/home/jenkins/jenkins-slave/workspace/Solr/Solr-Smoketest-9.4/gradle.properties
> >>> [Gradle] - Launching build.
> >>> [Solr-Smoketest-9.4] $ 
> >>> /home/jenkins/jenkins-slave/workspace/Solr/Solr-Smoketest-9.4/gradlew 
> >>> -Dversion.suffix= 
> >>> -Dgradle.user.home=/home/jenkins/jenkins-slave/workspace/Solr/.

Re: Found a bug in 9.6...

2024-05-29 Thread David Smiley
Yeah, +1 to what Houston said, thus continue with releasing 9.6.1 without
this fix.

On Wed, May 29, 2024 at 3:50 PM Houston Putman  wrote:

> Honestly there are some pretty bad bugs that 9.6.1 is aiming to fix, so
> I'm hesitant to fail the RC for this. If you have a quick fix, I can make a
> new RC, but otherwise I'd likely just leave it for 9.6.2 or 9.7.0.
>
> These releases aren't hard to do, we can always do another one in a few
> weeks.
>
> - Houston
>
> On Wed, May 29, 2024 at 1:15 PM Eric Pugh 
> wrote:
>
>> I don’t know that it’s worth holding up 9.6.1 for….
>>
>>
>> [SOLR-17315] Bug in messaging when creating a collection, and then you
>> can't actually call the config method to set-user-property - ASF JIRA
>> 
>> issues.apache.org 
>> [image: fav-jsw.png] 
>> 
>>
>> It started as just a bug in the messaging from bin/solr create -c
>> mycollection, but found that the bin/solr config -c mycollection -action
>> set-user-property -property update.autoCreateFields -value false
>>
>> Is failing….Works on main ;-(
>>
>> ___
>> *Eric Pugh **| *Founder | OpenSource Connections, LLC | 434.466.1467 |
>> http://www.opensourceconnections.com | My Free/Busy
>> 
>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
>> 
>> This e-mail and all contents, including attachments, is considered to be
>> Company Confidential unless explicitly stated otherwise, regardless
>> of whether attachments are marked as such.
>>
>>


Re: Bugfix release Solr 9.6.1

2024-05-28 Thread David Smiley
Curious, did you generate the list of people to think with
https://github.com/apache/solr/pull/2424 ?

On Thu, May 23, 2024 at 2:10 PM Houston Putman  wrote:
>
> Release notes can be found here:
> https://cwiki.apache.org/confluence/display/SOLR/ReleaseNote9_6_1
>
> On Thu, May 23, 2024 at 12:55 PM Houston Putman  wrote:
>
> > NOTICE:
> >
> > I am now preparing for a bugfix release from branch branch_9_6
> >
> > Please observe the normal rules for committing to this branch:
> >
> > * Before committing to the branch, reply to this thread and argue
> >   why the fix needs backporting and how long it will take.
> > * All issues accepted for backporting should be marked with 9.6.1
> >   in JIRA, and issues that should delay the release must be marked as
> > Blocker
> > * All patches that are intended for the branch should first be committed
> >   to the unstable branch, merged into the stable branch, and then into
> >   the current release branch.
> > * Only Jira issues with Fix version 9.6.1 and priority "Blocker" will delay
> >   a release candidate build.
> >

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Amazon ion-java library version

2024-05-27 Thread David Smiley
Hi Francisco,

There are no plans for another Solr 8.11.x release, and I don't expect
there to be one if the remedy is upgrading the JAR manually which any
user can do.  That's also only for a module of Solr that a subset of
users must opt-in to use for this to even be applicable.  Also, that
CVE is just a DOS attack -- these usually don't concern me.

~ David

On Mon, May 27, 2024 at 6:52 AM Francisco Jose Mulero
 wrote:
>
> Hi
>
> The library software.amazon.ion/ion-java is currently fixed to version
> 1.0.2 [1]. That  library is provided along with the version 8.11.3. I am
> not sure where it comes from but that version has a high CVE reported
> (CVE-2024-21634 [2]) . Is there any plan to update it?
>
> [1]
> https://github.com/apache/solr/blob/2b28161cc565f695e0ec0761a0c3b0f4c09074f9/versions.lock#L453C1-L453C35
> [2] https://nvd.nist.gov/vuln/detail/CVE-2024-21634
>
> --
> This electronic communication and the information and any files transmitted
> with it, or attached to it, are confidential and are intended solely for
> the use of the individual or entity to whom it is addressed and may contain
> information that is confidential, legally privileged, protected by privacy
> laws, or otherwise restricted from disclosure to anyone else. If you are
> not the intended recipient or the person responsible for delivering the
> e-mail to the intended recipient, you are hereby notified that any use,
> copying, distributing, dissemination, forwarding, printing, or copying of
> this e-mail is strictly prohibited. If you received this e-mail in error,
> please return the e-mail to the sender, delete it from your computer, and
> destroy any printed copy of it.

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [VOTE] Release Solr 9.6.1 RC1

2024-05-24 Thread David Smiley
+1

SUCCESS! [0:43:15.535671]

On Thu, May 23, 2024 at 3:40 PM Houston Putman  wrote:
>
> Please vote for release candidate 1 for Solr 9.6.1
>
> The artifacts can be downloaded from:
> https://dist.apache.org/repos/dist/dev/solr/solr-9.6.1-RC1-rev-d7f7166567f52f1b31e3315b0188e11f2c4c9b60
>
> You can run the smoke tester directly with this command:
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
> https://dist.apache.org/repos/dist/dev/solr/solr-9.6.1-RC1-rev-d7f7166567f52f1b31e3315b0188e11f2c4c9b60
>
> You can build a release-candidate of the official docker images (full &
> slim) using the following command:
>
> SOLR_DOWNLOAD_SERVER=
> https://dist.apache.org/repos/dist/dev/solr/solr-9.6.1-RC1-rev-d7f7166567f52f1b31e3315b0188e11f2c4c9b60/solr
> && \
>   docker build $SOLR_DOWNLOAD_SERVER/9.6.1/docker/Dockerfile.official-full \
> --build-arg SOLR_DOWNLOAD_SERVER=$SOLR_DOWNLOAD_SERVER \
> -t solr-rc:9.6.1-1 && \
>   docker build $SOLR_DOWNLOAD_SERVER/9.6.1/docker/Dockerfile.official-slim \
> --build-arg SOLR_DOWNLOAD_SERVER=$SOLR_DOWNLOAD_SERVER \
> -t solr-rc:9.6.1-1-slim
>
> The vote will be open for at least 72 hours i.e. until 2024-05-29 20:00 UTC.
> (Extended due to weekend and US holiday)
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Welcome Sanjay Dutt as Solr committer!

2024-05-20 Thread David Smiley
The Project Management Committee (PMC) for Apache Solr has invited
Sanjay Dutt to become a committer and we are pleased to announce that
they have accepted.

Sanjay, the tradition is that new committers introduce themselves with
a brief bio.

Congratulations and welcome!

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Monthly Virtual Meetups and continued use of Meetup.com

2024-05-20 Thread David Smiley
Thanks for doing that Ishan!

On Sat, May 18, 2024 at 3:06 PM Ishan Chattopadhyaya
 wrote:
>
> I've been paying for our group's Meetup.com account. I no longer have the
> funds to do so anymore. I can't pay for that account any more.
>
> On Sat, 18 May 2024 at 15:01, Mark Miller  wrote:
>
> > Wow, I never knew meetup.com charged - and it looks like as much as a
> > streaming service. Wow. The only value I know of that meetup.com really
> > provides is discoverability and rsvp. Neither of which seems that valuable
> > for this.
> >
> > It would be crazy to pay that monthly fee if you didn’t already. If it did
> > have some value, I’d try and replace it with a free option. I think
> > EventBrite is free for free events, and LinkedIn has a LinkedIn Event
> > feature.
> >

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Solr and hjave JRE support

2024-05-16 Thread David Smiley
Please use the us...@solr.apache.org list.  "dev" is for internal
development.

On Thu, May 16, 2024 at 3:12 PM Perera, Priyantha 
wrote:

> WE have a sitecore 10.3 installation running Solr 8.11.2.
>
> We have been just informed that there is a vulnerability related to Oracle
> JRE 1.8.0.381.
>
> What java version is supported for solr 8.11.2?
> [image: Wescom-Logo.png] 
> [image: Instagram.png] [image:
> Facebook.png] [image:
> LinkedIn.png] [image:
> X.png] 
> wescom.org  * Priyantha Perera*
> *Sr Implementation Engineer*
> [image: GPTW-Logo.png]
> Wescom Credit Union
> 123 South Marengo Ave
> Pasadena, California 91101
> pper...@wescom.org
> Phone: 888-493-7266 x8822
>
>
>
>
> --
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you have received this email in error, please delete it immediately and
> advise the sender. WESCOM CREDIT UNION 1-888-493-7266
>


CHANGES.txt, process improvement solicitation

2024-05-16 Thread David Smiley
Managing CHANGES.txt in each PR/change we do is a pain.  It's so
merge-conflict prone.  I don't mean to call for the removal of
CHANGES.txt (although I've wished for this off-and-on), but want to
solicit inputs on what can be done to make this easier.

I could be mistaken but maybe it was Calvin Cowie that recommended a
scheme something like the following: each change/PR merely adds its
own txt file to a new CHANGES directory instead of adding to
CHANGES.txt.  Then it will be aggregated / concatenated to CHANGES.txt
at a release boundary by the RM using a script.  The per-change file
then goes away.  The category (e.g. New Feature vs Bug Fix) would need
to be encoded somewhere, like maybe in the file name.  Even the JIRA
number could be part of the file name and not its content.  No ad-hoc
newline use either; just write the message and the script will
line-wrap it.  This is probably the simplest and least disruptive
change.

I'm kind of envious of small projects that can simply rely on GitHub's
release notes generator [1].  Yeah it's just the first-line commit
message, which emphasizes brevity over clarity.

[1] 
https://docs.github.com/en/repositories/releasing-projects-on-github/automatically-generated-release-notes

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [DISCUSS] Community Virtual Meetup, May 2024

2024-05-14 Thread David Smiley
23rd or any.  Possible discussion topic:  SolrJ API back-wards compatibility.

On Tue, May 14, 2024 at 1:47 AM raghavan m  wrote:
>
> Hello everyone
>
> I am soliciting votes for the following days and times. Please reply to
> this thread with your preferred 1) *date and time options from below *and
> 2) *topic you want to discuss *
>
> *Voting closes May 17th 11:59 pm*
>
> *Options*
> 1. 05/22/2024 9 am pacific time
> 2. 05/23/2024 9 am pacific time
> 3. 05/24/2024 9 am pacific time
> 4. 05/27/2024 9 am pacific time
>
> thanks,
> *Raghavan*
>
>
> On Sun, May 12, 2024 at 7:03 PM raghavan m  wrote:
>
> > Thanks Jason.
> > I will start a thread and create a page, to collect topics.
> > *Raghavan*
> >
> >
> > On Sun, May 12, 2024 at 7:00 PM Jason Gerlowski 
> > wrote:
> >
> >> Raghavan - absolutely, thanks for stepping up!  (And welcome back to
> >> the country!).
> >>
> >> On Thu, May 9, 2024 at 1:51 PM raghavan m  wrote:
> >> >
> >> > Hey Jason
> >> > I am back in the country. Can I volunteer?
> >> >
> >> > Sent from iPhone
> >> >
> >> >
> >> > On Thu, May 9, 2024 at 9:35 AM Jason Gerlowski 
> >> > wrote:
> >> >
> >> > > Hey all,
> >> > >
> >> > > It's time once again to start thinking ahead to this month's virtual
> >> > > meetup!
> >> > >
> >> > > As always, two questions:
> >> > >
> >> > > 1. Does anyone have an interest in organizing?  Duties are light but
> >> > > it's an important job.  I'm happy to organize by default if there
> >> > > aren't any volunteers by mid-next-week.  (Addtl details:
> >> > > https://cwiki.apache.org/confluence/display/SOLR/Meeting+notes)
> >> > >
> >> > > 2. Does anyone have preferences on the date or time-of-day?
> >> > >
> >> > > Best,
> >> > >
> >> > > Jason
> >> > >
> >> > > -
> >> > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> >> > > For additional commands, e-mail: dev-h...@solr.apache.org
> >> > >
> >> > >
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> >> For additional commands, e-mail: dev-h...@solr.apache.org
> >>
> >>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Synonyms

2024-05-14 Thread David Smiley
I don't think so but the comma and any other char can be escaped with
a backslash.

On Tue, May 14, 2024 at 10:51 AM Bauchwitz, Leonardo  wrote:
>
> Hello,
>
>
> I'm sorry if I ask a very basic question. I'm setting up a synonym file for 
> chemical substances. The problem is that several of these substances have 
> names like this: 2,4-Toluylendiisocyanate. There are commas included in the 
> name, but the synonym file uses the comma as a separator for synonyms. Is it 
> possible to change the separator used by Solr's SynonymGraphFilterFactory?
>
> Regards
> Leonardo F. Bauchwitz

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [DISCUSS] Solr 9.6.1 release

2024-05-14 Thread David Smiley
Thanks for volunteering:
https://issues.apache.org/jira/browse/SOLR-17275 is a blocker IMO.
It's being looked at now.

On Tue, May 14, 2024 at 3:38 PM Houston Putman  wrote:
>
> Hey everyone,
>
> I've finished up two tickets (SOLR-17049
>  and SOLR-17261
> ) that I believe are
> serious enough bugs to release a 9.6.1. I think SOLR-17049 is pretty
> long-standing, but SOLR-17261 was introduced in 9.5, so definitely not
> something that people have learned to live with.
>
> I'm planning on starting the release process on next monday, May 20, unless
> anyone has an objection. That will give everyone some time to get other bug
> fixes in that they may need.
>
> - Houston

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Removing ZK Watches

2024-05-14 Thread David Smiley
Apparently ZooKeeper 3.5 implements the removal of Watchers[1] (thanks
Ilan for this FYI).
It seems helpful for SolrCloud to remove watchers that are no longer
pertinent.  This isn't an area I'm familiar with.  I see that the
Curator framework supports it too.  Has anyone else looked into using
this?  For example to manage state.json watchers.  Granted for that
one, you may need a local ref-count, probably.  Or use one Watcher per
replica; not sure if that's a scale concern.  Also curious if others
have considered what it would look like for Curator to manage watching
collection states -- like what utilities Curator has that would be
well suited.

[1] 
https://zookeeper.apache.org/doc/current/zookeeperProgrammers.html#sc_WatchRemoval

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: SolrJ collection creation API, replica type specificity

2024-05-08 Thread David Smiley
For handlers on a core or collection, defaults are easy to specify in
a  in solrconfig.xml.  It'd be cool if we could have a
solr.xml that specifies handlers, and with deeper specificity to the
operations.  Kind of a shame to have these dispatching handlers; if it
were at the admin operation level then we'd get cool features for free
like distinct metrics.

On Fri, May 3, 2024 at 3:09 PM David Smiley  wrote:
>
> I totally understand that the client should be empowered to be
> specific, and it is right now.  But I also think we should support the
> client being unspecific, and instead allow Solr service owners via
> Solr-side configuration to choose what makes sense.  Where I work,
> there are different teams between client and server, the
> people/service at the client don't care about Solr infrastructure
> specifics and new-fangled options (PRS being another) and replica
> types.  Updating their client to tweak options around this is
> annoying.  They just want a collection to be created, even with an
> assumed configSet as this Solr cluster is only for servicing the needs
> of that client .  The Solr service owner (me) is responsible for Solr
> specifics.  One could image for one app, assume TLOG and PULL and for
> another, both NRT, or whatever really.
>
> On Fri, May 3, 2024 at 2:58 PM Jason Gerlowski  wrote:
> >
> > You didn't mention it by name, but it sounds like you're talking about
> > the v1 API's "replicationFactor" parameter (which has defaulted to
> > creating NRT replicas for awhile now)?
> >
> > Personally, I'd rather see that parameter (and corresponding SolrJ
> > code) go away altogether.  Some things (e.g. the configset name, the
> > number of shards) are important enough that we force users to be
> > explicit about them...IMO the number and type of replicas fall into
> > that category.
> >
> > But while the ambiguous "replicationFactor" parameter exists I think
> > some sort of "default replica type" concept makes sense.  (Granted
> > that we find a way to handle certain complications, like how "PULL"
> > replicas *must* be used in conjunction with replicas of other
> > leader-eligible replica types.)
> >
> > On Wed, May 1, 2024 at 2:32 PM David Smiley  wrote:
> > >
> > > In the interests of supporting different replica types better, I'd
> > > like our SolrJ CollectionAdminRequest methods to not *locally* assume
> > > NRT when creating a request.  Calls like createCollection(collection
> > > name, numShards, numReplicas) are nicely ambiguous as to the type, nor
> > > do javadocs indicate what the type is.  This is good, I think.  Yet
> > > the default behavior is to create a v1 API call that specifies how
> > > many NRT replicas (yes of this specific type) to make.  Instead, I'd
> > > like to see the replica type decision made by the server (Solr).
> > > Today it also assumes NRT but I could imagine something as simple as
> > > EnvUtils (env var / sys prop) deciding what the default type should
> > > be.  So far this is merely changing CollectionAdminRequest and
> > > consequently the specificity of its v1 requests.  It'd be followed up
> > > by improving a number of tests to be less specific to NRT.  Any
> > > concerns here?
> > >
> > > *Actually* using other replica types (like TLOG or ZERO) may raise
> > > issues for some tests beyond this, sure.  In particular, many tests
> > > assume read-your-write (index, commit, query --> find it) but this
> > > won't hold if the query randomly routes to a non-leader.  For this I'm
> > > thinking an automatically applied
> > > shards.preference=replica.leader:true
> > > https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-distributed-requests.html#shards-preference-parameter
> > > -- only when the default replica type isn't NRT.
> > >
> > > ~ David Smiley
> > > Apache Lucene/Solr Search Developer
> > > http://www.linkedin.com/in/davidwsmiley
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > > For additional commands, e-mail: dev-h...@solr.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: SolrJ collection creation API, replica type specificity

2024-05-03 Thread David Smiley
I totally understand that the client should be empowered to be
specific, and it is right now.  But I also think we should support the
client being unspecific, and instead allow Solr service owners via
Solr-side configuration to choose what makes sense.  Where I work,
there are different teams between client and server, the
people/service at the client don't care about Solr infrastructure
specifics and new-fangled options (PRS being another) and replica
types.  Updating their client to tweak options around this is
annoying.  They just want a collection to be created, even with an
assumed configSet as this Solr cluster is only for servicing the needs
of that client .  The Solr service owner (me) is responsible for Solr
specifics.  One could image for one app, assume TLOG and PULL and for
another, both NRT, or whatever really.

On Fri, May 3, 2024 at 2:58 PM Jason Gerlowski  wrote:
>
> You didn't mention it by name, but it sounds like you're talking about
> the v1 API's "replicationFactor" parameter (which has defaulted to
> creating NRT replicas for awhile now)?
>
> Personally, I'd rather see that parameter (and corresponding SolrJ
> code) go away altogether.  Some things (e.g. the configset name, the
> number of shards) are important enough that we force users to be
> explicit about them...IMO the number and type of replicas fall into
> that category.
>
> But while the ambiguous "replicationFactor" parameter exists I think
> some sort of "default replica type" concept makes sense.  (Granted
> that we find a way to handle certain complications, like how "PULL"
> replicas *must* be used in conjunction with replicas of other
> leader-eligible replica types.)
>
> On Wed, May 1, 2024 at 2:32 PM David Smiley  wrote:
> >
> > In the interests of supporting different replica types better, I'd
> > like our SolrJ CollectionAdminRequest methods to not *locally* assume
> > NRT when creating a request.  Calls like createCollection(collection
> > name, numShards, numReplicas) are nicely ambiguous as to the type, nor
> > do javadocs indicate what the type is.  This is good, I think.  Yet
> > the default behavior is to create a v1 API call that specifies how
> > many NRT replicas (yes of this specific type) to make.  Instead, I'd
> > like to see the replica type decision made by the server (Solr).
> > Today it also assumes NRT but I could imagine something as simple as
> > EnvUtils (env var / sys prop) deciding what the default type should
> > be.  So far this is merely changing CollectionAdminRequest and
> > consequently the specificity of its v1 requests.  It'd be followed up
> > by improving a number of tests to be less specific to NRT.  Any
> > concerns here?
> >
> > *Actually* using other replica types (like TLOG or ZERO) may raise
> > issues for some tests beyond this, sure.  In particular, many tests
> > assume read-your-write (index, commit, query --> find it) but this
> > won't hold if the query randomly routes to a non-leader.  For this I'm
> > thinking an automatically applied
> > shards.preference=replica.leader:true
> > https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-distributed-requests.html#shards-preference-parameter
> > -- only when the default replica type isn't NRT.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: PRS, important changes needed

2024-05-02 Thread David Smiley
Thanks for the update and willingness to help on this journey, Justin!
 Maybe an umbrella JIRA would make sense for tracking purposes, and
link child issues or do sub-tasks.  Perhaps the goal is "PRS enabled
by default".

I don't know if this thread is the best place to discuss it but having
the PRS znode be ephemeral would be really beneficial to dramatically
strengthen the goals of PRS by efficiently handling restarts.  No need
to mark a node's replicas as down!

On Thu, May 2, 2024 at 3:45 PM Justin Sweeney
 wrote:
>
> We (Fullstory) have been running PRS for quite a while now with great
> stability and a huge performance benefit for us particularly in terms of
> cluster restarts. That said, our use case certainly isn't everyone's use
> case. We run large clusters with lots of cores so we get a particular
> benefit. My expectation in the current state is that as far as performance
> it will help some use cases without hurting any use case.
>
> I don't know what timing will look like but I am +1 with David on moving to
> PRS in Solr 10 as it would make code maintenance much better going forward.
> Both myself and other devs at Fullstory can definitely make contributions
> toward getting PRS to a state where others also are getting the performance
> benefits and feel comfortable with this decision.
>
> I can look at adding some Jira's along these lines and would be happy to
> discuss more along the way.
>
> On Thu, May 2, 2024 at 1:59 PM Houston Putman  wrote:
>
> > I'm all for moving towards it if it has both (or at least a good tradeoff
> > between):
> >
> >- A proven stability, like the current implementation
> >- A noted increase in performance for common use cases
> >
> > It seems to me that without the performance benefits, the loss in stability
> > (PRS has had a few bad bugs in 9x releases) is worrisome.
> > I'd be very happy to move to PRS if we can improve it to give us concrete
> > benefits, but until then I'm not in favor of making it the default.
> >
> > Maybe the ephemeral node for replica state
> > <https://github.com/apache/solr/pull/2432#discussion_r158684> is the
> > logic we really need to make PRS "pop", but I haven't thought about it
> > a ton.
> >
> > - Houston
> >
> > On Thu, May 2, 2024 at 12:52 PM David Smiley  wrote:
> >
> > > Note that PRS has existed for all of the 9x series.  I say in 10,
> > > let's finally move on.  Be bold.
> > >
> > > On Thu, May 2, 2024 at 1:19 PM Ilan Ginzburg  wrote:
> > > >
> > > > There is no plan to remove the non PRS way to manage replica state
> > before
> > > > making PRS the default way to manage replica state (in addition to the
> > > > current state.json option) then letting PRS bake for a while with all
> > new
> > > > deployments (for example a whole release), right?
> > > >
> > > > Ilan
> > > >
> > > >
> > > >
> > > > On Thu, May 2, 2024 at 6:25 PM David Smiley 
> > wrote:
> > > >
> > > > > In the meetup, my colleague Aparna shared her explorative findings of
> > > > > enabling PRS, which uncovered 2 matters that seem to defeat much of
> > > > > PRS's idealized benefits:
> > > > > * Shard leader elections still touch state.json
> > > > > * Replica state is still in state.json
> > > > > Given those two matters, we didn't notice any improvement (or
> > > > > regression).  Some FullStory devs, who use this mode in production,
> > > > > shared that the first matter wasn't noticed by them because they only
> > > > > run with one replica per shard.  The other... was unclear why; maybe
> > > > > for backwards-compatibility?  In my mind, there shouldn't be such a
> > > > > concern as it's enabled per-collection and you'd only do this once
> > all
> > > > > servers are known to be PRS-enabled (e.g. have a modern Solr
> > version).
> > > > >
> > > > > If we can identify JIRA issues to capture the work involved, we could
> > > > > converse more and track the work to completion.  I think it's
> > > > > important to get to a future in Solr 10 where there is one mode (PRS)
> > > > > not two.
> > > > >
> > > > > ~ David Smiley
> > > > > Apache Lucene/Solr Search Developer
> > > > > http://www.linkedin.com/in/davidwsmiley
> > > > >
> > > > > -
> > > > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > > > > For additional commands, e-mail: dev-h...@solr.apache.org
> > > > >
> > > > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > > For additional commands, e-mail: dev-h...@solr.apache.org
> > >
> > >
> >

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: PRS, important changes needed

2024-05-02 Thread David Smiley
Note that PRS has existed for all of the 9x series.  I say in 10,
let's finally move on.  Be bold.

On Thu, May 2, 2024 at 1:19 PM Ilan Ginzburg  wrote:
>
> There is no plan to remove the non PRS way to manage replica state before
> making PRS the default way to manage replica state (in addition to the
> current state.json option) then letting PRS bake for a while with all new
> deployments (for example a whole release), right?
>
> Ilan
>
>
>
> On Thu, May 2, 2024 at 6:25 PM David Smiley  wrote:
>
> > In the meetup, my colleague Aparna shared her explorative findings of
> > enabling PRS, which uncovered 2 matters that seem to defeat much of
> > PRS's idealized benefits:
> > * Shard leader elections still touch state.json
> > * Replica state is still in state.json
> > Given those two matters, we didn't notice any improvement (or
> > regression).  Some FullStory devs, who use this mode in production,
> > shared that the first matter wasn't noticed by them because they only
> > run with one replica per shard.  The other... was unclear why; maybe
> > for backwards-compatibility?  In my mind, there shouldn't be such a
> > concern as it's enabled per-collection and you'd only do this once all
> > servers are known to be PRS-enabled (e.g. have a modern Solr version).
> >
> > If we can identify JIRA issues to capture the work involved, we could
> > converse more and track the work to completion.  I think it's
> > important to get to a future in Solr 10 where there is one mode (PRS)
> > not two.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
> >

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



PRS, important changes needed

2024-05-02 Thread David Smiley
In the meetup, my colleague Aparna shared her explorative findings of
enabling PRS, which uncovered 2 matters that seem to defeat much of
PRS's idealized benefits:
* Shard leader elections still touch state.json
* Replica state is still in state.json
Given those two matters, we didn't notice any improvement (or
regression).  Some FullStory devs, who use this mode in production,
shared that the first matter wasn't noticed by them because they only
run with one replica per shard.  The other... was unclear why; maybe
for backwards-compatibility?  In my mind, there shouldn't be such a
concern as it's enabled per-collection and you'd only do this once all
servers are known to be PRS-enabled (e.g. have a modern Solr version).

If we can identify JIRA issues to capture the work involved, we could
converse more and track the work to completion.  I think it's
important to get to a future in Solr 10 where there is one mode (PRS)
not two.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: solr query alerting

2024-05-01 Thread David Smiley
Luwak is good to me!

On Tue, Apr 30, 2024 at 4:01 PM Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD
A)  wrote:
>
> I love the name "luwak"! I was about to suggest the same but was worried 
> about the trademark concerns and I assumed there was a reason they changed 
> the name when donating it to lucene.
>
> From: dev@solr.apache.org At: 04/30/24 15:56:22 UTC-4:00To:  
> dev@solr.apache.org
> Subject: Re: solr query alerting
>
> Luwak is the original name of the Lucene monitor, contributed by Flax back in
> the days: https://github.com/flaxsearch/luwak
>
> Perhaps we could go full circle (if no trademark issues) to call it the Solr
> luwak module? Luwak is a type of coffee, and thus related to percolator 
>
> Otherwise “stored-queries” is an option.
>
> Jan Høydahl
>
> > 30. apr. 2024 kl. 19:26 skrev David Smiley :
> >
> > I agree the feature is relevant / useful.
> >
> > Another angle on the module vs sandbox or wherever else is maintenance
> > cost.  If a lot of code is being contributed as is here, then as a PMC
> > member I hope to get a subjective sense that folks are interested in
> > maintaining it.  On one hand we have a number of committers here from
> > Bloomberg, yet the abandoned and now-removed "analytics" component
> > shows that abandonment is a risk nonetheless.  I don't know how to
> > conclude this thought but I'm hoping to hear from folks that they
> > intend to look after this module.  It's not just being "thrown over
> > the wall", so to speak.
> >
> > Naming is hard...
> > * ...-monitor-: sorry I hate it
> > * ...-percolator- No clue why this was chosen for ElasticSearch.
> > I can appreciate a curious/non-obvious name like this that is not
> > going to conflict with anyone's guesses at what a general name might
> > convey.
> > * "indexed-queries" or "query-indexing" would be a good name?  This is
> > the best technical name I can think of.
> > *  "reverse search" came to mind (based on the Netflix article)
> > although that makes me think of leading-wildcard / suffix-search.
> > * "inverted-search"
> > *  "indexed-query-alerts" incorporates "alerts" thus might better
> > convey the use-case
> >
> >> On Mon, Apr 1, 2024 at 3:53 PM Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD
> >> A)  wrote:
> >>
> >> Hi All,
> >>
> >> A few months ago I wrote the user list about potentially integrating lucene
> monitor into solr. I have raised this PR with a first attempt at implementing
> this integration. I'd greatly appreciate any feedback on this even though I
> still have it marked as draft. I want to make sure I'm heading in the right
> direction here so input from solr dev community would be extremely valuable 
> :-)
> >>
> >> Many thanks,
> >> Luke
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Failed startup behavior

2024-04-30 Thread David Smiley
I noticed (as well as my colleague Vincent and with more detail and
root causing) that if Solr's CoreContainer has some issue starting up
for almost any reason, that it only logs an error (in
SolrDispatchFilter).  From Jetty’s perspective, startup has succeeded,
thus the server is running.  Seeing this in a test, it can seem to
hang.  Jetty is running but Solr rejects all requests.  Shouldn’t it
propagate the exception and fail startup?

https://github.com/apache/solr/blob/242e6c861b8fdf4ac75ed9e1b083f3c2f18f6c40/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L163

I could be wrong but I thought this behavior happened newly since a
big refactor in SolrDispatchFilter by Gus a couple years ago.  Yet the
pertinent lines of code go way back to Mark in 2014 so who knows.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: solr query alerting

2024-04-30 Thread David Smiley
I agree the feature is relevant / useful.

Another angle on the module vs sandbox or wherever else is maintenance
cost.  If a lot of code is being contributed as is here, then as a PMC
member I hope to get a subjective sense that folks are interested in
maintaining it.  On one hand we have a number of committers here from
Bloomberg, yet the abandoned and now-removed "analytics" component
shows that abandonment is a risk nonetheless.  I don't know how to
conclude this thought but I'm hoping to hear from folks that they
intend to look after this module.  It's not just being "thrown over
the wall", so to speak.

Naming is hard...
* ...-monitor-: sorry I hate it
* ...-percolator- No clue why this was chosen for ElasticSearch.
I can appreciate a curious/non-obvious name like this that is not
going to conflict with anyone's guesses at what a general name might
convey.
* "indexed-queries" or "query-indexing" would be a good name?  This is
the best technical name I can think of.
*  "reverse search" came to mind (based on the Netflix article)
although that makes me think of leading-wildcard / suffix-search.
* "inverted-search"
*  "indexed-query-alerts" incorporates "alerts" thus might better
convey the use-case

On Mon, Apr 1, 2024 at 3:53 PM Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD
A)  wrote:
>
> Hi All,
>
> A few months ago I wrote the user list about potentially integrating lucene 
> monitor into solr. I have raised this PR with a first attempt at implementing 
> this integration. I'd greatly appreciate any feedback on this even though I 
> still have it marked as draft. I want to make sure I'm heading in the right 
> direction here so input from solr dev community would be extremely valuable 
> :-)
>
> Many thanks,
> Luke

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Tracking contributors uniquely

2024-04-26 Thread David Smiley
LOL meanwhile I posted https://github.com/apache/solr/pull/2424 for
the script I developed and improved today.
I think CHANGES.txt is the best source for a release centric view
while git log is best for project health metrics.

On Fri, Apr 26, 2024 at 4:38 PM Jan Høydahl  wrote:
>
> I think it is a good idea to include a list of contributors in the release 
> note mail.
> it is a tiny encouragement for folks to contribute more. The list should 
> perhaps
> be excluding committers, so we only list external contributors?
>
> I already added a script to dev-tools to parse SolrBot contributions from git 
> log and add to CHANGES:
> https://github.com/apache/solr/blob/main/dev-tools/scripts/addDepsToChanges.py
>
> Based on this I did a similar script that parses out Authors and 
> Co-Authored-By from git log
> since last release, see https://github.com/apache/solr/pull/2423 for a Draft.
>
> There's a risk of this method missing the names of some contributors who did 
> not actually commit anything to a PR but still are listed in CHANGES for the 
> release. That can be fixed by us being more careful when merging PRs, and 
> when committing patches from JIRA,
>
> Jan
>
> > 26. apr. 2024 kl. 15:39 skrev David Smiley :
> >
> > On Fri, Apr 26, 2024 at 9:35 AM Gus Heck  wrote:
> >>
> >> I don't know if it's relevant, but I recall that back in the early 2000's
> >> around the time of the adoption of the ASL 2.0 (when I was contributing to
> >> Ant) the ASF had us stop using @author tags in code. I was not a fan at the
> >> time, but they had some reason I don't fully recall relating to shielding
> >> the contributors in the event of someone hitting a bug and then trying to
> >> sue folks to recover losses or something. I wonder if that logic still
> >> exists, and if this could be seen as related to that. It's also possible
> >> that this memory has severely mutated while hanging out in the back of my
> >> brain for 20 year :).
> >
> > The context of the name appearing as I propose in a "thank you" is
> > merely to thank them, not to indirectly hold them to stability/quality
> > measures.
> >
> > I don't think it's related.  @author tags can repel a collaborative
> > ownership mindset on a specific bit of code.  I used to @author my
> > code out of pride but long ago I realized those tags are a bad idea
> > and also kind of needless with git-blame anyway.
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Ideas for release notes for 9.6

2024-04-26 Thread David Smiley
I added a "Thanks to all contributors" section at the end based on the
9.6 CHANGES.txt.

On Mon, Apr 22, 2024 at 10:48 PM Gus Heck  wrote:
>
> Initial release notes have been drafted here, please flesh out, refine,
> copy edit as needed.
>
> https://cwiki.apache.org/confluence/display/SOLR/ReleaseNote9_6_0
>
> -Gus
>
> --
> http://www.needhamsoftware.com (work)
> https://a.co/d/b2sZLD9 (my fantasy fiction book)

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Tracking contributors uniquely

2024-04-26 Thread David Smiley
On Fri, Apr 26, 2024 at 9:35 AM Gus Heck  wrote:
>
> I don't know if it's relevant, but I recall that back in the early 2000's
> around the time of the adoption of the ASL 2.0 (when I was contributing to
> Ant) the ASF had us stop using @author tags in code. I was not a fan at the
> time, but they had some reason I don't fully recall relating to shielding
> the contributors in the event of someone hitting a bug and then trying to
> sue folks to recover losses or something. I wonder if that logic still
> exists, and if this could be seen as related to that. It's also possible
> that this memory has severely mutated while hanging out in the back of my
> brain for 20 year :).

The context of the name appearing as I propose in a "thank you" is
merely to thank them, not to indirectly hold them to stability/quality
measures.

I don't think it's related.  @author tags can repel a collaborative
ownership mindset on a specific bit of code.  I used to @author my
code out of pride but long ago I realized those tags are a bad idea
and also kind of needless with git-blame anyway.

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Tracking contributors uniquely

2024-04-25 Thread David Smiley
The 9.6 release is upon us.  I'd like to find a way of highlighting
more prominently who contributed to the release in the release
announcement.  Something like:

  Thank you to those who contributed to this release:  David Smiley,
Gus Heck, Christine Poerschke

(Of course the actual list for 9.6 is much longer).  Back when I
started this thread, I wrote a script that I put up in a Gist:
https://gist.github.com/dsmiley/876f37089778d7d8abb49ef6121b4e1a that
parses CHANGES.txt, which I think is the ideal source for use in
featuring people in a release announcement, but not ideal for project
health / metrics.  It's very rough; I don't really know Python but due
to the miracles of ChatGPT, I get by.  It includes false-positives so
requires a bit of cleanup.  This regexp needs improvement to reduce
this, like requiring a newline after the trailing close-parenthesis.
The input should not be hard-coded as well!  Ultimately I'm hoping the
script is cleaned up and tweaked to make it easy for the RM to run and
incorporate into the notes.  It can be contributed to
dev-tools/scripts.

WDYT folks?

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [VOTE] Release Solr 9.6.0 RC1

2024-04-25 Thread David Smiley
False alarm; it's a test bug that randomly re-orders docs at merge.
I'll file a PR in a bit.

+1 vote for the release.

On Thu, Apr 25, 2024 at 9:56 PM David Smiley  wrote:
>
> I got a test failure that reproduces:
>  ./gradlew :solr:core:test --tests
> "org.apache.solr.uninverting.TestUninvertingReader.testSortedSetFloat"
> -Ptests.seed=5827A2FA13E7BE3C
> Based on GE 
> https://ge.apache.org/scans/tests?search.relativeStartTime=P90D=solr-root=America%2FNew_York=org.apache.solr.uninverting.TestUninvertingReader=testSortedSetFloat
> and 
> http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.uninverting.TestUninvertingReader.testSortedSetInteger
> (notice looking at testSortedSetFloat vs testSortedSetInteger)
> these test methods on this suite have failed in the past since the
> start of this year.  I ran a "git bisect" exploration and the test
> broke as of SOLR-17097: Upgrade Solr to use Lucene 9.9.2 (#2176).
> That shipped in Solr 9.5.  I wish we had reproducer-detection /
> alerts.  I'll file a JIRA issue for this bug.
>
> As to the seriousness... well this affects anyone using our Legacy
> numerics (vs Points) and who uninverts them (i.e. didn't enable
> docValues, which people _should_ be doing but it's easy to forget).
>
> The Smoketest passed for me when I ran it a second time.
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [VOTE] Release Solr 9.6.0 RC1

2024-04-25 Thread David Smiley
I got a test failure that reproduces:
 ./gradlew :solr:core:test --tests
"org.apache.solr.uninverting.TestUninvertingReader.testSortedSetFloat"
-Ptests.seed=5827A2FA13E7BE3C
Based on GE 
https://ge.apache.org/scans/tests?search.relativeStartTime=P90D=solr-root=America%2FNew_York=org.apache.solr.uninverting.TestUninvertingReader=testSortedSetFloat
and 
http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.uninverting.TestUninvertingReader.testSortedSetInteger
(notice looking at testSortedSetFloat vs testSortedSetInteger)
these test methods on this suite have failed in the past since the
start of this year.  I ran a "git bisect" exploration and the test
broke as of SOLR-17097: Upgrade Solr to use Lucene 9.9.2 (#2176).
That shipped in Solr 9.5.  I wish we had reproducer-detection /
alerts.  I'll file a JIRA issue for this bug.

As to the seriousness... well this affects anyone using our Legacy
numerics (vs Points) and who uninverts them (i.e. didn't enable
docValues, which people _should_ be doing but it's easy to forget).

The Smoketest passed for me when I ran it a second time.

On Thu, Apr 25, 2024 at 7:04 PM Tomás Fernández Löbbe
 wrote:
>
> +1 (binding)
>
> SUCCESS! [0:46:03.145181]
>
> On Thu, Apr 25, 2024 at 12:39 PM Anshum Gupta  wrote:
>
> > +1 (binding)
> >
> > Only ran the smoke tester and basic searching/indexing.
> >
> > On Tue, Apr 23, 2024 at 10:36 AM Gus Heck  wrote:
> >
> > > Please vote for release candidate 1 for Solr 9.6.0
> > >
> > > The artifacts can be downloaded from:
> > >
> > >
> > https://dist.apache.org/repos/dist/dev/solr/solr-9.6.0-RC1-rev-f8e5a93c11267e13b7b43005a428bfb910ac6e57
> > >
> > > You can run the smoke tester directly with this command:
> > >
> > > python3 -u dev-tools/scripts/smokeTestRelease.py \
> > >
> > >
> > https://dist.apache.org/repos/dist/dev/solr/solr-9.6.0-RC1-rev-f8e5a93c11267e13b7b43005a428bfb910ac6e57
> > >
> > > You can build a release-candidate of the official docker images (full &
> > > slim) using the following command:
> > >
> > > SOLR_DOWNLOAD_SERVER=
> > >
> > >
> > https://dist.apache.org/repos/dist/dev/solr/solr-9.6.0-RC1-rev-f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr
> > > &&
> > > <
> > https://dist.apache.org/repos/dist/dev/solr/solr-9.6.0-RC1-rev-f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr&;
> > >
> > > \
> > >   docker build
> > $SOLR_DOWNLOAD_SERVER/9.6.0/docker/Dockerfile.official-full
> > > \
> > > --build-arg SOLR_DOWNLOAD_SERVER=$SOLR_DOWNLOAD_SERVER \
> > > -t solr-rc:9.6.0-1 && \
> > >   docker build
> > $SOLR_DOWNLOAD_SERVER/9.6.0/docker/Dockerfile.official-slim
> > > \
> > > --build-arg SOLR_DOWNLOAD_SERVER=$SOLR_DOWNLOAD_SERVER \
> > > -t solr-rc:9.6.0-1-slim
> > >
> > > The vote will be open for at least 72 hours i.e. until 2024-04-26 06:00
> > > UTC.
> > >
> > > [ ] +1  approve
> > > [ ] +0  no opinion
> > > [ ] -1  disapprove (and reason why)
> > >
> > > Here is my +1
> > >
> > >
> > > --
> > > http://www.needhamsoftware.com (work)
> > > https://a.co/d/b2sZLD9 (my fantasy fiction book)
> > >
> >

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



SolrJ HTTP SolrClient class hierarchy

2024-04-25 Thread David Smiley
Our SolrJ class hierarchy is looking rather confusing right now for
the HTTP ones especially.  This message is mostly a big FYI, with some
reflections and a recommendation or two.

SolrClient
- BaseHttpSolrClient (NOT yet deprecated but should be?)
- - HttpSolrClient  (based on Apache HttpClient; deprecated)
- - - DelegationTokenHttpSolrClient
- CloudSolrClient
- - CloudHttp2SolrClient
- - CloudLegacySolrClient (based on Apache HttpClient; deprecated)
- ConcurrentUpdateHttp2SolrClient
- - ...
- ConcurrentUpdateSolrClient (based on Apache HttpClient; deprecated)
- - ...
- HttpSolrClientBase (this is new)
- - Http2SolrClient
- - HttpJdkSolrClient (this is new; based on the JDK HttpClient)
- LBSolrClient
- - LBHttp2SolrClient
- - LBHttpSolrClient (based on Apache HttpClient; deprecated)

In retrospect, we can see that some past names weren't so great after
all.  I think our clients should reflect the vendor/source of the
HttpClient.  "HttpJdkSolrClient" is the newest client, and it reflects
the vendor (JDK provided HttpClient).  Personally I don't care enough
to rename all the ones with "2" in there to have "Jetty" but that's
what they are -- if it has a "2", it's using Jetty (and it supports
1.1; FYI JDK also supports both 1.1 and 2 as well).  The clients for
Apache HttpClient are all deprecated so perhaps we continue to leave
them be, mostly.  Removing them will take some time; they are
entrenched!  BaseHttpSolrClient (the parent of HttpSolrClient) is at
the moment even more confusing because HttpSolrClientBase was just
added.  BaseHttpSolrClient should be removed now; it only holds 2
static inner classes for RemoteSolrException and
RemoteExecutionException which should find a new home somewhere.
Since they are referenced so much, that will happen only in main.
HttpSolrClientBase is a tempting home but SolrClient would be fine, I
think.

Also, just because we have a nice new HttpJdkSolrClient, doesn't mean
we can yet advise anyone to safely remove Apache & Jetty dependencies
*yet*!  We have no tests that this works, and a quick attempt I did
recently showed there are some obscure references still!  Modularizing
SolrJ further (for Jetty & Apache) will help reveal where we have some
references, after which we can finally free users of needing those
dependencies.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Replicas-on-demand

2024-04-23 Thread David Smiley
I’m soliciting interest / feedback on something.

Maybe some of you are familiar with transient-cores (SOLR-1028), an
LRU cache SolrCore mechanism that allows you to have an almost
unlimited number of cores on a node in name-only with a limited number
that are actually loaded at any one time.  It’s for use-cases where
the working set is significantly smaller than the total possible
addressable set.  Transient-cores only works in standalone Solr; I
tried to get it to work in SolrCloud but it proved difficult / buggy,
especially with leader election entanglements.  Furthermore, if we
imagine tens of thousands of replicas on a node, actually maintaining
that in SolrCloud / ZooKeeper is a ton of information and book-keeping
/ watching etc.

I am imagining another approach where replicas are created and removed
on-demand and thus the underlying core as well.  And at a higher level
of abstraction (at the request level) that can make more informed
decisions than the “SolrCores”/transient-cores mechanism can.  If a
request comes in and we have 0 replicas and a shared file system,
create one similar to autoAddReplicas[1].  If we have 1 replica, maybe
we should asynchronously arrange for another to maintain good
availability.  If the core seems saturated with query activity, create
another (to have more than 2 total).  That might depend on /select vs
/update and be replica-type specific.  Meanwhile a node listener can
remove replicas that have not been used recently, especially to limit
the number of replicas per node.  It can consider the number of
*other* replicas for the same shard that exists, and leadership and
replica type in its decision.  Of course different users/apps might
want to tune such settings differently, and it doesn’t imply a shared
file system to be useful either.

To support such a feature, I don’t think much is needed of Solr.  The
request “demand driven” aspect suggests a new plugin type within
HttpSolrCall that resolves a request to a SolrCore, perhaps called a
“RequestToCoreResolver”, or we make HttpSolrCall itself more
extensible.  For the 0-replica scenario, CloudSolrClient probably
needs a little more tolerance to just get the request to Solr instead
of prematurely failing.  HttpShardHandler (for distributed-search)
might similarly.  There is no core-listener; it could be added, or we
do polling, or we extend SolrCores as a collaborating plugin.

One risk/concern is ensuring the core data is retained after the
replica is removed.  It’s not necessary to do that but if it’s
removed, then it’s expensive & slow to create it again — a problem if
there are no replicas to serve a request.  I haven’t yet checked on
the feasibility of keeping data lying around, and using it again when
re-creating the replica.

A significant motivation of mine with this proposal is to help SIP-20
“Zero Replicas” [2].  The biggest obstacle I see with it is that
unused cores are empty until queried, yet still are in state ACTIVE
the whole time (don’t even use the RECOVERY state).  A hack in
SearchHandler throws an exception and gets the data.  This stuff
doesn’t *need* to be in that branch (it’s not fundamental to Zero even
if it’s fundamental to how we scale with Zero right now) but a
replica-on-demand approach would obsolete that.

[1]:  Solr used to have an “autoAddReplicas” feature prior to Solr 9.
In Solr 9 a substitute was developed — CollectionsRepairEventListener.
Regardless, the idea is to create replicas automatically in response
to nodes going away (or maybe other circumstances) in order to
maintain a replicationFactor.  There are many references to
autoAddReplicas in CHANGES.txt; originally in SOLR-5656 it was
intended for shared file system but was later expanded to be more
general SOLR-10397.  In the case of a shared file system, you can even
reach 0 replicas and nonetheless create more later.

[2]: SIP-20 https://cwiki.apache.org/confluence/x/8YokEQ and branch
https://github.com/apache/solr/tree/jira/solr-17125-zero-replicas

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Fix version 9.6 vs 9.6.0

2024-04-21 Thread David Smiley
Whoever created the JIRA "release" named "9.6.0" (and 9.5.0 for that
matter) made a small mistake; it should have been 9.6 (and 9.5) based
on past conventions.  You should simply re-assign the existing issues
using 9.6 (there are only 2), *delete* that one, then *rename* "9.6.0"
to "9.6".  And rename 9.5.0 to 9.5 for that matter.

I also noticed that we are not updating the status of each release to
be of state "Released" (via the "Build & Release" action)
consistently.  Not sure what consequence there is to it but
nonetheless it ought to be added to the release wizard if it isn't
there already.

On Sun, Apr 21, 2024 at 10:08 PM Gus Heck  wrote:
>
> It seems in the last couple of point releases we've tagged stuff as 9.4 and
> 9.5 in Jira, but this time around 9.6.0 has been used in 59 out of 61
> issues...
>
> I'm guessing, we want to bulk change all of that to 9.6 to match prior
> releases...
>
> LMK if this doesn't sound right, or I missed the memo on a change in
> practice.
>
> -Gus
>
> --
> http://www.needhamsoftware.com (work)
> https://a.co/d/b2sZLD9 (my fantasy fiction book)

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Is SolrBot too noisy / being ignored...?

2024-04-19 Thread David Smiley
I think it’s satisfactory if we merely have advice to ourselves in the PR
to remind us what little we need to do. Like… if you are a committer and
this is passing, just merge it.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Apr 19, 2024 at 7:51 AM Eric Pugh 
wrote:

> From the perspective of commits being merged by a bot?
>
> Assuming the legal side was okay, what are your thoughts about having the
> commits be merged by a bot based on the criteria I suggested?  Crazy?
> Reasonable?
>
>
>
> > On Apr 18, 2024, at 7:00 PM, Mike Drob  wrote:
> >
> > That’s probably a question for asf legal
> >
> > On Thu, Apr 18, 2024 at 5:36 PM Eric Pugh <
> ep...@opensourceconnections.com <mailto:ep...@opensourceconnections.com>>
> > wrote:
> >
> >> Thanks for the work that has been done on some of these.
> >>
> >> I actually just ran through the process of updating commons-cli based on
> >> what SolrBot provided.   I *did* have to update a Java class, and I did
> >> regenerate the licenses, and that was about it…
> >>
> >> Which made me wonder..   If SolrBot opens a dependency upgrade, and
> >> recommit and the tests pass, could we have it just commit automatically
> the
> >> update?
> >>
> >> I looked at one that I constantly see, the update to the awssdk:
> >> https://github.com/apache/solr/pull/2056.   The tests all pass, and it
> >> appears all I need to do to make precommit happy is drop in some new
> >> licenses.   Other than that, I believe that I could merge that PR, and I
> >> wouldn’t need to do any other steps….  So, if there were no new license
> and
> >> precommit had passed, couldn’t SolrBot merge it for us?
> >>
> >> Basically, do we actually need a human in the loop on this when at least
> >> this human, me, wouldn’t really be doing anything else if all the checks
> >> passed….
> >>
> >>> On Apr 9, 2024, at 8:01 AM, Eric Pugh  >
> >> wrote:
> >>>
> >>> The update that I see a lot is for the software.amazon.awssdk and
> >> com.google.cloud packages….  I checked renovate.json and they should
> only
> >> happen once a month.
> >>>
> >>> I just checked and there has been an update today, yesterday, and the
> >> day before for the software.amazon.awssdk package.
> >>>
> >>> Looks like they all go to https://github.com/apache/solr/pull/2056
> >> however.  Is this because once it opens the PR, it is just updating the
> PR
> >> as needed?
> >>>
> >>> How can we get a smoother workflow?   The constant updates are noisy,
> >> and now I think they are just ignored…!   I saw that Kevin approved this
> >> back in November 2023.   Do we want to be more on top of these and
> merge as
> >> they go?
> >>>
> >>> And maybe for these frequently changing ones, maybe move to a quarterly
> >> schedule?   Or, do we add it to the release manager process, though I
> know
> >> that approach was discussed and then viewed as too burdensome for the
> RM.
> >>>
> >>>
> >>>
> >>> Eric
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> ___
> >>> Eric Pugh | Founder | OpenSource Connections, LLC | 434.466.1467 |
> >> http://www.opensourceconnections.com <
> >> http://www.opensourceconnections.com/> | My Free/Busy <
> >> http://tinyurl.com/eric-cal>
> >>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
> >>
> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
> >
> >>
> >>> This e-mail and all contents, including attachments, is considered to
> be
> >> Company Confidential unless explicitly stated otherwise, regardless of
> >> whether attachments are marked as such.
> >>>
> >>
> >> ___
> >> Eric Pugh | Founder | OpenSource Connections, LLC | 434.466.1467 |
> >> http://www.opensourceconnections.com <
> http://www.opensourceconnections.com/> <
> >> http://www.opensourceconnections.com/> | My Free/Busy <
> >> http://tinyurl.com/eric-cal>
> >> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
> >>
> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
> >
> >>
> >> This e-mail and all contents, including attachments, is considered to be
> >> Company Confidential unless explicitly stated otherwise, regardless of
> >> whether attachments are marked as such.
>
> ___
> Eric Pugh | Founder | OpenSource Connections, LLC | 434.466.1467 |
> http://www.opensourceconnections.com <
> http://www.opensourceconnections.com/> | My Free/Busy <
> http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless of
> whether attachments are marked as such.
>
>


Welcome Jason Gerlowski as Solr's new PMC Chair

2024-04-18 Thread David Smiley
The PMC has voted Jason Gerlowski as Solr's new PMC Chair.  I am the
outgoing chair.  The change was approved by the ASF board yesterday,
April 17th.

Thanks for stepping up Jason!  I'll be happy to assist as needed.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: solr query alerting

2024-04-18 Thread David Smiley
I'm probably not a great reviewer of this TBH as I have only heard &
read about lucene-monitor/percolator; haven't used such a thing yet.

I did a *quick* review to see how the PR touches the codebase.  It's
wonderful that it is purely isolated; it requires no changes of Solr
itself.  I might argue why is it here vs the Solr sandbox or somewhere
else with publicity in http://solr.cool

The name "Monitor" sounds like an infrastructure monitoring feature.

p.s. I'm on vacation this week with minimal time to collaborate

On Mon, Apr 15, 2024 at 10:27 AM Luke Kot-Zaniewski (BLOOMBERG/ 919
3RD A)  wrote:
>
> Hey David,
>
> Just wanted to bump this. I appreciate you taking a look, and wanted to 
> stress that I am looking for even just higher level feedback at this point.
>
> Basically, I am questioning the direction I am going in which involves 
> exposing some of the lucene-monitor internals (so currently have an implicit 
> dependence on a lucene change). OTOH I couldn't figure out a better way to 
> apply the lucene-monitor optimizations while still utilizing solr's 
> sophisticated index management and scaling. Lucene-monitor is a very sealed 
> interface and the way it manages the index and caching (via Monitor 
> interface) is not easy to integrate on its own.
>
> Again, I'd appreciate any feedback, even partial, that you may have!
>
> Thanks,
> Luke
>
> From: dev@solr.apache.org At: 04/08/24 13:43:52 UTC-4:00To:  
> dev@solr.apache.org
> Subject: Re: solr query alerting
>
> I'm so glad someone has started this!  Thanks for contributing.  I'll
> take a look
>
> On Mon, Apr 1, 2024 at 3:53 PM Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD
> A)  wrote:
> >
> > Hi All,
> >
> > A few months ago I wrote the user list about potentially integrating lucene
> monitor into solr. I have raised this PR with a first attempt at implementing
> this integration. I'd greatly appreciate any feedback on this even though I
> still have it marked as draft. I want to make sure I'm heading in the right
> direction here so input from solr dev community would be extremely valuable 
> :-)
> >
> > Many thanks,
> > Luke
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [DISCUSS] Solr 9.6 release

2024-04-12 Thread David Smiley
We do have a lot of PRs that appear to be in an advanced state of
readiness.  Maybe some have been reviewed by local search teams even
albeit not publically here.  I'm hoping we could rally around more
peer review of these to get the hard work of others merged.  I suppose
this is an ever-present concern but an impending release reminds me of
our PR backlog.  For my part, I scanned through a couple pages of PRs;
looked at some.

On Mon, Apr 8, 2024 at 12:55 PM Gus Heck  wrote:
>
> It's been about 3 months since we started our last release discussion, and 
> Jira
> shows
> 
> that we have:
>
> 5 bug fixes
> 1 feature (query time distributed stats disable)
> 11 improvements
> 7 sub tasks, several of which represent new features including CPU limited
> requests
> 3 tasks, including the upgrade to Lucene 9.10
>
> Only two are not resolved, but one seems to have commits and the other had
> a PR ready in late Feb...
>
> It seems like there are quite a few things now that should be made more
> widely available to users.
>
> I'm happy to volunteer as RM, though it will be my first time so I may have
> questions. I propose that we cut the branch Next Monday April 15 and
> prepare the first RC.
>
> - Gus
>
> --
> http://www.needhamsoftware.com (work)
> https://a.co/d/b2sZLD9 (my fantasy fiction book)

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: The design of ClusterStateProvider & ClusterState

2024-04-09 Thread David Smiley
On Fri, Apr 5, 2024 at 8:52 AM Ilan Ginzburg  wrote:
>
> I would suggest doing any such change in two independent steps:
> - Moving classes around without any functional change ("pure" refactoring)
> - A change to what a class exposes, its behavior etc.
> Otherwise it is very hard to track what has simply moved and what has
> changed.

Makes sense.  I heard this approach is championed by Kent Beck in his
"Tidy First" book.

> Is the principal motivation here class content refactoring or change of
> behavior?

My initial motivation is *refactoring* to avoid expensive methods that
appear innocent.  In our community we sometimes call them "traps" or
"trappy behavior".  ClusterStateProvider.getClusterState may just be a
getter in ZkClientClusterStateProvider but for
HttpClusterStateProvider -- it's always O(Collections) (with an HTTP
call per collection *every* time).  So there are callers out there
that call it and think nothing of it.  Watch out!  Likewise
org.apache.solr.common.cloud.ClusterState#getCollectionsMap is
O(Collections) -- ouch!  This is *the* way to list collections in
SolrCloud.  So at least with the refactoring, we'd be clear when we're
looping per collection and fetching; wouldn't be hidden behind an
innocent looking method.  Out of scope is reducing/optimizing these
cases, which should probably happen first for some of them.  In this
phase the API changes should be minimized for 9x compatibility (not
strictly but reasonable effort).

My motivation beyond that is to reconsider what the ClusterState API /
behavior *ought* to be, especially to be a cache...

> Independent of which classes do the work, I would like the cluster state to
> be considered for what it is: a *stale cache* of the ZooKeeper data (at
> varying levels of details and staleness).
> Rather than try to keep it up to date with constant watches (it is still
> stale given these are async), consider it can be very stale and deal with
> the staleness when encountered.

> For example, DO NOT watch the whole collection list (for collections not
> represented on the node). If a request for an unknown collection is
> received, check (in ZK) if it exists. Implement of course a level of
> caching of fetched data.
>
> Similarly, for "watched" collections, no need to get all updates. A
> periodic re-fetch (or re-check) from ZooKeeper might be justified, or in
> general fetch from ZooKeeper the collection state when it is absent locally
> or is identified as stale.
>
> I believe approaching the distribution of cluster state to all nodes in
> such a way will greatly limit the load and chatiness with ZooKeeper esp on
> "dynamic" clusters with many state changes.

100% to all that; thanks for expanding upon my economy of words.  I
think the StateCache stuff in CloudSolrClient resembles what we're
after.

~ David

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: solr query alerting

2024-04-08 Thread David Smiley
I'm so glad someone has started this!  Thanks for contributing.  I'll
take a look

On Mon, Apr 1, 2024 at 3:53 PM Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD
A)  wrote:
>
> Hi All,
>
> A few months ago I wrote the user list about potentially integrating lucene 
> monitor into solr. I have raised this PR with a first attempt at implementing 
> this integration. I'd greatly appreciate any feedback on this even though I 
> still have it marked as draft. I want to make sure I'm heading in the right 
> direction here so input from solr dev community would be extremely valuable 
> :-)
>
> Many thanks,
> Luke

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [DISCUSS] Solr 9.6 release

2024-04-08 Thread David Smiley
Thanks for volunteering Gus.

On Mon, Apr 8, 2024 at 12:55 PM Gus Heck  wrote:
>
> It's been about 3 months since we started our last release discussion, and 
> Jira
> shows
> 
> that we have:
>
> 5 bug fixes
> 1 feature (query time distributed stats disable)
> 11 improvements
> 7 sub tasks, several of which represent new features including CPU limited
> requests
> 3 tasks, including the upgrade to Lucene 9.10
>
> Only two are not resolved, but one seems to have commits and the other had
> a PR ready in late Feb...
>
> It seems like there are quite a few things now that should be made more
> widely available to users.
>
> I'm happy to volunteer as RM, though it will be my first time so I may have
> questions. I propose that we cut the branch Next Monday April 15 and
> prepare the first RC.
>
> - Gus
>
> --
> http://www.needhamsoftware.com (work)
> https://a.co/d/b2sZLD9 (my fantasy fiction book)

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [EXTERNAL] - Re: Unable to build tag releases/solr/9.5.0

2024-04-06 Thread David Smiley
JDK 11 is definitely supported!

On Fri, Apr 5, 2024 at 6:18 PM Isabelle Giguere
 wrote:
>
> Thaks for the reply, Shawn.
>
> So... Building Sol 9.5 with Java 11 ?  I thought that was still supported.
>
> Regards;
>
> Isabelle Giguère
> Computational Linguist & Java Developer
> Linguiste informaticienne & développeur java
>
>
> 
> De : Shawn Heisey 
> Envoyé : 4 avril 2024 16:46
> À : dev@solr.apache.org 
> Objet : [EXTERNAL] - Re: Unable to build tag releases/solr/9.5.0
>
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe. If you feel that the email is suspicious, please report it 
> using PhishAlarm.
>
>
> On 4/4/24 09:00, Isabelle Giguere wrote:
> > Hello devs;
> >
> > I checked out tag releases/solr/9.5.0
> > https://urldefense.com/v3/__https://github.com/apache/solr/tree/releases/solr/9.5.0__;!!Obbck6kTJA!Y8xYTFEzJbxgKBE9JDXt_8CkukRjlGanEtclDGDaLe_9jSbBlc9YXsm5LQiyM6GMON-KatIsbECIQnoVwjS13zwM$
> >> github.com/apache/solr/tree/releases/solr/9.5.0>
> >
> > I'm trying to build locally:
> > ./gradlew build --write-locks
> >
> > The build fails very fast:
>
> I ran these commands on an Ubuntu system with openjdk 17.0.10 installed,
> and it built the tarball successfully.
>
> git clone 
> https://urldefense.com/v3/__https://github.com/apache/solr.git__;!!Obbck6kTJA!Y8xYTFEzJbxgKBE9JDXt_8CkukRjlGanEtclDGDaLe_9jSbBlc9YXsm5LQiyM6GMON-KatIsbECIQnoVwpcadHOU$
>   foo
> cd foo
> git checkout releases/solr/9.5.0
> ./gradlew clean distTar
>
> After the build:
>
> sheisey@sheisey-desktop:~/git/foo$ find . -name "*.tgz"
> ./solr/packaging/build/distributions/solr-9.5.0-SNAPSHOT-slim.tgz
> ./solr/packaging/build/distributions/solr-9.5.0-SNAPSHOT.tgz
>
> Thanks,
> Shawn
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



The design of ClusterStateProvider & ClusterState

2024-04-04 Thread David Smiley
I've been looking at HttpClusterStateProvider lately, and of ClusterState.
It has a method getClusterState which goes and loads the complete
ClusterState (all collections with all state info).  ClusterState is
immutable.  At a massive collection scale, such a method is very
disconcerting!  Thankfully, there's a method getState(collection)
returning a CollectionRef (holder of DocCollection) implemented by
fetching only the state of the pertinent collection.  Likewise the
live nodes can be retrieved directly from ClusterStateProvider without
requiring using ClusterState.

I'd like to make a bold proposal: Merge ClusterState with
ClusterStateProvider, keeping the same ClusterState name & package and
all/most API methods.  This means it would lose its immutability
designation.  If an immutable variation is needed, one could exist.

Don't include methods like getCollectionsMap which is evil at
many-collection scale.  Listing/looping collections should be done
sparingly; don't make it too easy to do by accident.

Possibly also move CloudSolrClient's StateCache (a cache of
DocCollection keyed by collection name) into the new & improved
ClusterState.

The end-game is ClusterState being where we can list live nodes,
aliases, collections, and most importantly a cache of DocCollection.
With an eventually consistent mind-set; anything can be out of date
and may need to be re-fetched.

Has anyone thought similarly or have concerns in such a pursuit?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: SolrParams implementations

2024-04-03 Thread David Smiley
I've felt we need one or two less maybe.  MultiMapSolrParams can often
be substituted by ModifiableSolrParams.  Javadocs could be improved in
general.

RequiredSolrParams and DefaultSolrParams (plus subclasses) could be
package scope so as to discourage direct use, which I think is never
needed.

On Wed, Apr 3, 2024 at 10:52 AM Gus Heck  wrote:
>
> We have quite a few, including MapSolrParams which seems to explicitly defy
> the contract of SolrParams being a multimap...
>
> Has anyone spent any time considering if we really need all these variants:
>
>- AppendedSolrParams
>- DefaultSolrParams
>- DocRowParams
>- MapSolrParams
>- ModifiableSolrParams
>- MultiMapSolrParams
>- RequiredSolrParams
>- SolrQuery
>- VersionedParams
>
> Additionally there seem to be 3 places where we create anonymous subclasses
> of SolrParams...
>
> Most of these seem to have some angle or additional fields, so they aren't
> useless per-se but it does make it hard to reason about the behavior of a
> method that accepts SolrParams... for example RequestUtil.processParams()
> has the suspicious code:
>
> public static void processParams(
> SolrRequestHandler handler,
> SolrQueryRequest req,
> SolrParams defaults,
> SolrParams appends,
> SolrParams invariants) {
>
>   boolean searchHandler = handler instanceof SearchHandler;
>   SolrParams params = req.getParams();
>
>   // Handle JSON stream for search requests
>   if (searchHandler && req.getContentStreams() != null) {
>
> Map map = MultiMapSolrParams.asMultiMap(params, false);
>
> if (!(params instanceof MultiMapSolrParams || params instanceof
> ModifiableSolrParams)) {
>   // need to set params on request since we weren't able to access
> the original map
>   params = new MultiMapSolrParams(map);
>   req.setParams(params);
> }
>
>
> This seems to A) possibly fail to restrict defaults and appends to the
> actual subclasses that are associated with that functionality (and if that
> is not a good idea, why do we have them?), and B) discard the request
> parameters if someone uses anything other than the expected two subclasses
> of SolrParams in the request.
>
> Has this been considered before?
>
> Archives search only brought up my previous irritations with SolrParams
> implementations...
>
> https://lists.apache.org/thread/tkoj75z736x1nzotgh2xsn7wdnnsoc8g
>
> -Gus
>
> --
> http://www.needhamsoftware.com (work)
> https://a.co/d/b2sZLD9 (my fantasy fiction book)

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: timeout HTTP response code; use 524?

2024-03-31 Thread David Smiley
It was an 8x but it's more hypothetical if /update were to support
this.  It does where I work :-)

Weird that you got status=0 and fewer results without specifying
shards.tolerant=true

On Sun, Mar 31, 2024 at 4:41 PM Gus Heck  wrote:
>
> Hmm, I took the initial statement about returning 500 at face value when I
> wrote my response above, but with actual testing I'm not seeing that
> behavior in any recent version... (tested 9.1, 9.4, 9.5, main) this cause a
> some confusion since I am working in that code right now and having read
> this I made sure in paths I added I threw 500 and then discovered that it
> wasn't reliably thrown all the time when I was testing... I initially
> assumed I must have broken it until I stashed everything and made this
> discovery :).
>
> David, what version were you observing 500 status for a time limit
> exceedance with timeAllowed? Is it an 8.x? I didn't test any of those...
>
> Searching an index of the ref guide here which I use for testing sometime
> (this index is easily created via
> https://github.com/nsoft/index-solr-ref-guide) in each of those versions I
> got:
> {
>   "responseHeader":{
> "zkConnected":true,
> "partialResults":true,
> "status":0,
> "QTime":20,
> "params":{
>   "q":"hdfs",
>   "indent":"true",
>   "q.op":"OR",
>   "timeAllowed":"1",
>   "useParams":"",
>   "_":"1711914604030"}},
>
> "response":{"numFound":2,"start":0,"maxScore":3.6057181,"numFoundExact":true,"docs":[
>
> Or without timeAllowed:
>
> {
>   "responseHeader":{
> "zkConnected":true,
> "status":0,
> "QTime":66,
> "params":{
>   "q":"hdfs",
>   "indent":"true",
>   "q.op":"OR",
>   "useParams":"",
>   "_":"1711914604030"}},
>
> "response":{"numFound":9,"start":0,"maxScore":4.5576544,"numFoundExact":true,"docs":[
>
> As a side quirk, the numFoundExact seems a bit misleading... alongside
> partialResults:true
>
> On Wed, Mar 20, 2024 at 5:59 PM Gus Heck  wrote:
>
> > I don't like the current 500. That's sharing the same conceptual space as
> > NPE, bugs and other hard failures for which IT should be paged at 2:30 am.
> >
> > 5xx indicates that there is a problem with the server and that there is
> > nothing the client can do, and also would typically be flagged by
> > monitoring tools as a system problem. That's not true, often the query can
> > be modified, or resubmitted of peak hours etc... The server is not
> > compromised.
> >
> > 4xx indicates that "the client should correct their request" which is true
> > if things are taking a long time because the request was dumb in the first
> > place, but since that's entirely relative to the index and the hardware,
> > it's unknowable from our perspective so it also feels wrong to send a 400,
> > because it could be a perfectly fine request sent in the middle of a DOS
> > attack...  Not the client's fault.
> >
> > Stepping back a moment let's think about the request: it says "run this
> > query for up to X milliseconds" ... the server did that successfully so
> > really the response ought to be a success.
> >
> > So 206 Partial Content seems to make the most sense
> >
> > -Gus
> >
> >
> > On Wed, Mar 20, 2024 at 2:06 PM Chris Hostetter 
> > wrote:
> >
> >>
> >> : I still think 503 is appropriate when timeAllowed is exceeded. The
> >> : service requested is a reponse within the set time. That service is not
> >> : available. Here are the RFC definitions of 500 and 503. Exceeding
> >> : timeAllowed isn’t an “unexpected condition”, it is part of the normal
> >> : operation of that limit.
> >>
> >> Devils avocate argument: it is an "unexpected condition" -- it's
> >> unepected
> >> that the request wasn't able to be processed w/in the configured limits.
> >> If you
> >> expected that the request could *NOT* be processed, why send it w/those
> >> limits?
> >>
> >> I'm -0 to using 503 because it's worded to very explicitly be a temporal
> >> situation: failue "due temporary overload or scheduled maintenance" --
> >> those aren't *examples* of why a server MAY return 503, those are the
> >> explict circumstances indicated by a 503 -- and neither one is directly
> >> applicable for a query limit being exceeded.  An otherwise idle server
> >> might still fail a request with a query limit configured -- regardless of
> >> wther the server is "overloaded" or down for "scheduled maintenance" --
> >> and there is no reason for a client to assume retrying a request with the
> >> same query limits will succeed again in the future.
> >>
> >>
> >> Bottom line for me: I'm fine w/changing the default response code to
> >> anything -- as long as we make it configure.  But i *strongly* urge us
> >> not to use a non-standard error code that already has very specific, well
> >> established, meaning with well known proxy software.
> >>
> >>
> >> : 6.6.1.  500 Internal Server Error
> >> :
> >> :The 500 (Internal Server Error) status code indicates that the server
> >> :encountered 

Re: PR review for SolrZkClient

2024-03-30 Thread David Smiley
It's getting some attention now :-)

On Thu, Mar 28, 2024 at 10:54 PM Aman Raj  wrote:
>
> Hi team,
>
> Can someone from the Solr community please review this PR - SOLR-17220 - Make 
> the SolrZkClient thread as a daemon thread by amanraj2520 · Pull Request 
> #2376 · apache/solr (github.com).
>
> Thanks,
> Aman.

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: DISCUSS: Optionality of JIRA issues

2024-03-25 Thread David Smiley
The only missing “status” in a PR is fix-version. Maybe a fix-version
solution is needed before a PR should exist without a JIRA.  Again, at
least for anything of “substance”.

Distinguishing “substantive” or not for internal APIs and the degree to
which a “small” bug such is obviously a slippery slope, so I’d rather we
leave that to a guidance/preference document than a mandate.  I would
prefer a weaker distinction than yours, letting minor changes go through
without a JIRA. A rare NPE fix for example (so rare it’s a waste of time to
add it or have many people read it in CHANGES.txt IMO). Minor Java API
changes, very subjective there of course. Walking down the slippery slope
we go, here. See what I mean?  I’m trying to make it easier for
contributors and us too for that matter.

No matter how simple it is to create a JIRA issue, it’s still sometimes
wasted time (what’s the point again?) — that breaks up one’s flow. It’s
generally not less than a minute if thought is put into it (isn’t rushed).
Differences in formatting between both systems.

A PR with a mandated pointless JIRA does itself contribute to bifurcation.
If I find the PR and see a JIRA link, I wonder what’s there. I look and
ascertain if it’s the ~same description. If it’s the same, then it was
wasted time.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Mar 25, 2024 at 5:46 AM Gus Heck  wrote:

> I definitely prefer to have a real description of what is intended, and a
> clear place for status and reporting of fix versions and the like for any
> "substantive" change. Also a central place to subscribe to hear what people
> say, or learn when things are addressed/decided I also really don't want to
> have to search more than one place to find changes relating to a feature,
> and I'm particularly interested in being able to find both when a feature
> was directly touched, and when it was discussed as part of some other
> development. (i.e. if something about streaming expressions impacted how we
> migrated solrj code from one technology to another... If discussions are
> spread across multiple locations this gets harder and harder. As it stands
> now, things might be discussed on the list or in Jira... we have tried hard
> not to make Slack a 3rd location and having Github sneak in as a 4th is
> equally undesirable to me, and at present the situation with discussions on
> PR's is only tolerable because the PR will be auto-linked from the Jira.
> Without that Jira as a hub, the existence of the discussion will be much
> harder to discover. Good Jira tickets link the archived discussion on the
> list when applicable.
>
> So what's substantive? Below is my opinion by way of examples.
>
>- Anything that changes back compatibility
>- Anything that changes an API (in code or HTTP) including throwing new
>exceptions or responding with new response codes.
>- Anything that adds a feature
>- Anything that removes a feature (see item 1)
>- Anything that alters (or might alter) performance
>(CPU/Mem/Disk/Network) intentionally
>- Anything that fixes a bug in a released feature (because even
>seemingly simple stuff can cause fix/breaks).
>- Anything worthy of an entry in CHANGES.txt (i.e. something that users
>might want to know about).
>- Any change that moves/renames a file.
>- Anything that has (should have) a unit test.
>
> What's clearly not substantive:
>
>- Fixing typos
>- Elimination or suppression of simple static analysis warnings where
>the change is localized to a few lines in a single method.
>- Improvement of docs to better portray the *previously existing*
>features
>
> Gray zone:
>
>- Minor but not trivial refactoring such as extracting a method and then
>updating various code to utilize it to reduce duplication. Such a thing
>greatly depends on how complicated the method is, and if the usages are
> all
>identical. This trends toward substantive if there are varying patterns
> of
>argument usage with null being commonly passed in, or introduction of
>"result objects". etc.
>
> For anything substantive I really want fix version information, and I think
> it's a fundamental part of good programing to communicate what it was you
> intended to do so that later when someone looks at the code (especially if
> it's been updated by several more people) there's a prayer that they can
> know if it's drifted away from it's intent (and if that was on purpose as
> evidence by subsequent JIRA's)
>
> I seriously don't understand what's so difficult about making a JIRA
> ticket. It takes like 30 seconds plus the time to write the same
> description that **I hope** you would also write in a PR. It sm

Re: DISCUSS: Optionality of JIRA issues

2024-03-24 Thread David Smiley
On Sun, Mar 24, 2024 at 6:54 AM Ilan Ginzburg  wrote:
>
> I do like having a place where a discussion can be had on a code change.
> Years later it helps. Also, some Jiras get comments or questions long after
> the code has been merged.

Maybe you mean *in advance of* a code change; i.e. to discuss an idea?
 This is what a JIRA issue does pretty well; indeed we will continue
to create them sometimes -- I will.  Dev list threads get more
visibility though.  It's rare to link to a dev list thread from a PR;
I wish that was easier -- we should encourage that when it applies.
Speaking for myself, I monitor the dev list daily but not the issues
list (hey I have a day job LOL).  I suspect new people interested in
Solr development miss subscribing to that list.  If you mean *after* a
code change -- note the PR isn't closed for comments.  I've commented
on a PR after a merge to follow up.  The participants to the PR
(subscribers) should get notified.  I've found it's increasingly rare
for people to even "Watch" the JIRA issue if there is a PR
accompanying it, making me wonder if whatever I write there will be
read.

When GitHub PRs with code review commentary was a new contribution
approach for Solr, I had suggested at the time that JIRA issues be
where we discuss the high level matter -- requirements and approach.
In practice, once you start looking at the PR, you instinctively want
to comment there no matter what level of detail; it's just natural.
Not just for little things but bigger things (i.e. should we even be
doing it this way?).  Sometimes I've conversed back on the JIRA issue
to discuss the bigger picture.

Independently of the forthcoming vote thread, more could be done to
make our use of JIRA better.  For example, if we could get a weekly
(or more often?) digest summary of what's happening in JIRA and post
this to the dev list, I think it'd be very helpful to bring visibility
there.  And we could re-examine the JIRA-PR synchronization options
that the ASF gives us.  Originally it was configured for all
individual updates to be copied to JIRA which was way too much but if
it could be configured to be top level comments only, I think that'd
be a nice balance.

Nevertheless; if someone has a PR and doesn't want to create a JIRA; I
don't blame them :-)

> If people find that it really slows them down to create a jira, we can
> create a catch-all jira per released Solr version, referenced by all PR’s
> that would like no jira. Whoever references that jira will have to think
> twice and might decide it makes more sense to create a specific jira
> instead.

I don't get the point.  If it is only to satisfy a JIRA mandate, we
should really reconsider the mandate.

> Ilan
>
> On Fri 22 Mar 2024 at 21:57, David Smiley  wrote:
>
> > There was a recent conversation in priv...@solr.apache.org that should
> > not have been conducted there so I am moving it to the dev list.  I
> > intend to hold a VOTE to formalize the decision soon.  It would be a
> > procedural vote and as-such needs a majority of voters approving for
> > it to pass.
> >
> > -- PROPOSAL BEGIN --
> > JIRA issues are optional for code changes, except non-public matters
> > like fixing a security vulnerability.  We may have a documented
> > *preference* (e.g. please create them for "big" things but not "small"
> > things), but ultimately it's optional.  So if a GitHub PR appears to
> > be ready but lacks an associated JIRA -- it may be merged anyway,
> > regardless of documented JIRA usage preferences.
> > -- END --
> >
> > This isn't the same decision as GitHub Issues vs JIRA --
> > https://issues.apache.org/jira/browse/SOLR-16455 because this isn't
> > about GitHub Issues at all.  Nonetheless I think fans of SOLR-16455
> > would eagerly vote +1 to the proposal here.
> >
> > This doesn't retire Solr's use of JIRA nor make it deprecated.  JIRA
> > is required for private/security issues that can't be disclosed.
> >
> > Not requiring JIRA does not substantively change our use of
> > CHANGES.txt.  Use "PR#2320" syntax, for example (that's an excerpt I
> > copy-pasted).  PRs don't necessarily need a CHANGES.txt either (minor
> > stuff might omit).  No change in policy/guidance.
> >
> > Commit messages thus won't have a SOLR- prefix.  There is no
> > guidance on what the prefix should be.  Please don't use "NO-JIRA".
> > GitHub adds a suffix like (#2320), which is fine; easily click-able in
> > an IDE & GitHub UI.  Separately from this discussion thread, I hope we
> > might discuss a useful prefix standard.
> >
> > I acknowledge that optionality in use of JIRA will styme people's
> > attempts to use JIRA as a comprehensive resource of 

DISCUSS: Optionality of JIRA issues

2024-03-22 Thread David Smiley
There was a recent conversation in priv...@solr.apache.org that should
not have been conducted there so I am moving it to the dev list.  I
intend to hold a VOTE to formalize the decision soon.  It would be a
procedural vote and as-such needs a majority of voters approving for
it to pass.

-- PROPOSAL BEGIN --
JIRA issues are optional for code changes, except non-public matters
like fixing a security vulnerability.  We may have a documented
*preference* (e.g. please create them for "big" things but not "small"
things), but ultimately it's optional.  So if a GitHub PR appears to
be ready but lacks an associated JIRA -- it may be merged anyway,
regardless of documented JIRA usage preferences.
-- END --

This isn't the same decision as GitHub Issues vs JIRA --
https://issues.apache.org/jira/browse/SOLR-16455 because this isn't
about GitHub Issues at all.  Nonetheless I think fans of SOLR-16455
would eagerly vote +1 to the proposal here.

This doesn't retire Solr's use of JIRA nor make it deprecated.  JIRA
is required for private/security issues that can't be disclosed.

Not requiring JIRA does not substantively change our use of
CHANGES.txt.  Use "PR#2320" syntax, for example (that's an excerpt I
copy-pasted).  PRs don't necessarily need a CHANGES.txt either (minor
stuff might omit).  No change in policy/guidance.

Commit messages thus won't have a SOLR- prefix.  There is no
guidance on what the prefix should be.  Please don't use "NO-JIRA".
GitHub adds a suffix like (#2320), which is fine; easily click-able in
an IDE & GitHub UI.  Separately from this discussion thread, I hope we
might discuss a useful prefix standard.

I acknowledge that optionality in use of JIRA will styme people's
attempts to use JIRA as a comprehensive resource of Solr work
tracking.  For example if you have a JIRA filter / alert on keywords
of interest; it will become less effective; try switching to emails on
the iss...@solr.apache.org list instead.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Automatic "tidy" (formatting) in PR, GitHub action

2024-03-22 Thread David Smiley
Sometimes we make changes and forget to run tidy.  It's rather annoying.

It occurred to me that our "precommit" GitHub PR action could be modified
to first run tidy and to commit the changes (if any) beforehand, pushing to
the source branch (generally on someone's fork).  Here's a blog post with
examples on how to do this:
https://peterevans.dev/posts/github-actions-how-to-automate-code-formatting-in-pull-requests/
There are some caveats listed there... like a possible permissions issue.
It seems there may be a solution --
https://github.com/orgs/community/discussions/26865
Also, the post recommends a "slash command" approach instead but I think
that would just add an extra step; we know what the solution is every
time.  And of course a contributor can do manual follow-up editing of the
results when it doesn't flow nicely.  Ultimately it all gets squashed
anyway.

Any thoughts on this or an alternative?

~ David


Re: [jira] [Created] (SOLR-16455) Migrate Jira to Github Issues and Github Projects, and migrate mailing lists to Github Discussions

2024-03-22 Thread David Smiley
I found the correct thread, let's discuss there not here.
"Do we require Jira's for bug fixes?" March 2nd

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [jira] [Created] (SOLR-16455) Migrate Jira to Github Issues and Github Projects, and migrate mailing lists to Github Discussions

2024-03-22 Thread David Smiley
I can't seem to find the conversation I was thinking of but this one
is close enough.
Can we have more clarity on when a JIRA issue is mandatory?  Or the
reverse -- give clarity that small bugs don't need a JIRA issue.  My
goal is to remove/reduce barriers for contributors as much as
possible.

I'd like to give a specific example of a simple & obscure NPE in
QueryComponent related to exception handling:
https://issues.apache.org/jira/browse/SOLR-17209
https://github.com/apache/solr/pull/2354
Yes I was the one who asked for a JIRA to be created but I felt bad
doing so (for something so trivial) and was only doing it because I
recall one of us recently saying/suggesting all bugs need a JIRA.

~ David

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: [DISCUSS] Community Virtual Meetup, March 2024

2024-03-21 Thread David Smiley
Sounds good; I can probably join then.  Thanks for organizing!

On Thu, Mar 21, 2024 at 11:26 AM Jason Gerlowski  wrote:
>
> Hey all,
>
> Haven't heard any suggestions on scheduling, so maybe we can aim for noon
> ET on Thursday of next week?  That gives us a full week between then and
> now.  Pending any last minute objections I'll create the Confluence page
> and Google Meet link later today.
>
> Hope to see many of you there!
>
> Jason
>
> On Tue, Mar 19, 2024 at 8:34 AM Jason Gerlowski 
> wrote:
>
> > Hey all,
> >
> > It's time once again to start thinking ahead to our Virtual Meetup for
> > March!  (Apologies for starting this discussion a bit late this month, as
> > I've been afk for a few weeks.)
> >
> > Since we are pretty far into the month I'll volunteer to organize the
> > meeting for March, so all we need to do is pick a day and time to meet.
> > Does anyone have opinions on that?  Maybe one day next week would make for
> > a good target?
> >
> > Best,
> >
> > Jason
> >

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: timeout HTTP response code; use 524?

2024-03-19 Thread David Smiley
I'm glad I raised this topic -- great feedback from all of you!

I suppose we could indeed use a new/unused response code.  Making it
configurable seems fine but I'd only want to support arcane things
like that in the most basic way, e.g. using the new EnvUtils.

Walter's response really resonates with me -- 503 "temporary overload"
wording is a reasonable characterization of exceeding a timeout in
practice, at least a good proportion of the time.  Perhaps less clear
when the user's query is really expensive and there's ample resources
to finish it.  But a timeout's existence is largely to prevent
resource exhaustion (overload) so this distinction of *is* overloaded
vs. protecting ourselves from such is splitting hairs needlessly IMO.

On Tue, Mar 19, 2024 at 6:54 PM Walter Underwood  wrote:
>
> I still think 503 is appropriate when timeAllowed is exceeded. The service 
> requested is a reponse within the set time. That service is not available. 
> Here are the RFC definitions of 500 and 503. Exceeding timeAllowed isn’t an 
> “unexpected condition”, it is part of the normal operation of that limit.
>
> 6.6.1.  500 Internal Server Error
>
>The 500 (Internal Server Error) status code indicates that the server
>encountered an unexpected condition that prevented it from fulfilling
>the request.
> https://datatracker.ietf.org/doc/html/rfc7231#section-6.6.1
>
>  6.6.4 503 Service Unavailable
>
>The 503 (Service Unavailable) status code indicates that the server
>is currently unable to handle the request due to a temporary overload
>or scheduled maintenance, which will likely be alleviated after some
>delay.  The server MAY send a Retry-After header field
>(Section 7.1.3) to suggest an appropriate amount of time for the
>client to wait before retrying the request
> https://datatracker.ietf.org/doc/html/rfc7231#section-6.6.4
>
> Solr could even return 503 with a message of “timeAllowed exceeded”.
>
> I spent about a decade working on a search engine with an integrated web 
> spider. Accurate HTTP response codes are really useful.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Mar 19, 2024, at 3:12 PM, Chris Hostetter  
> > wrote:
> >
> >
> > Agree on all of Uwe's points below
> >
> > I think 500 is the most appropriate for exceeding QueryLimits --
> > unless/until we decie we want Solr to start using custom response codes in
> > some cases, but in that case i would suggest we explicitly *avoid* using
> > 504, 524, & 529 precisely because they already have specific meanings in
> > well known HTTP proxies/services that don't match what we're talking about
> > here.
> >
> > As far as one of David's specific observations...
> >
> > : > ideal IMO because Solr's health could reasonably be judged by looking
> > : > for 500's specifically as a sign of a general error that service
> > : > operators should pay attention to.
> >
> > Any client that is interpreting a '500' error as a *general* indication of
> > a problem with Solr, and not specific to that request, would not be
> > respecting the spec on what '500' means.  *Some* '5xx' are documented
> > to indicate that there may be a general problem afflicting the
> > server/service as a whole (notably '503') but most do not.
> >
> > But i also think that if we really want to cover our basis -- we can
> > always make it configurable.  Let people configure Solr to return
> > 500, 400, 418, 666, 999, ... wtf they want ... but 500 is probably the
> > best sane default that doesn't carry around implicit baggage.
> >
> > : 524 or 504 both refer to timeouts, but both are meant for proxies (so 
> > reverse
> > : proxy can't reach the backend server in time). So both of them do not 
> > match.
> > :
> > : 408 is "request timeout", but that's client's fault (4xx code). In that 
> > case
> > : its a more technical code because it also requires to close the 
> > connection and
> > : not keep it alive, so we can't trigger that from Servlet API in a correct 
> > way.
> > :
> > : 503 does not fit well as Solr is not overloaded, but would be the only
> > : alternative I see. Maybe add a new Solr-specific one? Anyways, I think 500
> > : seems the best response unless you find another one not proxy-related.
> > :
> > : Uwe
> >
> >
> > -Hoss
> > http://www.lucidworks.com/
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



timeout HTTP response code; use 524?

2024-03-18 Thread David Smiley
If timeAllowed is set and Solr takes too long then we fail the
response with an HTTP 500 response code.  It's not bad but it's not
ideal IMO because Solr's health could reasonably be judged by looking
for 500's specifically as a sign of a general error that service
operators should pay attention to.  There is a 529 response code used
by CloudFlare (judging from Wikipedia):
https://en.wikipedia.org/wiki/List_of_HTTP_status_codes

Any opinion on the use of 529 instead of 500; or alternative perspectives?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



  1   2   3   4   5   6   >