Re: AbstractMultiTermQueryConstantScoreWrapper cost estimates (https://github.com/apache/lucene/issues/13029)

2024-08-02 Thread Michael Froh
Incidentally, speaking as someone with only a superficial understanding of how the FSTs work, I'm wondering if there is risk of cost in expanding the first few terms. Say we have a million terms, but only one contains an 'a'. If someone searches for '*a*', does that devolve into a term scan? Or

Re: AbstractMultiTermQueryConstantScoreWrapper cost estimates (https://github.com/apache/lucene/issues/13029)

2024-08-02 Thread Michael Froh
Exactly! My initial implementation added some potential cost. (I think I enumerated up to 128 terms before giving up.) Now that Mayya moved the (probably tiny) cost of expanding the first 16 terms upfront, my change is theoretically "free". Froh On Fri, Aug 2, 2024 at 3:25 PM Greg Miller

Re: AbstractMultiTermQueryConstantScoreWrapper cost estimates (https://github.com/apache/lucene/issues/13029)

2024-08-02 Thread Greg Miller
Hey Froh- I got some time to look through your PR (most of the time was actually refreshing my memory on the change history leading up to your PR and digesting the issue described). I think this makes a ton of sense. If I'm understanding properly, the latest version of your PR essentially takes

Re: AbstractMultiTermQueryConstantScoreWrapper cost estimates (https://github.com/apache/lucene/issues/13029)

2024-08-01 Thread Greg Miller
Hi Froh- Thanks for raising this and sorry I missed your tag in GH#13201 back in June (had some vacation and was generally away). I'd be interested to see what others think as well, but I'll at least commit to looking through your PR tomorrow or Monday to get a better handle on what's being

AbstractMultiTermQueryConstantScoreWrapper cost estimates (https://github.com/apache/lucene/issues/13029)

2024-08-01 Thread Michael Froh
Hi there, For a few months, some of us have been running into issues with the cost estimate from AbstractMultiTermQueryConstantScoreWrapper. ( https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/AbstractMultiTermQueryConstantScoreWrapper.java#L300 ) In

Re: Intra-segment search concurrency implementation

2024-08-01 Thread Luca Cavanna
Hey Alan, Thanks for the feedback. I need to give it some more thought, but I kind of assumed that we would not want to create different instances of leaf reader context for partitions of the same segment. The mapping between the physical layout of a segment and leaf reader context should remain

Re: Intra-segment search concurrency implementation

2024-07-31 Thread Alan Woodward
Hi Luca, This is very exciting! I haven’t followed the dev process very closely so far, so this may already have been looked at and dismissed as unworkable for various reasons, but I’m wondering if we definitely need a new abstraction for a LeafReaderContext partition? Could we instead find

Re: IndexWriter.getReader speed (NRT)

2024-07-31 Thread Adrien Grand
Michael's work was indeed never merged. This approach of searching the IndexWriter buffer is tempting on paper, but I worry that it would come with lots of asterisks and be hard to use in practice. - Terms and points are not sorted in the IW buffer, so some queries like range queries would be

Re: [JENKINS] Lucene-main-MacOSX (64bit/hotspot/jdk-21.0.1) - Build # 11653 - Unstable!

2024-07-30 Thread Jakub Slowinski
Reproducible. Attempted the fix in https://github.com/apache/lucene/pull/13621. On Tue, 30 Jul 2024 at 13:55, Policeman Jenkins Server wrote: > Build: https://jenkins.thetaphi.de/job/Lucene-main-MacOSX/11653/ > Java: 64bit/hotspot/jdk-21.0.1 -XX:-UseCompressedOops -XX:+UseParallelGC > > 1 tests

Re: IndexWriter.getReader speed (NRT)

2024-07-30 Thread David Smiley
On Mon, Jul 29, 2024 at 3:51 PM Michael Froh wrote: > > Hi David, > > Great meeting you at Buzzwords last month! Nice seeing you too! ... > Adding an overload to IndexWriter.getReader would be pretty easy, but that > method is package-private. The hairier part probably involves deciding which

SearchWithCollectorTask breaking changes

2024-07-30 Thread Luca Cavanna
Hi all, I've been working to remove leftover usages of the deprecated search(Query,Collector) method. Some adjustments are needed in ReadTask as well as SearchWithCollectorTask. I made the changes in https://github.com/apache/lucene/pull/13602 . I could use a review from somebody who is familiar

Re: Welcome Armin Braun as Lucene comitter

2024-07-29 Thread Nhat Nguyen
Congrats, Armin! On Mon, Jul 29, 2024 at 11:08 AM Michael Gibney wrote: > Welcome, Armin! > > On Mon, Jul 29, 2024 at 7:54 AM Gus Heck wrote: > > > > Welcome, Armin! > > > > On Mon, Jul 29, 2024 at 2:14 AM Stefan Vodita > wrote: > >> > >> Welcome, Armin! > >> > >> On Thu, 25 Jul 2024 at

Intra-segment search concurrency implementation

2024-07-29 Thread Luca Cavanna
Hey all, I have been working on an initial implementation of intra-segment search concurrency for Lucene. My goal is to introduce the ability to concurrently search partitions of the same segment, think of a force-merged segment for instance, in a way that's as transparent as possible to users.

Re: IndexWriter.getReader speed (NRT)

2024-07-29 Thread Michael Froh
Hi David, Great meeting you at Buzzwords last month! (Sorry for the late reply -- I was on vacation for weeks.) You can modify maxFullFlushMergeWaitMillis at the IndexWriterConfig level, but that sets it for any caller who tries to open an IndexReader from the IndexWriter. It sounds like you

Re: Welcome Armin Braun as Lucene comitter

2024-07-29 Thread Michael Gibney
Welcome, Armin! On Mon, Jul 29, 2024 at 7:54 AM Gus Heck wrote: > > Welcome, Armin! > > On Mon, Jul 29, 2024 at 2:14 AM Stefan Vodita wrote: >> >> Welcome, Armin! >> >> On Thu, 25 Jul 2024 at 12:08, Luca Cavanna wrote: >>> >>> I'm pleased to announce that Armin Braun has accepted the PMC's

Re: Welcome Armin Braun as Lucene comitter

2024-07-29 Thread Gus Heck
Welcome, Armin! On Mon, Jul 29, 2024 at 2:14 AM Stefan Vodita wrote: > Welcome, Armin! > > On Thu, 25 Jul 2024 at 12:08, Luca Cavanna wrote: > >> I'm pleased to announce that Armin Braun has accepted the PMC's >> invitation to become a Lucene committer. >> >> Armin, the tradition is that new

Re: Welcome Armin Braun as Lucene comitter

2024-07-29 Thread Stefan Vodita
Welcome, Armin! On Thu, 25 Jul 2024 at 12:08, Luca Cavanna wrote: > I'm pleased to announce that Armin Braun has accepted the PMC's invitation > to become a Lucene committer. > > Armin, the tradition is that new committers introduce themselves with a > brief bio. > > Thanks for your

Re: Lucene Cyborg

2024-07-28 Thread Armin Braun
> What is of course very crazy and the main reason for his improvements: He > figured out that the query part can be made faster by using some tricks like > not using virtual function calls. This is not possible in Java I've seen the same thing in profiling many times over. Method calls use

Re: Welcome Armin Braun as Lucene comitter

2024-07-27 Thread Michael Sokolov
Welcome Armin! On Fri, Jul 26, 2024 at 7:24 PM Greg Miller wrote: > > Welcome Armin! > > On Fri, Jul 26, 2024 at 10:51 AM Patrick Zhai wrote: >> >> Congrats and welcome, Armin! >> >> >> On Fri, Jul 26, 2024, 10:30 Vigya Sharma wrote: >>> >>> Congratulations and welcome, Armin! Volunteering as

Re: Welcome Armin Braun as Lucene comitter

2024-07-26 Thread Greg Miller
Welcome Armin! On Fri, Jul 26, 2024 at 10:51 AM Patrick Zhai wrote: > Congrats and welcome, Armin! > > On Fri, Jul 26, 2024, 10:30 Vigya Sharma wrote: > >> Congratulations and welcome, Armin! Volunteering as a firefighter is >> amazing, respect! >> >> On Fri, Jul 26, 2024 at 1:46 AM Ignacio

Re: Please help me find a good first issue

2024-07-26 Thread Greg Miller
Hi Lucas! Thanks for your interest and for reaching out. Sounds like your background could provide you with a useful set of fresh eyes to view our codebase through! This question comes up a lot (good starter issues) and I don't think our answers are ever all that satisfying. I will share a few

Re: Welcome Armin Braun as Lucene comitter

2024-07-26 Thread Patrick Zhai
Congrats and welcome, Armin! On Fri, Jul 26, 2024, 10:30 Vigya Sharma wrote: > Congratulations and welcome, Armin! Volunteering as a firefighter is > amazing, respect! > > On Fri, Jul 26, 2024 at 1:46 AM Ignacio Vera wrote: > >> Welcome Armin! >> >> On Fri, Jul 26, 2024 at 10:16 AM Chris

Re: Welcome Armin Braun as Lucene comitter

2024-07-26 Thread Vigya Sharma
Congratulations and welcome, Armin! Volunteering as a firefighter is amazing, respect! On Fri, Jul 26, 2024 at 1:46 AM Ignacio Vera wrote: > Welcome Armin! > > On Fri, Jul 26, 2024 at 10:16 AM Chris Hegarty > wrote: > > > > Welcome Armin! > > > > -Chris. > > > > > On 26 Jul 2024, at 05:24,

Re: Welcome Armin Braun as Lucene comitter

2024-07-26 Thread Ignacio Vera
Welcome Armin! On Fri, Jul 26, 2024 at 10:16 AM Chris Hegarty wrote: > > Welcome Armin! > > -Chris. > > > On 26 Jul 2024, at 05:24, Anshum Gupta wrote: > > > > Congratulations and welcome, Armin! > > > > On Thu, Jul 25, 2024 at 2:10 AM Luca Cavanna wrote: > > I'm pleased to announce that Armin

Re: Welcome Armin Braun as Lucene comitter

2024-07-26 Thread Chris Hegarty
Welcome Armin! -Chris. > On 26 Jul 2024, at 05:24, Anshum Gupta wrote: > > Congratulations and welcome, Armin! > > On Thu, Jul 25, 2024 at 2:10 AM Luca Cavanna wrote: > I'm pleased to announce that Armin Braun has accepted the PMC's invitation to > become a Lucene committer. > > Armin,

Re: Welcome Armin Braun as Lucene comitter

2024-07-25 Thread Anshum Gupta
Congratulations and welcome, Armin! On Thu, Jul 25, 2024 at 2:10 AM Luca Cavanna wrote: > I'm pleased to announce that Armin Braun has accepted the PMC's invitation > to become a Lucene committer. > > Armin, the tradition is that new committers introduce themselves with a > brief bio. > >

Re: Welcome Armin Braun as Lucene comitter

2024-07-25 Thread Armin Braun
Hi all, Thanks everyone for the warm welcome! A couple word about me: I'm Armin, originally from Germany, now living in Switzerland. I've been with Elastic for a little over seven years now. Started out in the Logstash team, then moved to Elasticsearch about six years ago. Until about a year ago

Like Fail2Ban, but for Kubernetes

2024-07-25 Thread Jonathan Wilbur from Kube2Ban
Hey there, Solr Dev Community, I made a Kubernetes service that works like Fail2Ban, but in cloud-native Kubernetes environments, called "Kube2Ban." It reads your Kubernetes logs and bans bad actors by their IP addresses at GCP Firewall, AWS WAF, Cloudflare, etc. I am offering a two-week money

Re: Welcome Armin Braun as Lucene comitter

2024-07-25 Thread Mike Drob
Congrats and welcome, Armin On Thu, Jul 25, 2024 at 8:49 AM Adrien Grand wrote: > Welcome Armin! > > On Thu, Jul 25, 2024 at 3:44 PM Uwe Schindler wrote: > >> Welcome Armin! Great to have you onboard. >> >> Yesterday I forgot to let you merge the PR on your own! You can now add >> your own

Re: Welcome Armin Braun as Lucene comitter

2024-07-25 Thread Adrien Grand
Welcome Armin! On Thu, Jul 25, 2024 at 3:44 PM Uwe Schindler wrote: > Welcome Armin! Great to have you onboard. > > Yesterday I forgot to let you merge the PR on your own! You can now add > your own changes entry: https://github.com/apache/lucene/pull/13608 > > Uwe > > Am 25.07.2024 um 11:06

Re: Welcome Armin Braun as Lucene comitter

2024-07-25 Thread Uwe Schindler
Welcome Armin! Great to have you onboard. Yesterday I forgot to let you merge the PR on your own! You can now add your own changes entry: https://github.com/apache/lucene/pull/13608 Uwe Am 25.07.2024 um 11:06 schrieb Luca Cavanna: I'm pleased to announce that Armin Braun has accepted the

Re: Welcome Armin Braun as Lucene comitter

2024-07-25 Thread 张超
Welcome Armin! -- Zhang Chao > 2024年7月25日 18:33,Dawid Weiss 写道: > > > Welcome, Armin! > > On Thu, Jul 25, 2024 at 11:10 AM Luca Cavanna > wrote: >> I'm pleased to announce that Armin Braun has accepted the PMC's invitation >> to become a Lucene committer. >> >>

Re: Welcome Armin Braun as Lucene comitter

2024-07-25 Thread Dawid Weiss
Welcome, Armin! On Thu, Jul 25, 2024 at 11:10 AM Luca Cavanna wrote: > I'm pleased to announce that Armin Braun has accepted the PMC's invitation > to become a Lucene committer. > > Armin, the tradition is that new committers introduce themselves with a > brief bio. > > Thanks for your

Re: Welcome Armin Braun as Lucene comitter

2024-07-25 Thread Alan Woodward
Welcome Armin! > On 25 Jul 2024, at 10:06, Luca Cavanna wrote: > > I'm pleased to announce that Armin Braun has accepted the PMC's invitation to > become a Lucene committer. > > Armin, the tradition is that new committers introduce themselves with a brief > bio. > > Thanks for your

Re: Welcome Armin Braun as Lucene comitter

2024-07-25 Thread Michael McCandless
Welcome Armin! Mike McCandless http://blog.mikemccandless.com On Thu, Jul 25, 2024 at 5:10 AM Luca Cavanna wrote: > I'm pleased to announce that Armin Braun has accepted the PMC's invitation > to become a Lucene committer. > > Armin, the tradition is that new committers introduce themselves

Welcome Armin Braun as Lucene comitter

2024-07-25 Thread Luca Cavanna
I'm pleased to announce that Armin Braun has accepted the PMC's invitation to become a Lucene committer. Armin, the tradition is that new committers introduce themselves with a brief bio. Thanks for your contributions so far and looking forward to the upcoming ones :) Congratulations and

Please help me find a good first issue

2024-07-24 Thread Lucas Wolf
Hi everyone, My name is Lucas and I am interested in contributing to Lucene. I have read through the issues list on GitHub but felt that I was lacking a bit of context on what is achievable/impactful to tackle as a newcomer. Perhaps someone here can help me out. :) My background is mostly in

Re: Lucene Cyborg

2024-07-22 Thread Uwe Schindler
Hi, First of all: We have to be a bit careful with the results. E.g., the SegmentReader::get_live_docs() returns null, so the code does not use deleted documents. Of course this is not relevant if the original Lucene Index is also without deletions, but you need to keep an eye on it. What's

Re: Lucene Cyborg

2024-07-22 Thread Michael McCandless
Thanks for sharing Adrien, this is really cool! It's neat that the relative gains of Java vs C are quite a bit less than they were ~11 years ago when I played with a much smaller subset of queries. Also, COUNT on disjunction queries with Lucene Cyborg got slower. What a feat, to port so much of

Lucene Cyborg

2024-07-22 Thread Adrien Grand
Hello everyone, I recently stumbled on this paper after Ishan shared it on LinkedIn: https://github.com/0ctopus13prime/lucene-cyborg-paper/blob/main/LuceneCyborg_Hybrid_Search_Engine_Written_in_Java_and_C%2B%2B.pdf . This is quite impressive: this person did a high-fidelity rewrite of Lucene in

JDK 23 RDP2 | Removal of the legacy COMPAT locale provider and more heads-up!

2024-07-21 Thread David Delabassee
Welcome to the OpenJDK Quality Outreach summer update. JDK 23 is now in Rampdown Phase Two [1], its overall feature has been frozen a few weeks ago. Per the JDK Release Process, we have now turned our focus to P1 and P2 bugs, which can be fixed with approval [2]. Late enhancements are still

Cacheable DocAndScoreQuery in AbstractKnnVectorQuery

2024-07-19 Thread Alexey Gorlenko
Hi all, I see that DocAndScoreQuery's Weight in AbstractKnnVectorQuery is cacheable now. Is this correct? I doubt because if we use KnnQuery as a part of a filter (for example and(someFilter, or(knnQuery1, knnQuery2))), we get DocAndScoreQuery cached. But as far I understand after index

Re: Supporting more than 2 versions of indexes

2024-07-16 Thread Anshum Gupta
Sorry for the late reply, Mike. Yes, I am suggesting a 'best effort' here that at least allows users to attempt using older indices. We could potentially even print a warning. On Mon, Jul 1, 2024 at 3:43 AM Michael McCandless wrote: > Increasing the scope/duration of backwards compatibility

Re: `./gradlew clean check` no longer works on main branch?

2024-07-15 Thread Dawid Weiss
> matter? Searching our build.gradle, I only see: >includeBuild("dev-tools/missing-doclet") > when searching for "includeBuild" but maybe it's other syntax too? > Look at settings.gradle - includeBuild("build-tools/build-infra") D. > On Fri, Jul 12, 2024 at 6:01 AM Dawid Weiss wrote: >

Re: `./gradlew clean check` no longer works on main branch?

2024-07-15 Thread David Smiley
What "composite subproject" / "included build" specifically is the matter? Searching our build.gradle, I only see: includeBuild("dev-tools/missing-doclet") when searching for "includeBuild" but maybe it's other syntax too? On Fri, Jul 12, 2024 at 6:01 AM Dawid Weiss wrote: > > > I've spent

Re: `./gradlew clean check` no longer works on main branch?

2024-07-12 Thread Dawid Weiss
I've spent some time on this. The problem is a result of the change from buildSrc to an included build (composite subproject). I don't think this should matter much but task ordering from the composite affects and is tangled with the top level project tasks - I filed this issue and provided a link

unsubscribe

2024-07-10 Thread Tony Schwartz

Re: `./gradlew clean check` no longer works on main branch?

2024-07-10 Thread Dawid Weiss
> > Fair -- I was mainly just wondering if you (or anyone) knew/remembered off > the top of your head something that had changed in the "architecture" of > our gradle deps on the main branch (since this problem doesn't sem to > affect branch_9x, eventhough it also uses the same version of gradle)

Re: `./gradlew clean check` no longer works on main branch?

2024-07-09 Thread Chris Hostetter
: The build does need an overhaul - there are some problems with it that I : know about - but it's not as simple as reviewing the dependencies you : mentioned - there are also implicit dependencies that are derived from : configurations dependencies, source sets, etc. Fair -- I was mainly just

Re: `./gradlew clean check` no longer works on main branch?

2024-07-09 Thread Dawid Weiss
> ...so maybe there is a way to avoid this problem in lucene by > removing / re-thinking / tweaking some of dependsOn, mustRunAfter, > or finalizedBy declarations somewhere? The build does need an overhaul - there are some problems with it that I know about - but it's not as simple as reviewing

Re: `./gradlew clean check` no longer works on main branch?

2024-07-08 Thread Chris Hostetter
: Seems like a bug in gradle somewhere, Chris - : https://github.com/gradle/gradle/issues/21325 Following some of the linked commits/issues/PRs from that URL, and reading between the lines, it sounds like gradle 7.5 (and above) changed something in how gradle analyzes what/where/when/how it

Re: `./gradlew clean check` no longer works on main branch?

2024-07-08 Thread Dawid Weiss
Seems like a bug in gradle somewhere, Chris - https://github.com/gradle/gradle/issues/21325 On Mon, Jul 8, 2024 at 10:28 PM Dawid Weiss wrote: > > Yep, same for me. Looks terrible. I don't know what this is, sorry. > > Dawid > > > On Mon, Jul 8, 2024 at 9:30 PM Chris Hostetter > wrote: > >> >>

Re: `./gradlew clean check` no longer works on main branch?

2024-07-08 Thread Dawid Weiss
Yep, same for me. Looks terrible. I don't know what this is, sorry. Dawid On Mon, Jul 8, 2024 at 9:30 PM Chris Hostetter wrote: > > It's been a bit since I built anything on the main branch, but it seems > like smething is now broken in the gradle task > depenendencies/parallelization if you

`./gradlew clean check` no longer works on main branch?

2024-07-08 Thread Chris Hostetter
It's been a bit since I built anything on the main branch, but it seems like smething is now broken in the gradle task depenendencies/parallelization if you try to run... ./gradlew clean check A truncated example of what the output looks like is below, the final failure message

Re: scalar quantization heap usage during merge

2024-07-04 Thread Gautam Worah
Thanks for the PR Ben. I'll try to take a look in the next couple of days. On leave for now.. I got the setup working yesterday, and thought of sharing some learnings. I changed the LiveIndexWriterConfig#ramBufferSizeMB to 2048 and that made things work. I was even able to keep merging on, and

unsubscribe

2024-07-03 Thread Qizhi Zheng
Thanks.

Re: scalar quantization heap usage during merge

2024-07-03 Thread Benjamin Trent
Hey Gautam & Michael, I opened a PR that will help slightly. It should reduce the heap usage by a smallish factor. But, I would still expect the cost to be dominated by the `float[]` vectors held in memory before flush. https://github.com/apache/lucene/pull/13538 The other main overhead is the

unsubscribe

2024-07-03 Thread Gian Marco Tagliani

Re: scalar quantization heap usage during merge

2024-07-02 Thread Gautam Worah
Hi Ben, I am working on something very close to what Michael Sokolov has done. I see OOMs on the Writer when it tries to index 130M 8 bit / 4 bit quantized vectors on a single big box with a 40 GB heap, with HNSW disabled. I've tried indexing all the vectors as plain vectors converted to floats

Re: github notification delay

2024-07-02 Thread Michael Sokolov
ah that helps, thanks On Tue, Jul 2, 2024 at 2:41 PM Robert Muir wrote: > > On Tue, Jul 2, 2024 at 1:59 PM Michael Sokolov wrote: > > > > Hi all - I wonder if anyone else is observing weird email behavior > > from Github. I'm starting to see emails generated from PRs and issues > > that are

Re: github notification delay

2024-07-02 Thread Robert Muir
On Tue, Jul 2, 2024 at 1:59 PM Michael Sokolov wrote: > > Hi all - I wonder if anyone else is observing weird email behavior > from Github. I'm starting to see emails generated from PRs and issues > that are wildly out of date. Like one dated yesterday that was > generated from a comment that is

github notification delay

2024-07-02 Thread Michael Sokolov
Hi all - I wonder if anyone else is observing weird email behavior from Github. I'm starting to see emails generated from PRs and issues that are wildly out of date. Like one dated yesterday that was generated from a comment that is weeks old. And I am missing many current updates -- as if there

Re: Should all 'static final' CharArray(Set|Map)s in stock Analyzers be "public" ?

2024-07-02 Thread Chris Hostetter
: Should we keep the HOLDER.DEFAULT pattern to not create the default stop : set if not needed (when there is a custom building)? I did not mean to imply that i think we eliminate the HOLDER pattern/optimization -- i just didn't include it in my "(simplified)" example to try and focus on the

Re: Should all 'static final' CharArray(Set|Map)s in stock Analyzers be "public" ?

2024-07-02 Thread Bruno Roustant
Should we keep the HOLDER.DEFAULT pattern to not create the default stop set if not needed (when there is a custom building)? Le mar. 2 juil. 2024 à 01:45, Chris Hostetter a écrit : > > : There's also one other problem with those sets: Unfortunately they are > : modifiable, because they are not

Re: Should all 'static final' CharArray(Set|Map)s in stock Analyzers be "public" ?

2024-07-01 Thread Chris Hostetter
: There's also one other problem with those sets: Unfortunately they are : modifiable, because they are not real "Set" but CharArraySets. There : is no 100% unmodifiable view of them. This was the main reason why we did not : make them public for newer variants of analyzers. I think we should

Re: lucene-solr-1 Jenkins Agent Management

2024-07-01 Thread Jason Gerlowski
Thanks again Uwe! On Wed, Jun 26, 2024 at 4:41 PM Uwe Schindler wrote: > > Sorry, > > it's dead again. So fixing the stuck slave is up to ASF INFRA. Maybe its > something like wrong Java version. Without any logfiles its impossible > to guess. > > Uwe > > Am 26.06.2024 um 22:39 schrieb Uwe

Re: Supporting more than 2 versions of indexes

2024-07-01 Thread Michael McCandless
Increasing the scope/duration of backwards compatibility index support across the board adds a big taxation and risk on ongoing development. It's hard enough just supporting N-1 major release written indices. Or are we talking about the "best effort" (e.g. sandbox Codecs) that I think Simon

Re: Should all 'static final' CharArray(Set|Map)s in stock Analyzers be "public" ?

2024-06-30 Thread Uwe Schindler
Hi again, There's also one other problem with those sets: Unfortunately they are modifiable, because they are not real "Set" but CharArraySets. There is no 100% unmodifiable view of them. This was the main reason why we did not make them public for newer variants of analyzers. I think we

Re: Should all 'static final' CharArray(Set|Map)s in stock Analyzers be "public" ?

2024-06-30 Thread Uwe Schindler
Hi, I am fine with this. But on the other hand: Why do you want to replicate the files into Solr's config folder? A Solr configuration should better be able to load the stopwords file from resources, too. I was always wondering why we have that tons of files in the default configset, some of

Re: Should all 'static final' CharArray(Set|Map)s in stock Analyzers be "public" ?

2024-06-27 Thread Dawid Weiss
+1 to make it more consistent (with preference for a public method). Dawid On Fri, Jun 28, 2024 at 2:16 AM Chris Hostetter wrote: > > Over in Solr, there's an open jira regarding some "drift" that has > happened over time between some of the lang specific stopword files that > Solr shipts in

Should all 'static final' CharArray(Set|Map)s in stock Analyzers be "public" ?

2024-06-27 Thread Chris Hostetter
Over in Solr, there's an open jira regarding some "drift" that has happened over time between some of the lang specific stopword files that Solr shipts in it's default configset and the equivilent files that are provided in the lucene jars (and loaded by the corrisponding lucene Analyzers

Re: Supporting more than 2 versions of indexes

2024-06-27 Thread Anshum Gupta
I'm actually only considering support for 8x+ but I think the default codec, used by most users, should allow for 7x indexes to be read by 9x. If we can do this for 8x+ i.e. indexes generated with 8x being supported by 10 would be a good starting point as well. On Thu, Jun 27, 2024 at 1:23 PM

Re: Supporting more than 2 versions of indexes

2024-06-27 Thread Ishan Chattopadhyaya
+1, we should definitely give this a try. Do you have any particular version combinations in mind that don't work for users now? On my end, I see Solr 8x users who would love to use Solr 9x, but with Lucene 8x indexes (previously upgraded from Lucene 7x). On Thu, 27 Jun 2024 at 23:17, Anshum

Supporting more than 2 versions of indexes

2024-06-27 Thread Anshum Gupta
Hi everyone, At Buzzwords and Community Over Code this last month, the topic of supporting indexes for over 2 versions came up. While there are times that require breaking compatibility, I think it would be really useful to support the indexes especially if you use a codec that doesn't have a

[ANNOUNCE] Apache Lucene 9.11.1 released

2024-06-27 Thread Ignacio Vera
The Lucene PMC is pleased to announce the release of Apache Lucene 9.11.1. Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

[RESULT] [VOTE] Release Lucene 9.11.1 RC1

2024-06-27 Thread Ignacio Vera
It's been >72h since the vote was initiated and the result is: +1 11 (11 binding) 0 0 -1 0 This vote has PASSED

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-27 Thread Ignacio Vera
Hello, This vote is now closed. I will be sending an email with the results, Thank you for participating! Ignacio On Thu, Jun 27, 2024 at 1:32 PM Michael McCandless < luc...@mikemccandless.com> wrote: > +1 > > SUCCESS! [0:19:43.387183] > > Thank you for RMing Ignacio! > > Mike McCandless > >

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-27 Thread Michael McCandless
+1 SUCCESS! [0:19:43.387183] Thank you for RMing Ignacio! Mike McCandless http://blog.mikemccandless.com On Mon, Jun 24, 2024 at 1:29 AM Ignacio Vera wrote: > Please vote for release candidate 1 for Lucene 9.11.1 > > > The artifacts can be downloaded from: > > >

Re: Lucene 10

2024-06-27 Thread Michael McCandless
Thanks Adrien. Longish term planning in open source is such a hard thing so I'm glad you are helping to herd us cats ;) I've also finally switched our nightly benchmarks to use concurrent search (intra-query concurrency)! It's annotation GM in the charts. Some queries got faster, like

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-27 Thread Dawid Weiss
+1 SUCCESS! [2:56:37.226214] On Mon, Jun 24, 2024 at 7:28 AM Ignacio Vera wrote: > Please vote for release candidate 1 for Lucene 9.11.1 > > > The artifacts can be downloaded from: > > >

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-27 Thread Ignacio Vera
Hi Uwe, I will keep the vote open until today 12:00 UTC. I hope it gives you enough time to run the smoke tester with JDK 22. Cheers, Ignacio On Wed, Jun 26, 2024 at 11:05 PM Uwe Schindler wrote: > Hi, > > Small update: Give me a few hours, I will add Java 22 to the list > (important for

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-26 Thread Uwe Schindler
Hi, Small update: Give me a few hours, I will add Java 22 to the list (important for vectors). My vote still counts unless I veto. Uwe Am 26. Juni 2024 22:57:48 MESZ schrieb Uwe Schindler : >Hi, > >+1 from my side. > >Policeman Jenkins checked the relaese with several JDKs for me: > >+

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-26 Thread Uwe Schindler
Hi, +1 from my side. Policeman Jenkins checked the relaese with several JDKs for me: + python3 -u dev-tools/scripts/smokeTestRelease.py --test-alternative-java /home/jenkins/tools/java/64bit/hotspot/latest-jdk17 --test-alternative-java /home/jenkins/tools/java/64bit/hotspot/latest-jdk19

Re: lucene-solr-1 Jenkins Agent Management

2024-06-26 Thread Uwe Schindler
Sorry, it's dead again. So fixing the stuck slave is up to ASF INFRA. Maybe its something like wrong Java version. Without any logfiles its impossible to guess. Uwe Am 26.06.2024 um 22:39 schrieb Uwe Schindler: Hi, looks like killing the stuck slave process ("sudo pkill -9 -u jenkins")

Re: lucene-solr-1 Jenkins Agent Management

2024-06-26 Thread Uwe Schindler
Hi, looks like killing the stuck slave process ("sudo pkill -9 -u jenkins") helped lucene-solr-2 to recover. Uwe Am 26.06.2024 um 16:30 schrieb Jason Gerlowski: Hi Uwe Thanks for the context! A few follow-up questions for you (or anyone else that can answer): 1. What's the hostname for

Re: lucene-solr-1 Jenkins Agent Management

2024-06-26 Thread Uwe Schindler
Hi Jason, Am 26.06.2024 um 16:30 schrieb Jason Gerlowski: Hi Uwe Thanks for the context! A few follow-up questions for you (or anyone else that can answer): 1. What's the hostname for lucene-solr-1? Maybe my Jenkins knowledge is lacking, but I can't find the full hostname exposed anywhere

Re: Lucene 10

2024-06-26 Thread Adrien Grand
Hello everyone, Time flies, I started this email thread ~3.5 months ago and we now have ~3 months before September 22nd, where 10.0 will go on feature freeze. Robert kindly added a description to the GitHub milestone that refers to this thread: https://github.com/apache/lucene/milestone/2.

Re: lucene-solr-1 Jenkins Agent Management

2024-06-26 Thread Jason Gerlowski
Hi Uwe Thanks for the context! A few follow-up questions for you (or anyone else that can answer): 1. What's the hostname for lucene-solr-1? Maybe my Jenkins knowledge is lacking, but I can't find the full hostname exposed anywhere in Jenkins, and I'd like to try SSH-ing in. (I'm hoping it

Re: [JENKINS] Lucene-main-MacOSX (64bit/hotspot/jdk-21.0.1) - Build # 11500 - Still Failing!

2024-06-26 Thread Uwe Schindler
The fault is also on my side, because the Hackintosh VM is way too outdated :-) It is now also EOL for Apple Updates, so I may need to spend some time to reinstall or upgrade it to latest x64 macOS (and hope it still works, the VM needs some patched cpuid, as it is AMD and Apple doesn't like

Re: lucene-solr-1 Jenkins Agent Management

2024-06-26 Thread Uwe Schindler
Hi, yes e.g., I have access to that machine and can issue a reboot. The two machines are VMs specifically owned by the Lucene PMC (not Solr as far as I remeber), but they were created before the split of projects. Some people have direct access, not sure how this is managed. Basically to

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-25 Thread Houston Putman
+1 SUCCESS! [0:37:06.564011] - Houston On Tue, Jun 25, 2024 at 1:27 PM Anshum Gupta wrote: > +1 > > SUCCESS! [0:44:08.897482] > > On Sun, Jun 23, 2024 at 10:29 PM Ignacio Vera wrote: > >> Please vote for release candidate 1 for Lucene 9.11.1 >> >> >> The artifacts can be downloaded from: >>

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-25 Thread Anshum Gupta
+1 SUCCESS! [0:44:08.897482] On Sun, Jun 23, 2024 at 10:29 PM Ignacio Vera wrote: > Please vote for release candidate 1 for Lucene 9.11.1 > > > The artifacts can be downloaded from: > > >

lucene-solr-1 Jenkins Agent Management

2024-06-25 Thread Jason Gerlowski
Hey all, Sending this email to discuss the ASF Jenkins' 'lucene-solr-1' worker node, which both Lucene and Solr use to run builds. While investigating a recent issue with some Solr builds, I asked INFRA to restart 'lucene-solr-1'. They were happy to oblige (and I'm now unblocked), but in the

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-25 Thread Tomás Fernández Löbbe
+1 SUCCESS! [1:10:51.244392] On Tue, Jun 25, 2024 at 11:20 AM Chris Hegarty wrote: > +1 SUCCESS! [0:54:35.855891] > > -Chris. > > > > On 25 Jun 2024, at 09:27, Adrien Grand wrote: > > > > +1 SUCCESS! [1:25:33.092957] > > > > On Mon, Jun 24, 2024 at 4:22 PM Michael Sokolov > wrote: > >

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-25 Thread Chris Hegarty
+1 SUCCESS! [0:54:35.855891] -Chris. > On 25 Jun 2024, at 09:27, Adrien Grand wrote: > > +1 SUCCESS! [1:25:33.092957] > > On Mon, Jun 24, 2024 at 4:22 PM Michael Sokolov wrote: > SUCCESS! [0:55:48.190137] > > (tested w/Corretto JDK) > > +1 > > On Mon, Jun 24, 2024 at 8:01 AM Benjamin

Re: [JENKINS] Lucene-main-MacOSX (64bit/hotspot/jdk-21.0.1) - Build # 11500 - Still Failing!

2024-06-25 Thread Chris Hegarty
Thanks Uwe, I upgraded Gradle ( to 8.8 ), but of course didn’t notice this issue. Apologies. Happy that you found a workaround. -Chris. > On 22 Jun 2024, at 15:34, Dawid Weiss wrote: > > > Thank you for digging, Uwe! > > On Fri, Jun 21, 2024 at 10:24 PM Uwe Schindler wrote: > Hi, > it

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-25 Thread Adrien Grand
+1 SUCCESS! [1:25:33.092957] On Mon, Jun 24, 2024 at 4:22 PM Michael Sokolov wrote: > SUCCESS! [0:55:48.190137] > > (tested w/Corretto JDK) > > +1 > > On Mon, Jun 24, 2024 at 8:01 AM Benjamin Trent > wrote: > > > > SUCCESS! [0:40:46.898514] > > > > +1 > > > > On Mon, Jun 24, 2024 at 1:29 AM

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-24 Thread Michael Sokolov
SUCCESS! [0:55:48.190137] (tested w/Corretto JDK) +1 On Mon, Jun 24, 2024 at 8:01 AM Benjamin Trent wrote: > > SUCCESS! [0:40:46.898514] > > +1 > > On Mon, Jun 24, 2024 at 1:29 AM Ignacio Vera wrote: > > > > Please vote for release candidate 1 for Lucene 9.11.1 > > > > > > The artifacts can

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-24 Thread Benjamin Trent
SUCCESS! [0:40:46.898514] +1 On Mon, Jun 24, 2024 at 1:29 AM Ignacio Vera wrote: > > Please vote for release candidate 1 for Lucene 9.11.1 > > > The artifacts can be downloaded from: > > https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.1-RC1-rev-0c087dfdd10e0f6f3f6faecc6af4415e671a9e69

Re: Bugfix release 9.11.1

2024-06-23 Thread Ignacio Vera
Here are the release notes for the release: https://cwiki.apache.org/confluence/pages/resumedraft.action?draftId=311626871=460c13d9-7e8e-4e24-912b-2933f64bb746; Please feel free to edit them. On Fri, Jun 21, 2024 at 11:55 AM Stefan Vodita wrote: > The fix is now in main, branch_9x, and

  1   2   3   4   5   6   7   8   9   10   >