[VOTE] Release Lucene 9.0.0 RC2

2021-11-23 Thread Adrien Grand
Please vote for release candidate 2 for Lucene 9.0.0 The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-9.0.0-RC2-rev-95072f3b71e67e308d71a6149323bf7dd04c8f75 You can run the smoke tester directly with this command: python3 -u

Re: [VOTE] Release Lucene 9.0.0 RC1

2021-11-22 Thread Adrien Grand
tch - but wanted to give notice > > of that, just in case. > > > > [1] > https://github.com/apache/lucene/commit/4193bcbc02313c82afcf8cf9e2d14e47466cb1c3 > > > > Tomoko > > > > 2021年11月22日(月) 6:18 Adrien Grand : > > > > > > Fair en

Re: What should we do of branch_8x?

2021-11-21 Thread Adrien Grand
.4711. > >> >> >> >> > >> >> >> >> As said before: let's close branch 8.x and add protection to > it in GitHub. Anybox may merge Bugfixes directly from Solr or Lucene main I > to branch_8_11. I see no problem. Just no index changes!

Re: [VOTE] Release Lucene 9.0.0 RC1

2021-11-21 Thread Adrien Grand
gt; > >> On Sun, Nov 21, 2021 at 5:11 AM Robert Muir wrote: > > >> > > > >> > -1 to release lucene 9.0, as long as branch_8x remains. > > >> > > > >> > I know you made a separate thread for this, but it is a real

What should we do of branch_8x?

2021-11-20 Thread Adrien Grand
Uwe brought up the question on a the vote thread: we are not going to do a 8.12 release, so what should we do of branch_8x?

Re: [VOTE] Release Lucene 9.0.0 RC1

2021-11-20 Thread Adrien Grand
delete it? > > Uwe > > Am 20. November 2021 13:15:23 UTC schrieb Adrien Grand >: >> >> We need to keep the 8.11 jobs, but I think they can be disabled. We >> typically only enable them when we start discussing doing a new patch >> release? >> >> Le sam.

Re: [VOTE] Release Lucene 9.0.0 RC1

2021-11-20 Thread Adrien Grand
er > Achterdiek 19, D-28357 Bremen > https://www.thetaphi.de > eMail: u...@thetaphi.de > > > -Original Message- > > From: Adrien Grand > > Sent: Saturday, November 20, 2021 9:25 AM > > To: Lucene Dev > > Subject: [VOTE] Release Lucene 9.0.0 RC1 >

[VOTE] Release Lucene 9.0.0 RC1

2021-11-20 Thread Adrien Grand
Please vote for release candidate 1 for Lucene 9.0.0. The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-9.0.0-RC1-rev-903ee94dc50643299c15dfa954410f3ee4d62075 You can run the smoke tester directly with this command: python3 -u

Re: Expected name of release candidates

2021-11-19 Thread Adrien Grand
943 > > Dawid > > On Fri, Nov 19, 2021 at 10:48 PM Adrien Grand wrote: > > > > Hello, > > > > I'm running the release scripts, and there is some inconsistency > > between the build logic that creates a release candidate where the > > name is > &g

Expected name of release candidates

2021-11-19 Thread Adrien Grand
Hello, I'm running the release scripts, and there is some inconsistency between the build logic that creates a release candidate where the name is lucene-{version}-RC1-rev-{git hash} while the smoketester assumes a name that is lucene-{version}-RC1-rev{git hash} (Note that there is no

Re: Lucene 9.0 release candidate

2021-11-19 Thread Adrien Grand
Greg already has a PR for it: https://github.com/apache/lucene/pull/458. On Fri, Nov 19, 2021 at 5:30 PM Robert Muir wrote: > > Can we also push commit to branch 9x (just don't want it to get forgotten) > > On Fri, Nov 19, 2021 at 10:50 AM Adrien Grand wrote: > > > > Th

Re: Lucene 9.0 release candidate

2021-11-19 Thread Adrien Grand
Thanks Greg, Patrick, Mike and Robert for the quick turnaround on getting these changes merged! I'll now resume work on the 9.0 release. On Fri, Nov 19, 2021 at 4:44 PM Greg Miller wrote: > > Heads up that both LUCENE-10122 and LUCENE-10062 have been merged onto > branch_9_0 now. @Adrie

8.11 / 8.x CI disabled

2021-11-17 Thread Adrien Grand
Hello, Now that 8.11 is now out, I disabled all Lucene and Solr 8.11 / 8.x jobs on ASF Jenkins. We can re-enable the 8.11 builds if/when we want to do a 8.11.1 release. Uwe, should we disable these builds on Policeman Jenkins too? -- Adrien

Re: [ANNOUNCE] Apache Lucene 8.11.0 released

2021-11-17 Thread Adrien Grand
/core/8_11/changes/Changes.html > > I got 404 Not Found due to missing "_0" of "8_11_0" in the link for > Changes. > > Koji > > On 2021/11/16 21:59, Adrien Grand wrote: > > The Lucene PMC is pleased to announce the release of Apache Lucene 8.11. >

[ANNOUNCE] Apache Lucene 8.11.0 released

2021-11-16 Thread Adrien Grand
The Lucene PMC is pleased to announce the release of Apache Lucene 8.11. Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform. This

Re: Lucene 9.0 release candidate

2021-11-15 Thread Adrien Grand
I'll update on this thread when PRs have been merged. > > Cheers, > -Greg > > On Mon, Nov 15, 2021 at 6:20 AM Adrien Grand wrote: > > > > Thanks Dawid. > > > > @Greg Miller What do you think about getting these two PRs in for 9.0? > > > > On Sun, Nov

Re: Lucene 9.0 release candidate

2021-11-15 Thread Adrien Grand
add this - already have a local > patch that does it and enables Luke to become a first-class module, > for example. > > Dawid > > On Sat, Nov 13, 2021 at 8:49 PM Adrien Grand wrote: > > > > Hello, > > > > I plan to build a RC for Lucene 9.0 in the next few day

Lucene 9.0 release candidate

2021-11-13 Thread Adrien Grand
Hello, I plan to build a RC for Lucene 9.0 in the next few days. We don't have blockers left, but there are two faceting changes that look like we could save some backward compatibility logic in 10.x by folding them into 9.0: - LUCENE-10062 :

[RESULT][VOTE] Release Lucene/Solr 8.11.0 RC1

2021-11-13 Thread Adrien Grand
It's been >72h since the vote was initiated and the result is: +1 14 (13 binding) 0 0 -1 0 This vote has PASSED -- Adrien

Re: Add custom merge policy to Lucene sandbox ?

2021-11-10 Thread Adrien Grand
+1 The entry bar for the sandbox is very low, it's a good place for experimental functionality. We can then look into folding it into the default merge policy if that makes sense. Le mer. 10 nov. 2021 à 08:49, Anand Kotriwal a écrit : > Hi, > > We (at Amazon product search) have customized

[VOTE] Release Lucene/Solr 8.11.0 RC1

2021-11-09 Thread Adrien Grand
Please vote for release candidate 1 for Lucene/Solr 8.11.0 The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.11.0-RC1-reve912fdd5b632267a9088507a2a6bcbc75108f381 You can run the smoke tester directly with this command: python3 -u

Re: 8.11 release candidate

2021-11-09 Thread Adrien Grand
The Solr change has been merged. There's a good bugfix for concurrent search that has been approved: https://github.com/apache/lucene/pull/431, so I'll wait for this one to be merged to branch_8_11 and I'll resume building a release candidate. On Fri, Nov 5, 2021 at 2:35 PM Adrien Grand wrote

Re: 8.11 release candidate

2021-11-05 Thread Adrien Grand
; On Fri, Nov 5, 2021 at 5:05 AM Adrien Grand wrote: > >> All blockers are now addressed, I'll proceed with building a release >> candidate. >> >> On Thu, Nov 4, 2021 at 6:51 PM Adrien Grand wrote: >> >>> Thanks Jan! >>> >>> On Thu,

Re: 8.11 release candidate

2021-11-05 Thread Adrien Grand
All blockers are now addressed, I'll proceed with building a release candidate. On Thu, Nov 4, 2021 at 6:51 PM Adrien Grand wrote: > Thanks Jan! > > On Thu, Nov 4, 2021 at 2:32 PM Jan Høydahl wrote: > >> I added a PR https://github.com/apache/lucene-solr/pull/2603 for >>

Re: Use lucene custom scorer for highlighting?

2021-11-05 Thread Adrien Grand
Hi Chris, While this is theoretically possible, this would require rewriting all queries that you might want to run, so this would be a huge investment. In general doing something like that is a bad idea since it requires computing highlights for many documents that may not make it to the top-k

Re: 8.11 release candidate

2021-11-04 Thread Adrien Grand
Thanks Jan! On Thu, Nov 4, 2021 at 2:32 PM Jan Høydahl wrote: > I added a PR https://github.com/apache/lucene-solr/pull/2603 for > SOLR-14438, that I had in the workings. I think it will do... > > Jan > > 4. nov. 2021 kl. 10:34 skrev Adrien Grand : > > Joel recommended

Re: 8.11 release candidate

2021-11-04 Thread Adrien Grand
actually be treated as a blocker for 8.11? On Wed, Nov 3, 2021 at 7:32 PM Adrien Grand wrote: > Jason's above issue isn't fixed yet but it looks like we should be able to > have a fix in the very near future. > > However we seem to still have these two other open Solr blockers for 8.1

Re: 8.11 release candidate

2021-11-03 Thread Adrien Grand
timeline for cutting the RC. >> >> Best, >> >> Jason >> >> On Tue, Nov 2, 2021 at 1:31 PM Adrien Grand wrote: >> > >> > Hello, >> > >> > Assuming CI is green and no blockers have been raised until then, I >> plan to bu

Re: [JENKINS] Lucene » Lucene-Solr-Tests-8.11 - Build # 7 - Unstable!

2021-11-03 Thread Adrien Grand
Thanks for the pointer Dawid. On Wed, Nov 3, 2021 at 4:15 PM Dawid Weiss wrote: > > It's this one, Adrien - > https://issues.apache.org/jira/browse/LUCENE-10190 > > On Wed, Nov 3, 2021 at 4:01 PM Adrien Grand wrote: > >> I ran 100 iterations with ant beast but could no

Re: [JENKINS] Lucene » Lucene-Solr-Tests-8.11 - Build # 7 - Unstable!

2021-11-03 Thread Adrien Grand
I ran 100 iterations with ant beast but could not reproduce this failure. I won't treat this test failure as a blocker for the release, but I suspect there's still something wrong with it. On Wed, Nov 3, 2021 at 12:36 PM Apache Jenkins Server < jenk...@builds.apache.org> wrote: > Build: >

Bump minimum Java version to 17 on main (10.0)

2021-11-03 Thread Adrien Grand
Hello, Now that the main branch is the future 10.0 version, would there be any concern if we bumped the minimum Java version to 17 instead of 11? -- Adrien

Re: 8.11 release candidate

2021-11-02 Thread Adrien Grand
like to try to get it into 8.11. Would that > be ok? > > I'm just digging into things now so I don't have a complete > understanding yet, but I'm optimistic it won't long delay your > timeline for cutting the RC. > > Best, > > Jason > > On Tue, Nov 2, 2021

Re: Taxonomy backward compatibility tests

2021-11-02 Thread Adrien Grand
420. > It'll also need to pass the backward compatibility check, so for me, it > seems it's better to keep it. And if I can make it work before the 9.0 > release then we can delete it after, and if not, we need a backward > compatibility test between 9 and 10 I guess? > > Best >

Re: Taxonomy backward compatibility tests

2021-11-02 Thread Adrien Grand
Thanks Gautam! On Tue, Nov 2, 2021 at 6:36 PM Gautam Worah wrote: > I think this makes sense. Lucene 9 continues to be backwards compatible > with Lucene 8. Lucene 10 will be able to read Lucene 9,10 taxonomy indexes. > LGTM! > > -- > Gautam Worah. > > > On Tue, No

8.11 release candidate

2021-11-02 Thread Adrien Grand
Hello, Assuming CI is green and no blockers have been raised until then, I plan to build the first release candidate for Lucene/Solr 8.11 on Thursday November 4th. Please let me know if you are aware of any blocker that we should address before building the first RC. -- Adrien

8.11 and 9.0 release notes

2021-11-02 Thread Adrien Grand
Hello, I started working on the 8.11 and 9.0 release notes, can you help me add highlights that I missed? https://cwiki.apache.org/confluence/display/LUCENE/Release+Notes+8.11 https://cwiki.apache.org/confluence/display/LUCENE/Release+Notes+9.0 -- Adrien

Taxonomy backward compatibility tests

2021-11-02 Thread Adrien Grand
Hello here, As part of changing the version of the main branch from 9.0 to 10.0, I had to address some backward compatibility logic and tests. In particular, lucene/facet had a backward compatibility test for the taxonomy index due to the move from doc values to stored fields. I deleted this

8.11 and 9.0 feature freeze

2021-11-02 Thread Adrien Grand
Hello all, I just created branches in preparation for the upcoming 8.11 and 9.0 releases. Here's how branches map to Lucene versions now: - main: Lucene 10.0 - branch_9x: Lucene 9.1 - branch_9_0: Lucene 9.0 - branch_8_11: Lucene/Solr 8.11 - branch_8x: Shouldn't be used anymore, there will

Re: Lucene 9.0 release

2021-11-02 Thread Adrien Grand
lease branch. Hope there won't be > too many problems along the way. > > Dawid > > On Fri, Oct 29, 2021 at 9:09 PM Adrien Grand wrote: > >> This sounds good to me Dawid. Please update this thread when you are done >> and I will proceed with branching. >> &g

Re: Lucene 9.0 release

2021-10-29 Thread Adrien Grand
y-pick the > necessary changes there. I think it's easier if they land on main though. > > Dawid > > On Fri, Oct 29, 2021 at 6:00 PM Adrien Grand wrote: > >> Hearing no objections, I will be moving forward with the plan I outlined >> above. Next Monday is a holiday in

Re: Lucene 9.0 release

2021-10-29 Thread Adrien Grand
gt; > ~ David Smiley > > > Apache Lucene/Solr Search Developer > > > http://www.linkedin.com/in/davidwsmiley > > > > > > > > > On Fri, Oct 15, 2021 at 3:30 AM Adrien Grand > wrote: > > >> > > >> For visibility, I recently opene

Re: Slow DV equivalent of TermInSetQuery

2021-10-26 Thread Adrien Grand
I opened https://issues.apache.org/jira/browse/LUCENE-10207 about these ideas. On Tue, Oct 26, 2021 at 7:52 PM Robert Muir wrote: > On Tue, Oct 26, 2021 at 1:37 PM Adrien Grand wrote: > > > > > And then we could make an IndexOrDocValuesQuery with both th

Re: Slow DV equivalent of TermInSetQuery

2021-10-26 Thread Adrien Grand
> And then we could make an IndexOrDocValuesQuery with both the TermInSetQuery and this SDV.newSlowInSetQuery? Unfortunately IndexOrDocValuesQuery relies on the fact that the "index" query can evaluate its cost (ScorerSupplier#cost) without doing anything costly, which isn't the case for

Re: Why is AttributeSource#addAttribute a hotspot for nightly benchmarks?

2021-10-25 Thread Adrien Grand
Thanks Robert and Mike for helping dig. I opened https://issues.apache.org/jira/browse/LUCENE-10203. On Thu, Oct 21, 2021 at 3:22 PM Michael McCandless < luc...@mikemccandless.com> wrote: > LOL don't cross the tokenstreams! > > Yeah should be 555 or 556 flushes I think. Probably times the

Re: [JENKINS] Lucene » Lucene-Check-main - Build # 3617 - Unstable!

2021-10-21 Thread Adrien Grand
I pushed a fix for this bug. On Thu, Oct 21, 2021 at 9:08 AM Apache Jenkins Server < jenk...@builds.apache.org> wrote: > Build: https://ci-builds.apache.org/job/Lucene/job/Lucene-Check-main/3617/ > > 1 tests failed. > FAILED: >

Why is AttributeSource#addAttribute a hotspot for nightly benchmarks?

2021-10-21 Thread Adrien Grand
Hello, I've been looking a bit more carefully at nightly benchmarks recently and I'm puzzled by the fact that indexing spends almost 5% of the time on AttributeSource#addAttribute. Here is the link

Re: [JENKINS] Lucene » Lucene-NightlyTests-main - Build # 430 - Unstable!

2021-10-21 Thread Adrien Grand
Nhat pushed a fix for this one: https://github.com/apache/lucene/commit/4c2692e897eb6095a5b6a8416dd9b927c84eb066 On Wed, Oct 20, 2021 at 9:16 AM Apache Jenkins Server < jenk...@builds.apache.org> wrote: > Build: > https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-main/430/ > > 2

Re: should we clean up dev-docs?

2021-10-15 Thread Adrien Grand
I suspect that the adoc format got used because it was the format that was already being used for the Solr ref guide. +1 to move to Markdown. On Fri, Oct 15, 2021 at 3:49 PM Michael Sokolov wrote: > I was poking around looking for info on how we release Lucene, and I > stumbled into this

Re: Lucene 9.0 release

2021-10-15 Thread Adrien Grand
60b8056c56%40%3Cdev.lucene.apache.org%3E > > - "9.0 release": > https://lists.apache.org/thread.html/r7bef0af668860fdbfedb4b58261efd01d9fb26dc280915284c121065%40%3Cdev.lucene.apache.org%3E > > > > Jan > > > > 17. aug. 2021 kl. 11:13 skrev Adrien Grand : > > > > +1 to your suggest

Re: Accessibility of CollectedSearchGroup's state

2021-10-14 Thread Adrien Grand
I feel sorry for increasing the scope of all these requests for changes that you make, but the way Elasticsearch overrides this collector feels wrong to me as any change in the implementation details of this collector would probably break Elasticsearch's collector too. In my opinion,

Re: [VOTE] Release Lucene/Solr 8.10.1 RC1

2021-10-13 Thread Adrien Grand
+1 SUCCESS! [2:58:22.541739] On Wed, Oct 13, 2021 at 4:27 PM Jan Høydahl wrote: > Ran the smoke tester > > SUCCESS! [1:21:37.076780] > > +1 > > Jan (binding) > > 13. okt. 2021 kl. 01:59 skrev Mayya Sharipova : > > Please vote for release candidate 1 for Lucene/Solr 8.10.1 > > The artifacts can

Re: How to run Lucene 9 tests in eclipse?

2021-10-12 Thread Adrien Grand
Hi Praveen, Have you seen this page on the wiki? https://cwiki.apache.org/confluence/display/LUCENE/DeveloperTips#DeveloperTips-TipstoconfigureIDEs By running `gradlew tasks`, you should see a section about IDE tasks that help with importing the project in Eclipse. For me running tests then

Re: Welcome Michael Gibney as Lucene committer

2021-10-07 Thread Adrien Grand
Welcome Michael! On Thu, Oct 7, 2021 at 4:12 PM Namgyu Kim wrote: > Congratulations and welcome, Michael! :D > > On Thu, Oct 7, 2021 at 10:53 PM Michael Gibney > wrote: > >> Thank you all for the welcome! >> >> I work as a software developer at the University of Pennsylvania >> libraries

Re: 8.10.1 Patch release?

2021-10-06 Thread Adrien Grand
new and was present in the previous versions as well, > but was discovered quite recently. > > On Tue, Oct 5, 2021 at 3:54 PM Mike Drob wrote: > >> Is the bug new in 8.10? If it affects older versions as well then I feel >> like 8.10.1 might be less urgent. >> >> Mike >

Re: 8.10.1 Patch release?

2021-10-05 Thread Adrien Grand
+1 to a 8.10.1 patch release On Tue, Oct 5, 2021 at 2:03 AM Mayya Sharipova wrote: > Thanks for the update, Robert. Would be nice to have these bug fixes as > well. > > On Mon, Oct 4, 2021 at 7:56 PM Robert Muir wrote: > >> FYI Looks like there are already six items currently listed under >>

Re: Should Queries be able to throw CollectionTerminationException?

2021-10-05 Thread Adrien Grand
Hi Greg, Maybe one clean way to make it happen would be to make timeouts an IndexSearcher feature. Whenever a timeout is set, IndexSearcher could split the doc ID space into ranges of X docs and check the timeout between every range. This way, the CollectionTerminatedException wouldn't be raised

Re: Accessibility of SegmentInfo::setDiagnostics

2021-09-24 Thread Adrien Grand
I'd +1 a change that makes setDiagnostics public. Longer term I wonder if we should have a more locked down API that _only_ allows setting diagnostics. There are lots of things in SegmentCommitInfo that merges should never override like the segment ID, and I can't think of anything else than

Re: Soften Jira's note when opening new issues?

2021-09-24 Thread Adrien Grand
want help or have a feature idea, please ask on the mailing list > or IRC channel before submitting a Jira issue. > > > > wunder > > Walter Underwood > > wun...@wunderwood.org > > http://observer.wunderwood.org/ (my blog) > > > > On Sep 22, 2021, at 9:18 AM, Ad

Re: Soften Jira's note when opening new issues?

2021-09-22 Thread Adrien Grand
bserving is expected or not, please > discuss it there first. > ``` > > Cheers, > -Greg > > On Wed, Sep 22, 2021 at 5:35 AM Adrien Grand wrote: > > > Hi Walter, > > Though it doesn't invalidate your comment, I was considering changing the > message only fo

Re: Soften Jira's note when opening new issues?

2021-09-22 Thread Adrien Grand
.org/ (my blog) > > On Sep 21, 2021, at 12:23 AM, Adrien Grand wrote: > > I think you made a good point, Alexandre. Would something like this read > better: > > ``` > This project has a user mailing list and an IRC channel for support. If > you are looking for suppor

Re: Accessibility of MergeThread.rateLimiter

2021-09-22 Thread Adrien Grand
c(; > } > runOnMergeFinished(mergeSource); >} catch (Throwable exc) { > > > Then ES can leverage such from the infoStream, right? ( thus avoiding the > need for ES extract the inaccessible information directly itself, while > also being more general

Re: Accessibility of MergeThread.rateLimiter

2021-09-22 Thread Adrien Grand
Hi Chris, I looked into this and Elasticsearch seems to only need access to the rate limiter for logging purposes, without adding any information that Lucene doesn't have. Maybe another option would consist of moving the logging to Lucene? Having information in the IndexWriter's InfoStream about

Re: Soften Jira's note when opening new issues?

2021-09-21 Thread Adrien Grand
it there first. ``` On Mon, Sep 20, 2021 at 2:22 PM Alexandre Rafalovitch wrote: > +1. > Ideally, the final version could still be several shorter sentences. To > avoid needing to be a programmer to parse the deeply nested, if totally > logical, structure. > > On Mon., Sep. 20, 2021

Soften Jira's note when opening new issues?

2021-09-20 Thread Adrien Grand
Hello, Jira gives the following note when opening an issue: ``` This project has a user mailing list and an IRC channel for support. Please ensure that you have discussed your problem using one of those resources BEFORE creating this ticket. ``` This can be quite intimidating for someone who

Re: Java 11/17 Version Matrix

2021-09-14 Thread Adrien Grand
I think we should discuss options when Project Panama is released. Doing frequent major releases forces users to reindex more often. If Project Panama was released shortly and we decided to release Lucene 10 immediately, this would force users to reindex their 8.x data to be able to upgrade, I

Re: Are the new index consistency checks too strict?

2021-09-02 Thread Adrien Grand
> index a field as both docvalue and terms, then it is not (currently), > > > which seems weird. I guess the same is true of a field that has no > > > docvalues on some docs, and has them on others, but is also indexed as > > > terms everywhere. I think th

Re: Are the new index consistency checks too strict?

2021-09-01 Thread Adrien Grand
This additional validation that we introduced in Lucene 9 feels like a natural extension of the validation that we already had before, such as the fact that you cannot have some docs that use SORTED doc values and other docs that use NUMERIC doc values on the same field. Actually I would have

Re: 8.10 release soon?

2021-08-26 Thread Adrien Grand
+1 to a 8.10 release and cutting a branch next week On Tue, Aug 24, 2021 at 8:02 PM Timothy Potter wrote: > Hi folks, > > Looks like we have a number of nice enhancements and bug fixes in > Lucene and Solr for 8.10. > > https://github.com/apache/lucene-solr/blob/branch_8x/lucene/CHANGES.txt >

Re: Lucene 9.0 release

2021-08-17 Thread Adrien Grand
+1 to your suggestions I just commented on LUCENE-9959 to suggest reverting since the changes are currently half baked and I don't think that they should block 9.0. There are no other blockers left to my knowledge. On Sat, Aug 14, 2021 at 6:24

Re: Rationale behind different maxDocsPerChunk in Lucene90CompressingStoredFieldsFormat

2021-08-10 Thread Adrien Grand
I left a comment on the other thread where you asked a similar question. On Mon, Aug 9, 2021 at 6:14 PM Praveen Nishchal wrote: > Any thoughts on maxDocsPerChunk <-> chunkSize relation ? > > On Fri, Jul 30, 2021 at 2:10 PM Praveen Nishchal > wrote: > >> Hello, >> >> What is the rationale

Re: Deduplication/inversion for dimensional points

2021-07-20 Thread Adrien Grand
On Tue, Jul 20, 2021 at 5:50 PM Michael McCandless < luc...@mikemccandless.com> wrote: > To my knowledge, we don't have more deduplication logic. When an inner >> block has a single value, the IntersectVisitor likely >> returns CELL_INSIDE_QUERY and Lucene will only collect doc IDs for all leaf

Re: Deduplication/inversion for dimensional points

2021-07-16 Thread Adrien Grand
Hey Mike, I believe that we always handled the case when all documents in a leaf have the same value efficiently. The case that got improved more recently is the case when there are many duplicates within a leaf but still more than one value. We added run-length encoding for this case

Re: [Lucene] Selection of threshold

2021-07-01 Thread Adrien Grand
Hi, This is just a number that proved to work well in practice. The general idea is that we want to narrow down the set of candidates periodically in order to speed up query execution. If we do it too often, then we might spend more time narrowing down the set of candidates than actually

Re: Two-phase range queries?

2021-06-29 Thread Adrien Grand
Hi Greg, Have you looked at IndexOrDocValuesQuery? It dynamically chooses between computing the range up-front using the BKD tree and running the range query using doc values depending on the estimated cost of the range query (computed by counting the number of leaf nodes of the BKD tree that

Re: Welcome Mayya Sharipova to the Lucene PMC

2021-06-28 Thread Adrien Grand
Congratulations, Mayya! Well deserved! On Mon, Jun 28, 2021 at 3:26 PM Christian Moen wrote: > Congrats! > > On Mon, Jun 28, 2021 at 22:16 Robert Muir wrote: > >> I am pleased to announce that Mayya has accepted an invitation to join >> the Lucene PMC! >> >> Congratulations, and welcome

Re: Boolean Scorer

2021-06-21 Thread Adrien Grand
a > Reentrant lock for synchronization in the collector. > > I just wanted reviews on this since I tried this and some tests were not > passing. So if you could tell what is wrong in this approach, I > would appreciate it. > > Thanking You in advance, > Arihant. > >

Re: [VOTE] Release Lucene/Solr 8.9.0 RC1

2021-06-15 Thread Adrien Grand
+1 SUCCESS! [1:36:12.056443] On Tue, Jun 15, 2021 at 4:26 PM Mayya Sharipova wrote: > Thanks Robert for such detailed investigations. > > Lucene-Solr-SmokeRelease-8.9 also had 2 recent failures. Failures are not > reproducible on my local machine. > > build #13: ant test

Re: Use DirectMonotonicWriter store sorted NumericDocValues

2021-06-15 Thread Adrien Grand
> > On Tue, Jun 15, 2021 at 9:04 AM Adrien Grand wrote: > > > > I believe that this sort of optimization would be more effective and > robust if we made doc values look more like postings, with relatively small > blocks of values that would get compressed independently a

Re: Boolean Scorer

2021-06-15 Thread Adrien Grand
r that up. > Appreciate you taking the time to explain! > > Cheers, > -Greg > > On Mon, Jun 14, 2021 at 2:35 AM Adrien Grand wrote: > >> Hello Arihant, >> >> The Scorer for disjunctions uses a heap data structure that needs to be >> reordered upon eve

Re: Use DirectMonotonicWriter store sorted NumericDocValues

2021-06-15 Thread Adrien Grand
I believe that this sort of optimization would be more effective and robust if we made doc values look more like postings, with relatively small blocks of values that would get compressed independently and decompressed in bulk. This way, we wouldn't require data to be sorted across entire segments

Re: Handling Archive Data Using Lucene 7.6

2021-06-14 Thread Adrien Grand
Hi Rashmi, This upgrade skips 3 major versions, the simplest path will be to reindex your content. On Fri, Jun 11, 2021 at 10:40 AM Rashmi Bisanal wrote: > Hi Lucene Support Team , > > > > Objective : Upgrade Lucene 3.6 to 7.6 > > > > Description : We have huge data against version Lucene 3.6

Re: Boolean Scorer

2021-06-14 Thread Adrien Grand
Hello Arihant, The Scorer for disjunctions uses a heap data structure that needs to be reordered upon every hit. While reordering heaps is efficient as it runs in logarithmic time, the fact that it needs to run on every document might add non-negligible overhead. BooleanScorer tries to work

Re: debugging query execution plan

2021-06-09 Thread Adrien Grand
FYI this got just checked in: https://issues.apache.org/jira/browse/LUCENE-9965. I'd be curious to know if it helps with your problem, Mike. On Wed, May 12, 2021 at 1:54 PM Adrien Grand wrote: > Indeed this is code is ASL2 pre-7.10, but I wouldn't have expected any > concerns regardless

Welcome Greg Miller as Lucene committer

2021-05-29 Thread Adrien Grand
I'm pleased to announce that Greg Miller has accepted the PMC's invitation to become a committer. Greg, the tradition is that new committers introduce themselves with a brief bio. Congratulations and welcome! -- Adrien

Re: Should segment order impact hit count estimation?

2021-05-24 Thread Adrien Grand
rning. If we're proving a hit count with >>> "equal to" semantics, it should be correct and shouldn't change based >>> on segment ordering, etc. But, on the other hand, if we're just >>> providing a floor (i.e., "there are at least this many hits"), then >>

Re: Should segment order impact hit count estimation?

2021-05-21 Thread Adrien Grand
Hi Patrick, Why do you feel weird about the fact that segment order impacts the hit count estimation? It feels ok to me, especially as segment order has deeper implications, e.g. you could get different top hits given that Lucene uses the global doc ID as a tie breaker for documents that produce

Re: debugging query execution plan

2021-05-12 Thread Adrien Grand
tart from the Elasticsearch implementation for low-level query > execution tracing, which I think is from (pre-7.10) ASL2 licensed code? > > That sounds helpful, even with the Heisenberg caveats. > > Mike McCandless > > http://blog.mikemccandless.com > > On Thu, May 6, 2021 at

Re: Release Lucene/Solr 8.9.0 should we have it soon

2021-05-11 Thread Adrien Grand
I would like to backport LUCENE-9827 <https://issues.apache.org/jira/browse/LUCENE-9827> before we release 8.9, a performance regression to stored fields merges. I'll work on this as soon as possible. On Thu, May 6, 2021 at 10:28 PM Adrien Grand wrote: > +1 > > Mayya, are y

Re: Release Lucene/Solr 8.9.0 should we have it soon

2021-05-06 Thread Adrien Grand
+1 Mayya, are you volunteering to be the release manager? Le jeu. 6 mai 2021 à 18:06, Ishan Chattopadhyaya a écrit : > +1 > > On Thu, May 6, 2021 at 7:50 PM Mayya Sharipova > wrote: > >> Hello everyone, >> I was wondering if we can have a 8.9.0 release. It has been more than 3 >> months since

Re: debugging query execution plan

2021-05-06 Thread Adrien Grand
We have something like that in Elasticsearch that wraps queries in order to be able to report cost, matchCost and the number of calls to nextDoc/advance/matches/score/advanceShallow/getMaxScore for every node in the query tree. It's not perfect as it needs to disable some optimizations in order

Re: Exploring a different approach to skip lists

2021-04-27 Thread Adrien Grand
Hi Greg, I like that Lucene can scale to index sizes that are much larger than the amount of main memory, so I would like the default codec to keep optimizing for sequential reads. We do random access for some parts of the index like the terms index and the points index, but the expectation is

Re: hello i have a question

2021-04-19 Thread Adrien Grand
The order in which segments get searched depends on the order of a List that IndexWriter internally maintains (IndexWriter.segmentInfos.segments). When segments get flushed, they are appended to the end of this list of segments. So more recent segments get However when segments get merged

Welcome Zach Chen as Lucene committer

2021-04-19 Thread Adrien Grand
I'm pleased to announce that Zach Chen has accepted the PMC's invitation to become a committer. Zach, the tradition is that new committers introduce themselves with a brief bio. Congratulations and welcome! -- Adrien

Re: 9.0 release

2021-04-14 Thread Adrien Grand
oped to get some strong consensus, but that proved challenging. > Given that, I'm OK leaving things as-is, marking these apis > @experimental and potentially revisiting naming issues later, eg once > we have a second vector ANN implementation. > > On Wed, Apr 14, 2021 at 11:07 AM Adrien Gr

Re: 9.0 release

2021-04-14 Thread Adrien Grand
ed today. I'd like to >> get that sorted out before 9.0 - I'll hunt up the ticket(s) and mark >> as blockers >> >> On Sun, Mar 28, 2021 at 11:02 AM Adrien Grand wrote: >> > >> > Hello Jan, >> > >> > The list of blockers should be mostly

Re: Welcome Peter Gromov as Lucene committer

2021-04-06 Thread Adrien Grand
Welcome Peter! On Tue, Apr 6, 2021 at 7:48 PM Robert Muir wrote: > I'm pleased to announce that Peter Gromov has accepted the PMC's > invitation to become a committer. > > Peter, the tradition is that new committers introduce themselves with a > brief bio. > > Congratulations and welcome! > >

Re: Questions about the new vector API

2021-04-06 Thread Adrien Grand
I created a JIRA about moving VectorValues#search to VectorReader: https://issues.apache.org/jira/browse/LUCENE-9908. On Tue, Mar 16, 2021 at 7:14 PM Adrien Grand wrote: > Hello Mike, > > On Tue, Mar 16, 2021 at 5:05 PM Michael Sokolov > wrote: > >> I think the

Re: 9.0 release

2021-03-28 Thread Adrien Grand
Adrien says there are also > other scripts that need updating. > > Jan > > 13. jan. 2021 kl. 15:10 skrev Adrien Grand : > > +1 to start planning 9.0. > > Since you mentioned the Gradle build, I believe that we still need to > migrate some of the rel

Re: PFOR for docids?

2021-03-18 Thread Adrien Grand
how this type of evaluation is generally done in the context of an > > upstream change? As a first step, I can open a Jira issue to track the > > evaluation if you think that would be useful. Thanks again! > > > > Cheers, > > -Greg > > > > > > On Tue, Ma

Re: Questions about the new vector API

2021-03-17 Thread Adrien Grand
ed to write these parameters to the index; > we're free to use different values when merging for example. > > -Mike > > On Tue, Mar 16, 2021 at 2:15 PM Adrien Grand wrote: > > > > Hello Mike, > > > > On Tue, Mar 16, 2021 at 5:05 PM Michael Sokolov > wrote:

<    1   2   3   4   5   6   7   8   9   10   >