Re: Welcome Bruno to the Apache Lucene PMC

2021-03-10 Thread Adrien Grand
Welcome Bruno! On Thu, Mar 11, 2021 at 2:39 AM Michael Sokolov wrote: > Welcome, Bruno! > > On Wed, Mar 10, 2021, 7:56 PM Mike Drob wrote: > >> I am pleased to announce that Bruno has accepted an invitation to join >> the Lucene PMC! >> >> Congratulations, and welcome aboard! >> >> Mike >> >

Re: Proposal for the Lucene Dependency after git repo split

2021-02-26 Thread Adrien Grand
FYI Elasticsearch has been regularly depending on builds of specific commits of Lucene for this case of features that need changes both in Lucene and Elasticsearch. The workflow usually looks like this: - Do work in Lucene. - When it becomes clear that the next release of Lucene should happen

Re: Question about QueryCache

2021-02-26 Thread Adrien Grand
It does recurse indeed! To reuse Mike's example, in that case the cache would consider caching: - A, - B, - C, - D, - (C D), - +A +B +(C D) One weakness of this cache is that it doesn't consider caching subsets of boolean queries (except single clauses). E.g. in the above example, it would

Re: Congratulations to the new Lucene PMC Chair, Michael Sokolov!

2021-02-18 Thread Adrien Grand
Congratulations, Mike! We're lucky that you accepted to take this role. On Wed, Feb 17, 2021 at 10:32 PM Anshum Gupta wrote: > Every year, the Lucene PMC rotates the Lucene PMC chair and Apache Vice > President position. > > This year we nominated and elected Michael Sokolov as the Chair, a >

Re: Behaviour change of Query.parse(String query) in 8.7.0 vs 2.9.4

2021-02-04 Thread Adrien Grand
I believe that this is related to the fact that the classic query parser no longer splits on whitespace and instead relies on the analyzer to compute query terms. So if your tokenizer makes no difference between spaces and dashes, you would indeed get a disjunction. If you need to restore this

Re: Merging segment parts concurrently (SegmentMerger)

2021-01-26 Thread Adrien Grand
first not-postings related] > 2021-01-19 18:08:43,697 TRACE l4g.lucene.infostream: IW: 529 msec to > finish stored fields [qt:worker-2:T8] > 2 > > Darn. That splitting by field sounds all more attractive now! > > D. > > > On Tue, Jan 26, 2021 at 2:29 PM Adr

Re: Merging segment parts concurrently (SegmentMerger)

2021-01-26 Thread Adrien Grand
Parallelizing segment merges would be nice. When indexing a dataset into a single segment, it is not rare that the final merge down to 1 segment take longer than indexing just because merging can only use one thread. It's frustrating to wait for this merge to finish with only one busy core. :-)

Re: 2021-01 Lucene/Solr Committer meeting

2021-01-17 Thread Adrien Grand
Out of curiosity, why do we expect a long delay between Lucene 9 and Solr 9, is it because we have many Solr JIRAs that need to go in Solr 9? On Sat, Jan 16, 2021 at 11:54 PM David Smiley wrote: > On Sat, Jan 16, 2021 at 4:50 PM Adrien Grand wrote: > >> It's not fully cle

Re: 2021-01 Lucene/Solr Committer meeting

2021-01-16 Thread Adrien Grand
ts to 9.1. Am I missing something? > > Jan Høydahl > > 16. jan. 2021 kl. 18:43 skrev Adrien Grand : > >  > On Thu, Jan 14, 2021 at 8:02 PM David Smiley wrote: > >> I'm not familiar with why the testing needs to be manual instead of >> automated. After having

Re: 2021-01 Lucene/Solr Committer meeting

2021-01-16 Thread Adrien Grand
On Thu, Jan 14, 2021 at 8:02 PM David Smiley wrote: > I'm not familiar with why the testing needs to be manual instead of > automated. After having a RC of 8.9, couldn't we add the back-compat > indices to branch_9x and check that 9.0 is happy with them (running > applicable automated tests) as

Re: Blog post - Profiling the Lucene nightly benchmarks

2021-01-16 Thread Adrien Grand
This is very cool, thanks for sharing Anton! Le ven. 15 janv. 2021 à 23:40, Anton Hägerstrand a écrit : > Hello everyone! > > I recently wrote a blog post which looks into profiling data of the Lucene > nightl benchmarks. I emailed Michael McCandless (the maintainer of the > benchmarks) and he

Re: 2021-01 Lucene/Solr Committer meeting

2021-01-14 Thread Adrien Grand
Le jeu. 14 janv. 2021 à 18:16, Mike Drob a écrit : > > 9.0 Release Planing > >- Reminder that there are issues with new minors after a major release >(8.9 after a 9.0) - somebody to research. > > The main challenge with this is backwards compatibility testing. If you release 8.9 after

Re: 9.0 release

2021-01-13 Thread Adrien Grand
+1 to start planning 9.0. Since you mentioned the Gradle build, I believe that we still need to migrate some of the release tooling from Ant to Gradle, e.g. dev-tools/scripts/addBackcompatIndexes.py. These scripts are not easy to test without actually doing a release so the 9.0 RM might have some

Re: RFC: N-2 compatibility for file formats

2021-01-13 Thread Adrien Grand
+1 this strikes to me as a good balance between increasing backward compatibility guarantees and still keeping room for innovation. David, actually I would like to advocate in favor of still disallowing opening N-2 indices by default, as they might not match Lucene's current expectations (e.g.

Re: [Lucene] confusion in posting encoding

2021-01-13 Thread Adrien Grand
Hello, It is indeed because I could get the compiler to use SIMD instructions with the loop written this way. On Wed, Jan 13, 2021 at 11:29 AM LuXugang wrote: > Hi Adrien, > > I have some confusion about the method collapse8(long[ ] arr) in* ForUtil* > class > > > > > On line 85, the loop

Re: Old programmers do fade away

2020-12-31 Thread Adrien Grand
Finding something that interests you even more is a great reason to move forward, I wish you a lot of fun with the welder and hope the squirrels will leave your tomatoes alone. Thank you for all your contributions and your great community spirit. On Wed, Dec 30, 2020 at 3:09 PM Erick Erickson

Re: Question by solr queries optimization

2020-12-23 Thread Adrien Grand
Hi Alex, Indeed Solr would automatically rewrite this query to `id:%key^3` since versions 7.1 / 8.0. This happens via BooleanQuery#rewrite, you can check out the JIRA where this was implemented: https://issues.apache.org/jira/browse/LUCENE-7925. On Wed, Dec 23, 2020 at 3:13 PM Alex Bulygin

Re: Deterministic index construction

2020-12-19 Thread Adrien Grand
Have you considered leveraging Lucene's built-in index sorting? It supports concurrent indexing and is quite fast. On Fri, Dec 18, 2020 at 7:26 PM Haoyu Zhai wrote: > Hi > Our team is seeking a way of construct (or rebuild) a deterministic sorted > index concurrently (I know lucene could

Re: Processing query clause combinations at indexing time

2020-12-15 Thread Adrien Grand
I like this idea. I can think of several users who have a priori knowledge of frequently used filters and would appreciate having Lucene take care of transparently optimizing the execution of such filters instead of having to do it manually. I'm not sure a separate project is the best option, it

Re: 8.8 Release

2020-12-10 Thread Adrien Grand
This sounds good to me. Thanks for volunteering! On Thu, Dec 10, 2020 at 5:11 PM Ishan Chattopadhyaya < ichattopadhy...@gmail.com> wrote: > Hi Devs, > There are lots of changes accumulated and some underway. I wish to > volunteer for a 8.8 release, if there are no objections. I'm planning to >

Re: Welcome Houston Putman to the PMC

2020-12-02 Thread Adrien Grand
Welcome Houston! On Wed, Dec 2, 2020 at 1:40 PM Erick Erickson wrote: > Welcome Houston! > > > On Dec 2, 2020, at 1:34 AM, Shalin Shekhar Mangar < > shalinman...@gmail.com> wrote: > > > > Congratulations Houston! > > > > On Wed, Dec 2, 2020 at 2:50 AM Mike Drob wrote: > > I am pleased to

Re: SOLR: Why do we have a CHANGES.txt/md to maintain?

2020-11-30 Thread Adrien Grand
I have a preference for maintaining a separate CHANGES file because it allows us to keep JIRA focused for a committer/contributor audience while the CHANGES file can describe changes that matter for users. Elasticsearch uses a similar mechanism for release notes to what you are proposing, using

Re: Welcome Julie Tibshirani as Lucene/Solr committer

2020-11-18 Thread Adrien Grand
Welcome Julie! On Wed, Nov 18, 2020 at 4:09 PM Alan Woodward wrote: > Congratulations and welcome Julie! > > > On 18 Nov 2020, at 15:06, Michael Sokolov wrote: > > > > I'm pleased to announce that Julie Tibshirani has accepted the PMC's > > invitation to become a committer. > > > > Julie, the

Re: 8.7 Release

2020-11-06 Thread Adrien Grand
t;>>>> >>>>>> I have no bandwidth to tackle the deprecations, nor the energy to >>>>>> fight those who oppose it. >>>>>> >>>>>> On Tue, 20 Oct, 2020, 2:43 pm Atri Sharma, wrote: >>>>>> >>&g

Re: Solr 8.x and contribs requiring Java 11

2020-11-02 Thread Adrien Grand
; > On Fri, 30 Oct, 2020, 11:22 pm Adrien Grand, wrote: > >> Ishan, why would this be a blocker for 8.7? Would it be good enough to >> remove in branch_8x? >> >> Le ven. 30 oct. 2020 à 18:33, Ishan Chattopadhyaya < >> ichattopadhy...@gmail.com> a écrit :

Re: Solr 8.x and contribs requiring Java 11

2020-10-30 Thread Adrien Grand
Ishan, why would this be a blocker for 8.7? Would it be good enough to remove in branch_8x? Le ven. 30 oct. 2020 à 18:33, Ishan Chattopadhyaya < ichattopadhy...@gmail.com> a écrit : > +1 to removing it (in 8.7 with a respin, if needed). If we can't support > it, there's no need to keep it. If

Re: [VOTE] Release Lucene/Solr 8.7.0 RC1

2020-10-30 Thread Adrien Grand
+1 SUCCESS! [2:11:05.149743] On Fri, Oct 30, 2020 at 5:54 AM Atri Sharma wrote: > Please vote for release candidate 1 for Lucene/Solr 8.7.0 > > > The artifacts can be downloaded from: > > >

Re: 8.7 Release Blockers

2020-10-26 Thread Adrien Grand
22, 2020, at 4:35 PM, Adrien Grand wrote: > > Can someone help review this PR to get the above blocker resolved? > https://github.com/apache/lucene-solr/pull/2019 > > On Thu, Oct 22, 2020 at 4:11 PM Atri Sharma wrote: > >> Reminder: This is still a blocker for 8.7: >

Re: 8.7 Release Blockers

2020-10-22 Thread Adrien Grand
Can someone help review this PR to get the above blocker resolved? https://github.com/apache/lucene-solr/pull/2019 On Thu, Oct 22, 2020 at 4:11 PM Atri Sharma wrote: > Reminder: This is still a blocker for 8.7: > > https://issues.apache.org/jira/browse/SOLR-14354 > > On Tue, Oct 20, 2020 at

Re: Seeking Inputs for Release Highlights

2020-10-22 Thread Adrien Grand
RAFT-ReleaseNote87 > > On Thu, Oct 22, 2020 at 4:37 PM Adrien Grand wrote: > > > > Hi Atri, if you can create a page in the wiki, I'll be happy to add some > items to the release highlights. I have the compression for stored fields > improvements in mind as well as the speedu

Re: Seeking Inputs for Release Highlights

2020-10-22 Thread Adrien Grand
Hi Atri, if you can create a page in the wiki, I'll be happy to add some items to the release highlights. I have the compression for stored fields improvements in mind as well as the speedups for queries sorted by a field indexed with points. On Thu, Oct 22, 2020 at 12:54 PM Atri Sharma wrote:

Re: 8.7 Release

2020-10-19 Thread Adrien Grand
:54 skrev Atri Sharma : > > > > Fixed the issue. Cherry picking to branch_8_7 now. > > > > Apologies, I must have created branch_8_x accidentally. Let me delete. > > > > On Mon, Oct 19, 2020 at 1:40 AM Adrien Grand wrote: > >> > >> 1. This is failing 8.

Re: 8.8 section in the changelog?

2020-10-18 Thread Adrien Grand
use case. > > On Fri, Oct 16, 2020, 1:07 PM Adrien Grand wrote: > >> Hello, >> >> I'm confused that master now has a non-empty 8.8 section in the Changelog >> while branch_8_7 has not been cut yet, was it done by mistake? >> >> -- >> Adrien >> > -- Adrien

Re: 8.7 Release

2020-10-18 Thread Adrien Grand
>>>>> [exec] >>>>> [exec] Missing javadocs were found! >>>>> >>>>> That package is missing a "package-info.java". It's been this way for >>>>> a while... I wonder why it hasn't been noticed by others yet? >&

Re: 8.7 Release

2020-10-17 Thread Adrien Grand
As the branch has been cut, I deleted 8.6 jobs and created 8.7 jobs on the ASF Jenkins: https://ci-builds.apache.org/job/Lucene/. On Tue, Oct 13, 2020 at 6:16 PM Adrien Grand wrote: > This sounds good to me, thank you! > > On Tue, Oct 13, 2020 at 6:06 PM Atri Sharma wrote: > >

8.8 section in the changelog?

2020-10-16 Thread Adrien Grand
Hello, I'm confused that master now has a non-empty 8.8 section in the Changelog while branch_8_7 has not been cut yet, was it done by mistake? -- Adrien

Re: 8.7 Release

2020-10-13 Thread Adrien Grand
This sounds good to me, thank you! On Tue, Oct 13, 2020 at 6:06 PM Atri Sharma wrote: > I will start the first build candidate on upcoming Monday. This is my > first release so fingers crossed :) > > On Tue, Oct 13, 2020 at 7:01 PM Adrien Grand wrote: > > > > Thanks At

Re: 8.7 Release

2020-10-13 Thread Adrien Grand
I plan on merging SOLR-14907 >> <https://issues.apache.org/jira/browse/SOLR-14907> to master and 8x >> tomorrow. If you would mind waiting to cut 8.7 until then, I would >> appreciate it. >> >> - Houston >> >> On Mon, Oct 12, 2020 at 4:59 AM Adrien Grand wrote: &g

Re: 8.7 Release

2020-10-12 Thread Adrien Grand
gt; >>>> >>>> >>>> >>>> >>>> On Sep 14, 2020, at 10:06 AM, Christine Poerschke (BLOOMBERG/ LONDON) < >>>> cpoersc...@bloomberg.net> wrote: >>>> >>>> >>>> >>>> >>>> >

Re: [VOTE] Release Lucene/Solr 8.6.3 RC1

2020-10-05 Thread Adrien Grand
+1 SUCCESS! [1:36:10.395992] On Mon, Oct 5, 2020 at 2:15 PM Michael McCandless wrote: > +1 (binding) > > SUCCESS! [0:44:16.898412] > > > Mike McCandless > > http://blog.mikemccandless.com > > > On Mon, Oct 5, 2020 at 3:28 AM Atri Sharma wrote: > >> +1 (binding) >> >> SUCCESS! [1:04.32.39193]

Re: 8.6.3 Release

2020-09-29 Thread Adrien Grand
+1 Erick On Mon, Sep 28, 2020 at 8:05 PM Erick Erickson wrote: > For me, there’s a sharp distinction between changing a dependency in a > point release just because there’s a new version, and changing the > dependency because there’s a bug in it. That said, if someone can use > 8.6.3, what’s

Re: restlet dependencies

2020-09-29 Thread Adrien Grand
I've occasionally also seen connections reset with Lucene on Maven Central and I've been pointed to https://github.com/gradle/gradle/pull/13144 by one of our build engineers at Elastic. So maybe we also need to upgrade to Gradle 6.6.1? On Fri, Sep 25, 2020 at 4:50 PM David Smiley wrote: > I

Re: PointInSetQuery dose not terminate early if DocIdSetBuilder's bitSet is null

2020-09-28 Thread Adrien Grand
What are you storing in your points? If you are storing numbers, I wonder if a better approach to this problem might be to start leveraging IndexOrDocValuesQuery and scorerSupplier() for point-in-set queries like we did for range queries. The approach you suggested would help in some cases, but

Re: 8.6.3 Release

2020-09-23 Thread Adrien Grand
Simultaneous releases are problematic for our backward-compatibility tests, but we do not need to wait between releases, we could start the release process right away when 8.6.3 is out. I don't think there's any potential for confusion. We've done this several times in the past, e.g. 7.1.0 was

Re: [JENKINS] Lucene » Lucene-Solr-Tests-8.x - Build # 118 - Still Failing!

2020-09-04 Thread Adrien Grand
I pushed a fix, there is no Deflater#setDictionary(ByteBuffer) on JDK 8. On Fri, Sep 4, 2020 at 11:04 AM Apache Jenkins Server < jenk...@builds.apache.org> wrote: > Build: > https://ci-builds.apache.org/job/Lucene/job/Lucene-Solr-Tests-8.x/118/ > > All tests passed > > Build Log: > [...truncated

Re: [JENKINS] Lucene-Solr-8.x-Linux (64bit/jdk-14.0.1) - Build # 4265 - Still Failing!

2020-09-03 Thread Adrien Grand
I pushed a fix for these failures about solr.cmd having tabs on branch_8x. On Thu, Sep 3, 2020 at 10:42 AM Policeman Jenkins Server < jenk...@thetaphi.de> wrote: > Build: https://jenkins.thetaphi.de/job/Lucene-Solr-8.x-Linux/4265/ > Java: 64bit/jdk-14.0.1 -XX:+UseCompressedOops -XX:+UseSerialGC

Re: [JENKINS] Lucene » Lucene-Solr-NightlyTests-master - Build # 23 - Still unstable!

2020-09-02 Thread Adrien Grand
I'm looking into it. On Wed, Sep 2, 2020 at 7:54 AM Apache Jenkins Server < jenk...@builds.apache.org> wrote: > Build: > https://ci-builds.apache.org/job/Lucene/job/Lucene-Solr-NightlyTests-master/23/ > > 3 tests failed. > FAILED: >

Re: Approach towards solving split package issues?

2020-09-01 Thread Adrien Grand
+1 Changing packages of many classes should be done in a major. On Tue, Sep 1, 2020 at 5:50 PM Tomoko Uchida wrote: > Just to make sure, could I confirm "when the changes will be out"... > Resolving split package issues should break backward compatibility > (changing package names and moving

Re: [VOTE] Lucene logo contest, third time's a charm

2020-09-01 Thread Adrien Grand
A1, A2, D (binding) On Tue, Sep 1, 2020 at 10:21 PM Ryan Ernst wrote: > Dear Lucene and Solr developers! > > Sorry for the multiple threads. This should be the last one. > > In February a contest was started to design a new logo for Lucene > [jira-issue]. The initial attempt [first-vote] to

Re: [VOTE] Lucene logo contest, here we go again

2020-09-01 Thread Adrien Grand
Ryan, FYI the links to proposals C are incorrect as they all use the attachment ID of the first proposal, so all links point to the same logo, here are the correct links: [C1] https://issues.apache.org/jira/secure/attachment/13006392/lucene_logo1_full.pdf [C2]

Re: Performance in Solr 9 / Java 11

2020-08-30 Thread Adrien Grand
nitely had no idea about them when doing 8.5.2 and did not >> even think to verify anything about it. >> >> On Sat, Aug 29, 2020 at 4:05 PM Adrien Grand wrote: >> >>> It may only be indirectly related to your question, but there is support >>> for vec

Re: Performance in Solr 9 / Java 11

2020-08-29 Thread Adrien Grand
It may only be indirectly related to your question, but there is support for vectorized operations of byte[] arrays that was added in JDK 13 (this blog https://richardstartin.github.io/posts/vectorised-byte-operations explains well what it is about) that we started leveraging for compressing terms

Re: 8.7 Release

2020-08-20 Thread Adrien Grand
something that actually holds up a release. > Regards, > Ishan > > On Fri, Aug 21, 2020 at 1:56 AM Adrien Grand wrote: > >> Noble, I'm curious what blockers you have in mind. I just checked JIRA, >> and while I see a number of 9.0 blockers, I'm not counting many 8.7 >>

Re: 8.7 Release

2020-08-20 Thread Adrien Grand
Noble, I'm curious what blockers you have in mind. I just checked JIRA, and while I see a number of 9.0 blockers, I'm not counting many 8.7 blockers? On Thu, Aug 20, 2020 at 11:13 AM Noble Paul wrote: > There are a lot of blockers for 8.7. It's good to plan in advance > > On Thu, Aug 20, 2020

Re: Welcome Atri Sharma to the PMC

2020-08-20 Thread Adrien Grand
Welcome Atri! On Thu, Aug 20, 2020 at 8:24 PM David Smiley wrote: > Welcome! > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > > > On Thu, Aug 20, 2020 at 2:16 PM Ishan Chattopadhyaya < > ichattopadhy...@gmail.com> wrote: > >> I am pleased to

Re: Standardize Leading Test or Trailing Test

2020-08-06 Thread Adrien Grand
+1 On Thu, Aug 6, 2020 at 1:54 PM Erick Erickson wrote: > This has amused/annoyed me for a long time. But did I ever have the > energy to tackle it? N. > > +1 > > > On Aug 6, 2020, at 1:50 AM, Tomás Fernández Löbbe > wrote: > > > > +1 > > > > On Wed, Aug 5, 2020 at 10:37 PM David

Re: Welcome Namgyu Kim to the PMC

2020-08-03 Thread Adrien Grand
Welcome! On Mon, Aug 3, 2020 at 5:12 AM Koji Sekiguchi wrote: > Welcome, Namgyu! > > Koji > > On 2020/08/03 8:18, Ishan Chattopadhyaya wrote: > > I am pleased to announce that Namgyu Kim has accepted the PMC's > invitation to join. > > > > Congratulations and welcome, Namgyu! > >

Re: Welcome Mike Drob to the PMC

2020-07-27 Thread Adrien Grand
Welcome Mike! On Fri, Jul 24, 2020 at 10:03 PM Anshum Gupta wrote: > I am pleased to announce that Mike Drob has accepted the PMC's invitation > to join. > > Congratulations and welcome, Mike! > > -- > Anshum Gupta > -- Adrien

Welcome Tomoko Uchida to the PMC

2020-07-04 Thread Adrien Grand
I am pleased to announce that Tomoko Uchida has accepted the PMC's invitation to join. Welcome Tomoko! -- Adrien

Welcome Michael Sokolov to the PMC

2020-07-03 Thread Adrien Grand
I am pleased to announce that Michael Sokolov has accepted the PMC's invitation to join. Welcome Michael! -- Adrien

Re: Welcome Ilan Ginzburg as Lucene/Solr committer

2020-06-22 Thread Adrien Grand
Welcome Ilan! On Sun, Jun 21, 2020 at 11:44 AM Noble Paul wrote: > Hi all, > > Please join me in welcoming Ilan Ginzburg as the latest Lucene/Solr > committer. > Ilan, it's tradition for you to introduce yourself with a brief bio. > > Congratulations and Welcome! > Noble > -- Adrien

Re: [JENKINS] Lucene-Solr-Tests-8.x - Build # 1677 - Unstable

2020-06-21 Thread Adrien Grand
This is tracked at https://issues.apache.org/jira/browse/LUCENE-9409. I disabled the test until the PR is merged. On Sun, Jun 21, 2020 at 4:11 AM Apache Jenkins Server < jenk...@builds.apache.org> wrote: > Build: https://builds.apache.org/job/Lucene-Solr-Tests-8.x/1677/ > > 1 tests failed. >

Re: [VOTE] Lucene logo contest

2020-06-17 Thread Adrien Grand
A. (PMC) I like that it retains the same idea as our current logo with a more modern look. On Wed, Jun 17, 2020 at 4:58 PM Andi Vajda wrote: > > C. (current logo) > > Andi.. (pmc) > > On Jun 15, 2020, at 15:08, Ryan Ernst wrote: > >  > Dear Lucene and Solr developers! > > In February a

Re: 8.6 release

2020-06-16 Thread Adrien Grand
+1 Thanks Bruno! On Tue, Jun 16, 2020 at 3:33 PM David Smiley wrote: > +1 Thanks for volunteering Bruno! > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > > > On Tue, Jun 16, 2020 at 9:31 AM Bruno Roustant > wrote: > >> Hi all, >> >> It’s been

Re: Welcome Mayya Sharipova as Lucene/Solr committer

2020-06-09 Thread Adrien Grand
Welcome, Mayya! On Mon, Jun 8, 2020 at 6:58 PM jim ferenczi wrote: > Hi all, > > Please join me in welcoming Mayya Sharipova as the latest Lucene/Solr > committer. > Mayya, it's tradition for you to introduce yourself with a brief bio. > > Congratulations and Welcome! > > Jim > -- Adrien

Re: ram estimate for docvalues is incorrect

2020-05-28 Thread Adrien Grand
%tg > difference). > > Mike McCandless > > http://blog.mikemccandless.com > > > On Thu, May 28, 2020 at 2:51 AM Adrien Grand wrote: > >> To be clear, there is no plan to remove RAM accounting from readers yet, >> this is just something that I have bee

Re: ram estimate for docvalues is incorrect

2020-05-28 Thread Adrien Grand
be the correct way to handle > this? > > Thank you > > -John > > > On Wed, May 27, 2020 at 1:36 PM Adrien Grand wrote: > >> A couple major versions ago, Lucene required tons of heap memory to keep >> a reader open, e.g. norms were on heap and so on. To my knowled

Re: ram estimate for docvalues is incorrect

2020-05-27 Thread Adrien Grand
A couple major versions ago, Lucene required tons of heap memory to keep a reader open, e.g. norms were on heap and so on. To my knowledge, the only thing that is now kept in memory and is a function of maxDoc is live docs, all other codec components require very little memory. I'm actually

Re: [VOTE] Solr to become a top-level Apache project (TLP)

2020-05-13 Thread Adrien Grand
+1 On Tue, May 12, 2020 at 9:37 AM Dawid Weiss wrote: > Dear Lucene and Solr developers! > > According to an earlier [DISCUSS] thread on the dev list [2], I am > calling for a vote on the proposal to make Solr a top-level Apache > project (TLP) and separate Lucene and Solr development into two

Re: [DISCUSS] Lucene-Solr split (Solr promoted to TLP)

2020-05-11 Thread Adrien Grand
On Mon, May 11, 2020 at 1:17 AM Shawn Heisey wrote: > I think the presence of Solr in the codebase > has diluted Lucene's releases, making them come far too quickly. I > would bet that without Solr, Lucene would probably be somewhere in 6.x, > not 8.x. > Actually I think that Lucene would be

Re: [DISCUSS] Lucene-Solr split (Solr promoted to TLP)

2020-05-10 Thread Adrien Grand
On Sun, May 10, 2020 at 8:20 AM David Smiley wrote: > I wonder if ElasticSearch tries to do this on their side too; does it? > Yes, Elasticsearch regularly upgrades to new snapshots of Lucene[1][2], often multiple times per minor version. It helps give Lucene more test and performance coverage,

Re: [DISCUSS] 8.5.2 Release?

2020-05-09 Thread Adrien Grand
I'd rather not change signatures on a patch release. Are there ways we could have the bug fix on 8.5 without backporting this API change? On Fri, May 8, 2020 at 5:18 PM Mike Drob wrote: > While back porting LUCENE-9350, I ran into needing LUCENE-9349[1] which > introduces a small API change to

Re: [DISCUSS] 8.5.2 Release?

2020-05-09 Thread Adrien Grand
I just reenabled 8.5 builds on the ASF's Jenkins. On Fri, May 8, 2020 at 5:18 PM Mike Drob wrote: > While back porting LUCENE-9350, I ran into needing LUCENE-9349[1] which > introduces a small API change to QueryVisitor. Do we think this is > acceptable in a bug fix release, or do I need to

Re: [DISCUSS] 8.5.2 Release?

2020-05-07 Thread Adrien Grand
+1 Le jeu. 7 mai 2020 à 19:13, Mike Drob a écrit : > Devs, > > I know that we had 8.5.1 only a few weeks ago, but with the fix for > LUCENE-9350 I think we should consider another bug-fix. I know that without > it I will be explicitly recommending several users to stay off of 8.5.x on > their

Re: [DISCUSS] Lucene-Solr split (Solr promoted to TLP)

2020-05-07 Thread Adrien Grand
There are definitely pros and cons of splitting vs. being a single project. The bigger pains for me until now have been the following ones: Digging Solr failures The theory is that Solr failures can help find Lucene bugs that Lucene bugs wouldn't catch, and while this occurred a couple times, I

Re: 7.7.3 bugfix release

2020-05-04 Thread Adrien Grand
I just disabled the 7.7 builds since 7.7 has made it to the mirrors. On Wed, Apr 22, 2020 at 1:23 AM Noble Paul wrote: > thanks Adrien > > On Tue, Apr 21, 2020 at 6:22 PM Adrien Grand wrote: > > > > FYI I just re-enabled 7.7 builds on the Apache Jenkins. > > > &

Re: Update French stop Words

2020-04-30 Thread Adrien Grand
Thanks Philippe. I was first confused because these changes are not reflected at http://snowball.tartarus.org/algorithms/french/stop.txt, but they are in the source tree indeed: https://github.com/snowballstem/snowballstem.github.io/blob/master/algorithms/french/stop.txt . JIRA screens suggest

Re: [VOTE] Release Lucene/Solr 7.7.3 RC1

2020-04-21 Thread Adrien Grand
I didn't get the same error as Andrzej while running the smoketester. +1 SUCCESS! [3:49:55.694980] On Tue, Apr 21, 2020 at 12:31 PM Andrzej Białecki wrote: > Hi, > > I’m getting the following error, looks like the checksum doesn’t match the > file: > > Test Solr... > test basics... > check

Re: Require consistency between different data-structures sharing the same field name as of 9.0?

2020-04-21 Thread Adrien Grand
factory methods for queries >> and sorts on the field type. So for example a LongPointAndValue field >> would automatically index its value into both BKD and NumericDocValues, and >> then LongPointAndValue#newRangeQuery() would build the relevant >> IndexOrDocVal

Re: Lucene/Solr 8.5.1 bugfix release

2020-04-21 Thread Adrien Grand
I just disabled the Apache Jenkins builds for 8.5. On Wed, Apr 15, 2020 at 2:04 PM Adrien Grand wrote: > Solr doesn't use addIndexes(Directory) so this is not relevant to Solr > indeed. > > On Wed, Apr 15, 2020 at 11:40 AM Ignacio Vera wrote: > >> I updated the Solr's re

Re: 7.7.3 bugfix release

2020-04-21 Thread Adrien Grand
FYI I just re-enabled 7.7 builds on the Apache Jenkins. On Thu, Apr 16, 2020 at 10:17 PM jim ferenczi wrote: > Hi, > > Ì merged LUCENE-9300 in > the 7.7 branch. > > > I shall cut the branch in a day or two > > I guess you meant create the

Require consistency between different data-structures sharing the same field name as of 9.0?

2020-04-20 Thread Adrien Grand
Hello, Lucene currently doesn't require consistency across data-structures. For instance it is possible to have different values in points and doc values under the same field name. Until now, we worked around it either by making features use a single data-structure, e.g. facets only use doc

Re: Lucene/Solr 8.5.1 bugfix release

2020-04-15 Thread Adrien Grand
ion when only one node down >>>> >>>> Jan >>>> >>>> 3. apr. 2020 kl. 22:29 skrev Jan Høydahl : >>>> >>>> I plan to merge this to branch_8_5 >>>> >>>>*SOLR-14359 >>>> <https://iss

Re: [VOTE] Release Lucene/Solr 8.5.1 RC1

2020-04-10 Thread Adrien Grand
+1 SUCCESS! [2:38:04.620259] On Thu, Apr 9, 2020 at 1:28 PM jim ferenczi wrote: > +1 > > SUCCESS! [2:10:08.094546] > > Le jeu. 9 avr. 2020 à 10:19, Alan Woodward a > écrit : > >> +1 >> >> SUCCESS! [1:18:54.574272] >> >> On 8 Apr 2020, at 21:21, Nhat Nguyen >> wrote: >> >> +1 >> >> SUCCESS!

Re: Welcome Eric Pugh as a Lucene/Solr committer

2020-04-06 Thread Adrien Grand
Welcome Eric! On Mon, Apr 6, 2020 at 2:21 PM Jan Høydahl wrote: > Hi all, > > Please join me in welcoming Eric Pugh as the latest Lucene/Solr committer! > > Eric has been part of the Solr community for over a decade, as a code > contributor, book author, company founder, blogger and mailing

Re: Lucene/Solr 8.5.1 bugfix release

2020-04-02 Thread Adrien Grand
My general take on this is that it's ok to upgrade a dependency in a patch release if the dependency upgrade itself is a new patch release of the same minor version. The changelog of Tika 1.24 seems to include not only bug fixes but also some enhancements[1], so I'd rather do a 8.6 release in the

Re: Lucene/Solr 8.5.1 bugfix release

2020-04-02 Thread Adrien Grand
+1 On Thu, Apr 2, 2020 at 7:47 PM Ignacio Vera wrote: > Hi, > > I propose a quick 8.5.1 bugfix release and I volunteer as RM. The main > motivation for this release is LUCENE-9300 where Jim addressed a serious > bug that can lead to data corruption when merging indices via IW#addIndices. > > If

Re: Inconsistent query results in Lucene 8.1.0

2020-03-10 Thread Adrien Grand
Thanks for digging this issue Michele. On Tue, Mar 10, 2020 at 5:04 PM Michele Palmia wrote: > Fiona - I opened a ticket > for this. You can > find some recommendations there that might help you fix your issue. > -- Adrien

Re: Do we leverage index sort for filters?

2020-03-05 Thread Adrien Grand
We don't directly take advantage of index sort in this case, but index sorting still makes this faster. I had mentioned it in a presentation a couple years ago https://speakerdeck.com/elastic/get-the-lay-of-the-lucene-land-1?slide=14: querying geonames for TYPE:CITY AND CONTRY_CODE_US ran 1.6x

Re: CHANGES.txt and issue categorization

2020-03-04 Thread Adrien Grand
t;> * Other: Anything else: Refactorings, tests, build, docs, etc. And >> adding log statements. >> >> >> I recommend the following changes to Lucene 8.5: >> >> These are "Improvements" that I think are better categorized as >> &q

Welcome Nhat Nguyen to the PMC

2020-03-03 Thread Adrien Grand
I am pleased to announce that Nhat Nguyen has accepted the PMC's invitation to join. Welcome Nhat! -- Adrien

Re: 8.5 release

2020-02-18 Thread Adrien Grand
+1 On Tue, Feb 18, 2020 at 4:58 PM Alan Woodward wrote: > Hi all, > > It’s been a while since we released lucene-solr 8.4, and we’ve accumulated > quite a few nice new features since then. I’d like to volunteer to be a > release manager for an 8.5 release. If there's agreement, then I plan to

Re: Info on document number limitations

2020-02-14 Thread Adrien Grand
Lucene has a limit of 2^31-1-128 documents per index, see IndexWriter.MAX_DOCS. Users don't often run into this limit but I've seen it happen multiple times. I think that it's unlikely that Lucene will ever remove this limit on a per-segment basis, however there have been some discussions about

Re: [JENKINS-Experimental-GC] Lucene-Solr-8.x-Linux (64bit/jdk-12.0.2) - Build # 2035 - Still Unstable!

2020-02-06 Thread Adrien Grand
It should be fixed now. On Thu, Feb 6, 2020 at 11:55 AM Adrien Grand wrote: > I'm looking into it. > > On Thu, Feb 6, 2020 at 8:56 AM Policeman Jenkins Server < > jenk...@thetaphi.de> wrote: > >> Build: https://jenkins.thetaphi.de/job/Lucene-Solr-8.x-Linux/2035/ &g

Re: [JENKINS-Experimental-GC] Lucene-Solr-8.x-Linux (64bit/jdk-12.0.2) - Build # 2035 - Still Unstable!

2020-02-06 Thread Adrien Grand
I'm looking into it. On Thu, Feb 6, 2020 at 8:56 AM Policeman Jenkins Server wrote: > Build: https://jenkins.thetaphi.de/job/Lucene-Solr-8.x-Linux/2035/ > Java: 64bit/jdk-12.0.2 -XX:+UseCompressedOops > -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC > > 1 tests failed. > FAILED:

Re: Solr 9.0?

2020-02-02 Thread Adrien Grand
One Lucene issue I'd like to have in 9.0 is https://issues.apache.org/jira/browse/LUCENE-9047, which is about making our directory abstractions little-endian instead of big-endian. It would be breaking enough that it should be done in a major. On Sun, Feb 2, 2020 at 3:35 PM Erick Erickson wrote:

Re: [lucene-solr] branch master updated: Synchronizing 8.4.1 changes

2020-01-22 Thread Adrien Grand
Given that 8.5.0 is not released yet, moving these entries from 8.5 to 8.4.1 feels consistent with instructions we have on the wiki: https://cwiki.apache.org/confluence/display/lucene/ReleaseTodo#ReleaseTodo-SynchronizeCHANGES.txt. Indeed we would have duplicated entries if we were releasing 8.4.2

Re: Congratulations to the new Lucene/Solr PMC Chair, Anshum Gupta!

2020-01-16 Thread Adrien Grand
Congratulations, Anshum! On Wed, Jan 15, 2020 at 10:15 PM Cassandra Targett wrote: > Every year, the Lucene PMC rotates the Lucene PMC chair and Apache Vice > President position. > > This year we have nominated and elected Anshum Gupta as the Chair, a > decision that the board approved in its

Heads up: the FSTOrd postings format is about to be removed

2020-01-15 Thread Adrien Grand
Hello all, Short version: I'm about to remove the FSTOrd postings format. Longer version: I'm trying to simplify the API we have for the terms dictionary to interact with postings. This involves removing an API that is only used by the FST and FSTOrd postings format, as a way to store

Re: [JENKINS] Lucene-Solr-NightlyTests-8.x - Build # 314 - Failure

2020-01-07 Thread Adrien Grand
I disabled NRTCachingDirectory when tests request a FSDirectory to fix these failures. On Sat, Jan 4, 2020 at 2:30 PM Apache Jenkins Server < jenk...@builds.apache.org> wrote: > Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-8.x/314/ > > 1 tests failed. > FAILED:

<    1   2   3   4   5   6   7   8   9   10   >