Re: [VOTE] Release Lucene 10.0.0 RC2

2024-10-03 Thread Michael McCandless
+1

SUCCESS! [0:37:28.511639]

Mike McCandless

http://blog.mikemccandless.com


On Thu, Oct 3, 2024 at 3:44 PM Tomás Fernández Löbbe 
wrote:

> +1
> SUCCESS! [1:46:57.466093]
>
> On Thu, Oct 3, 2024 at 11:43 AM Michael Sokolov 
> wrote:
>
>> um, +1
>>
>> On Thu, Oct 3, 2024 at 10:39 AM Michael Sokolov 
>> wrote:
>> >
>> > SUCCESS! [1:24:53.393070]
>> >
>> > On Thu, Oct 3, 2024 at 9:43 AM Benjamin Trent 
>> wrote:
>> > >
>> > > +1 SUCCESS! [0:56:38.403983]
>> > >
>> > > On Thu, Oct 3, 2024 at 5:51 AM Stefan Vodita 
>> wrote:
>> > > >
>> > > > +1 SUCCESS! [0:39:04.597088]
>> > > >
>> > > >
>> > > > On Thu, 3 Oct 2024 at 07:48, Luca Cavanna 
>> wrote:
>> > > >>
>> > > >> Please vote for release candidate 2 for Lucene 10.0.0
>> > > >>
>> > > >> I published a draft of the release notes at
>> https://cwiki.apache.org/confluence/display/LUCENE/Release+Notes+10.0.0
>> . Feedback is welcome. Feel free to edit directly.
>> > > >>
>> > > >> The artifacts can be downloaded from:
>> > > >>
>> https://dist.apache.org/repos/dist/dev/lucene/lucene-10.0.0-RC2-rev-f76fdb293e1a9be99f3bb1fde38d85ad8272b081
>> > > >>
>> > > >> You can run the smoke tester directly with this command:
>> > > >>
>> > > >> python3 -u dev-tools/scripts/smokeTestRelease.py \
>> > > >>
>> https://dist.apache.org/repos/dist/dev/lucene/lucene-10.0.0-RC2-rev-f76fdb293e1a9be99f3bb1fde38d85ad8272b081
>> > > >>
>> > > >> The vote will be open for at least 72 hours i.e. until 2024-10-05
>> 21:00 UTC.
>> > > >>
>> > > >> [ ] +1  approve
>> > > >> [ ] +0  no opinion
>> > > >> [ ] -1  disapprove (and reason why)
>> > > >>
>> > > >> Here is my +1
>> > > >>
>> > > >> SUCCESS! [0:38:09.149221]
>> > > >>
>> > > >>
>> > >
>> > > -
>> > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > > For additional commands, e-mail: dev-h...@lucene.apache.org
>> > >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>


Re: [VOTE] Release Lucene 10.0.0 RC1

2024-10-01 Thread Michael McCandless
Hi Team,

I'm worried about https://github.com/apache/lucene/issues/13844 -- a hang
during HNSW indexing, uncovered in nightly benchy.  I'm seeing if the hang
reproduces ... but it might be a blocker for 10.0.0 release if it's real /
repros?  Then again, it must be quite rare ...

Mike McCandless

http://blog.mikemccandless.com


On Tue, Oct 1, 2024 at 12:36 PM Chris Hegarty
 wrote:

>
> SUCCESS! [0:53:16.589622]
>
> +1 (binding)
>
> -Chris.
>
> > On 1 Oct 2024, at 16:32, Luca Cavanna  wrote:
> >
> > Please vote for release candidate 1 for Lucene 10.0.0
> >
> > I published a draft of the release notes at
> https://cwiki.apache.org/confluence/display/LUCENE/Release+Notes+10.0.0 .
> Feedback is welcome. Feel free to edit directly.
> >
> > The artifacts can be downloaded from:
> >
> https://dist.apache.org/repos/dist/dev/lucene/lucene-10.0.0-RC1-rev-4461bc1eff44c26c149b6ff96a667ce68c6866a4
> >
> > You can run the smoke tester directly with this command:
> >
> > python3 -u dev-tools/scripts/smokeTestRelease.py \
> >
> https://dist.apache.org/repos/dist/dev/lucene/lucene-10.0.0-RC1-rev-4461bc1eff44c26c149b6ff96a667ce68c6866a4
> >
> > The vote will be open for at least 72 hours i.e. until 2024-10-04 15:00
> UTC.
> >
> > [ ] +1  approve
> > [ ] +0  no opinion
> > [ ] -1  disapprove (and reason why)
> >
> > Here is my +1:
> >
> > SUCCESS! [0:45:06.247409]
> >
> >
> > Cheers
> > Luca
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: [VOTE] Release Lucene 9.12.0 RC2

2024-09-27 Thread Michael McCandless
On Wed, Sep 25, 2024 at 12:51 PM Dawid Weiss  wrote:

>
>
>> Mike's laptop is more than 2x faster than mine?!?!? Dang, do I need an
>> upgrade?
>
>
> Mike lives in the future and only occasionally drops by to the present.
> His computers are also from the future. It's always been like this - you
> can
> check the archives. You can't buy his "beast" machine because it hasn't
> been fabricated yet. Uwe can probably confirm that not only his machines
> are from the future but also the shower knobs at his house. :)
>

HA!

This run was on my Raptor Lake (i9 13900K CPU) workstation.  It's a fast
CPU, though maybe time to think about upgrading my dev box again ... you
know one computer year equates to 35 human years!  Similar to how one dog
year equates to 7 human years ...

The shower knobs are indeed from the future.

Mike

>


Re: [VOTE] Release Lucene 9.12.0 RC2

2024-09-25 Thread Michael McCandless
+1 to release:  SUCCESS! [0:19:16.468055]

Thanks Chris!

Mike McCandless

http://blog.mikemccandless.com


On Wed, Sep 25, 2024 at 9:57 AM Chris Hegarty
 wrote:

> Please vote for release candidate 2 for Lucene 9.12.0
>
> This build includes the bwc stuff with 8.11.4.
>
> The artifacts can be downloaded from:
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.12.0-RC2-rev-e913796758de3d9b9440669384b29bec07e6a5cd
>
> You can run the smoke tester directly with this command:
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.12.0-RC2-rev-e913796758de3d9b9440669384b29bec07e6a5cd
>
> The vote will be open for at least 72 hours i.e. until 2024-09-28 17:00
> UTC.
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1
>
> -Chris.
>
> P.S. Uwe, could I please ask you to run this new RC2 through your
> Policeman build and test?
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: [VOTE] Release Lucene 9.12.0 RC1

2024-09-25 Thread Michael McCandless
On Wed, Sep 25, 2024 at 7:01 AM Chris Hegarty <
christopher.hega...@elastic.co> wrote:

> +1 for release, unless the uncovered issue with TieredMergePolicy causes
> malfunction.
>
> My analysis concludes that there is no product bug here. Yeah, maybe the
> policy could be a little more aggressive in its merging, but the issue that
> the test runs into looks like a corner case on the boundary of what the
> heuristics look for.
>
> https://github.com/apache/lucene/issues/13818#issuecomment-2371868712
>
> It would be good to have Adrien or Mike confirm.


I added a comment -- this looks like a rare corner case bug or test issue.
It should not be a 9.12.0 release blocker

Thanks Chris.

Mike McCandless

http://blog.mikemccandless.com


Re: Lucene 10.0 and 9.12 blockers

2024-09-11 Thread Michael McCandless
On Tue, Sep 10, 2024 at 9:52 AM Michael McCandless <
luc...@mikemccandless.com> wrote:

I'm not clear if someone is actively looking into the recall issues with
>> 8-bit quantization <https://github.com/apache/lucene/issues/13519>?
>>
>
> If nobody jumps on this in the next day or so, I'll work up a PR to remove
> int8 for now.  Especially with possible RaBitQ (aside: I am not loving this
> sudden trend for acronyms to be case sensitive like this!!) coming, maybe
> the (less impactful?) scalar quantization will become less interesting...
>

OK I posted an initial PR for this:
https://github.com/apache/lucene/pull/13767

Mike McCandless

http://blog.mikemccandless.com

>


Re: Lucene 10.0 and 9.12 blockers

2024-09-10 Thread Michael McCandless
Thanks for driving this release process Adrien, and everyone for helping to
resolve the blockers!  More comments below:

On Mon, Sep 9, 2024 at 1:45 PM Adrien Grand  wrote:

I just reviewed the humongous PR that migrates more classes to records
> , it looks pretty good to
> me. If someone can look at my comments (hopefully much quicker than
> reviewing the whole PR!), I would appreciate it.
>

+1 to merge this.

I would like to get support for intra-segment search concurrency
>  in, as it is breaking
> enough that we could not easily introduce it in a minor later on. It seems
> to be almost ready, so hopefully it will get merged before feature freeze
> next week?
>

+1


> I'm not clear if someone is actively looking into the recall issues with
> 8-bit quantization ?
>

If nobody jumps on this in the next day or so, I'll work up a PR to remove
int8 for now.  Especially with possible RaBitQ (aside: I am not loving this
sudden trend for acronyms to be case sensitive like this!!) coming, maybe
the (less impactful?) scalar quantization will become less interesting...

Mike McCandless

http://blog.mikemccandless.com

>


Re: [JENKINS] Lucene-MMAPv2-Windows (64bit/hotspot/jdk-21.0.1) - Build # 1517 - Unstable!

2024-09-05 Thread Michael McCandless
I opened https://github.com/apache/lucene/issues/13720

Mike McCandless

http://blog.mikemccandless.com


On Thu, Sep 5, 2024 at 9:00 AM Michael McCandless 
wrote:

> This reproduces for me, on Linux and current main.  I'll open an issue ...
> looks likely to be a floating point ulp issue, wrong/different order of
> operations or so.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Thu, Sep 5, 2024 at 12:07 AM Policeman Jenkins Server <
> jenk...@thetaphi.de> wrote:
>
>> Build: https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Windows/1517/
>> Java: 64bit/hotspot/jdk-21.0.1 -XX:+UseCompressedOops -XX:+UseG1GC
>>
>> 1 tests failed.
>> FAILED:
>> org.apache.lucene.facet.taxonomy.TestTaxonomyFacetAssociations.testFloatSumAssociation
>>
>> Error Message:
>> java.lang.AssertionError: expected:<1832078.25> but was:<1832078.0>
>>
>> Stack Trace:
>> java.lang.AssertionError: expected:<1832078.25> but was:<1832078.0>
>> at
>> __randomizedtesting.SeedInfo.seed([DCA9002AF6829540:B1E122AE39E2A1D]:0)
>> at junit@4.13.1/org.junit.Assert.fail(Assert.java:89)
>> at junit@4.13.1/org.junit.Assert.failNotEquals(Assert.java:835)
>> at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:555)
>> at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:685)
>> at
>> org.apache.lucene.facet.taxonomy.TestTaxonomyFacetAssociations.assertFloatFacetResultsEqual(TestTaxonomyFacetAssociations.java:657)
>> at
>> org.apache.lucene.facet.taxonomy.TestTaxonomyFacetAssociations.testFloatSumAssociation(TestTaxonomyFacetAssociations.java:330)
>> at
>> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
>> at java.base/java.lang.reflect.Method.invoke(Method.java:580)
>> at randomizedtesting.runner@2.8.1
>> /com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
>> at randomizedtesting.runner@2.8.1
>> /com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
>> at randomizedtesting.runner@2.8.1
>> /com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
>> at randomizedtesting.runner@2.8.1
>> /com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
>> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
>> /org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
>> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
>> /org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
>> /org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
>> /org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
>> /org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>> at junit@4.13.1
>> /org.junit.rules.RunRules.evaluate(RunRules.java:20)
>> at randomizedtesting.runner@2.8.1
>> /com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at randomizedtesting.runner@2.8.1
>> /com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
>> at randomizedtesting.runner@2.8.1
>> /com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
>> at randomizedtesting.runner@2.8.1
>> /com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
>> at randomizedtesting.runner@2.8.1
>> /com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
>> at randomizedtesting.runner@2.8.1
>> /com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
>> at randomizedtesting.runner@2.8.1
>> /com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
>> at randomizedtesting.runner@2.8.1
>> /com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
>> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
>&g

Re: [JENKINS] Lucene-MMAPv2-Windows (64bit/hotspot/jdk-21.0.1) - Build # 1517 - Unstable!

2024-09-05 Thread Michael McCandless
This reproduces for me, on Linux and current main.  I'll open an issue ...
looks likely to be a floating point ulp issue, wrong/different order of
operations or so.

Mike McCandless

http://blog.mikemccandless.com


On Thu, Sep 5, 2024 at 12:07 AM Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Windows/1517/
> Java: 64bit/hotspot/jdk-21.0.1 -XX:+UseCompressedOops -XX:+UseG1GC
>
> 1 tests failed.
> FAILED:
> org.apache.lucene.facet.taxonomy.TestTaxonomyFacetAssociations.testFloatSumAssociation
>
> Error Message:
> java.lang.AssertionError: expected:<1832078.25> but was:<1832078.0>
>
> Stack Trace:
> java.lang.AssertionError: expected:<1832078.25> but was:<1832078.0>
> at
> __randomizedtesting.SeedInfo.seed([DCA9002AF6829540:B1E122AE39E2A1D]:0)
> at junit@4.13.1/org.junit.Assert.fail(Assert.java:89)
> at junit@4.13.1/org.junit.Assert.failNotEquals(Assert.java:835)
> at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:555)
> at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:685)
> at
> org.apache.lucene.facet.taxonomy.TestTaxonomyFacetAssociations.assertFloatFacetResultsEqual(TestTaxonomyFacetAssociations.java:657)
> at
> org.apache.lucene.facet.taxonomy.TestTaxonomyFacetAssociations.testFloatSumAssociation(TestTaxonomyFacetAssociations.java:330)
> at
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
> at java.base/java.lang.reflect.Method.invoke(Method.java:580)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at junit@4.13.1
> /org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>

Re: Lucene 10.0 and 9.12 blockers

2024-09-04 Thread Michael McCandless
On Sat, Aug 31, 2024 at 2:00 PM Shubham Chaudhary 
wrote:

> Hi, regarding the 10.0 release, should we also consider
> https://github.com/apache/lucene/pull/13328. It was planned for 10.0 (
> https://github.com/apache/lucene/issues/13207) and is waiting on review,
> so I think it'll be good if we could consider it. Looking forward to views
> and seeing if there are any concerns with the change I'm unaware of.
>

+1

It looks like this one is super close?  A couple of rounds of feedback from
Uwe, folded into the PR.  Maybe mark it blocker so we don't lose track?

Thanks Shubham.

Mike McCandless

http://blog.mikemccandless.com


> - Shubham
>
> On Thu, Aug 8, 2024 at 10:20 PM Adrien Grand  wrote:
>
>> Hello everyone,
>>
>> As previously discussed
>> , I
>> plan on releasing 9.last and 10.0 under the following timeline:
>> - ~September 15th: 10.0 feature freeze - main becomes 11.0
>> - ~September 22nd: 9.last release,
>> - ~October 1st: 10.0 release.
>>
>> Unless someone shortly volunteers to do a 9.x release, this 9.last
>> release will likely be 9.12.
>>
>> As these dates are coming shortly, I would like to start tracking
>> blockers. Please reply to this thread with issues that you know about that
>> should delay the 9.last or 10.0 releases.
>>
>> Chris, Uwe: I also wanted to check with you if this timeline works well
>> with regards to supporting Java 23 in 9.last and 10.0?
>>
>> --
>> Adrien
>>
>


Re: Lucene 10.0 and 9.12 blockers

2024-09-04 Thread Michael McCandless
On Thu, Aug 29, 2024 at 5:15 AM Luca Cavanna 
wrote:

> For Lucene 10.0, I have two topics to raise:
>
> 1. Remove the deprecated IndexSearcher#search(Query, Collector) in favour
> of IndexSearcher#search(Query, CollectorManager)  (
> https://github.com/apache/lucene/issues/12892): this involves removing
> the leftover usages in facet, grouping, join and test-framework, plus in
> some tests. A list of the leftover usages is in the description of the
> issue. It would be great to complete this for Lucene 10, otherwise this
> deprecated method and usages will stick around for much longer. What do
> others think? Should we make this a blocker for the release? I think this
> is not a huge effort and it is parallelizable across different people.
>

+1, major releases are a great time to finish switching off of deprecated
classes and then removing them.


> 2. Intra-segment concurrency (https://github.com/apache/lucene/pull/13542):
> current thinking is to add support for partitioning segments when
> searching, and searching across segment partitions concurrently. My
> intention is to introduce breaking changes and documentation in Lucene 10
> (really only the basics), without switching the default slicing of
> IndexSearcher to create segment partitions. We will want to leverage
> segment partitions in testing. More iterations are going to be needed to
> remove duplicated work across partitions of the same segment, which is my
> next step, but currently out of scope for Lucene 10. Judging from the
> reviews I got so far, my PR is not far and I am working on it to address
> comments, polish it a bit more and merge it soon.
>

+1 to this strategy.

Thanks Luca.

Mike McCandless

http://blog.mikemccandless.com

>


Re: gradle build gives spurious warnings about unferenced license files?

2024-08-28 Thread Michael McCandless
I opened https://github.com/apache/lucene/issues/13695 to try to get to the
bottom of this.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Aug 27, 2024 at 4:29 PM Michael McCandless <
luc...@mikemccandless.com> wrote:

> Hi Team,
>
> When I run "gradle check" on main, sometimes I get seemingly false
> positive warnings about dozens of unferenced license files, like this:
>
> WARNING: there were unreferenced files under license folder:
>   - /s1/l/trunk/lucene/licenses/antlr4-runtime-LICENSE-BSD.txt
>   - /s1/l/trunk/lucene/licenses/antlr4-runtime-NOTICE.txt
>   - /s1/l/trunk/lucene/licenses/asm-LICENSE-BSD_LIKE.txt
>   - /s1/l/trunk/lucene/licenses/asm-NOTICE.txt
>   - /s1/l/trunk/lucene/licenses/asm-commons-LICENSE-BSD_LIKE.txt
>   - /s1/l/trunk/lucene/licenses/asm-commons-NOTICE.txt
>   - /s1/l/trunk/lucene/licenses/assertj-core-LICENSE-ASL.txt
>   - /s1/l/trunk/lucene/licenses/assertj-core-NOTICE.txt
>   - /s1/l/trunk/lucene/licenses/commons-codec-LICENSE-ASL.txt
>   - /s1/l/trunk/lucene/licenses/commons-codec-NOTICE.txt
>   ...
>
> If I then do a "./gradlew clean" and again "./gradlew check" the warnings
> seem to go away, but then if I immediately run "./gradlew check" again
> (after clean then check), the warnings return.
>
> Does anyone know why this happens?  Can we stop the false positives?  God
> forbid I someday actually introduce a true positive warning, I would never
> notice :)
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>


Re: Lucene 10.0 and 9.12 blockers

2024-08-28 Thread Michael McCandless
I think maybe also https://github.com/apache/lucene/issues/13519 should be
a blocker?  It looks like 8 bit vector HNSW quantization is broken (unless
I'm making a silly mistake with luceneutil tooling).

I've also set its milestone to 10.0.0.

Do we really not have a way to mark an issue a blocker for a given
release?  That's insane.  OK well I went and created "blocker" label, and
added that to GH 13519.  Greg, I'll also go mark your linked issue as
"blocker".

Mike McCandless

http://blog.mikemccandless.com


On Sat, Aug 24, 2024 at 2:33 PM Uwe Schindler  wrote:

> Hi,
>
> I updated Policeman Jenkins to have JDK 23 RC and JDK 24 EA releases.
>
> Uwe
>
> P.S.: Unfortunately I have to update the macOS Hackintosh VM to have a
> newer operating system version: JDK 22 and later no longer run on this
> machine.
> Am 23.08.2024 um 10:41 schrieb Uwe Schindler:
>
> Hi,
>
> In 9.x there's still the backport of
> https://github.com/apache/lucene/pull/13570 to be done. The PR apperas in
> the changelog, but was not backported yet. Chris and I will do this soon.
>
> 9.last release on Sept 22 fits perfectly with the JDK 23 release (and we
> will have Panama Vector Support). I am seeting up Jenkins Job with latest
> RC now to verify all vector stuff works with 23.
>
> Uwe
> Am 08.08.2024 um 18:50 schrieb Adrien Grand:
>
> Hello everyone,
>
> As previously discussed
> , I
> plan on releasing 9.last and 10.0 under the following timeline:
> - ~September 15th: 10.0 feature freeze - main becomes 11.0
> - ~September 22nd: 9.last release,
> - ~October 1st: 10.0 release.
>
> Unless someone shortly volunteers to do a 9.x release, this 9.last release
> will likely be 9.12.
>
> As these dates are coming shortly, I would like to start tracking
> blockers. Please reply to this thread with issues that you know about that
> should delay the 9.last or 10.0 releases.
>
> Chris, Uwe: I also wanted to check with you if this timeline works well
> with regards to supporting Java 23 in 9.last and 10.0?
>
> --
> Adrien
>
> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremenhttps://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremenhttps://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>


gradle build gives spurious warnings about unferenced license files?

2024-08-27 Thread Michael McCandless
Hi Team,

When I run "gradle check" on main, sometimes I get seemingly false positive
warnings about dozens of unferenced license files, like this:

WARNING: there were unreferenced files under license folder:
  - /s1/l/trunk/lucene/licenses/antlr4-runtime-LICENSE-BSD.txt
  - /s1/l/trunk/lucene/licenses/antlr4-runtime-NOTICE.txt
  - /s1/l/trunk/lucene/licenses/asm-LICENSE-BSD_LIKE.txt
  - /s1/l/trunk/lucene/licenses/asm-NOTICE.txt
  - /s1/l/trunk/lucene/licenses/asm-commons-LICENSE-BSD_LIKE.txt
  - /s1/l/trunk/lucene/licenses/asm-commons-NOTICE.txt
  - /s1/l/trunk/lucene/licenses/assertj-core-LICENSE-ASL.txt
  - /s1/l/trunk/lucene/licenses/assertj-core-NOTICE.txt
  - /s1/l/trunk/lucene/licenses/commons-codec-LICENSE-ASL.txt
  - /s1/l/trunk/lucene/licenses/commons-codec-NOTICE.txt
  ...

If I then do a "./gradlew clean" and again "./gradlew check" the warnings
seem to go away, but then if I immediately run "./gradlew check" again
(after clean then check), the warnings return.

Does anyone know why this happens?  Can we stop the false positives?  God
forbid I someday actually introduce a true positive warning, I would never
notice :)

Mike McCandless

http://blog.mikemccandless.com


Re: Branchless binary search in Java?

2024-08-27 Thread Michael McCandless
Thanks Adrien, this is really cool!  And the 10.4% speedup for
CountAndHighHigh, and ~2-4% speedup for other conjunctive queries, is
impressive!  It looks worth merging?

I put a few comments on the PR.  I do think other places in Lucene could
benefit over time (we don't need to test them before merging this already
impressive gain), e.g. range facet counting is heavy on binary search, and
is likely more "random access" ish: each value to be counted is binary
search'd to find the right bin in the range atom histogram to increment (
https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/range/LongRangeCounter.java#L77-L114
).

Mike McCandless

http://blog.mikemccandless.com


On Tue, Aug 27, 2024 at 5:21 AM Adrien Grand  wrote:

> For the curious, I managed to make Java use cmov instructions for binary
> search, see https://github.com/apache/lucene/pull/13692. binarySearch4,
> binarySearch5 and binarySearch6 internally use the cmov instruction.
>
> However, translating it into speedups for query evaluation proved quite
> challenging, as most queries in our benchmarks find the desired doc ID in
> the couple doc IDs after the current doc ID, which makes linear search hard
> to beat. I thought I'd still share it on this thread as we might have other
> places where making binary search branchless could prove useful. Some of
> you might also have other ideas how we could make advancing within a block
> faster.
>
> On Mon, Aug 14, 2023 at 5:31 PM Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> Oh I realized this super interesting read (thanks Rob!) was accidentally
>> sent privately to me, so I'm reply-all'ing explicitly so everyone else gets
>> to read this too :)
>>
>> I especially love the charts and math at the end!
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Thu, Jul 27, 2023 at 8:02 AM Rob Audenaerde 
>> wrote:
>>
>>> Super interesting read!
>>>
>>> Btw. following the links of the internet I somehow ended up here, which
>>> is a nice in-depth comparison on binary-search approaches:
>>>
>>> https://en.algorithmica.org/hpc/data-structures/binary-search/
>>>
>>>
>>>
>>> On Thu, Jul 27, 2023 at 1:40 PM Michael McCandless <
>>> luc...@mikemccandless.com> wrote:
>>>
>>>> Hi Team,
>>>>
>>>> At Amazon (customer facing product search team) we've been playing with
>>>> / benchmarking Tantivy (exciting Rust search engine loosely inspired by
>>>> Lucene: https://github.com/quickwit-oss/tantivy, created by Paul
>>>> Masurel and developed now by Quickwit and the Tantivy open source dev
>>>> community) vs Lucene, by building a benchmark that blends
>>>> Tantivy's search-benchmark-game (
>>>> https://github.com/quickwit-oss/search-benchmark-game) and Lucene's
>>>> nightly benchmarks (luceneutil:
>>>> https://github.com/mikemccand/luceneutil and
>>>> https://home.apache.org/~mikemccand/lucenebench).
>>>>
>>>> It's great fun, and we would love more eyeballs to spot any remaining
>>>> unfair aspects of the comparison (thank you Uwe and Adrien for catching
>>>> things already!).  We are trying hard for an apples to apples comparison:
>>>> same (enwiki) corpus, same queries, confirming we get precisely the same
>>>> top N hits and total hit counts, latest versions of both engines.
>>>>
>>>> Indeed, Tantivy is substantially (2-3X?) faster than Lucene for many
>>>> queries, and we've been trying to understand why.
>>>>
>>>> Sometimes it is due to a more efficient algorithms, e.g. the count()
>>>> API for pure disjunctive queries, which Adrien is now porting to Lucene
>>>> (thank you!), showing sizable (~80% faster in one query) speedups:
>>>> https://github.com/apache/lucene/issues/12358.  Other times may be due
>>>> to Rust's more efficient/immediate/Python-like GC, or direct access to SIMD
>>>> (Lucene now has this for aKNN search -- big speedup -- and maybe soon for
>>>> postings too?), and unsafe code, different skip data / block postings
>>>> encoding, or ...
>>>>
>>>> Specifically, one of the fascinating Tantivy optimizations is the
>>>> branchless binary search:
>>>> https://quickwit.io/blog/search-a-sorted-block.  Here's another blog
>>>> post about it (implemented in C++):
>>>> https://probablydance.com/2023

Re: Welcome Armin Braun as Lucene comitter

2024-07-25 Thread Michael McCandless
Welcome Armin!

Mike McCandless

http://blog.mikemccandless.com


On Thu, Jul 25, 2024 at 5:10 AM Luca Cavanna  wrote:

> I'm pleased to announce that Armin Braun has accepted the PMC's invitation
> to become a Lucene committer.
>
> Armin, the tradition is that new committers introduce themselves with a
> brief bio.
>
> Thanks for your contributions so far and looking forward to the upcoming
> ones :)
>
> Congratulations and welcome!
>


Re: Lucene Cyborg

2024-07-22 Thread Michael McCandless
Thanks for sharing Adrien, this is really cool!  It's neat that the
relative gains of Java vs C are quite a bit less than they were ~11 years
ago when I played with a much smaller subset of queries.  Also, COUNT on
disjunction queries with Lucene Cyborg got slower.  What a feat, to port so
much of our complex Search code to C!

Mike McCandless

http://blog.mikemccandless.com


On Mon, Jul 22, 2024 at 9:43 AM Adrien Grand  wrote:

> Hello everyone,
>
> I recently stumbled on this paper after Ishan shared it on LinkedIn:
> https://github.com/0ctopus13prime/lucene-cyborg-paper/blob/main/LuceneCyborg_Hybrid_Search_Engine_Written_in_Java_and_C%2B%2B.pdf
> .
>
> This is quite impressive: this person did a high-fidelity rewrite of
> Lucene in C++: it can even read indexes created by Lucene as-is. Then they
> ran the Tantivy benchmark to compare performance with Lucene, Tantivy and
> PISA. There are many takeaways, this is an interesting read.
>
> --
> Adrien
>


Re: Supporting more than 2 versions of indexes

2024-07-01 Thread Michael McCandless
Increasing the scope/duration of backwards compatibility index support
across the board adds a big taxation and risk on ongoing development.  It's
hard enough just supporting N-1 major release written indices.

Or are we talking about the "best effort" (e.g. sandbox Codecs) that I
think Simon pursued a while back?

Mike McCandless

http://blog.mikemccandless.com


On Thu, Jun 27, 2024 at 4:39 PM Anshum Gupta  wrote:

> I'm actually only considering support for 8x+ but I think the default
> codec, used by most users, should allow for 7x indexes to be read by 9x. If
> we can do this for 8x+ i.e. indexes generated with 8x being supported by 10
> would be a good starting point as well.
>
> On Thu, Jun 27, 2024 at 1:23 PM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> +1, we should definitely give this a try. Do you have any particular
>> version combinations in mind that don't work for users now? On my end, I
>> see Solr 8x users who would love to use Solr 9x, but with Lucene 8x indexes
>> (previously upgraded from Lucene 7x).
>>
>> On Thu, 27 Jun 2024 at 23:17, Anshum Gupta 
>> wrote:
>>
>>> Hi everyone,
>>>
>>> At Buzzwords and Community Over Code this last month, the topic of
>>> supporting indexes for over 2 versions came up.
>>>
>>> While there are times that require breaking compatibility, I think it
>>> would be really useful to support the indexes especially if you use a codec
>>> that doesn't have a breaking change. This would be extremely useful for
>>> users and would allow them to upgrade without the need to plan for complete
>>> reindexing.
>>>
>>> What do other folks think?
>>>
>>> --
>>> Anshum Gupta
>>>
>>
>
> --
> Anshum Gupta
>


Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-27 Thread Michael McCandless
+1

SUCCESS! [0:19:43.387183]

Thank you for RMing Ignacio!

Mike McCandless

http://blog.mikemccandless.com


On Mon, Jun 24, 2024 at 1:29 AM Ignacio Vera  wrote:

> Please vote for release candidate 1 for Lucene 9.11.1
>
>
> The artifacts can be downloaded from:
>
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.1-RC1-rev-0c087dfdd10e0f6f3f6faecc6af4415e671a9e69
>
>
> You can run the smoke tester directly with this command:
>
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
>
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.1-RC1-rev-0c087dfdd10e0f6f3f6faecc6af4415e671a9e69
>
>
> The vote will be open for at least 72 hours i.e. until 2024-06-27 07:00
> UTC.
>
>
> [ ] +1  approve
>
> [ ] +0  no opinion
>
> [ ] -1  disapprove (and reason why)
>
>
> Here is my +1
>


Re: Lucene 10

2024-06-27 Thread Michael McCandless
Thanks Adrien.  Longish term planning in open source is such a hard thing
so I'm glad you are helping to herd us cats ;)

I've also finally switched our nightly benchmarks to use concurrent search
(intra-query concurrency)!  It's annotation GM in the charts.  Some queries
got faster, like BooleanQuery disjunction of two high frequency terms (
https://home.apache.org/~mikemccand/lucenebench/OrHighHigh.html) and some
got slower e.g. simple TermQuery (
https://home.apache.org/~mikemccand/lucenebench/Term.html).  Now as we make
improvements to Lucene's cross-slice / cross-thread search concurrency,
e.g. intra-segment concurrency, we should be able to see the gains in our
nightly benchmarks.  Adding concurrency to Lucene has been such a long and
fun road, and we are really only getting started in search-time concurrency.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Jun 26, 2024 at 10:59 AM Adrien Grand  wrote:

> Hello everyone,
>
> Time flies, I started this email thread ~3.5 months ago and we now have ~3
> months before September 22nd, where 10.0 will go on feature freeze.
>
> Robert kindly added a description to the GitHub milestone that refers to
> this thread: https://github.com/apache/lucene/milestone/2.
>
> Overall, progress looks rather good to me:
>  - I/O concurrency is progressing nicely
> https://github.com/apache/lucene/issues/13179. In particular I'm hoping
> to merge I/O concurrency for terms dictionary lookups soon.
>  - Ignacio recently merged initial support for sparse indexing.
> https://github.com/apache/lucene/issues/11432 There are follow-ups we
> need to address, but they look reasonable in terms of amount of work and
> uncontroversial.
>
> Some things have got less traction:
>  - We haven't made significant progress on intra-segment search
> concurrency: https://github.com/apache/lucene/issues/9721.
>  - Relatedly, if we think that IndexSearcher should enable concurrency by
> default, a major version is a good time to make such a big change to
> runtime behavior. https://github.com/apache/lucene/issues/11523
>
> In any case, help is welcome. I know people have been creating more issues
> that they attached to the 10.0 milestone, e.g. doing more off-heap scoring
> for vectors https://github.com/apache/lucene/issues/13515 or deprecating
> the COSINE similarity https://github.com/apache/lucene/issues/13281. This
> is great too, the list isn't closed, I'll start thinking harder about which
> changes specifically should block the release as we get closer to September
> (I can't think of any at the moment). In the meantime, it's fine to
> optimistically attach issues to the 10.0 milestone.
>
> On Wed, Mar 20, 2024 at 2:09 PM Adrien Grand  wrote:
>
>> Thanks Mike and Dawid for the kind words, and thanks Patrick, Luca and
>> Egor for your interest in decoupling index geometry from search
>> concurrency, this would be a great release highlight if we can get it into
>> Lucene 10!
>>
>> I haven't seen pushback on the proposed schedule so I plan on proceeding
>> with this timeline in mind.
>>
>> If you have changes that you would like to include in Lucene 10.0, please
>> add the 10.0 milestone
>>  to them. It's ok to
>> be a bit ambitious at this stage and optimistically mark some changes as
>> scheduled for 10.0, we'll have opportunities for removing items from this
>> list when the date comes closer and some issues are not getting proper
>> traction. I'll take care of that.
>>
>> On Mon, Mar 18, 2024 at 11:39 AM Dawid Weiss 
>> wrote:
>>
>>> [...] but Adrien I don't honestly believe anyone who is
 paying attention thinks that is what you have been doing!
>>>
>>>
>>> +1. I wish I were procrastinating as productively!
>>>
>>> D.
>>>
>>
>>
>> --
>> Adrien
>>
>
>
> --
> Adrien
>


Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-05 Thread Michael McCandless
+1 SUCCESS! [0:24:55.332837]

Mike McCandless

http://blog.mikemccandless.com


On Wed, Jun 5, 2024 at 11:21 AM Adrien Grand  wrote:

> +1 SUCCESS! [1:09:30.262027]
>
> On Wed, Jun 5, 2024 at 4:15 PM Tomás Fernández Löbbe <
> tomasflo...@gmail.com> wrote:
>
>> +1
>>
>> SUCCESS! [1:12:30.029470]
>>
>> On Wed, Jun 5, 2024 at 9:22 AM Bruno Roustant 
>> wrote:
>>
>>> +1
>>>
>>> SUCCESS! [0:41:14.593265]
>>>
>>> Bruno
>>>

>
> --
> Adrien
>


Re: Lucene 9.11

2024-05-29 Thread Michael McCandless
Thanks Ben!

Mike McCandless

http://blog.mikemccandless.com


On Wed, May 29, 2024 at 12:45 AM Stefan Vodita 
wrote:

> Ben, I just merged #13414 ,
> so it's not a blocker for the release.
> Thanks again for volunteering to be release manager!
>
> Stefan
>
> On Tue, 28 May 2024 at 14:58, Benjamin Trent 
> wrote:
>
>> Hey y'all,
>>
>> I am planning on starting the release process tomorrow (May 29).
>>
>> I am in the Eastern USA time zone, so I will start the process around
>> noon UTC.
>>
>> I noticed one PR from Stefan. I can wait for that one if I need to.
>>
>> Did we figure out the hppc concerns? I saw some PR activity, wanted to
>> make sure we are all still good with starting the release process this week.
>>
>> Anything else I should be aware of or wait for?
>>
>> Thanks!
>>
>> Ben Trent
>>
>> On Wed, May 15, 2024, 3:58 AM Chris Hegarty
>>  wrote:
>>
>>> +1
>>>
>>> -Chris.
>>>
>>> > On 14 May 2024, at 16:10, Adrien Grand  wrote:
>>> >
>>> > +1 the 9.11 changelog looks great!
>>> >
>>> > On Tue, May 14, 2024 at 4:50 PM Benjamin Trent 
>>> wrote:
>>> > Hey y'all,
>>> >
>>> > Looking at changes for 9.11, we are building a significant list. I
>>> propose we do a release in the next couple of weeks.
>>> >
>>> > While this email is a little early (I am about to go on vacation for a
>>> bit), I volunteer myself as release manager.
>>> >
>>> > Unless there are objections, I plan on kicking off the release process
>>> May 28th.
>>> >
>>> > Thanks!
>>> >
>>> > Ben
>>> >
>>> >
>>> > --
>>> > Adrien
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>


Re: [JENKINS] Lucene » Lucene-NightlyTests-main - Build # 1315 - Still Unstable!

2024-04-02 Thread Michael McCandless
Hmm this failure looks not great.

I tried the "Reproduce with:" for one of the failures (see below) but it
fails to run any tests at all?  Maybe because of the cool parameterized
testing we now have for our back compat tests?  If I remove the "{...}"
pattern then the failures do repro.

./gradlew :lucene:backward-codecs:test --tests
"org.apache.lucene.backward_index.TestBinaryBackwardsCompatibility.testSearchOldIndex
{Lucene-Version:9.10.1; Pattern: unsupported.%1$s-cfs.zip}" -Ptests.jvms=4
-Ptests.jvmargs= -Ptests.seed=AED171B219\
72F50D -Ptests.multiplier=2 -Ptests.nightly=true -Ptests.gui=true
-Ptests.file.encoding=ISO-8859-1
-Ptests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene/Lucene-NightlyTests-main/test-data/enwiki.random.lines.txt
-Ptests.vectorsize=256

Mike McCandless

http://blog.mikemccandless.com


On Tue, Apr 2, 2024 at 4:52 AM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build:
> https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-main/1315/
>
> 6 tests failed.
> FAILED:
> org.apache.lucene.backward_index.TestBinaryBackwardsCompatibility.testSearchOldIndex
> {Lucene-Version:9.10.1; Pattern: unsupported.%1$s-cfs.zip}
>
> Error Message:
> java.lang.AssertionError: Index name 9.10.1 not found:
> unsupported.9.10.1-cfs.zip
>
> Stack Trace:
> java.lang.AssertionError: Index name 9.10.1 not found:
> unsupported.9.10.1-cfs.zip
> at
> __randomizedtesting.SeedInfo.seed([AED171B21972F50D:E4679B8937FD59F]:0)
> at junit@4.13.1/org.junit.Assert.fail(Assert.java:89)
> at junit@4.13.1/org.junit.Assert.assertTrue(Assert.java:42)
> at junit@4.13.1/org.junit.Assert.assertNotNull(Assert.java:713)
> at
> org.apache.lucene.backward_index.BackwardsCompatibilityTestBase.setUp(BackwardsCompatibilityTestBase.java:145)
> at
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
> at java.base/java.lang.reflect.Method.invoke(Method.java:580)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:980)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at junit@4.13.1
> /org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at org.apache.lucene.test_framework@10.0.0-SNAPSHOT
> /org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> at randomizedtesting.runner@2.8.1
> /com.carrotsearch.randomizedtesting.r

Re: Query about the GitHub statistics for Lucene

2024-03-06 Thread Michael McCandless
On Wed, Mar 6, 2024 at 4:41 AM Chris Hegarty 
wrote:

Seems that I’ve fallen into the newbie PMC Chair rabbit hole! ;-) - the
> reporting tool has long standing issues. Maybe they’re fixable, maybe not,
> but it’s possible we don’t necessarily need it now.
>

Sorry :)  Seems to be a rite-of-passage at this point!  It should be
mentioned in the handover instructions... or, we should simply merge Daniel
Gruno's one-line fix to the regexp that Kibble/Whimsy/reporter tool uses:
https://issues.apache.org/jira/browse/COMDEV-425?focusedCommentId=17823767&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17823767

@Mike is it possible to add “created since” filter?
>

Ahh good idea, done!
https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated&dd=created%3APast+3+months&dd=issue_or_pr%3APR
(this is PRs created in the Past 3 months ... it shows 36 open and 162
closed right now, close to the GitHub counts you found).

Here's the luceneserver commit that adds it:
https://github.com/mikemccand/luceneserver/commit/397942573bed3e2c4fd00ab0a324a19fd014bfd4

Mike McCandless

http://blog.mikemccandless.com


Re: Query about the GitHub statistics for Lucene

2024-03-05 Thread Michael McCandless
Found the prior discussion/issue:
https://lists.apache.org/thread/fhzw0y7kpnf48cxfml8t0313sdswdv6b

And a prior prior discussion:
https://lists.apache.org/thread/6rsr8v982fjqgyopprqzw057cpzfnz3z

Issue: https://issues.apache.org/jira/browse/COMDEV-425.  Jan seemed to get
close to fixing the (regexp?) bug!

Mike McCandless

http://blog.mikemccandless.com


On Tue, Mar 5, 2024 at 1:03 PM Michael McCandless 
wrote:

>
> On Tue, Mar 5, 2024 at 4:49 AM Chris Hegarty <
> christopher.hega...@elastic.co> wrote:
>
> In preparation for the project’s upcoming ASF board report, I came across
>> and reported [1] an issue with the GH statistics, available at:
>> https://reporter.apache.org/wizard/statistics?lucene
>>
>> It appears that there is no GH activity for 2024! Clearly this is
>> incorrect. I’ve yet to track down what’s going on with this. Familiar to
>> anyone here?
>
>
> There is a long-standing INFRA issue about this.  Lemme try to locate it
> ...
>
> @Mike. Would it be possible to add a “Past 3 months” to
>> https://githubsearch.mikemccandless.com/search.py ? Which would be
>> helpful when reporting.
>>
>
> Good idea!  Done!
> https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated&dd=status%3AOpen&dd=updated%3APast+3+months
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>


Re: Query about the GitHub statistics for Lucene

2024-03-05 Thread Michael McCandless
On Tue, Mar 5, 2024 at 4:49 AM Chris Hegarty 
wrote:

In preparation for the project’s upcoming ASF board report, I came across
> and reported [1] an issue with the GH statistics, available at:
> https://reporter.apache.org/wizard/statistics?lucene
>
> It appears that there is no GH activity for 2024! Clearly this is
> incorrect. I’ve yet to track down what’s going on with this. Familiar to
> anyone here?


There is a long-standing INFRA issue about this.  Lemme try to locate it
...

@Mike. Would it be possible to add a “Past 3 months” to
> https://githubsearch.mikemccandless.com/search.py ? Which would be
> helpful when reporting.
>

Good idea!  Done!
https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated&dd=status%3AOpen&dd=updated%3APast+3+months

Mike McCandless

http://blog.mikemccandless.com


Re: Announcing githubsearch!

2024-02-26 Thread Michael McCandless
Done!  Deployed!  Thank you Mike S.

Though on my "dark mode" Chrome on a Macbook, it's super dark.  I can make
it out but I gotta stare for a bit ... do they make light and dark mode
.ico files in one!?

Mike McCandless

http://blog.mikemccandless.com


On Sun, Feb 25, 2024 at 6:05 PM Michael Sokolov  wrote:

> here is a favicon you might want to try: I cropped the "VL" from the
> Apache Lucene logo (ok I guess it's an AL) -- if you save it as
> favicon.ico in the root of your website (ie as url /favicon.ico) it
> should show up in bookmarks, browser toolbars, etc as a handy memory
> aid. Of course you might have other ideas for a picture - it's
> actually pretty easy to make the favicon once you have a picture you
> like; I followed the instructions here
>
> https://www.logikfabrik.se/blog/how-to-create-a-multisize-favicon-using-gimp/
>
> On Thu, Feb 22, 2024 at 10:48 AM Zhang Chao <80152...@qq.com.invalid>
> wrote:
> >
> > Great job! Thanks Mike!
> >
> > 2024年2月22日 22:31,Alessandro Benedetti  写道:
> >
> > That's cool Mike! Well done!
> >
> > On Wed, 21 Feb 2024, 22:02 Anshum Gupta,  wrote:
> >>
> >> This is great! Like always, thank you Mike!
> >>
> >> On Mon, Feb 19, 2024 at 8:40 AM Michael McCandless <
> luc...@mikemccandless.com> wrote:
> >>>
> >>> Hi Team,
> >>>
> >>> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking
> from Jira to GitHub. Thank you Tomoko for all the hard work doing such a
> complex, multi-phased, high-fidelity migration!
> >>>
> >>> I finally finished also migrating jirasearch to GitHub:
> githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs
> are fundamentally more complex than Jira's data model, and the GitHub REST
> API is also quite rich / heavily normalized. All of the source code for
> githubsearch lives here. The UI remains its barebones self ;)
> >>>
> >>> Githubsearch is dog food for us: it showcases Lucene (currently
> 9.8.0), and many of its fun features like infix autosuggest, block join
> queries (each comment is a sub-document on the issue/PR), DrillSideways
> faceting, near-real-time indexing/searching, synonyms (try “oome”),
> expressions, non-relevance and blended-relevance sort, etc.  (This old blog
> post goes into detail.)  Plus, it’s meta-fun to use Lucene to search its
> own issues, to help us be more productive in improving Lucene!  Nicely
> recursive.
> >>>
> >>> In addition to good ol’ searching by text, githubsearch has some
> new/fun features:
> >>>
> >>> Drill down to just PRs or issues
> >>> Filter by “review requested” for a given user: poor Adrien has 8
> (open) now (sorry)! Or see your mentions (Robert is mentioned in 27 open
> issues/PRs). Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs).
> Or issues and PRs where a user has had any involvement at all (Dawid has
> interacted on 197 issues/PRs).
> >>> Find still-open PRs that were created by a New Contributor (an author
> who has no changes merged into our repository) or Contributor
> (non-committer who has had some changes merged into our repository) or
> Member
> >>> Here are the uber-stale (last touched more than a month ago) open PRs
> by outside contributors. We should ideally keep this at 0, but it’s 83 now!
> >>> “Link to this search” to get a short-er, more permanent URL (it is NOT
> a URL shortener, though!)
> >>> Save named searches you frequently run (they just save to local cookie
> state on that one browser)
> >>>
> >>> I’m sure there are exciting bugs, feedback/patches welcome!  If you
> see problems, please reply to this email or file an issue here.
> >>>
> >>> Note that jirasearch remains running, to search Solr, Tika and Infra
> issues.
> >>>
> >>> Happy Searching,
> >>>
> >>> Mike McCandless
> >>>
> >>> http://blog.mikemccandless.com
> >>
> >>
> >>
> >> --
> >> Anshum Gupta
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org


Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-26 Thread Michael McCandless
+1, exciting!

Mike McCandless

http://blog.mikemccandless.com


On Fri, Feb 23, 2024 at 6:24 AM Chris Hegarty
 wrote:

> Hi,
>
> Since the discussion on bumping the Lucene main branch to Java 21 is
> winding down, let's hold a vote on this important change.
>
> Once bumped, the next major release of Lucene (whenever that will be) will
> require a version of Java greater than or equal to Java 21.
>
> The vote will be open for at least 72 hours (and allow some additional
> time for the weekend) i.e. until 2024-02-28 12:00 UTC.
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1
>
> -Chris.
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Bump the Lucene main branch to Java 21

2024-02-21 Thread Michael McCandless
On Wed, Feb 21, 2024 at 7:41 AM Chris Hegarty
 wrote:

> So I think this means we are now free to use all the newfangled language
> features since Java 11 (min required for Lucene 9.x) -> Java 21?
>
> For the _main_ branch, yes.
>
> The _branch_9x_ remains unchanged - it stays on Java 11.
>
> So, if you’re planning to backport a change from main to 9x, then you may
> want to consider what Java language feature and/or JDK API you use - to
> make the backport more straightforward. But this is nothing new, _main_ is
> already on Java 17, while 9x is on Java 11, so the scenario already exists,
> just that the range is changing with this proposal. Hope this helps.
>

Thanks Chris, this makes sense!  So what's new with this change is on main
branch we can now use new language features from Java 17 -> Java 21.  But
on backport to 9.x we must still use only Java 11.

Thanks!

Mike McCandless

http://blog.mikemccandless.com


Re: Bump the Lucene main branch to Java 21

2024-02-21 Thread Michael McCandless
Thank you for the heads up Chris.

So I think this means we are now free to use all the newfangled language
features since Java 11 (min required for Lucene 9.x) -> Java 21?

Mike McCandless

http://blog.mikemccandless.com


On Wed, Feb 21, 2024 at 3:58 AM Chris Hegarty
 wrote:

> Hi,
>
> A number of us have been iterating on a PR to bump the Lucene main branch
> to a minimum of Java 21 [1]. The work is in a good state and is almost
> ready to commit.
>
> While the changes themselves are not large, the impact is arguably larger.
> So I’m raising awareness here with the wider group.
>
> Clearly one could conflate the bump to Java 21 with the question of when
> will Lucene have a next major release, but those issues, while somewhat
> related, are orthogonal. My position is that the next Lucene major should
> be on Java 21, regardless of when that will happen.
>
> Comments, feedback, suggestions welcome.
>
> Thanks,
> -Chris.
>
> [1] https://github.com/apache/lucene/pull/12753
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Welcome Zhang Chao as Lucene committer

2024-02-21 Thread Michael McCandless
Welcome Chao!

Mike McCandless

http://blog.mikemccandless.com


On Wed, Feb 21, 2024 at 5:02 AM Stefan Vodita 
wrote:

> Congratulations, Chao!
>
> On Tue, 20 Feb 2024 at 17:28, Adrien Grand  wrote:
>
>> I'm pleased to announce that Zhang Chao has accepted the PMC's
>> invitation to become a committer.
>>
>> Chao, the tradition is that new committers introduce themselves with a
>> brief bio.
>>
>> Congratulations and welcome!
>>
>> --
>> Adrien
>>
>


Re: Announcing githubsearch!

2024-02-21 Thread Michael McCandless
On Tue, Feb 20, 2024 at 10:06 AM Stefan Vodita 
wrote:

Thank you Mike, I really like all the facets!
>

Me too lol.  It was one of the big motivators for me to build this out.
GitHub's search didn't have all the facet drill-downs/up/sideways I
wanted.  Some of them are super useful like "which PRs have review
requested for me
"
or "where am I mentioned
".
Also, GitHub's filter choices do not seem to be dynamically generated for
this query -- so you can pick a filter value and it brings you to 0 hits,
violating the "no dead end" promise of Lucene's facets.

I was also disappointed with GitHub search's lack of hit highlighting, to
solve the "final inch" problem (show me specifically where, in this
massive massive list of comments on a PR/issue, my search terms appear),
and also not showing me the individual comment or code review comment
(multiple ones of those on a PR) where my search terms appear, lack of
linking directly to that comment, etc.  Githubsearch uses Lucene's block
joins to achieve this.

GitHub's search doesn't offer a blended relevance+recency sort, which I
think makes a great default.  It looks like it does support phrase search
(with double-quotes), curious how that works with ngrams.

I do like that the text query language includes all of the sort/filter
criteria -- the "is:open" and "sort:comments-desc".  Githubsearch doens't
support that through the text query language, just the facets UI / REST
query URL.

Anyway, I don't want to complain (too much) about GitHub's search efforts.
Search is clearly hard, and we all (Lucene experts) have a fairly
biased/opinionated take on it all, heh.  I've never met a search engine
that I'm fully happy with ;)

One thing that bothered me about GitHub's own search was that it would
> return
> different results if I wasn't signed in. Maybe it does early stopping for
> non-authenticated users? In any case, this won't be a problem with
> githubsearch.
>

Oh, that is very interesting -- I didn't know that.

Wow, I just tested -- indeed, you cannot even search the source code (for
Lucene's repo anyways) if you are not signed in.  That's weird.

For issues/PRs searching, the three queries I tried seem to produce the
same results signed in or out.  But it is scary/dangerous if this can
differ!!


> Have you considered indexing the Lucene source code too?
>

Oh my, I have not (until now lol).  That's a great idea.  Source code
tokenization would be such a fun problem ... I wonder if GitHub
open-sources how they tokenize the many different languages' source code.
GitHub's code search is in Rust (not using Lucene nor Rucene), a custom
search engine they recently built / switched to:
https://github.blog/2023-02-06-the-technology-behind-githubs-new-code-search,
away from Elasticsearch previously I think.  It looks like they use ngrams,
maybe instead of language-specific tokenization (?), to do the initial
matching/retrieval.  I would try normal lexical tokenization to see if
highlighting could work well.

I opened this luceneserver/GitHubSearch issue
 to think about this
... it'd sure be fun to build and use :)  Thank you for the suggestion
Stefan!

Mike McCandless

http://blog.mikemccandless.com

>


Re: Announcing githubsearch!

2024-02-20 Thread Michael McCandless
Thank you for all the warm feedback everyone, and all the exciting issues
already uncovered / ideas for improvements.  Now I have some more fun work
to do!

Mike McCandless

http://blog.mikemccandless.com


On Mon, Feb 19, 2024 at 12:58 PM Julie Tibshirani 
wrote:

> This is so cool! Thank you Mike for developing and hosting these services!
>
> Julie
>
> On Mon, Feb 19, 2024 at 9:40 AM Michael Wechner 
> wrote:
>
>> thank you very much!
>>
>> Am 19.02.24 um 17:39 schrieb Michael McCandless:
>>
>> Hi Team,
>>
>> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking from
>> Jira to GitHub. Thank you Tomoko for all the hard work doing such a
>> complex, multi-phased, high-fidelity migration!
>>
>> I finally finished also migrating jirasearch to GitHub:
>> githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs
>> are fundamentally more complex than Jira's data model, and the GitHub REST
>> API is also quite rich / heavily normalized. All of the source code for
>> githubsearch lives here
>> <https://github.com/mikemccand/luceneserver/tree/master/examples/githubsearch>.
>> The UI remains its barebones self ;)
>>
>> Githubsearch
>> <https://github.com/mikemccand/luceneserver/tree/master/examples/githubsearch>
>> is dog food for us: it showcases Lucene (currently 9.8.0), and many of its
>> fun features like infix autosuggest, block join queries (each comment is a
>> sub-document on the issue/PR), DrillSideways faceting, near-real-time
>> indexing/searching, synonyms (try “oome
>> <https://githubsearch.mikemccandless.com/search.py?text=oome&dd=status%3AOpen>”),
>> expressions, non-relevance and blended-relevance sort, etc.  (This old
>> blog post
>> <https://blog.mikemccandless.com/2016/10/jiraseseach-20-dog-food-using-lucene-to.html>
>>  goes
>> into detail.)  Plus, it’s meta-fun to use Lucene to search its own issues,
>> to help us be more productive in improving Lucene!  Nicely recursive.
>>
>> In addition to good ol’ searching by text, githubsearch
>> <https://githubsearch.mikemccandless.com/> has some new/fun features:
>>
>>- Drill down to just PRs or issues
>>- Filter by “review requested” for a given user: poor Adrien has 8
>>(open) now
>>
>> <https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated&dd=status%3AOpen&dd=requested_reviewers%3Ajpountz>
>>(sorry)! Or see your mentions (Robert is mentioned in 27 open
>>issues/PRs
>>
>> <https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated&dd=status%3AOpen&dd=mentioned_users%3Armuir>).
>>Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs
>>
>> <https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated&dd=status%3AOpen&dd=reviewed_users%3Auschindler>).
>>Or issues and PRs where a user has had any involvement at all (Dawid
>>has interacted on 197 issues/PRs
>>
>> <https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated&dd=reviewed_users%3Adweiss>
>>).
>>- Find still-open PRs that were created by a New Contributor
>>
>> <https://githubsearch.mikemccandless.com/search.py?chg=dds&text=&a1=author_association&a2=New+contributor&page=0&searcher=25792&sort=recentlyUpdated&format=list&id=cjhfx60attlt&dd=status%3AOpen&newText=>
>>(an author who has no changes merged into our repository) or
>>Contributor
>>
>> <https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated&dd=status%3AOpen&dd=author_association%3AContributor>
>>(non-committer who has had some changes merged into our repository) or
>>Member
>>
>> <https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated&dd=status%3AOpen&dd=author_association%3AMember>
>>- Here are the uber-stale (last touched more than a month ago) open
>>PRs by outside contributors
>>
>> <https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated&dd=status%3AOpen&dd=author_association%3ANew+contributor%2CContributor%2CNone&dd=updated_ago%3A%3E+1+month+ago&dd=issue_or_pr%3APR>.
>>We should ideally keep this at 0, but it’s 83 now!
>>- “Link to this search” to get a short-er, more permanent URL (it is
>>NOT a URL shortener, though!)
>>- Save named searches you frequently run (they just save to local
>>cookie state on that one browser)
>>
>> I’m sure there are exciting bugs, feedback/patches welcome!  If you see
>> problems, please reply to this email or file an issue here
>> <https://github.com/mikemccand/luceneserver/issues>.
>>
>> Note that jirasearch <https://jirasearch.mikemccandless.com/search.py>
>> remains running, to search Solr, Tika and Infra issues.
>>
>> Happy Searching,
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>>


Re: Announcing githubsearch!

2024-02-20 Thread Michael McCandless
On Mon, Feb 19, 2024 at 1:00 PM Walter Underwood 
wrote:

It appears to always search prefixes, so there is no way to search for
> “wunder” without getting “wundermap” and “wunderground”. Putting the term
> in quotes doesn’t turn that off.
>

Hmm that shouldn't be the case?  It does split on camel case though (thank
you WordDelimiterFilter!).  E.g. try searching on infix

and
you should see it highlighted inside terms like AnalyzingInfixSuggester.

In fact when I search for wunder

I get a horrible exception, I think I know why (it happens for any query
that gets no hits!).  I opened this issue
.  I'll try to fix
that soon.

Walter, I'm not sure how you were able to even search on "wunder" -- did
you get actual results?  From githubsearch
?

Mike McCandless

http://blog.mikemccandless.com


Re: Announcing githubsearch!

2024-02-20 Thread Michael McCandless
On Tue, Feb 20, 2024 at 6:01 AM Michael Sokolov  wrote:

I love the gray all text UI. Don't change it! But I wonder if it's time for
> a favicon?
>

LOL favicon!  You do NOT want to have to confront my artistic skills!

Mike McCandless

http://blog.mikemccandless.com

>


Announcing githubsearch!

2024-02-19 Thread Michael McCandless
Hi Team,

~1.5 years ago (August 2022) we migrated our Lucene issue tracking from
Jira to GitHub. Thank you Tomoko for all the hard work doing such a
complex, multi-phased, high-fidelity migration!

I finally finished also migrating jirasearch to GitHub:
githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs
are fundamentally more complex than Jira's data model, and the GitHub REST
API is also quite rich / heavily normalized. All of the source code for
githubsearch lives here
.
The UI remains its barebones self ;)

Githubsearch

is dog food for us: it showcases Lucene (currently 9.8.0), and many of its
fun features like infix autosuggest, block join queries (each comment is a
sub-document on the issue/PR), DrillSideways faceting, near-real-time
indexing/searching, synonyms (try “oome
”),
expressions, non-relevance and blended-relevance sort, etc.  (This old blog
post

goes
into detail.)  Plus, it’s meta-fun to use Lucene to search its own issues,
to help us be more productive in improving Lucene!  Nicely recursive.

In addition to good ol’ searching by text, githubsearch
 has some new/fun features:

   - Drill down to just PRs or issues
   - Filter by “review requested” for a given user: poor Adrien has 8
   (open) now
   

   (sorry)! Or see your mentions (Robert is mentioned in 27 open issues/PRs
   
).
   Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs
   
).
   Or issues and PRs where a user has had any involvement at all (Dawid has
   interacted on 197 issues/PRs
   

   ).
   - Find still-open PRs that were created by a New Contributor
   

   (an author who has no changes merged into our repository) or Contributor
   

   (non-committer who has had some changes merged into our repository) or
   Member
   

   - Here are the uber-stale (last touched more than a month ago) open PRs
   by outside contributors
   
.
   We should ideally keep this at 0, but it’s 83 now!
   - “Link to this search” to get a short-er, more permanent URL (it is NOT
   a URL shortener, though!)
   - Save named searches you frequently run (they just save to local cookie
   state on that one browser)

I’m sure there are exciting bugs, feedback/patches welcome!  If you see
problems, please reply to this email or file an issue here
.

Note that jirasearch 
remains running, to search Solr, Tika and Infra issues.

Happy Searching,

Mike McCandless

http://blog.mikemccandless.com


Re: [VOTE] Release Lucene 9.10.0 RC1

2024-02-19 Thread Michael McCandless
+1

SUCCESS! [0:19:57.370204]

Mike McCandless

http://blog.mikemccandless.com


On Mon, Feb 19, 2024 at 6:26 AM Chris Hegarty
 wrote:

>
> +1   SUCCESS! [1:14:49.683559]
>
> -Chris.
>
> > On 15 Feb 2024, at 21:08, Uwe Schindler  wrote:
> >
> > Hi,
> > I used Stefan Vodita's Hack to make the Smoketester run on a large list
> of JDKs: https://github.com/apache/lucene/pull/13108
> > See the console of running Java 11, Java 17, Java 19, Java 20, Java 21.
> Due to limitations of Gradle I wasn't able to do the smoker checks on Java
> 22 release candidate, but as there are no changes to 9.x branch I assume
> that everything also works in Java 22. If anybody else has time to run a
> test project with Java 22 using mmap and vectors it would be great!
> > Log file:
> https://jenkins.thetaphi.de/job/Lucene-Release-Tester-v2/3/console
> > Result was:
> > SUCCESS! [2:42:55.968473]
> >
> > Here is my +1 (binding).
> > Uwe
> >
> > Am 15.02.2024 um 12:50 schrieb Uwe Schindler:
> >> Hi,
> >> I ran the default smoke tester with Java 11 and Java 17 on Policeman
> Jenkins; all looks fine:
> https://jenkins.thetaphi.de/job/Lucene-Release-Tester/32/console
> >> SUCCESS! [1:04:45.740708]
> >> I only have one problem. Now that Java 21 LTS is out and more an more
> people use it, it would be good to also run the smoke tester with Java 21.
> I tried that locally by just passing the home dir of java 21 instead of
> Java 17, but that failed due to some check in smoker.
> >> I will work this evening on patching Smoke tester to also allow it to
> pass Java 21. Maybe the best would be to pass multiple Java versions as
> comma spearated list, just the default one must be Java 11 (the baseline).
> This would allo me to spin Policeman Jenkins with Java 11, Java 17, Java
> 19, Java 20, Java 21 and Java 22-rc1. Takes a while but would make sure all
> works in the officially MR-JAR supported relaeses + LTS.
> >> What do you think.
> >> I will give my +1 later when I checked the options and also looked into
> the downloaded artifacts.
> >> Uwe
> >> Am 14.02.2024 um 20:28 schrieb Adrien Grand:
> >>> Please vote for release candidate 1 for Lucene 9.10.0
> >>>
> >>> The artifacts can be downloaded from:
> >>>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.10.0-RC1-rev-695c0ac84508438302cd346a812cfa2fdc5a10df
> >>>
> >>> You can run the smoke tester directly with this command:
> >>>
> >>> python3 -u dev-tools/scripts/smokeTestRelease.py \
> >>>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.10.0-RC1-rev-695c0ac84508438302cd346a812cfa2fdc5a10df
> >>>
> >>> The vote will be open for at least 72 hours i.e. until 2024-02-17
> 20:00 UTC.
> >>>
> >>> [ ] +1  approve
> >>> [ ] +0  no opinion
> >>> [ ] -1  disapprove (and reason why)
> >>>
> >>> Here is my +1
> >>>
> >>> --
> >>> Adrien
> >> --
> >> Uwe Schindler
> >> Achterdiek 19, D-28357 Bremen
> >> https://www.thetaphi.de
> >> eMail: u...@thetaphi.de
> > --
> > Uwe Schindler
> > Achterdiek 19, D-28357 Bremen
> > https://www.thetaphi.de
> > eMail: u...@thetaphi.de
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Lucene 9.10

2024-02-08 Thread Michael McCandless
+1 to release 9.10.  Thank you for volunteering Adrien!

Mike McCandless

http://blog.mikemccandless.com


On Wed, Feb 7, 2024 at 9:57 AM Adrien Grand  wrote:

> Hello all,
>
> It's been 2 months since we released 9.9 and we accumulated a good number
> of changes, so I'd like to propose that we release 9.10.0.
>
> If there are no objections, I volunteer to be the release manager and
> suggest cutting the branch next Monday (February 12th) and starting the
> release process on Wednesday, one week from now (February 14th).
>
> +Uwe Schindler  I remember that there are JDK22-related
> changes that you'd like to get into 9.10, feel free to let me know if this
> timeline doesn't work for you.
>
> --
> Adrien
>


Re: Needs help reviewing on Lucene PostingsFormat memory improvement

2024-02-07 Thread Michael McCandless
Hi Anh Dũng Bùi,

Thank you for tackling these and being so gently patient/persisting!  Sorry
for the delay.  I will try to review them soon.  The off-heap (streaming?)
building of FSTs is really a massive improvement to Lucene, inspired by
Tantivy's FST implementation: https://blog.burntsushi.net/transducers/

Read-time for Lucene90BlockTreePostingsFormat was already off-heap?  And
your PR changes write-time to do so as well?  This will reduce RAM pressure
during indexing which is great.  And some Lucene usages generate incredibly
large FSTs (I'm looking at you HathiTrust!). I don't think we need to
explicitly measure any performance impact before merging?, but let's watch
the nightly benchy to see if there is any measurable impact?

And, yes, Lucene90BlockTreePostingsFormat is the default.  You find the
default codec from Codec.getDefault() and then trace downwards to all its
sources.

Maybe building the synonyms FST (SynonymMap.Builder) would be a good place
for off-heap writing too?

And this exciting PR  (still a
work in progres) would likely strongly benefit from streaming FST building,
since its FSTs will be much larger than the Lucene90BlockTree since it
stores all terms (not just the sampled prefix/index) in a single FST for
the segment.

Mike McCandless

http://blog.mikemccandless.com


On Thu, Feb 1, 2024 at 10:40 PM Anh Dũng Bùi  wrote:

> Hi Lucene devs!
>
> I have 2 PRs to optimize Lucene PostingsFormat
> (Lucene90BlockTreePostingsFormat and FSTPostingsFormat) by utilizing a new
> feature to stream the FST to IndexOutput directly, bypassing the on-heap
> writing:
> - https://github.com/apache/lucene/pull/12980
> - https://github.com/apache/lucene/pull/12985
>
> It would be great if someone can help reviewing. I also have some general
> questions:
> - How do I measure the memory improvement impact in Lucene?
> - Is Lucene90BlockTreePostingsFormat the main index format used in Lucene?
> If not, what is the main format?
> - Are there other places worth using the new streaming FST feature?
>
> Thank you!
> Anh Dung Bui
>


Re: [VOTE] Release Lucene 9.9.2 RC1

2024-01-25 Thread Michael McCandless
+1

SUCCESS! [0:18:29.298410]

Thank you Chris!

Mike McCandless

http://blog.mikemccandless.com


On Thu, Jan 25, 2024 at 6:57 AM Chris Hegarty
 wrote:

> Please vote for release candidate 1 for Lucene 9.9.2
>
> The artifacts can be downloaded from:
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.2-RC1-rev-a2939784c4ca60bc28bf488b5479c02fc2e5e22c
>
> You can run the smoke tester directly with this command:
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.2-RC1-rev-a2939784c4ca60bc28bf488b5479c02fc2e5e22c
>
> The vote will be open for 96 hours ( allowing some additional time for
> weekend span) i.e. until 2024-01-29 12:00 UTC.
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1
>
> Draft release notes can be found at
> https://cwiki.apache.org/confluence/display/LUCENE/ReleaseNote9_9_2
>
> -Chris.
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Welcome Stefan Vodita as Lucene committter

2024-01-18 Thread Michael McCandless
Hi Team,

I'm pleased to announce that Stefan Vodita has accepted the Lucene PMC's
invitation to become a committer!

Stefan, the tradition is that new committers introduce themselves with a
brief bio.

Congratulations, welcome, and thank you for all your improvements to Lucene
and our community,

Mike McCandless

http://blog.mikemccandless.com


Re: [JENKINS] Lucene-main-Linux (64bit/hotspot/jdk-21.0.1) - Build # 46049 - Unstable!

2024-01-04 Thread Michael McCandless
Hmm this is an interesting failure ... one of the hits scores is off by a
wee bit (one ULP?).

Mike McCandless

http://blog.mikemccandless.com


On Tue, Jan 2, 2024 at 7:33 AM Policeman Jenkins Server 
wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/46049/
> Java: 64bit/hotspot/jdk-21.0.1 -XX:+UseCompressedOops -XX:+UseG1GC
>
> 1 tests failed.
> FAILED:
> org.apache.lucene.search.TestBooleanMinShouldMatch.testRandomQueries
>
> Error Message:
> java.lang.AssertionError: Doc 6 scores don't match
> TopDocs totalHits=3 hits top=3
> 0) doc=0score=8.962799
> 1) doc=6score=7.7155914
> 2) doc=4score=7.1769814
> TopDocs totalHits=3 hits top=3
> 0) doc=0score=8.962799
> 1) doc=6score=7.7155924
> 2) doc=4score=7.1769814
> for query:(data:X +data:3 data:3 data:3 data:6 -data:B data:4 data:3
> data:1)~4 expected:<7.7155924> but was:<7.7155914>
>
> Stack Trace:
> java.lang.AssertionError: Doc 6 scores don't match
> TopDocs totalHits=3 hits top=3
> 0) doc=0score=8.962799
> 1) doc=6score=7.7155914
> 2) doc=4score=7.1769814
> TopDocs totalHits=3 hits top=3
> 0) doc=0score=8.962799
> 1) doc=6score=7.7155924
> 2) doc=4score=7.1769814
> for query:(data:X +data:3 data:3 data:3 data:6 -data:B data:4 data:3
> data:1)~4 expected:<7.7155924> but was:<7.7155914>
> at
> __randomizedtesting.SeedInfo.seed([DB978BEEAF5DCEF5:85BC3B029787E36B]:0)
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotEquals(Assert.java:835)
> at org.junit.Assert.assertEquals(Assert.java:577)
> at
> org.apache.lucene.search.TestBooleanMinShouldMatch.assertSubsetOfSameScores(TestBooleanMinShouldMatch.java:384)
> at
> org.apache.lucene.search.TestBooleanMinShouldMatch.testRandomQueries(TestBooleanMinShouldMatch.java:357)
> at
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
> at java.base/java.lang.reflect.Method.invoke(Method.java:580)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> at
> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> at
> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.tests.util.TestRuleAssertion

Heads up: upcoming GitHub action to mark stale Lucene PRs

2024-01-04 Thread Michael McCandless
Hi Team,

Stefan Vodita made an awesome simple PR adding a GitHub action to remind /
nag us about stale PRs: https://github.com/apache/lucene/pull/12813

This happened after an in-person discussion at the last Community Over Code
NA in Halifax where Stefan learned about the nice automation Apache Beam
uses to nudge PRs forward.  This change is just a baby step to try to get
our stale PRs into a healthier state / workflow.

In the ultimate irony, that PR itself had become stale recently (2 weeks of
no activity) -- a "meta-stale PR"!

I would like to merge this PR soon, but:
* It will generate a bunch of one-time noise because we have ~163 open
PRs many of which are stale:
https://githubsearch.mikemccandless.com/search.py?dd=status%3AOpen&dd=issue_or_pr%3APR
* I know nothing about GitHub actions YAML format, but worst comes to
worst we push it, it fails in some exotic way, and we revert.

I assume lazy consensus soon ;)

Mike McCandless

http://blog.mikemccandless.com


Re: [JENKINS] Lucene-main-Linux (64bit/hotspot/jdk-19) - Build # 45856 - Unstable!

2023-12-20 Thread Michael McCandless
I'll try to get to the bottom of this Adrien, thanks for digging!

I wonder if we are violating this (subtle) requirement in the
Terms.intersect API:

 Note that the provided startTerm must be accepted by the
automaton.

The failures involving DirectPostingsFormat seem angry that we are indeed
violating this.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Dec 20, 2023 at 6:09 AM Adrien Grand  wrote:

> I don't fully understandi it yet. I opened an issue:
> https://github.com/apache/lucene/issues/12957.
>
> On Tue, Dec 19, 2023 at 6:02 PM Adrien Grand  wrote:
>
>> This looks like a real bug with the default codec when the prefix
>> compares greater than every indexed term. I'll look into it tomorrow if
>> nobody beats me to it.
>>
>> On Tue, Dec 19, 2023 at 12:35 PM Policeman Jenkins Server <
>> jenk...@thetaphi.de> wrote:
>>
>>> Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45856/
>>> Java: 64bit/hotspot/jdk-19 -XX:+UseCompressedOops -XX:+UseSerialGC
>>>
>>> 1 tests failed.
>>> FAILED:  org.apache.lucene.index.TestTerms.testTermMinMaxRandom
>>>
>>> Error Message:
>>> java.lang.AssertionError
>>>
>>> Stack Trace:
>>> java.lang.AssertionError
>>> at
>>> __randomizedtesting.SeedInfo.seed([CBF65306049672F4:8785DC72680AA991]:0)
>>> at
>>> org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnum.getState(IntersectTermsEnum.java:245)
>>> at
>>> org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnum.seekToStartTerm(IntersectTermsEnum.java:288)
>>> at
>>> org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnum.(IntersectTermsEnum.java:126)
>>> at
>>> org.apache.lucene.codecs.lucene90.blocktree.FieldReader.intersect(FieldReader.java:223)
>>> at
>>> org.apache.lucene.index.CheckIndex.checkTermsIntersect(CheckIndex.java:2374)
>>> at
>>> org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:2327)
>>> at
>>> org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:2529)
>>> at
>>> org.apache.lucene.index.CheckIndex.testSegment(CheckIndex.java:1067)
>>> at
>>> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:783)
>>> at
>>> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:550)
>>> at
>>> org.apache.lucene.tests.util.TestUtil.checkIndex(TestUtil.java:340)
>>> at
>>> org.apache.lucene.tests.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:909)
>>> at
>>> org.apache.lucene.index.TestTerms.testTermMinMaxRandom(TestTerms.java:85)
>>> at
>>> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
>>> at java.base/java.lang.reflect.Method.invoke(Method.java:578)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
>>> at
>>> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
>>> at
>>> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>>> at
>>> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>>> at
>>> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>>> at
>>> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>>> at
>>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>>> at
>>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
>>> at
>>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
>>> at
>>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
>>> at
>>> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>>> at
>>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>>>   

Re: [JENKINS] Lucene » Lucene-Check-9.x - Build # 7187 - Still Unstable!

2023-12-20 Thread Michael McCandless
I'll try to bottom this one out.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Dec 20, 2023 at 6:25 AM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build: https://ci-builds.apache.org/job/Lucene/job/Lucene-Check-9.x/7187/
>
> 1 tests failed.
> FAILED:  org.apache.lucene.index.TestTerms.testTermMinMaxRandom
>
> Error Message:
> java.lang.AssertionError
>
> Stack Trace:
> java.lang.AssertionError
> at
> __randomizedtesting.SeedInfo.seed([C8D1EBB5035DA9F:40FE91CF3CA901FA]:0)
> at
> org.apache.lucene.codecs.memory.DirectPostingsFormat$DirectField$DirectIntersectTermsEnum.(DirectPostingsFormat.java:1055)
> at
> org.apache.lucene.codecs.memory.DirectPostingsFormat$DirectField.intersect(DirectPostingsFormat.java:655)
> at
> org.apache.lucene.index.CheckIndex.checkTermsIntersect(CheckIndex.java:2391)
> at
> org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:2344)
> at
> org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:2550)
> at
> org.apache.lucene.index.CheckIndex.testSegment(CheckIndex.java:1083)
> at
> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:797)
> at
> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:564)
> at
> org.apache.lucene.tests.util.TestUtil.checkIndex(TestUtil.java:345)
> at
> org.apache.lucene.tests.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:909)
> at
> org.apache.lucene.index.TestTerms.testTermMinMaxRandom(TestTerms.java:85)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> at
> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> at
> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> org.apache.lucene.tests

Re: [JENKINS] Lucene-main-Linux (64bit/hotspot/jdk-17.0.9) - Build # 45858 - Still Unstable!

2023-12-19 Thread Michael McCandless
Oh this is the new (awesome) check Adrien recent added to CheckIndex, so
maybe this check is catching some pre-existing bugs in one of our
(hopefully experimental, not default!) codecs?

Mike McCandless

http://blog.mikemccandless.com


On Tue, Dec 19, 2023 at 3:43 PM Michael McCandless <
luc...@mikemccandless.com> wrote:

> Hmm anyone know why this test suddenly started failing...?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Dec 19, 2023 at 3:35 PM Policeman Jenkins Server <
> jenk...@thetaphi.de> wrote:
>
>> Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45858/
>> Java: 64bit/hotspot/jdk-17.0.9 -XX:-UseCompressedOops -XX:+UseG1GC
>>
>> 1 tests failed.
>> FAILED:  org.apache.lucene.index.TestTerms.testTermMinMaxRandom
>>
>> Error Message:
>> java.lang.AssertionError
>>
>> Stack Trace:
>> java.lang.AssertionError
>> at
>> __randomizedtesting.SeedInfo.seed([3D3B000BC0883C14:71488F7FAC14E771]:0)
>> at
>> org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnum.getState(IntersectTermsEnum.java:245)
>> at
>> org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnum.seekToStartTerm(IntersectTermsEnum.java:288)
>> at
>> org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnum.(IntersectTermsEnum.java:126)
>> at
>> org.apache.lucene.codecs.lucene90.blocktree.FieldReader.intersect(FieldReader.java:223)
>> at
>> org.apache.lucene.tests.index.AssertingLeafReader$AssertingTerms.intersect(AssertingLeafReader.java:191)
>> at
>> org.apache.lucene.index.CheckIndex.checkTermsIntersect(CheckIndex.java:2374)
>> at
>> org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:2327)
>> at
>> org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:2529)
>> at
>> org.apache.lucene.index.CheckIndex.testSegment(CheckIndex.java:1067)
>> at
>> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:783)
>> at
>> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:550)
>> at
>> org.apache.lucene.tests.util.TestUtil.checkIndex(TestUtil.java:340)
>> at
>> org.apache.lucene.tests.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:909)
>> at
>> org.apache.lucene.index.TestTerms.testTermMinMaxRandom(TestTerms.java:85)
>> at
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>> at
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>> at
>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
>> at
>> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
>> at
>> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
>> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>> at
>> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>> at
>> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
>> at
>> com.carrotsearch.ra

Re: [JENKINS] Lucene-main-Linux (64bit/hotspot/jdk-17.0.9) - Build # 45858 - Still Unstable!

2023-12-19 Thread Michael McCandless
Hmm anyone know why this test suddenly started failing...?

Mike McCandless

http://blog.mikemccandless.com


On Tue, Dec 19, 2023 at 3:35 PM Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45858/
> Java: 64bit/hotspot/jdk-17.0.9 -XX:-UseCompressedOops -XX:+UseG1GC
>
> 1 tests failed.
> FAILED:  org.apache.lucene.index.TestTerms.testTermMinMaxRandom
>
> Error Message:
> java.lang.AssertionError
>
> Stack Trace:
> java.lang.AssertionError
> at
> __randomizedtesting.SeedInfo.seed([3D3B000BC0883C14:71488F7FAC14E771]:0)
> at
> org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnum.getState(IntersectTermsEnum.java:245)
> at
> org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnum.seekToStartTerm(IntersectTermsEnum.java:288)
> at
> org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnum.(IntersectTermsEnum.java:126)
> at
> org.apache.lucene.codecs.lucene90.blocktree.FieldReader.intersect(FieldReader.java:223)
> at
> org.apache.lucene.tests.index.AssertingLeafReader$AssertingTerms.intersect(AssertingLeafReader.java:191)
> at
> org.apache.lucene.index.CheckIndex.checkTermsIntersect(CheckIndex.java:2374)
> at
> org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:2327)
> at
> org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:2529)
> at
> org.apache.lucene.index.CheckIndex.testSegment(CheckIndex.java:1067)
> at
> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:783)
> at
> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:550)
> at
> org.apache.lucene.tests.util.TestUtil.checkIndex(TestUtil.java:340)
> at
> org.apache.lucene.tests.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:909)
> at
> org.apache.lucene.index.TestTerms.testTermMinMaxRandom(TestTerms.java:85)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> at
> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> at
> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(Statemen

Re: [VOTE] Release Lucene 9.9.1 RC1

2023-12-14 Thread Michael McCandless
On Thu, Dec 14, 2023 at 11:33 AM Adrien Grand  wrote:

Thanks Chris for taking care of this release.
>

+1!

And sorry about the respin...

Mike McCandless

http://blog.mikemccandless.com


Re: [JENKINS] Lucene » Lucene-NightlyTests-main - Build # 1209 - Unstable!

2023-12-14 Thread Michael McCandless
Ha, no worries, it was a good suggestion / idea!

Mike McCandless

http://blog.mikemccandless.com


On Mon, Dec 11, 2023 at 12:47 PM Adrien Grand  wrote:

> Woops, sorry for suggesting this change in the first place! I didn't know
> we had this validation for points, but not for postings.
>
> On Fri, Dec 8, 2023 at 2:16 PM Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> OK I reverted the "optimization" to not pull FieldInfo for a field when
>> getting Points values from SlowCompositeCodecReaderWrapper!  Clearly it was
>> not safe ;)
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Fri, Dec 8, 2023 at 8:06 AM Michael McCandless <
>> luc...@mikemccandless.com> wrote:
>>
>>> Uh oh -- I'll dig.  We may need to put back the FieldInfo check before
>>> pulling points.  Tricky!
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>>
>>> On Fri, Dec 8, 2023 at 3:55 AM Apache Jenkins Server <
>>> jenk...@builds.apache.org> wrote:
>>>
>>>> Build:
>>>> https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-main/1209/
>>>>
>>>> 3 tests failed.
>>>> FAILED:  org.apache.lucene.index.TestPointValues.testSparsePoints
>>>>
>>>> Error Message:
>>>> java.lang.IllegalStateException: this writer hit an unrecoverable
>>>> error; cannot merge
>>>>
>>>> Stack Trace:
>>>> java.lang.IllegalStateException: this writer hit an unrecoverable
>>>> error; cannot merge
>>>> at
>>>> __randomizedtesting.SeedInfo.seed([ADA30A2081CE6DA4:A05414293C35A568]:0)
>>>> at
>>>> org.apache.lucene.index.IndexWriter.hasPendingMerges(IndexWriter.java:2425)
>>>> at
>>>> org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.hasPendingMerges(IndexWriter.java:6527)
>>>> at
>>>> org.apache.lucene.index.ConcurrentMergeScheduler.maybeStall(ConcurrentMergeScheduler.java:589)
>>>> at
>>>> org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:540)
>>>> at
>>>> org.apache.lucene.index.IndexWriter.executeMerge(IndexWriter.java:2315)
>>>> at
>>>> org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2310)
>>>> at
>>>> org.apache.lucene.index.IndexWriter.processEvents(IndexWriter.java:5985)
>>>> at
>>>> org.apache.lucene.index.IndexWriter.flushNextBuffer(IndexWriter.java:3606)
>>>> at
>>>> org.apache.lucene.tests.index.RandomIndexWriter.flushAllBuffersSequentially(RandomIndexWriter.java:263)
>>>> at
>>>> org.apache.lucene.tests.index.RandomIndexWriter.maybeFlushOrCommit(RandomIndexWriter.java:235)
>>>> at
>>>> org.apache.lucene.tests.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:226)
>>>> at
>>>> org.apache.lucene.index.TestPointValues.testSparsePoints(TestPointValues.java:697)
>>>> at
>>>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>> Method)
>>>> at
>>>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>>>> at
>>>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>>>> at
>>>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
>>>> at
>>>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
>>>> at
>>>> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
>>>> at
>>>> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
>>>> at
>>>> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
>>>> at
>>>> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>>>> at
>>>> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>>>>  

Re: [JENKINS] Lucene-MMAPv2-Linux (64bit/hotspot/jdk-17.0.9) - Build # 1537 - Failure!

2023-12-14 Thread Michael McCandless
Timeout.

Mike McCandless

http://blog.mikemccandless.com


On Thu, Dec 14, 2023 at 9:09 AM Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Linux/1537/
> Java: 64bit/hotspot/jdk-17.0.9 -XX:-UseCompressedOops -XX:+UseShenandoahGC
>
> All tests passed
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: Running tests with streaming console output but NOT verbose?

2023-12-14 Thread Michael McCandless
 do here? For the record, all the test
> options and their defaults are displayed with:
>
> gradlew -p lucene\core testOpts
>
> Dawid
>
> On Tue, Dec 12, 2023 at 8:45 PM Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> Ahh thanks for the quick explanation and temporary solution Dawid!.
>> Naming is the hardest part :)
>>
>> I think long ago we used to call it "-Dtests.stdout=true" or so?  Not the
>> greatest name tho.  Maybe "tests.liveConsoleOut"?  "tests.liveConsole"?
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Tue, Dec 12, 2023 at 2:31 PM Dawid Weiss 
>> wrote:
>>
>>>
>>> This is actually an accidental (?) clash between Lucene's system
>>> property and what's in defaults-tests.gradle.
>>> You can manually prepend true || ... to the following in
>>> defaults-tests.gradle.
>>>
>>> def verboseMode = resolvedTestOption("tests.verbose").toBoolean()
>>>
>>> I can't remember why it aligns with Lucene's logger. Maybe it
>>> should/could be a
>>> separate property? I find it difficult to come up with a reasonable name
>>> though.
>>>
>>> D.
>>>
>>> On Tue, Dec 12, 2023 at 8:03 PM Michael McCandless <
>>> luc...@mikemccandless.com> wrote:
>>>
>>>> Hi Team,
>>>>
>>>> This is prolly a Dawid question...
>>>>
>>>> Sometimes I want to run a test (like a slow Monster test), seeing its
>>>> ongoing musings popping out on the console in "real time" (not buffered).
>>>>
>>>> I can do this today by adding "-Dtests.verbose=true" to the ./gradlew
>>>> invocation that's running the test.
>>>>
>>>> But that also turns on LuceneTestCase.VERBOSE which sometimes produces
>>>> insane amounts of mostly not helpful content.
>>>>
>>>> Is there any way to do the first (stream console output) without the
>>>> second (mega verbosity enabled)?
>>>>
>>>> Thanks,
>>>>
>>>> Mike McCandless
>>>>
>>>> http://blog.mikemccandless.com
>>>>
>>>


Re: [VOTE] Release Lucene 9.9.1 RC1

2023-12-14 Thread Michael McCandless
+1

SUCCESS! [0:14:52.296147]


I also cracked a bit of rust off our Monster tests and all but one passed:
https://github.com/apache/lucene/pull/12942

Mike McCandless

http://blog.mikemccandless.com


On Wed, Dec 13, 2023 at 4:24 PM Benjamin Trent 
wrote:

> SUCCESS! [1:06:02.232333]
>
> + 1!
>
> On Wed, Dec 13, 2023 at 3:26 PM Greg Miller  wrote:
>
>> SUCCESS! [2:27:01.875939]
>>
>> +1
>>
>> Thanks!
>> -Greg
>>
>> On Wed, Dec 13, 2023 at 3:58 AM Chris Hegarty
>>  wrote:
>>
>>> And (short) release note:
>>>
>>>   https://cwiki.apache.org/confluence/display/LUCENE/ReleaseNote9_9_1
>>>
>>> -Chris.
>>>
>>> > On 13 Dec 2023, at 11:55, Chris Hegarty <
>>> christopher.hega...@elastic.co> wrote:
>>> >
>>> > Hi,
>>> >
>>> > Please vote for release candidate 1 for Lucene 9.9.1
>>> >
>>> > The artifacts can be downloaded from:
>>> >
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.1-RC1-rev-eee32cbf5e072a8c9d459c349549094230038308
>>> >
>>> > You can run the smoke tester directly with this command:
>>> >
>>> > python3 -u dev-tools/scripts/smokeTestRelease.py \
>>> >
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.1-RC1-rev-eee32cbf5e072a8c9d459c349549094230038308
>>> >
>>> > The vote will be open for at least 72 hours i.e. until 2023-12-16
>>> 12:00 UTC.
>>> >
>>> > [ ] +1  approve
>>> > [ ] +0  no opinion
>>> > [ ] -1  disapprove (and reason why)
>>> >
>>> > Here is my +1
>>> >
>>> > -Chris.
>>> >
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>


Re: [JENKINS] Lucene-9.x-MacOSX (64bit/hotspot/jdk-17.0.9) - Build # 3226 - Failure!

2023-12-14 Thread Michael McCandless
Build timed out.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Dec 13, 2023 at 7:50 PM Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-9.x-MacOSX/3226/
> Java: 64bit/hotspot/jdk-17.0.9 -XX:+UseCompressedOops -XX:+UseG1GC
>
> All tests passed
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: Running tests with streaming console output but NOT verbose?

2023-12-12 Thread Michael McCandless
Ahh thanks for the quick explanation and temporary solution Dawid!.  Naming
is the hardest part :)

I think long ago we used to call it "-Dtests.stdout=true" or so?  Not the
greatest name tho.  Maybe "tests.liveConsoleOut"?  "tests.liveConsole"?

Mike McCandless

http://blog.mikemccandless.com


On Tue, Dec 12, 2023 at 2:31 PM Dawid Weiss  wrote:

>
> This is actually an accidental (?) clash between Lucene's system property
> and what's in defaults-tests.gradle.
> You can manually prepend true || ... to the following in
> defaults-tests.gradle.
>
> def verboseMode = resolvedTestOption("tests.verbose").toBoolean()
>
> I can't remember why it aligns with Lucene's logger. Maybe it should/could
> be a
> separate property? I find it difficult to come up with a reasonable name
> though.
>
> D.
>
> On Tue, Dec 12, 2023 at 8:03 PM Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> Hi Team,
>>
>> This is prolly a Dawid question...
>>
>> Sometimes I want to run a test (like a slow Monster test), seeing its
>> ongoing musings popping out on the console in "real time" (not buffered).
>>
>> I can do this today by adding "-Dtests.verbose=true" to the ./gradlew
>> invocation that's running the test.
>>
>> But that also turns on LuceneTestCase.VERBOSE which sometimes produces
>> insane amounts of mostly not helpful content.
>>
>> Is there any way to do the first (stream console output) without the
>> second (mega verbosity enabled)?
>>
>> Thanks,
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>


Running tests with streaming console output but NOT verbose?

2023-12-12 Thread Michael McCandless
Hi Team,

This is prolly a Dawid question...

Sometimes I want to run a test (like a slow Monster test), seeing its
ongoing musings popping out on the console in "real time" (not buffered).

I can do this today by adding "-Dtests.verbose=true" to the ./gradlew
invocation that's running the test.

But that also turns on LuceneTestCase.VERBOSE which sometimes produces
insane amounts of mostly not helpful content.

Is there any way to do the first (stream console output) without the second
(mega verbosity enabled)?

Thanks,

Mike McCandless

http://blog.mikemccandless.com


Re: The need for a Lucene 9.9.1 release

2023-12-12 Thread Michael McCandless
OK this is merged now.  Are there any other 9.9.1 blockers?  I am trying to
pass all Monster tests but that can probably just run concurrently with
voting (optimistic concurrency!)?

Mike McCandless

http://blog.mikemccandless.com


On Tue, Dec 12, 2023 at 9:18 AM Chris Hegarty
 wrote:

> Hi Mike,
>
> > On 12 Dec 2023, at 12:56, Michael McCandless 
> wrote:
> >
> > Hi Chris,
> >
> > I think we should also regenerate the FSTs for 9.9.1?
>
> Seems reasonable.
>
> > https://github.com/apache/lucene/pull/12924
>
> I added my comments and review on the PR.
>
> -Chris.
>
> > Thanks,
> >
> > Mike
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: The need for a Lucene 9.9.1 release

2023-12-12 Thread Michael McCandless
Hi Chris,

I think we should also regenerate the FSTs for 9.9.1?

https://github.com/apache/lucene/pull/12924

Thanks,

Mike

On Tue, Dec 12, 2023 at 7:54 AM Guo Feng  wrote:

> Heads up:
>
> The bug fix PR (https://github.com/apache/lucene/pull/12900) has been
> merged to main, and backported to lucene_9x & lucene_9_9.
>
> On 2023/12/11 20:27:48 Chris Hegarty wrote:
> > Just a quick update on this...
> >
> > > On 9 Dec 2023, at 09:09, Chris Hegarty 
> wrote:
> > >
> > > Hi,
> > >
> > > We’ve encounter two very serious issues with the recent Lucene 9.9.0
> release, both of which (even if taken by themselves) would warrant a 9.9.1.
> The issues are:
> > >
> > > 1. https://github.com/apache/lucene/issues/12895 - Corruption read on
> term dictionaries in Lucene 9.9
> >
> > Great work has been done re-adding tests, creating a new test to
> reproduce, and also working on an underlying fix. It feels like we’re
> getting close! :-)
> >
> > > 2. https://github.com/apache/lucene/issues/12898 - JVM SIGSEGV crash
> when compiling computeCommonPrefixLengthAndBuildHistogram Lucene 9.9.0
> >
> > Merged to branch_9_9.
> >
> > Once no.1 is merged, I’ll build a 9.9.1 RC1 and start a vote.
> >
> > -Chris
> >
> >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: [JENKINS] Lucene » Lucene-NightlyTests-main - Build # 1209 - Unstable!

2023-12-08 Thread Michael McCandless
OK I reverted the "optimization" to not pull FieldInfo for a field when
getting Points values from SlowCompositeCodecReaderWrapper!  Clearly it was
not safe ;)

Mike McCandless

http://blog.mikemccandless.com


On Fri, Dec 8, 2023 at 8:06 AM Michael McCandless 
wrote:

> Uh oh -- I'll dig.  We may need to put back the FieldInfo check before
> pulling points.  Tricky!
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, Dec 8, 2023 at 3:55 AM Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
>
>> Build:
>> https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-main/1209/
>>
>> 3 tests failed.
>> FAILED:  org.apache.lucene.index.TestPointValues.testSparsePoints
>>
>> Error Message:
>> java.lang.IllegalStateException: this writer hit an unrecoverable error;
>> cannot merge
>>
>> Stack Trace:
>> java.lang.IllegalStateException: this writer hit an unrecoverable error;
>> cannot merge
>> at
>> __randomizedtesting.SeedInfo.seed([ADA30A2081CE6DA4:A05414293C35A568]:0)
>> at
>> org.apache.lucene.index.IndexWriter.hasPendingMerges(IndexWriter.java:2425)
>> at
>> org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.hasPendingMerges(IndexWriter.java:6527)
>> at
>> org.apache.lucene.index.ConcurrentMergeScheduler.maybeStall(ConcurrentMergeScheduler.java:589)
>> at
>> org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:540)
>> at
>> org.apache.lucene.index.IndexWriter.executeMerge(IndexWriter.java:2315)
>> at
>> org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2310)
>> at
>> org.apache.lucene.index.IndexWriter.processEvents(IndexWriter.java:5985)
>> at
>> org.apache.lucene.index.IndexWriter.flushNextBuffer(IndexWriter.java:3606)
>> at
>> org.apache.lucene.tests.index.RandomIndexWriter.flushAllBuffersSequentially(RandomIndexWriter.java:263)
>> at
>> org.apache.lucene.tests.index.RandomIndexWriter.maybeFlushOrCommit(RandomIndexWriter.java:235)
>> at
>> org.apache.lucene.tests.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:226)
>> at
>> org.apache.lucene.index.TestPointValues.testSparsePoints(TestPointValues.java:697)
>> at
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>> at
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>> at
>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
>> at
>> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
>> at
>> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
>> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>> at
>> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>> at
>> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
>> at
>> com.carrotse

Re: [JENKINS] Lucene » Lucene-NightlyTests-main - Build # 1209 - Unstable!

2023-12-08 Thread Michael McCandless
Uh oh -- I'll dig.  We may need to put back the FieldInfo check before
pulling points.  Tricky!

Mike McCandless

http://blog.mikemccandless.com


On Fri, Dec 8, 2023 at 3:55 AM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build:
> https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-main/1209/
>
> 3 tests failed.
> FAILED:  org.apache.lucene.index.TestPointValues.testSparsePoints
>
> Error Message:
> java.lang.IllegalStateException: this writer hit an unrecoverable error;
> cannot merge
>
> Stack Trace:
> java.lang.IllegalStateException: this writer hit an unrecoverable error;
> cannot merge
> at
> __randomizedtesting.SeedInfo.seed([ADA30A2081CE6DA4:A05414293C35A568]:0)
> at
> org.apache.lucene.index.IndexWriter.hasPendingMerges(IndexWriter.java:2425)
> at
> org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.hasPendingMerges(IndexWriter.java:6527)
> at
> org.apache.lucene.index.ConcurrentMergeScheduler.maybeStall(ConcurrentMergeScheduler.java:589)
> at
> org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:540)
> at
> org.apache.lucene.index.IndexWriter.executeMerge(IndexWriter.java:2315)
> at
> org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2310)
> at
> org.apache.lucene.index.IndexWriter.processEvents(IndexWriter.java:5985)
> at
> org.apache.lucene.index.IndexWriter.flushNextBuffer(IndexWriter.java:3606)
> at
> org.apache.lucene.tests.index.RandomIndexWriter.flushAllBuffersSequentially(RandomIndexWriter.java:263)
> at
> org.apache.lucene.tests.index.RandomIndexWriter.maybeFlushOrCommit(RandomIndexWriter.java:235)
> at
> org.apache.lucene.tests.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:226)
> at
> org.apache.lucene.index.TestPointValues.testSparsePoints(TestPointValues.java:697)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> at
> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> at
> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com

Re: [JENKINS] Lucene-MMAPv2-Linux (64bit/openj9/jdk-17.0.8) - Build # 1500 - Unstable!

2023-12-07 Thread Michael McCandless
Oh, nevermind -- we have seen it before, and added a comment on the
upstream (Open J9) issue:
https://github.com/eclipse-openj9/openj9/issues/18400#issuecomment-1795093834

Mike McCandless

http://blog.mikemccandless.com


On Thu, Dec 7, 2023 at 8:32 AM Michael McCandless 
wrote:

> Hmm -- this looks like maybe another Open J9 specific failure?  I have not
> seen this one before I think...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, Dec 1, 2023 at 10:20 PM Policeman Jenkins Server <
> jenk...@thetaphi.de> wrote:
>
>> Build: https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Linux/1500/
>> Java: 64bit/openj9/jdk-17.0.8 -XX:-UseCompressedOops -Xgcpolicy:gencon
>>
>> 1 tests failed.
>> FAILED:
>> org.apache.lucene.index.TestIndexWriterThreadsToSegments.testSegmentCountOnFlushRandom
>>
>> Error Message:
>> java.lang.AssertionError
>>
>> Stack Trace:
>> java.lang.AssertionError
>> at __randomizedtesting.SeedInfo.seed([701E8537ED3618D8]:0)
>> at app//org.junit.Assert.fail(Assert.java:87)
>> at app//org.junit.Assert.assertTrue(Assert.java:42)
>> at app//org.junit.Assert.assertTrue(Assert.java:53)
>> at
>> app//org.apache.lucene.index.TestIndexWriterThreadsToSegments$CheckSegmentCount.run(TestIndexWriterThreadsToSegments.java:150)
>> at
>> java.base@17.0.8.1/java.util.concurrent.CyclicBarrier.dowait(CyclicBarrier.java:222)
>> at
>> java.base@17.0.8.1/java.util.concurrent.CyclicBarrier.await(CyclicBarrier.java:364)
>> at
>> app//org.apache.lucene.index.TestIndexWriterThreadsToSegments$2.run(TestIndexWriterThreadsToSegments.java:236)
>>
>> -
>> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: builds-h...@lucene.apache.org
>
>


Re: [JENKINS] Lucene-MMAPv2-Linux (64bit/openj9/jdk-17.0.8) - Build # 1500 - Unstable!

2023-12-07 Thread Michael McCandless
Hmm -- this looks like maybe another Open J9 specific failure?  I have not
seen this one before I think...

Mike McCandless

http://blog.mikemccandless.com


On Fri, Dec 1, 2023 at 10:20 PM Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Linux/1500/
> Java: 64bit/openj9/jdk-17.0.8 -XX:-UseCompressedOops -Xgcpolicy:gencon
>
> 1 tests failed.
> FAILED:
> org.apache.lucene.index.TestIndexWriterThreadsToSegments.testSegmentCountOnFlushRandom
>
> Error Message:
> java.lang.AssertionError
>
> Stack Trace:
> java.lang.AssertionError
> at __randomizedtesting.SeedInfo.seed([701E8537ED3618D8]:0)
> at app//org.junit.Assert.fail(Assert.java:87)
> at app//org.junit.Assert.assertTrue(Assert.java:42)
> at app//org.junit.Assert.assertTrue(Assert.java:53)
> at
> app//org.apache.lucene.index.TestIndexWriterThreadsToSegments$CheckSegmentCount.run(TestIndexWriterThreadsToSegments.java:150)
> at
> java.base@17.0.8.1/java.util.concurrent.CyclicBarrier.dowait(CyclicBarrier.java:222)
> at
> java.base@17.0.8.1/java.util.concurrent.CyclicBarrier.await(CyclicBarrier.java:364)
> at
> app//org.apache.lucene.index.TestIndexWriterThreadsToSegments$2.run(TestIndexWriterThreadsToSegments.java:236)
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: [JENKINS] Lucene-9.x-Linux (64bit/hotspot/jdk-11.0.21) - Build # 14204 - Unstable!

2023-12-01 Thread Michael McCandless
Hmm this reproduces for me, and looks new/unique.  Could it be related to
recent 9.9.0 changes / release blocker?

Mike

On Fri, Dec 1, 2023 at 3:33 PM Policeman Jenkins Server 
wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/14204/
> Java: 64bit/hotspot/jdk-11.0.21 -XX:+UseCompressedOops -XX:+UseParallelGC
>
> 1 tests failed.
> FAILED:  org.apache.lucene.index.TestParallelLeafReader.testQueries
>
> Error Message:
> org.junit.ComparisonFailure: expected: but was:
>
> Stack Trace:
> org.junit.ComparisonFailure: expected: but was:
> at
> __randomizedtesting.SeedInfo.seed([6CA57EA3FB50CA0D:302BB278E1397FA3]:0)
> at org.junit.Assert.assertEquals(Assert.java:117)
> at org.junit.Assert.assertEquals(Assert.java:146)
> at
> org.apache.lucene.index.TestParallelLeafReader.queryTest(TestParallelLeafReader.java:263)
> at
> org.apache.lucene.index.TestParallelLeafReader.testQueries(TestParallelLeafReader.java:48)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> at
> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> at
> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at
> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.

Re: [VOTE] Release Lucene 9.9.0 RC2

2023-12-01 Thread Michael McCandless
+1


SUCCESS! [0:20:12.297376]


Mike McCandless

http://blog.mikemccandless.com


On Fri, Dec 1, 2023 at 9:21 AM Uwe Schindler  wrote:

> Hi,
>
> I let Policeman Jenkins run the smoke tester with Java 11 and Java 17
> (unfortunately we have no support for 21 yet, so new MMap and Vectors were
> not tested). But this was tested long enough, so I trust everything. I just
> did some cross-checking and validated the MR-JAR to contain all classes and
> that Javadocs are uptodate. Looks fine after the manual review.
>
> Here is Policeman's work and opinion: SUCCESS! [1:02:37.749085] (
> https://jenkins.thetaphi.de/job/Lucene-Release-Tester/30/console)
>
> Here is my personal opinion:
> +1 to release
>
> Uwe
> Am 30.11.2023 um 18:31 schrieb Chris Hegarty:
>
> Please vote for release candidate 2 for Lucene 9.9.0
>
>
> The artifacts can be downloaded from:
>
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC2-rev-06070c0dceba07f0d33104192d9ac98ca16fc500
>
>
> You can run the smoke tester directly with this command:
>
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
>
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC2-rev-06070c0dceba07f0d33104192d9ac98ca16fc500
>
>
> The vote will be open for at least 72 hours, and given the weekend in
> between, let’s keep it open until 2023-12-04 12:00 UTC.
>
> [ ] +1  approve
>
> [ ] +0  no opinion
>
> [ ] -1  disapprove (and reason why)
>
>
> Here is my +1
>
>
> -Chris.
>
> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremenhttps://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>


Re: [VOTE] Release Lucene 9.9.0 RC1

2023-11-30 Thread Michael McCandless
On Thu, Nov 30, 2023 at 9:56 AM Chris Hegarty
 wrote:

P.S. I’m less sure about this, but the RC 2 starts a 72hr voting time
> again? (Just so I know what TTL to put on that)
>

Yeah a new 72 hour clock starts with each new RC :)

Mike McCandless

http://blog.mikemccandless.com


Re: GitHub issues vs PRs vs Lucene's CHANGES.txt

2023-11-30 Thread Michael McCandless
Well, I created a starting tool to at least help us keep the
what-should-be-identical-yet-is-nearly-impossible-for-us-to-achieve
sections in CHANGES.txt in sync: https://github.com/apache/lucene/pull/12860

Right now it finds a number of mostly minor differences in the 9.9.0
sections in main vs branch_9_9:

NOTE: resolving branch_9_9 -->
https://raw.githubusercontent.com/apache/lucene/branch_9_9/lucene/CHANGES.txt
NOTE: resolving main -->
https://raw.githubusercontent.com/apache/lucene/main/lucene/CHANGES.txt
15a16,18
> * GITHUB#12646, GITHUB#12690: Move FST#addNode to FSTCompiler to avoid a
circular dependency
>   between FST and FSTCompiler (Anh Dung Bui)
>
27,30c30
< * GITHUB#12646, GITHUB#12690: Move FST#addNode to FSTCompiler to avoid a
circular dependency
<   between FST and FSTCompiler (Anh Dung Bui)
<
< * GITHUB#12709 Consolidate FSTStore and BytesStore in FST. Created
FSTReader which contains the common methods
---
> * GITHUB#12709: Consolidate FSTStore and BytesStore in FST. Created
FSTReader which contains the common methods
33,34d32
< * GITHUB#12735: Remove FSTCompiler#getTermCount() and
FSTCompiler.UnCompiledNode#inputCount (Anh Dung Bui)
<
37a36,37
> * GITHUB#12735: Remove FSTCompiler#getTermCount() and
FSTCompiler.UnCompiledNode#inputCount (Anh Dung Bui)
>
166a167,168
> * GITHUB#12748: Specialize arc store for continuous label in FST. (Guo
Feng, Zhang Chao)
>
173,177d174
< * GITHUB#12748: Specialize arc store for continuous label in FST. (Guo
Feng, Chao Zhang)
<
< * GITHUB#12825, GITHUB#12834: Hunspell: improved dictionary loading
performance, allowed in-memory entry sorting.
<   (Peter Gromov)
<
185,186d181
<
< * GITHUB#12552: Make FSTPostingsFormat load FSTs off-heap. (Tony X)


Mike McCandless

http://blog.mikemccandless.com


On Wed, Nov 29, 2023 at 6:01 AM Michael McCandless <
luc...@mikemccandless.com> wrote:

> Oh, and that the CHANGES.txt entries in e.g. 9.9.0 section match on 9.x
> and main branches... I think that one we have some automation to catch?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Wed, Nov 29, 2023 at 5:58 AM Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> Hi Team,
>>
>> I see Chris is tagging issues that were left open after their linked PRs
>> were merged (thanks!).
>>
>> Is there something in our release tooling that cross-checks all the
>> weakly linked metadata today: Milestone marked (or more often: not) on an
>> issue vs commits to the respective branches vs location in Lucene's
>> CHANGES.txt vs open/closed issue matching the linked PRs?
>>
>> It seems like some simple automation here could catch mistakes.  E.g. I'm
>> uncertain I properly moved all the FST related CHANGES.txt entries to the
>> right places.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>


Re: [VOTE] Release Lucene 9.9.0 RC1

2023-11-30 Thread Michael McCandless
+1 to release.

I hit a corner-case test failure and opened a PR to fix it:
https://github.com/apache/lucene/pull/12859

I don't think this should block the release? -- it looks exotic.

Thanks Chris!

Mike McCandless

http://blog.mikemccandless.com


On Thu, Nov 30, 2023 at 1:16 AM Patrick Zhai  wrote:

> SUCCESS! [1:03:54.880200]
>
> +1. Thank you Chris!
>
> On Wed, Nov 29, 2023 at 8:45 PM Nhat Nguyen 
> wrote:
>
>> SUCCESS! [1:11:30.037919]
>>
>> +1. Thanks, Chris!
>>
>> On Wed, Nov 29, 2023 at 8:53 AM Chris Hegarty
>>  wrote:
>>
>>> Hi,
>>>
>>>
>>> Please vote for release candidate 1 for Lucene 9.9.0
>>>
>>>
>>> The artifacts can be downloaded from:
>>>
>>>
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC1-rev-92a5e5b02e0e083126c4122f2b7a02426c21a037
>>>
>>>
>>> You can run the smoke tester directly with this command:
>>>
>>>
>>> python3 -u dev-tools/scripts/smokeTestRelease.py \
>>>
>>>
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC1-rev-92a5e5b02e0e083126c4122f2b7a02426c21a037
>>>
>>>
>>> The vote will be open for at least 72 hours, and given the weekend in
>>> between, let’s it open until 2023-12-04 12:00 UTC.
>>>
>>>
>>> [ ] +1  approve
>>>
>>> [ ] +0  no opinion
>>>
>>> [ ] -1  disapprove (and reason why)
>>>
>>>
>>> Here is my +1
>>>
>>>
>>> Draft release highlights can be viewed here (comments and feedback
>>> welcome):
>>> https://cwiki.apache.org/confluence/display/LUCENE/ReleaseNote9_9_0
>>>
>>> -Chris.
>>>
>>


Re: [JENKINS] Lucene » Lucene-Check-main - Build # 10750 - Unstable!

2023-11-30 Thread Michael McCandless
I hit this one running the smoke tester on 9.9.0 RC 0, and it repros.  I'll
open an issue ... I think it's just a missing null check in the
SlowCompositeCodecReaderWrapper.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Nov 28, 2023 at 6:37 PM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build:
> https://ci-builds.apache.org/job/Lucene/job/Lucene-Check-main/10750/
>
> 2 tests failed.
> FAILED:
> org.apache.lucene.search.TestPointQueries.testAllPointDocsWereDeletedAndThenMergedAgain
>
> Error Message:
> java.io.IOException: background merge hit exception:
> _3(10.0.0):C2:[diagnostics={lucene.version=10.0.0, source=merge,
> os.arch=amd64, java.runtime.version=17.0.7+7, mergeFactor=1, os=Linux,
> java.vendor=Eclipse Adoptium, os.version=5.4.0-167-generic,
> timestamp=1701213730838,
> mergeMaxNumSegments=1}]:[attributes={Lucene90StoredFieldsFormat.mode=BEST_SPEED}]
> :id=4i1dci11qy6ymdhw6cirss6iz into _4 [maxNumSegments=1]
>
> Stack Trace:
> java.io.IOException: background merge hit exception:
> _3(10.0.0):C2:[diagnostics={lucene.version=10.0.0, source=merge,
> os.arch=amd64, java.runtime.version=17.0.7+7, mergeFactor=1, os=Linux,
> java.vendor=Eclipse Adoptium, os.version=5.4.0-167-generic,
> timestamp=1701213730838,
> mergeMaxNumSegments=1}]:[attributes={Lucene90StoredFieldsFormat.mode=BEST_SPEED}]
> :id=4i1dci11qy6ymdhw6cirss6iz into _4 [maxNumSegments=1]
> at
> __randomizedtesting.SeedInfo.seed([F9345050587589D4:B625E461757388B1]:0)
> at
> org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:2170)
> at
> org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:2099)
> at
> org.apache.lucene.search.TestPointQueries.testAllPointDocsWereDeletedAndThenMergedAgain(TestPointQueries.java:1212)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> at
> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> at
> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucen

Re: GitHub issues vs PRs vs Lucene's CHANGES.txt

2023-11-29 Thread Michael McCandless
Oh, and that the CHANGES.txt entries in e.g. 9.9.0 section match on 9.x and
main branches... I think that one we have some automation to catch?

Mike McCandless

http://blog.mikemccandless.com


On Wed, Nov 29, 2023 at 5:58 AM Michael McCandless <
luc...@mikemccandless.com> wrote:

> Hi Team,
>
> I see Chris is tagging issues that were left open after their linked PRs
> were merged (thanks!).
>
> Is there something in our release tooling that cross-checks all the weakly
> linked metadata today: Milestone marked (or more often: not) on an issue vs
> commits to the respective branches vs location in Lucene's CHANGES.txt vs
> open/closed issue matching the linked PRs?
>
> It seems like some simple automation here could catch mistakes.  E.g. I'm
> uncertain I properly moved all the FST related CHANGES.txt entries to the
> right places.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>


GitHub issues vs PRs vs Lucene's CHANGES.txt

2023-11-29 Thread Michael McCandless
Hi Team,

I see Chris is tagging issues that were left open after their linked PRs
were merged (thanks!).

Is there something in our release tooling that cross-checks all the weakly
linked metadata today: Milestone marked (or more often: not) on an issue vs
commits to the respective branches vs location in Lucene's CHANGES.txt vs
open/closed issue matching the linked PRs?

It seems like some simple automation here could catch mistakes.  E.g. I'm
uncertain I properly moved all the FST related CHANGES.txt entries to the
right places.

Mike McCandless

http://blog.mikemccandless.com


Re: [JENKINS] Lucene-9.x-Linux (64bit/hotspot/jdk-19) - Build # 14180 - Failure!

2023-11-29 Thread Michael McCandless
JVM crashed:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x7f9fa8545493, pid=2982126, tid=2990096
#
# JRE version: OpenJDK Runtime Environment (19.0+36) (build 19+36-2238)
# Java VM: OpenJDK 64-Bit Server VM (19+36-2238, mixed mode, sharing,
tiered, compressed oops, compressed class ptrs, serial gc,
linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xb45493]
PhaseIdealLoop::build_loop_late_post_work(Node*, bool)+0xf3
#
# No core dump will be written. Core dumps have been disabled. To
enable core dumping, try "ulimit -c unlimited" before starting Java
again
#
# An error report file with more information is saved as:
# 
/home/jenkins/workspace/Lucene-9.x-Linux/lucene/core/build/tmp/tests-cwd/hs_err_pid2982126.log
#
# Compiler replay data is saved as:
# 
/home/jenkins/workspace/Lucene-9.x-Linux/lucene/core/build/tmp/tests-cwd/replay_pid2982126.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#


Mike McCandless

http://blog.mikemccandless.com


On Wed, Nov 29, 2023 at 3:49 AM Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/14180/
> Java: 64bit/hotspot/jdk-19 -XX:+UseCompressedOops -XX:+UseSerialGC
>
> All tests passed
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: [JENKINS] Lucene » Lucene-NightlyTests-main - Build # 1199 - Unstable!

2023-11-28 Thread Michael McCandless
OK I pushed a fix.

Mike

On Tue, Nov 28, 2023 at 7:32 PM Michael McCandless <
luc...@mikemccandless.com> wrote:

> I think maybe LuceneTestCase.newSearcher is turning on concurrency
> (setting the executor randomly).  Since this test explicitly passes a "no
> concurrency" collector manager I think we should switch to "new
> IndexSearcher(...)".
>
> Mike
>
> On Tue, Nov 28, 2023 at 7:29 PM Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> This reproduces for me.
>>
>> Maybe related to LUCENE-10002 / #240?
>>
>> Mike
>>
>> On Tue, Nov 28, 2023 at 1:58 AM Apache Jenkins Server <
>> jenk...@builds.apache.org> wrote:
>>
>>> Build:
>>> https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-main/1199/
>>>
>>> 1 tests failed.
>>> FAILED:  org.apache.lucene.search.TestTopFieldCollector.testSort
>>>
>>> Error Message:
>>> java.lang.IllegalStateException: This TopFieldCollectorManager was
>>> created without concurrency (supportsConcurrency=false), but multiple
>>> collectors are being created
>>>
>>> Stack Trace:
>>> java.lang.IllegalStateException: This TopFieldCollectorManager was
>>> created without concurrency (supportsConcurrency=false), but multiple
>>> collectors are being created
>>> at
>>> __randomizedtesting.SeedInfo.seed([4B0B913D92123C6D:1AEEB914595F267D]:0)
>>> at
>>> org.apache.lucene.search.TopFieldCollectorManager.newCollector(TopFieldCollectorManager.java:142)
>>> at
>>> org.apache.lucene.search.TopFieldCollectorManager.newCollector(TopFieldCollectorManager.java:31)
>>> at
>>> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:623)
>>> at
>>> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:607)
>>> at
>>> org.apache.lucene.search.TestTopFieldCollector.testSort(TestTopFieldCollector.java:124)
>>> at
>>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> Method)
>>> at
>>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>>> at
>>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
>>> at
>>> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
>>> at
>>> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>>> at
>>> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>>> at
>>> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>>> at
>>> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>>> at
>>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>>> at
>>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
>>> at
>>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
>>> at
>>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
>>> at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(Random

Re: [JENKINS] Lucene » Lucene-NightlyTests-main - Build # 1199 - Unstable!

2023-11-28 Thread Michael McCandless
I think maybe LuceneTestCase.newSearcher is turning on concurrency (setting
the executor randomly).  Since this test explicitly passes a "no
concurrency" collector manager I think we should switch to "new
IndexSearcher(...)".

Mike

On Tue, Nov 28, 2023 at 7:29 PM Michael McCandless <
luc...@mikemccandless.com> wrote:

> This reproduces for me.
>
> Maybe related to LUCENE-10002 / #240?
>
> Mike
>
> On Tue, Nov 28, 2023 at 1:58 AM Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
>
>> Build:
>> https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-main/1199/
>>
>> 1 tests failed.
>> FAILED:  org.apache.lucene.search.TestTopFieldCollector.testSort
>>
>> Error Message:
>> java.lang.IllegalStateException: This TopFieldCollectorManager was
>> created without concurrency (supportsConcurrency=false), but multiple
>> collectors are being created
>>
>> Stack Trace:
>> java.lang.IllegalStateException: This TopFieldCollectorManager was
>> created without concurrency (supportsConcurrency=false), but multiple
>> collectors are being created
>> at
>> __randomizedtesting.SeedInfo.seed([4B0B913D92123C6D:1AEEB914595F267D]:0)
>> at
>> org.apache.lucene.search.TopFieldCollectorManager.newCollector(TopFieldCollectorManager.java:142)
>> at
>> org.apache.lucene.search.TopFieldCollectorManager.newCollector(TopFieldCollectorManager.java:31)
>> at
>> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:623)
>> at
>> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:607)
>> at
>> org.apache.lucene.search.TestTopFieldCollector.testSort(TestTopFieldCollector.java:124)
>> at
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>> at
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>> at
>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
>> at
>> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
>> at
>> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
>> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>> at
>> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>> at
>> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
>> at
>> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
>> at
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverr

Re: [JENKINS] Lucene » Lucene-NightlyTests-main - Build # 1199 - Unstable!

2023-11-28 Thread Michael McCandless
This reproduces for me.

Maybe related to LUCENE-10002 / #240?

Mike

On Tue, Nov 28, 2023 at 1:58 AM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build:
> https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-main/1199/
>
> 1 tests failed.
> FAILED:  org.apache.lucene.search.TestTopFieldCollector.testSort
>
> Error Message:
> java.lang.IllegalStateException: This TopFieldCollectorManager was created
> without concurrency (supportsConcurrency=false), but multiple collectors
> are being created
>
> Stack Trace:
> java.lang.IllegalStateException: This TopFieldCollectorManager was created
> without concurrency (supportsConcurrency=false), but multiple collectors
> are being created
> at
> __randomizedtesting.SeedInfo.seed([4B0B913D92123C6D:1AEEB914595F267D]:0)
> at
> org.apache.lucene.search.TopFieldCollectorManager.newCollector(TopFieldCollectorManager.java:142)
> at
> org.apache.lucene.search.TopFieldCollectorManager.newCollector(TopFieldCollectorManager.java:31)
> at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:623)
> at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:607)
> at
> org.apache.lucene.search.TestTopFieldCollector.testSort(TestTopFieldCollector.java:124)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> at
> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> at
> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at
> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> org.apache.lucene.tests.util.TestRuleIgnoreTestSui

Re: [JENKINS] Lucene-main-Linux (64bit/hotspot/jdk-19) - Build # 45643 - Failure!

2023-11-25 Thread Michael McCandless
Hmm JVM crashed (there's an hs_err file there):

> Process 'Gradle Test Executor 33' finished with non-zero exit value 134

Mike McCandless

http://blog.mikemccandless.com


On Sat, Nov 25, 2023 at 6:34 AM Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45643/
> Java: 64bit/hotspot/jdk-19 -XX:+UseCompressedOops -XX:+UseParallelGC
>
> All tests passed
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: Lucene 9.9.0 Release

2023-11-21 Thread Michael McCandless
+1

Thank you for volunteering as RC Chris!

Mike McCandless

http://blog.mikemccandless.com


On Tue, Nov 21, 2023 at 4:52 AM Chris Hegarty
 wrote:

> Hi,
>
> It's been a while since the 9.8.0 release and we’ve accumulated quite a
> few changes. I’d like to propose that we release 9.9.0.
>
> If there's no objections, I volunteer to be the release manager and will
> cut the feature branch a week from now, 12:00 28th Nov UTC.
>
> -Chris.
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: [JENKINS] Lucene-9.x-MacOSX (64bit/hotspot/jdk-11.0.21) - Build # 3165 - Still Failing!

2023-11-20 Thread Michael McCandless
Build timed out (after 169 minutes). Marking the build as aborted.
Build timed out (after 169 minutes). Marking the build as failed.

Mike McCandless

http://blog.mikemccandless.com


On Mon, Nov 20, 2023 at 5:01 PM Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-9.x-MacOSX/3165/
> Java: 64bit/hotspot/jdk-11.0.21 -XX:-UseCompressedOops -XX:+UseSerialGC
>
> All tests passed
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: [JENKINS] Lucene-MMAPv2-Linux (64bit/hotspot/jdk-17.0.9) - Build # 1442 - Failure!

2023-11-14 Thread Michael McCandless
Hmm again timeout.  Something seems amiss.  Do our super slow tests still
print out HEARTBEAT periodically?  Or did we lose that in the gradle
migration maybe?

Build timed out (after 126 minutes). Marking the build as aborted.
Build timed out (after 126 minutes). Marking the build as failed.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Nov 14, 2023 at 7:59 AM Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Linux/1442/
> Java: 64bit/hotspot/jdk-17.0.9 -XX:+UseCompressedOops -XX:+UseShenandoahGC
>
> All tests passed
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: [JENKINS] Lucene-main-Linux (64bit/hotspot/jdk-17.0.9) - Build # 45536 - Failure!

2023-11-14 Thread Michael McCandless
Thanks Uwe.

OK so this might just be a high-sigma outlier-ish event due to unluckily
slow seed selection?

I wonder whether the distribution of total run time of each full "./gradlew
test" on each JVM configuration is roughly Gaussian-ish?

Mike McCandless

http://blog.mikemccandless.com


On Tue, Nov 14, 2023 at 5:40 AM Uwe Schindler  wrote:

> Hi,
>
> Actually this is the default JVM, so its not OpenJ9 or another EA
> release.It could be one of the tests haging, but we can't figure that out.
>
> P.S.: Jenkins kills jobs, if they take longer than usual it kills it (it
> has no hard limit, it takes the average time of previous runs and if one
> takes much longer it kills).
>
> Uwe
>
> Am 14.11.2023 um 11:06 schrieb Michael McCandless:
>
> Hmm build timed out -- not sure why it's taking so long to run tests:
>
> Build timed out (after 137 minutes). Marking the build as aborted.
> Build timed out (after 137 minutes). Marking the build as failed.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Nov 14, 2023 at 1:39 AM Policeman Jenkins Server <
> jenk...@thetaphi.de> wrote:
>
>> Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45536/
>> Java: 64bit/hotspot/jdk-17.0.9 -XX:+UseCompressedOops -XX:+UseShenandoahGC
>>
>> All tests passed
>>
>> -
>> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: builds-h...@lucene.apache.org
>
> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremenhttps://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>


Re: [JENKINS] Lucene-main-Linux (64bit/hotspot/jdk-17.0.9) - Build # 45536 - Failure!

2023-11-14 Thread Michael McCandless
Hmm build timed out -- not sure why it's taking so long to run tests:

Build timed out (after 137 minutes). Marking the build as aborted.
Build timed out (after 137 minutes). Marking the build as failed.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Nov 14, 2023 at 1:39 AM Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45536/
> Java: 64bit/hotspot/jdk-17.0.9 -XX:+UseCompressedOops -XX:+UseShenandoahGC
>
> All tests passed
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: Boolean field type

2023-11-13 Thread Michael McCandless
Hi Michael/Mikhails, yet another Mike here:

If you create a NumericDocValuesField, and it only ever has one value per
doc (0, 1), I think the default Codec will compress it well, though maybe
not as well as your idea.  It's a neat idea to notice a "very common
default value" and not store it and just store the sparse non-default
values.  I don't think Codec does that today.

For the search-time opto, I thought somewhere in Lucene we do something
like your idea, converting to a negation if it has lower estimated
cardinality than the positive form.  It might only be for points?  If the
field were stored as postings, you could consult the metadata in the terms
dictionary to know the cardinality of each case, and perhaps that the field
is single valued and fully set (no missing values), at which point your
optimization logic might be able to apply during rewrite maybe?

Mike McCandless

http://blog.mikemccandless.com


On Fri, Nov 10, 2023 at 6:05 PM Michael Froh  wrote:

> Thanks Mikhail and Mike!
>
> Mikhail, since you replied, I remembered your work on block joins in Solr
> (thank you for that, by the way!), which reminded me that it's not unusual
> for docs in a Lucene index to "mix" their schemata, like in parent/child
> blocks. If 90% of parent docs are "true" on a Boolean field, but the field
> doesn't exist for the child docs, my suggested approach would probably see
> "true" as the sparse value (assuming there are at least as many children as
> parents). Ideally, I would want to only track the "false" parents (and
> leave the field off of the children).
>
> Indeed the idea of a "required" field in Lucene is tricky (though Mike's
> suggestion of missing defaults could help). Even worse, I think we'd also
> need a way to enforce "exactly one value", since a "Boolean" term field can
> really have four states -- true, false, neither, or both.
>
> It feels like there's not a workable short-term solution to implement
> something like this as a regular IndexableField in Lucene (or at least I'm
> not seeing it).
>
> I don't think I see enough value to pitch the idea of adding a new
> field-like "thing" (that would have exactly one value for every doc, and
> maybe could be counted relative to docs in a block). For now, I think it's
> probably only practical to implement something like this as part of a
> schema definition in one of the higher-level search servers.
>
> Thanks for the discussion!
> Froh
>
> On Thu, Nov 9, 2023 at 5:01 AM Michael Sokolov  wrote:
>
>> Can you require the user to specify missing: true or missing: false
>> semantics. With that you can decide what to do with the missing values
>>
>> On Thu, Nov 9, 2023, 7:55 AM Mikhail Khludnev  wrote:
>>
>>> Hello Michael.
>>> This optimization "NOT the less common value" assumes that boolean field
>>> is required, but how to enforce this mandatory field constraint in Lucene?
>>> I'm not aware of something like Solr schema or mapping.
>>> If saying foo:true is common, it means that the posting list goes like
>>> dense sequentially increasing numbers 1,2,3,4,5.. May it already be
>>> compressed by codecs like
>>> https://lucene.apache.org/core/9_2_0/core/org/apache/lucene/util/packed/MonotonicBlockPackedWriter.html
>>> ?
>>>
>>> On Thu, Nov 9, 2023 at 3:31 AM Michael Froh  wrote:
>>>
 Hey,

 I've been musing about ideas for a "clever" Boolean field type on
 Lucene for a while, and I think I might have an idea that could work. That
 said, this popped into my head this afternoon and has not been fully-baked.
 It may not be very clever at all.

 My experience is that Boolean fields tend to be overwhelmingly true or
 overwhelmingly false. I've had pretty good luck with using a keyword-style
 field, where the only term represents the more sparse value. (For example,
 I did a thing years ago with explicit tombstones, where versioned deletes
 would have the field "deleted" with a value of "true", and live
 documents didn't have the deleted field at all. Every query would add a
 filter on "NOT deleted:true".)

 That's great when you know up-front what the sparse value is going to
 be. Working on OpenSearch, I just created an issue suggesting that we take
 a hint from users for which value they think is going to be more common so
 we only index the less common one:
 https://github.com/opensearch-project/OpenSearch/issues/11143

 At the Lucene level, though, we could index a Boolean field type as the
 less common term when we flush (by counting the values and figuring out
 which is less common). Then, per segment, we can rewrite any query for the
 more common value as NOT the less common value.

 You can compute upper/lower bounds on the value frequencies cheaply
 during a merge, so I think you could usually write the doc IDs for the less
 common value directly (without needing to count them first), even when
 input segments disagree on whic

Re: Healthy PR Approaches from Apache Beam

2023-11-13 Thread Michael McCandless
Thanks Stefan!

Mike McCandless

http://blog.mikemccandless.com


On Sat, Nov 11, 2023 at 5:22 AM Stefan Vodita 
wrote:

> Thank you for going through all those PRs Mike!
> I opened an issue for porting some of the bot functionality:
> https://github.com/apache/lucene/issues/12796
>
> Stefan
>
>
> On Thu, 2 Nov 2023 at 15:30, Michael McCandless 
> wrote:
>
>> Thanks for raising this Stefan.  This is an impressive approach to more
>> rigorously responding on PRs and taking them through their lifecycle,
>> giving a better community experience especially for newcomers.  I love
>> their docs too.
>>
>> Those graphs are awesome!  Much better than the simple PR open/closed
>> count chart we have in our nightlies:
>> https://home.apache.org/~mikemccand/lucenebench/github_pr_counts.html
>>
>> I just made a pass through some of our PRs (sorted oldest to newest, and
>> sorry for all the dev list noise!) and it's sad how many PRs we (Lucene dev
>> community) really should have responded to, but failed to, in a
>> timely manner.  I think something like the Apache Beam bot could help this,
>> though we don't really document attaching labels to newly opened PRs.
>>
>> I wonder what baby step we could adopt from Beam's approach to PRs?
>> Maybe open an issue on GitHub so we can discuss?
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Tue, Oct 31, 2023 at 5:39 AM Stefan Vodita 
>> wrote:
>>
>>> Hi all,
>>>
>>> I recently learned a few interesting things that the Beam
>>> <https://github.com/apache/beam> project does to
>>> promote and maintain good interactions on PRs.
>>>
>>> 1. Community metrics dashboard
>>> <http://35.193.202.176/d/code_velocity/code-velocity?orgId=1>. The
>>> graphs are pretty and insightful. You can
>>>see things like the number of open PRs across time or the mean time to
>>>first interaction on a new PR.
>>>
>>> 2. Life cycle management for PRs.
>>> a. A bot labels the PR and assigns reviewers based on the labels
>>> (example
>>> <https://github.com/apache/beam/pull/26424#issuecomment-1522788593>).
>>> b. Authors can run and re-run the pre-commit checks (doc
>>> <https://github.com/apache/beam/blob/master/CONTRIBUTING.md#create-a-pull-request>
>>> ).
>>> c. If the PR is not reviewed within 3 business days, the author is
>>> encouraged to notify the mailing list (doc
>>> <https://github.com/apache/beam/blob/master/CONTRIBUTING.md#get-reviewed>
>>> ).
>>> d. If the PR doesn't have activity, the bot comments on it, warning
>>> that it
>>> will be closed (example
>>> <https://github.com/apache/beam/pull/26424#issuecomment-1671254755>).
>>>
>>> It's hard for me to tell which of these ideas would translate well to the
>>> Lucene community, but we can try out something small, like an automated
>>> comment
>>> on stale PRs.
>>>
>>>
>>> Stefan
>>>
>>>
>>> https://github.com/apache/beam
>>> http://35.193.202.176/d/code_velocity/code-velocity?orgId=1
>>> https://github.com/apache/beam/pull/26424#issuecomment-1522788593
>>>
>>> https://github.com/apache/beam/blob/master/CONTRIBUTING.md#create-a-pull-request
>>> https://github.com/apache/beam/blob/master/CONTRIBUTING.md#get-reviewed
>>> https://github.com/apache/beam/pull/26424#issuecomment-1671254755
>>>
>>>


Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-11.0.20) - Build # 14015 - Still Failing!

2023-11-13 Thread Michael McCandless
I linked to this thread on the upstream (OpenJ9) issue about recent Lucene
CI build failures: https://github.com/eclipse-openj9/openj9/issues/18400

Mike McCandless

http://blog.mikemccandless.com


On Mon, Nov 13, 2023 at 3:09 AM Dawid Weiss  wrote:

>
> Sure, thanks. What's strange is that we don't use add-opens anywhere, I
> think (there is a mention of it I left in one of the
> comments, but nothing else across the codebase uses this directive).
>
> > Task :lucene:distribution.tests:compileTestJava
> warning: [options] --add-opens has no effect at compile time
>
>
>
> On Sun, Nov 12, 2023 at 10:56 PM Uwe Schindler  wrote:
>
>> Will check tomorrow, it's too late now.
>>
>> On Jenkins there were no windows builds with IBM and Java 11 yet:
>> https://jenkins.thetaphi.de/job/Lucene-9.x-Windows/
>> Am 12.11.2023 um 22:00 schrieb Dawid Weiss:
>>
>>
>> Hi Uwe,
>>
>> Can you reproduce this on Windows with the same JVM versions though?
>> Seems like I have exactly the same setup and yet this works for me just
>> fine. Strange.
>>
>> Dawid
>>
>> On Sun, Nov 12, 2023 at 9:52 PM Uwe Schindler  wrote:
>>
>>> This one was my first idea, too.
>>>
>>> It fails only with IBM Semeru in combination with Gradle using Temurin.
>>>
>>> I will dig tomorrow on Jenkins server and print all debug info.
>>>
>>> Uwe
>>>
>>>
>>> Am 12. November 2023 21:48:54 MEZ schrieb Dawid Weiss <
>>> dawid.we...@gmail.com>:
>>>

 I can't reproduce this though - used exactly the same JVMs (on Windows):

 > gradlew :lucene:distribution.tests:compileTestJava --rerun-tasks
 --console=plain
 Generating gradle.properties
 ...
 > Task :altJvmWarning
 NOTE: Alternative java toolchain will be used for compilation and tests:
   Project will use 11 (IBM JDK 11.0.20.1+1, home at:
 c:\_tmp\jdk-11.0.20.1+1)
   Gradle runs with 11 (Eclipse Temurin JDK 11.0.21+9, home at:
 C:\_tmp\jdk-11.0.21+9)
 ...
 > Task :lucene:distribution.tests:compileJava NO-SOURCE
 > Task :lucene:distribution.tests:classes UP-TO-DATE
 > Task :lucene:distribution.tests:compileTestJava

 BUILD SUCCESSFUL in 23s
 5 actionable tasks: 5 executed

 On main branch it works, no idea why:
>

 O thought it's because of this:

 https://github.com/apache/lucene/commit/2e12a35c876a

 but I don't think so... seems to work for me on Windows on branch_9x
 just fine?

 D.

>>> --
>>> Uwe Schindler
>>> Achterdiek 19, 28357 Bremen
>>> https://www.thetaphi.de
>>>
>> --
>> Uwe Schindler
>> Achterdiek 19, D-28357 Bremenhttps://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>>


Welcome Patrick Zhai to the Lucene PMC

2023-11-10 Thread Michael McCandless
I'm happy to announce that Patrick Zhai has accepted an invitation to join
the Lucene Project Management Committee (PMC)!

Congratulations Patrick, thank you for all your hard work improving
Lucene's community and source code, and welcome aboard!

Mike McCandless

http://blog.mikemccandless.com


Re: [JENKINS] Lucene » Lucene-Check-9.x - Build # 6861 - Failure!

2023-11-06 Thread Michael McCandless
I pushed a fix.

Curious -- the build indeed fails if I use Java 11 on 9.x, but passes if I
use Java 17+.

I'm really confused.  Did the javadoc checking get weaker with newer JDKs?
Anyway, I'll port this fix to main.

Mike McCandless

http://blog.mikemccandless.com


On Mon, Nov 6, 2023 at 10:27 AM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build: https://ci-builds.apache.org/job/Lucene/job/Lucene-Check-9.x/6861/
>
> No tests ran.
>
> Build Log:
> [...truncated 496 lines...]
> BUILD FAILED in 2m 8s
> 300 actionable tasks: 300 executed
>
> Publishing build scan...
> https://ge.apache.org/s/embosyxjwma5i
>
> Build step 'Invoke Gradle script' changed build result to FAILURE
> Build step 'Invoke Gradle script' marked build as failure
> Archiving artifacts
> Recording test results
> ERROR: Step ‘Publish JUnit test result report’ failed: No test report
> files were found. Configuration error?
> Email was triggered for: Failure - Any
> Sending email for trigger: Failure - Any
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: [JENKINS] Lucene » Lucene-Check-9.x - Build # 6861 - Failure!

2023-11-06 Thread Michael McCandless
Woops -- I'll fix.  renderJavadoc failure!

Mike McCandless

http://blog.mikemccandless.com


On Mon, Nov 6, 2023 at 10:27 AM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build: https://ci-builds.apache.org/job/Lucene/job/Lucene-Check-9.x/6861/
>
> No tests ran.
>
> Build Log:
> [...truncated 496 lines...]
> BUILD FAILED in 2m 8s
> 300 actionable tasks: 300 executed
>
> Publishing build scan...
> https://ge.apache.org/s/embosyxjwma5i
>
> Build step 'Invoke Gradle script' changed build result to FAILURE
> Build step 'Invoke Gradle script' marked build as failure
> Archiving artifacts
> Recording test results
> ERROR: Step ‘Publish JUnit test result report’ failed: No test report
> files were found. Configuration error?
> Email was triggered for: Failure - Any
> Sending email for trigger: Failure - Any
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: [JENKINS] Lucene-main-Linux (64bit/openj9/jdk-17.0.5) - Build # 45394 - Unstable!

2023-11-06 Thread Michael McCandless
On Sun, Nov 5, 2023 at 5:01 AM Uwe Schindler  wrote:

> I will update the J9 runtime later this day. But this was a real bug, so
> it's good it catched this :-) So - no - I won't remove OpenJ9 support at
> all.
>

I see, that's great that J9 build is indeed catching real Lucene bugs!  +1
to keep running it in CI builds.


> The errors someties happen are bugs, they might get better with latest
> versions. I see there's no waslo a Java 20 version. I will give it a try,
> too - especially regarding Panama (+ Vector). Want to see how it behaves.
>
+1

Thanks Uwe.

Mike McCandless

http://blog.mikemccandless.com
>
>


Re: [JENKINS] Lucene-main-Linux (64bit/openj9/jdk-17.0.5) - Build # 45394 - Unstable!

2023-11-04 Thread Michael McCandless
OK I opened https://github.com/eclipse-openj9/openj9/issues/18400 -- let's
see where that goes.

Uwe, should we upgrade to the latest OpenJ9 again maybe?

Mike McCandless

http://blog.mikemccandless.com


On Sat, Nov 4, 2023 at 12:25 PM Michael McCandless <
luc...@mikemccandless.com> wrote:

> Should we maybe stop testing J9?  Reduce its frequency?  So much noise ...
>
> I know I can filter these out from my gmail box.
>
> I will try opening an issue in the OpenJ9 GitHub repo:
> https://github.com/eclipse-openj9/openj9/issues
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, Nov 3, 2023 at 7:43 PM Policeman Jenkins Server <
> jenk...@thetaphi.de> wrote:
>
>> Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45394/
>> Java: 64bit/openj9/jdk-17.0.5 -XX:-UseCompressedOops -Xgcpolicy:metronome
>>
>> 2 tests failed.
>> FAILED:  org.apache.lucene.util.TestRamUsageEstimator.testReferenceSize
>>
>> Error Message:
>> java.lang.AssertionError: For 64 bit JVMs, reference size must be 8,
>> unless compressed references are enabled expected:<8> but was:<4>
>>
>> Stack Trace:
>> java.lang.AssertionError: For 64 bit JVMs, reference size must be 8,
>> unless compressed references are enabled expected:<8> but was:<4>
>> at
>> __randomizedtesting.SeedInfo.seed([91923EC152043BB:15B168BF99C02E62]:0)
>> at app//org.junit.Assert.fail(Assert.java:89)
>> at app//org.junit.Assert.failNotEquals(Assert.java:835)
>> at app//org.junit.Assert.assertEquals(Assert.java:647)
>> at
>> app//org.apache.lucene.util.TestRamUsageEstimator.testReferenceSize(TestRamUsageEstimator.java:195)
>> at 
>> java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>> at java.base@17.0.5
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>> at java.base@17.0.5
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.base@17.0.5
>> /java.lang.reflect.Method.invoke(Method.java:568)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
>> at
>> app//org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
>> at
>> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
>> app//org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>> at
>> app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>> at
>> app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>> at app//org.junit.rules.RunRules.evaluate(RunRules.java:20)
>> at
>> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
>> at
>> app//com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
>> at
>> app//com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
>> at
>> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
>> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> app//org.apache.lucene.tests.util.TestR

Re: [JENKINS] Lucene-main-Linux (64bit/openj9/jdk-17.0.5) - Build # 45394 - Unstable!

2023-11-04 Thread Michael McCandless
Should we maybe stop testing J9?  Reduce its frequency?  So much noise ...

I know I can filter these out from my gmail box.

I will try opening an issue in the OpenJ9 GitHub repo:
https://github.com/eclipse-openj9/openj9/issues

Mike McCandless

http://blog.mikemccandless.com


On Fri, Nov 3, 2023 at 7:43 PM Policeman Jenkins Server 
wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45394/
> Java: 64bit/openj9/jdk-17.0.5 -XX:-UseCompressedOops -Xgcpolicy:metronome
>
> 2 tests failed.
> FAILED:  org.apache.lucene.util.TestRamUsageEstimator.testReferenceSize
>
> Error Message:
> java.lang.AssertionError: For 64 bit JVMs, reference size must be 8,
> unless compressed references are enabled expected:<8> but was:<4>
>
> Stack Trace:
> java.lang.AssertionError: For 64 bit JVMs, reference size must be 8,
> unless compressed references are enabled expected:<8> but was:<4>
> at
> __randomizedtesting.SeedInfo.seed([91923EC152043BB:15B168BF99C02E62]:0)
> at app//org.junit.Assert.fail(Assert.java:89)
> at app//org.junit.Assert.failNotEquals(Assert.java:835)
> at app//org.junit.Assert.assertEquals(Assert.java:647)
> at
> app//org.apache.lucene.util.TestRamUsageEstimator.testReferenceSize(TestRamUsageEstimator.java:195)
> at 
> java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at java.base@17.0.5
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at java.base@17.0.5
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base@17.0.5
> /java.lang.reflect.Method.invoke(Method.java:568)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> at
> app//org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
> at
> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> app//org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> at
> app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at app//org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at
> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> at
> app//com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> at
> app//com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> at
> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> app//org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> at
> app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
> at
> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at
> app//org.apache.lu

Re: [JENKINS] Lucene-main-Linux (64bit/openj9/jdk-17.0.5) - Build # 45409 - Unstable!

2023-11-04 Thread Michael McCandless
Likely J9 specific?

Mike McCandless

http://blog.mikemccandless.com


On Sat, Nov 4, 2023 at 11:34 AM Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45409/
> Java: 64bit/openj9/jdk-17.0.5 -XX:-UseCompressedOops -Xgcpolicy:gencon
>
> 2 tests failed.
> FAILED:  org.apache.lucene.util.TestRamUsageEstimator.testQuery
>
> Error Message:
> java.lang.AssertionError: expected:<1160.0> but was:<1760.0>
>
> Stack Trace:
> java.lang.AssertionError: expected:<1160.0> but was:<1760.0>
> at
> __randomizedtesting.SeedInfo.seed([7FE6F0C1668A9364:F49784B29700A9B1]:0)
> at app//org.junit.Assert.fail(Assert.java:89)
> at app//org.junit.Assert.failNotEquals(Assert.java:835)
> at app//org.junit.Assert.assertEquals(Assert.java:555)
> at app//org.junit.Assert.assertEquals(Assert.java:685)
> at
> app//org.apache.lucene.util.TestRamUsageEstimator.testQuery(TestRamUsageEstimator.java:189)
> at 
> java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at java.base@17.0.5
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at java.base@17.0.5
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base@17.0.5
> /java.lang.reflect.Method.invoke(Method.java:568)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> at
> app//org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
> at
> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> app//org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> at
> app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at app//org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at
> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> at
> app//com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> at
> app//com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> at
> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> app//org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> at
> app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
> at
> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at
> app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> app//org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
> at app//org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   

Re: [JENKINS] Lucene-main-Windows (64bit/openj9/jdk-17.0.5) - Build # 13400 - Unstable!

2023-11-04 Thread Michael McCandless
Maybe J9 specific?

Mike McCandless

http://blog.mikemccandless.com


On Sat, Nov 4, 2023 at 11:01 AM Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-main-Windows/13400/
> Java: 64bit/openj9/jdk-17.0.5 -XX:-UseCompressedOops -Xgcpolicy:gencon
>
> 2 tests failed.
> FAILED:  org.apache.lucene.util.TestRamUsageEstimator.testReferenceSize
>
> Error Message:
> java.lang.AssertionError: For 64 bit JVMs, reference size must be 8,
> unless compressed references are enabled expected:<8> but was:<4>
>
> Stack Trace:
> java.lang.AssertionError: For 64 bit JVMs, reference size must be 8,
> unless compressed references are enabled expected:<8> but was:<4>
> at
> __randomizedtesting.SeedInfo.seed([41AB595A28A8656B:5D031209A44808B2]:0)
> at app//org.junit.Assert.fail(Assert.java:89)
> at app//org.junit.Assert.failNotEquals(Assert.java:835)
> at app//org.junit.Assert.assertEquals(Assert.java:647)
> at
> app//org.apache.lucene.util.TestRamUsageEstimator.testReferenceSize(TestRamUsageEstimator.java:195)
> at 
> java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at java.base@17.0.5
> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at java.base@17.0.5
> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base@17.0.5
> /java.lang.reflect.Method.invoke(Method.java:568)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> at
> app//org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
> at
> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> app//org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> at
> app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at app//org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at
> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> at
> app//com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> at
> app//com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> at
> app//com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> at
> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> app//org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> at
> app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at
> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
> at
> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at
> app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at
> app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at
> app//org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(Test

Re: Squash vs merge of PRs

2023-11-04 Thread Michael McCandless
I didn't realize the community had decided squashing (rewriting history)
was our standard.

> Comparing histories between branches with git-bisect to find bugs is just
one example.

But if the bug was introduced in one of the N local commits the developer
had done, wouldn't that be helpful?  You could see that one commit instead
of all N squashed, and get better context on how/why the bug was introduced?

I would prefer history-preserving commits.  It can reveal/preserve
important information -- like we tried one approach, and discovered some
issue, tweaked it to a better approach.  This can be useful in the future
if someone is working on that part of the code and is trying to understand
why it was done a certain way.  It preserves the natural and healthy
iterations we all experience when working closely together.  Why discard
such possibly helpful history?

Also, one can always wear hazy glasses in the future to "summarize" the
full history down to a view that's more palatable to them personally, if
you don't like seeing merge commit branching.  But we cannot do the
reverse.  Discarding the actual development history is a one-way door.

http://blog.mikemccandless.com


On Sat, Nov 4, 2023 at 11:03 AM Gus Heck  wrote:

> Also, since (as noted) this is a previously decided issue, not sure why
> this is a list email instead of a simple direct query to Robert seeking to
> understand the specific case? No need to make a public discussion unless
> it's a long term pattern, actually breaking something, or we want to change
> something?
>
> On Sat, Nov 4, 2023 at 9:37 AM Benjamin Trent 
> wrote:
>
>> TL;DR, forcing non-committers to squash things is a good idea. Enforcing
>> through some measure for committers is a bad idea.
>>
>> Since this thread is now in Robert's spam, I am guessing it won't have
>> any impact :). I do not think Robert is actively trying hurt the project in
>> any way. It seems to me that he doesn't think a clean git history is worth
>> the effort.
>>
>> Having a clean git history makes things easier for everyone. Comparing
>> histories between branches with git-bisect to find bugs is just one
>> example. Another is simply reading commits to see when
>> features/bug fixes/etc. were added.
>>
>> I do NOT think we should add procedures or branch protections to actively
>> enforce this.
>>
>> Small personal sacrifices (like dealing with commit conflicts) are
>> necessary for a community. Being part of a community is about buying into
>> what the community is about and working towards a common goal. Many times
>> we do things we don't agree with, or make things slightly more difficult
>> for us, for the community as a whole. This thing being OSS shows that we
>> all buy into its importance and are willing to put work into the project.
>>
>> Having a cultural default of "make things nice for others" is good.
>> Enforcing this ideology on others is antithesis to its definition.
>>
>>
>>
>> On Sat, Nov 4, 2023 at 9:02 AM Robert Muir  wrote:
>>
>>> This isn't a community issue, it is me avoiding useless unnecessary
>>> merge conflicts. Word "community" is invoked here to try to make it
>>> out, like you can hold a vote about what git commands i should type on
>>> my computer? You know that isn't gonna work. have some humility.
>>>
>>> thread moved to spam.
>>>
>>> On Sat, Nov 4, 2023 at 8:36 AM Mike Drob  wrote:
>>> >
>>> > We all agree on using Java though, and using a specific version, and
>>> even the style output from gradle tidy. Is that nanny state or community
>>> consensus?
>>> >
>>> > On Sat, Nov 4, 2023 at 7:29 AM Robert Muir  wrote:
>>> >>
>>> >> example of a nanny state IMO, trying to dictate what git commands to
>>> >> use, or what editor to use. Maybe this works for you in your corporate
>>> >> hellholes, but I think some folks have a bit of a power issue, are
>>> >> accustomed to dictacting this stuff to their employees and so on, but
>>> >> this is open-source. I don't report to you, i dont use the editor you
>>> >> tell me, or the git commands you tell me.
>>> >>
>>> >> On Sat, Nov 4, 2023 at 8:21 AM Uwe Schindler  wrote:
>>> >> >
>>> >> > Hi,
>>> >> >
>>> >> > I just wanted to give your attention to the following discussion:
>>> >> > https://github.com/apache/lucene/pull/12737#issuecomment-1793426911
>>> >> >
>>> >> >  From my knowledge the Lucene (and Solr) community decided a while
>>> back
>>> >> > to disable merging and only allow squashig of PRs. Robert always did
>>> >> > this, but because of a one-time problem with two branches he was
>>> working
>>> >> > on in parallel, he suddenly changed his mind and did merges on his
>>> own,
>>> >> > not sqashing the branch and pushing to ASF Git.
>>> >> >
>>> >> > I am also not a fan of removing all history, but especially for
>>> heavy
>>> >> > committing branches like the given PR, I think we should invite our
>>> >> > committers to also adhere to community standards everyone else
>>> >> > practices. I would agree with merging those 

Re: Healthy PR Approaches from Apache Beam

2023-11-02 Thread Michael McCandless
Thanks for raising this Stefan.  This is an impressive approach to more
rigorously responding on PRs and taking them through their lifecycle,
giving a better community experience especially for newcomers.  I love
their docs too.

Those graphs are awesome!  Much better than the simple PR open/closed count
chart we have in our nightlies:
https://home.apache.org/~mikemccand/lucenebench/github_pr_counts.html

I just made a pass through some of our PRs (sorted oldest to newest, and
sorry for all the dev list noise!) and it's sad how many PRs we (Lucene dev
community) really should have responded to, but failed to, in a
timely manner.  I think something like the Apache Beam bot could help this,
though we don't really document attaching labels to newly opened PRs.

I wonder what baby step we could adopt from Beam's approach to PRs?  Maybe
open an issue on GitHub so we can discuss?

Mike McCandless

http://blog.mikemccandless.com


On Tue, Oct 31, 2023 at 5:39 AM Stefan Vodita 
wrote:

> Hi all,
>
> I recently learned a few interesting things that the Beam
>  project does to
> promote and maintain good interactions on PRs.
>
> 1. Community metrics dashboard
> . The graphs
> are pretty and insightful. You can
>see things like the number of open PRs across time or the mean time to
>first interaction on a new PR.
>
> 2. Life cycle management for PRs.
> a. A bot labels the PR and assigns reviewers based on the labels
> (example
> ).
> b. Authors can run and re-run the pre-commit checks (doc
> 
> ).
> c. If the PR is not reviewed within 3 business days, the author is
> encouraged to notify the mailing list (doc
> 
> ).
> d. If the PR doesn't have activity, the bot comments on it, warning
> that it
> will be closed (example
> ).
>
> It's hard for me to tell which of these ideas would translate well to the
> Lucene community, but we can try out something small, like an automated
> comment
> on stale PRs.
>
>
> Stefan
>
>
> https://github.com/apache/beam
> http://35.193.202.176/d/code_velocity/code-velocity?orgId=1
> https://github.com/apache/beam/pull/26424#issuecomment-1522788593
>
> https://github.com/apache/beam/blob/master/CONTRIBUTING.md#create-a-pull-request
> https://github.com/apache/beam/blob/master/CONTRIBUTING.md#get-reviewed
> https://github.com/apache/beam/pull/26424#issuecomment-1671254755
>
>


Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-17.0.5) - Build # 13732 - Unstable!

2023-10-31 Thread Michael McCandless
Wow, thank you Adrien!  Cascaded merges count as uncommitted changes ...

Mike McCandless

http://blog.mikemccandless.com


On Tue, Oct 31, 2023 at 4:51 AM Adrien Grand  wrote:

> I pushed a fix for these failures:
> https://github.com/apache/lucene/commit/85f5d3bb0bf84fed46ca4c093c1aa084e4a43873
>
> On Fri, Oct 27, 2023 at 9:55 AM Policeman Jenkins Server <
> jenk...@thetaphi.de> wrote:
>
>> Build: https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/13732/
>> Java: 64bit/openj9/jdk-17.0.5 -XX:-UseCompressedOops -Xgcpolicy:gencon
>>
>> 1 tests failed.
>> FAILED:  org.apache.lucene.index.TestIndexWriter.testHasUncommittedChanges
>>
>> Error Message:
>> java.lang.AssertionError
>>
>> Stack Trace:
>> java.lang.AssertionError
>> at
>> __randomizedtesting.SeedInfo.seed([63AADDD55C51D4C2:45E1C0A475266832]:0)
>> at app//org.junit.Assert.fail(Assert.java:87)
>> at app//org.junit.Assert.assertTrue(Assert.java:42)
>> at app//org.junit.Assert.assertFalse(Assert.java:65)
>> at app//org.junit.Assert.assertFalse(Assert.java:75)
>> at
>> app//org.apache.lucene.index.TestIndexWriter.testHasUncommittedChanges(TestIndexWriter.java:2400)
>> at 
>> java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>> at java.base@17.0.5
>> /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>> at java.base@17.0.5
>> /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.base@17.0.5
>> /java.lang.reflect.Method.invoke(Method.java:568)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
>> at
>> app//org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
>> at
>> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
>> app//org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>> at
>> app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>> at
>> app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>> at app//org.junit.rules.RunRules.evaluate(RunRules.java:20)
>> at
>> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
>> at
>> app//com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
>> at
>> app//com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
>> at
>> app//com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
>> at
>> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
>> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> app//org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
>> at
>> app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>> at
>> app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>> at
>> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
>> at
>> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
>> app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>> at
>> app//org.apache.lucene.tests.util.TestRuleIgnoreAfterM

Re: [JENKINS] Lucene » Lucene-Coverage-main - Build # 937 - Unstable!

2023-10-29 Thread Michael McCandless
OK I pushed a fix:
https://github.com/apache/lucene/commit/11436a848cbcc8302b31ca01e63e57dd35a93b1e

Mike

On Sat, Oct 28, 2023 at 3:10 PM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build:
> https://ci-builds.apache.org/job/Lucene/job/Lucene-Coverage-main/937/
>
> 1 tests failed.
> FAILED:  org.apache.lucene.util.automaton.TestAutomaton.testRandomFinite
>
> Error Message:
> java.lang.IllegalArgumentException: This builder doesn't allow terms that
> are larger than 1000 characters, got [e3 8e be e3 8f 80 e3 8c 92 e3 8c a5
> e3 8f b4 e3 8d 89 e3 8c 89 e3 8f 82 e3 8f b4 e3 8c b5 f0 90 8a a2 f0 90 8a
> bb f0 90 8b 83 f0 90 8a ab f0 90 8a b9 f0 90 8b 87 f0 90 8b 89 f0 90 8a b1
> f0 90 8a a3 f0 90 8a a2 f0 90 8a b3 f0 90 8a aa f0 90 8a ad f0 90 8a b4 f0
> 90 91 98 f0 90 91 98 f0 90 91 ad f0 90 91 aa f0 90 91 9d f0 90 91 b8 f0 90
> 91 92 f0 90 91 a0 f0 90 91 af f0 90 91 9e f0 90 91 a3 f0 90 91 94 f0 90 91
> 9d f0 90 91 99 ea 9a 99 ea 99 88 ea 99 b9 ea 99 af ea 99 9d ea 99 95 ea 99
> 8d ea 99 8b ea 9a 9f ea 9a 90 ea 99 bb ea 99 a8 ea 99 b8 ea 99 8e ef b8 81
> ef b8 85 ef b8 87 ef b8 8c ef b8 8b ef b8 8f ef b8 8e ef b8 8f ef b8 82 ef
> b8 82 ef b8 86 ef b8 84 ef b8 8c ef b8 8b f0 90 83 bc f0 90 83 85 f0 90 83
> aa f0 90 83 b5 f0 90 82 92 f0 90 83 8e f0 90 82 b2 f0 90 83 a1 f0 90 82 aa
> d4 a3 d4 91 d4 84 d4 9a d4 90 d4 9d e3 bb a2 e4 b1 92 e3 a9 ac e4 9f 8a e4
> ab 80 e3 a0 b2 e4 9b ad e4 97 bc e4 af a4 e4 b3 a9 e3 9c ba e3 a6 b8 e4 a8
> a9 e4 8b 8e f0 90 ad b0 f0 90 ad be f0 90 ad b5 f0 90 ad b2 f0 90 ad ac f0
> 90 ad b4 f0 90 ad ae f0 90 ad b7 f0 90 ad aa f0 90 ad a3 f0 90 ad b5 f0 90
> ad a1 f0 90 ad b6 f0 90 ad a1 f0 90 ad ad f0 90 ad b3 f0 90 ad aa ea 9b 94
> ea 9b ba ea 9a a1 e1 83 a4 e1 83 a5 e1 82 b9 e1 82 b7 e1 83 86 e1 83 b5 e1
> 83 ab e1 83 bd e1 83 95 e1 82 be e1 83 a4 e1 83 8c e1 83 9d e2 bd 94 e2 be
> 98 e2 be 9c e2 be a5 e2 be a6 e2 be ba e2 bc ae e2 bc b6 f0 90 ad aa e1 a5
> b6 e1 a5 9e e1 a5 98 e1 a5 93 e1 a5 ab e1 a5 a1 e1 a5 90 e1 a5 b7 e1 a5 9d
> f3 b0 9d ad f3 bb b2 bf f3 b9 90 b3 f3 b5 86 8c f3 b8 9e a4 f3 b8 a6 8b f3
> b4 ae a2 f3 b5 95 b5 f3 bf 8f 8d f3 bd 8e a7 f3 b4 ba 81 f3 b4 81 be f3 b4
> 93 b2 f3 bb bf 84 f3 b0 aa 8e f3 b9 8d 9c f3 b5 a2 85 f3 b7 8d 93 f0 92 91
> b6 f0 92 90 8e f0 92 91 8d f0 92 91 8b f0 92 91 b2 f0 92 91 8c f0 92 90 86
> f0 9d 8d 8c f0 9d 8c 85 f0 9d 8d 94 f0 9d 8c be f0 9d 8c b0 f0 9d 8c b1 f0
> 9d 8c a5 f0 9d 8d 8d f0 9d 8c 81 f0 9d 8c b2 f0 9d 8d 93 f0 9d 8c b7 f0 9d
> 8d 85 f0 9d 8c bd f0 9d 8c 87 e1 b1 a5 e1 b1 b8 e1 b1 b9 e1 b1 a0 e1 b1 a2
> e1 b1 a6 e1 b1 b3 e1 b1 a7 e1 b1 b3 e1 b1 ab e1 b1 ac c7 af c7 93 c8 ae c7
> b2 c6 9d c8 be c8 a5 c7 81 c8 b6 c9 8a c9 81 e1 86 89 e1 84 9d e1 87 ad e1
> 86 b2 e1 86 a0 e1 87 92 e1 87 ad e1 85 87 e1 86 a2 e1 86 ae e1 84 86 e1 84
> 98 e1 85 b5 e1 85 b8 e1 87 b3 e1 85 be e1 86 b0 e1 84 98 e1 87 8a e1 87 97
> f0 9f 85 b9 f0 9f 85 a3 f0 9f 86 ab f0 9f 84 a6 f0 9f 85 b7 f0 9d 8c b0 e0
> ba a6 e0 ba 84 e0 ba 8b e0 ba 94 e0 bb 8f e0 bb ba e0 ba ad e0 bb a6 e0 bb
> 88 e0 ba a3 e0 bb 93 e0 ba 99 e0 bb 92 e0 ba b3 e0 ba 88 e0 bb b5 e0 ba 87
> e0 ba bc e1 8f 81 e1 8f 94 e1 8f ab e1 8f 98 e1 8e b0 e1 8f a0 e1 8f a7 e1
> 8e bd e1 8e b4 e1 8f a5 e1 8f 9d e1 8f b2 e1 8f bb e1 8e b8 f0 90 b1 80 f0
> 90 b0 8e f0 90 b0 a8 f0 90 b0 99 f0 90 b1 8a f0 90 b1 81 f0 90 b1 87 f0 90
> b1 87 f0 90 b1 8d f0 90 b0 af f0 90 b1 80 f0 90 b0 8d f0 90 b0 b7 f0 90 b1
> 84 f0 90 b0 8c f0 90 b0 b2 ef b9 9b ef b9 99 ef b9 91 ef b9 9e ef b9 9e ef
> b9 a7 ef b9 93 ef b9 9f e0 ac 8c e0 ad bb e0 ac 82 e0 ad 8a e0 ac 84 e0 ac
> bb e0 ad 8b e0 ac b1 e0 ad a7 e0 ac a4 e0 ad 94 e0 ad be e0 ad 88 e0 ad 8d]
>
> Stack Trace:
> java.lang.IllegalArgumentException: This builder doesn't allow terms that
> are larger than 1000 characters, got [e3 8e be e3 8f 80 e3 8c 92 e3 8c a5
> e3 8f b4 e3 8d 89 e3 8c 89 e3 8f 82 e3 8f b4 e3 8c b5 f0 90 8a a2 f0 90 8a
> bb f0 90 8b 83 f0 90 8a ab f0 90 8a b9 f0 90 8b 87 f0 90 8b 89 f0 90 8a b1
> f0 90 8a a3 f0 90 8a a2 f0 90 8a b3 f0 90 8a aa f0 90 8a ad f0 90 8a b4 f0
> 90 91 98 f0 90 91 98 f0 90 91 ad f0 90 91 aa f0 90 91 9d f0 90 91 b8 f0 90
> 91 92 f0 90 91 a0 f0 90 91 af f0 90 91 9e f0 90 91 a3 f0 90 91 94 f0 90 91
> 9d f0 90 91 99 ea 9a 99 ea 99 88 ea 99 b9 ea 99 af ea 99 9d ea 99 95 ea 99
> 8d ea 99 8b ea 9a 9f ea 9a 90 ea 99 bb ea 99 a8 ea 99 b8 ea 99 8e ef b8 81
> ef b8 85 ef b8 87 ef b8 8c ef b8 8b ef b8 8f ef b8 8e ef b8 8f ef b8 82 ef
> b8 82 ef b8 86 ef b8 84 ef b8 8c ef b8 8b f0 90 83 bc f0 90 83 85 f0 90 83
> aa f0 90 83 b5 f0 90 82 92 f0 90 83 8e f0 90 82 b2 f0 90 83 a1 f0 90 82 aa
> d4 a3 d4 91 d4 84 d4 9a d4 90 d4 9d e3 bb a2 e4 b1 92 e3 a9 ac e4 9f 8a e4
> ab 80 e3 a0 b2 e4 9b ad e4 97 bc e4 af a4 e4 b3 a9 e3 9c ba e3 a6 b8 e4 a8
> a9 e4 8b 8e f0 90 ad b0 f0 90 ad be f0 90 ad b5 f0 90 ad b2 f0 90 ad ac f0
> 90 ad b4 f0 90 ad ae f0 90 ad b7 f0 90 ad aa f0 90 ad a3 f0 90 ad b5 f0 90
> ad a1 f0 90 ad b6 f0 90 ad a1 f0 90 ad ad f0 90 ad b3 f0 90 ad aa ea 9b 94
> ea 9b ba ea 9a a1 e1 83 a4 e1 83 a5 e1 82 b9 e1

Re: [JENKINS] Lucene » Lucene-Coverage-main - Build # 937 - Unstable!

2023-10-29 Thread Michael McCandless
This repros for me ... silly test bug.  I'll commit a fix.

Mike

On Sat, Oct 28, 2023 at 3:10 PM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build:
> https://ci-builds.apache.org/job/Lucene/job/Lucene-Coverage-main/937/
>
> 1 tests failed.
> FAILED:  org.apache.lucene.util.automaton.TestAutomaton.testRandomFinite
>
> Error Message:
> java.lang.IllegalArgumentException: This builder doesn't allow terms that
> are larger than 1000 characters, got [e3 8e be e3 8f 80 e3 8c 92 e3 8c a5
> e3 8f b4 e3 8d 89 e3 8c 89 e3 8f 82 e3 8f b4 e3 8c b5 f0 90 8a a2 f0 90 8a
> bb f0 90 8b 83 f0 90 8a ab f0 90 8a b9 f0 90 8b 87 f0 90 8b 89 f0 90 8a b1
> f0 90 8a a3 f0 90 8a a2 f0 90 8a b3 f0 90 8a aa f0 90 8a ad f0 90 8a b4 f0
> 90 91 98 f0 90 91 98 f0 90 91 ad f0 90 91 aa f0 90 91 9d f0 90 91 b8 f0 90
> 91 92 f0 90 91 a0 f0 90 91 af f0 90 91 9e f0 90 91 a3 f0 90 91 94 f0 90 91
> 9d f0 90 91 99 ea 9a 99 ea 99 88 ea 99 b9 ea 99 af ea 99 9d ea 99 95 ea 99
> 8d ea 99 8b ea 9a 9f ea 9a 90 ea 99 bb ea 99 a8 ea 99 b8 ea 99 8e ef b8 81
> ef b8 85 ef b8 87 ef b8 8c ef b8 8b ef b8 8f ef b8 8e ef b8 8f ef b8 82 ef
> b8 82 ef b8 86 ef b8 84 ef b8 8c ef b8 8b f0 90 83 bc f0 90 83 85 f0 90 83
> aa f0 90 83 b5 f0 90 82 92 f0 90 83 8e f0 90 82 b2 f0 90 83 a1 f0 90 82 aa
> d4 a3 d4 91 d4 84 d4 9a d4 90 d4 9d e3 bb a2 e4 b1 92 e3 a9 ac e4 9f 8a e4
> ab 80 e3 a0 b2 e4 9b ad e4 97 bc e4 af a4 e4 b3 a9 e3 9c ba e3 a6 b8 e4 a8
> a9 e4 8b 8e f0 90 ad b0 f0 90 ad be f0 90 ad b5 f0 90 ad b2 f0 90 ad ac f0
> 90 ad b4 f0 90 ad ae f0 90 ad b7 f0 90 ad aa f0 90 ad a3 f0 90 ad b5 f0 90
> ad a1 f0 90 ad b6 f0 90 ad a1 f0 90 ad ad f0 90 ad b3 f0 90 ad aa ea 9b 94
> ea 9b ba ea 9a a1 e1 83 a4 e1 83 a5 e1 82 b9 e1 82 b7 e1 83 86 e1 83 b5 e1
> 83 ab e1 83 bd e1 83 95 e1 82 be e1 83 a4 e1 83 8c e1 83 9d e2 bd 94 e2 be
> 98 e2 be 9c e2 be a5 e2 be a6 e2 be ba e2 bc ae e2 bc b6 f0 90 ad aa e1 a5
> b6 e1 a5 9e e1 a5 98 e1 a5 93 e1 a5 ab e1 a5 a1 e1 a5 90 e1 a5 b7 e1 a5 9d
> f3 b0 9d ad f3 bb b2 bf f3 b9 90 b3 f3 b5 86 8c f3 b8 9e a4 f3 b8 a6 8b f3
> b4 ae a2 f3 b5 95 b5 f3 bf 8f 8d f3 bd 8e a7 f3 b4 ba 81 f3 b4 81 be f3 b4
> 93 b2 f3 bb bf 84 f3 b0 aa 8e f3 b9 8d 9c f3 b5 a2 85 f3 b7 8d 93 f0 92 91
> b6 f0 92 90 8e f0 92 91 8d f0 92 91 8b f0 92 91 b2 f0 92 91 8c f0 92 90 86
> f0 9d 8d 8c f0 9d 8c 85 f0 9d 8d 94 f0 9d 8c be f0 9d 8c b0 f0 9d 8c b1 f0
> 9d 8c a5 f0 9d 8d 8d f0 9d 8c 81 f0 9d 8c b2 f0 9d 8d 93 f0 9d 8c b7 f0 9d
> 8d 85 f0 9d 8c bd f0 9d 8c 87 e1 b1 a5 e1 b1 b8 e1 b1 b9 e1 b1 a0 e1 b1 a2
> e1 b1 a6 e1 b1 b3 e1 b1 a7 e1 b1 b3 e1 b1 ab e1 b1 ac c7 af c7 93 c8 ae c7
> b2 c6 9d c8 be c8 a5 c7 81 c8 b6 c9 8a c9 81 e1 86 89 e1 84 9d e1 87 ad e1
> 86 b2 e1 86 a0 e1 87 92 e1 87 ad e1 85 87 e1 86 a2 e1 86 ae e1 84 86 e1 84
> 98 e1 85 b5 e1 85 b8 e1 87 b3 e1 85 be e1 86 b0 e1 84 98 e1 87 8a e1 87 97
> f0 9f 85 b9 f0 9f 85 a3 f0 9f 86 ab f0 9f 84 a6 f0 9f 85 b7 f0 9d 8c b0 e0
> ba a6 e0 ba 84 e0 ba 8b e0 ba 94 e0 bb 8f e0 bb ba e0 ba ad e0 bb a6 e0 bb
> 88 e0 ba a3 e0 bb 93 e0 ba 99 e0 bb 92 e0 ba b3 e0 ba 88 e0 bb b5 e0 ba 87
> e0 ba bc e1 8f 81 e1 8f 94 e1 8f ab e1 8f 98 e1 8e b0 e1 8f a0 e1 8f a7 e1
> 8e bd e1 8e b4 e1 8f a5 e1 8f 9d e1 8f b2 e1 8f bb e1 8e b8 f0 90 b1 80 f0
> 90 b0 8e f0 90 b0 a8 f0 90 b0 99 f0 90 b1 8a f0 90 b1 81 f0 90 b1 87 f0 90
> b1 87 f0 90 b1 8d f0 90 b0 af f0 90 b1 80 f0 90 b0 8d f0 90 b0 b7 f0 90 b1
> 84 f0 90 b0 8c f0 90 b0 b2 ef b9 9b ef b9 99 ef b9 91 ef b9 9e ef b9 9e ef
> b9 a7 ef b9 93 ef b9 9f e0 ac 8c e0 ad bb e0 ac 82 e0 ad 8a e0 ac 84 e0 ac
> bb e0 ad 8b e0 ac b1 e0 ad a7 e0 ac a4 e0 ad 94 e0 ad be e0 ad 88 e0 ad 8d]
>
> Stack Trace:
> java.lang.IllegalArgumentException: This builder doesn't allow terms that
> are larger than 1000 characters, got [e3 8e be e3 8f 80 e3 8c 92 e3 8c a5
> e3 8f b4 e3 8d 89 e3 8c 89 e3 8f 82 e3 8f b4 e3 8c b5 f0 90 8a a2 f0 90 8a
> bb f0 90 8b 83 f0 90 8a ab f0 90 8a b9 f0 90 8b 87 f0 90 8b 89 f0 90 8a b1
> f0 90 8a a3 f0 90 8a a2 f0 90 8a b3 f0 90 8a aa f0 90 8a ad f0 90 8a b4 f0
> 90 91 98 f0 90 91 98 f0 90 91 ad f0 90 91 aa f0 90 91 9d f0 90 91 b8 f0 90
> 91 92 f0 90 91 a0 f0 90 91 af f0 90 91 9e f0 90 91 a3 f0 90 91 94 f0 90 91
> 9d f0 90 91 99 ea 9a 99 ea 99 88 ea 99 b9 ea 99 af ea 99 9d ea 99 95 ea 99
> 8d ea 99 8b ea 9a 9f ea 9a 90 ea 99 bb ea 99 a8 ea 99 b8 ea 99 8e ef b8 81
> ef b8 85 ef b8 87 ef b8 8c ef b8 8b ef b8 8f ef b8 8e ef b8 8f ef b8 82 ef
> b8 82 ef b8 86 ef b8 84 ef b8 8c ef b8 8b f0 90 83 bc f0 90 83 85 f0 90 83
> aa f0 90 83 b5 f0 90 82 92 f0 90 83 8e f0 90 82 b2 f0 90 83 a1 f0 90 82 aa
> d4 a3 d4 91 d4 84 d4 9a d4 90 d4 9d e3 bb a2 e4 b1 92 e3 a9 ac e4 9f 8a e4
> ab 80 e3 a0 b2 e4 9b ad e4 97 bc e4 af a4 e4 b3 a9 e3 9c ba e3 a6 b8 e4 a8
> a9 e4 8b 8e f0 90 ad b0 f0 90 ad be f0 90 ad b5 f0 90 ad b2 f0 90 ad ac f0
> 90 ad b4 f0 90 ad ae f0 90 ad b7 f0 90 ad aa f0 90 ad a3 f0 90 ad b5 f0 90
> ad a1 f0 90 ad b6 f0 90 ad a1 f0 90 ad ad f0 90 ad b3 f0 90 ad aa ea 9b 94
> ea 9b ba ea 9a a1 e1 83 a4 e1 83 a5 e1 82 b9 e1 82 b7 e1 83 86 e1 83 b5 e1
> 83 ab e1 83

Re: Welcome Guo Feng to the Lucene PMC

2023-10-25 Thread Michael McCandless
Welcome Feng!

Mike McCandless

http://blog.mikemccandless.com


On Wed, Oct 25, 2023 at 5:05 AM Michael Sokolov  wrote:

> Welcome, gf2121!
>
> On Wed, Oct 25, 2023, 3:03 AM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> Congratulations and welcome, Feng!
>>
>> On Tue, 24 Oct 2023 at 22:35, Adrien Grand  wrote:
>>
>>> I'm pleased to announce that Guo Feng has accepted an invitation to join
>>> the Lucene PMC!
>>>
>>> Congratulations Feng, and welcome aboard!
>>>
>>> --
>>> Adrien
>>>
>>


Re: Could we allow an IndexInput to read from a still writing IndexOutput?

2023-10-23 Thread Michael McCandless
Thanks everyone!  Responses below:

On Thu, Oct 19, 2023 at 11:17 AM Robert Muir  wrote:

> what will happen on windows?
>
> sorry, could not resist.
>

LOL, yeah, sigh.

On Thu, Oct 19, 2023 at 10:36 PM Dawid Weiss  wrote:

>
> I think there is a certain beauty (of tape-backed storage flavor...) in
> existing abstractions and I wouldn't change them unless absolutely
> necessary (FST construction isn't the dominant cost in indexing). Also,
> random seeks all over the place may be really problematic in certain
> scenarios (as is opening a written-to file for reading, as Robert
> mentioned).
>

I do agree.  I love how minimalistic the IO semantics Lucene actually
requires are.

> Failing that, our plan B is to wastefully duplicate the byte[] slices
> from the already written bytes into our own private (heap resident, boo)
> copy, which would use quite a bit more RAM while building the FST, and make
> less minimal FSTs for a given RAM budget.
>
> Well, this node cache doesn't have to be on heap... It can be a plain
> temporary file (with full random access). It's a scratch-only structure
> which you can delete after the fst is written. It does add I/O overhead but
> doesn't interfere with the rest of the code in Lucene. Perhaps, instead of
> changing IndexInput and IndexOutput, one could start with a plain temp file
> (NIO API)?
>

That's an interesting option.  I had ruled out "bypassing Directory
abstraction and going straight to JDK IO APIs", but maybe it's OK to do
so.  I like this option Dawid!


> I also think that the tradeoffs presented in graphs on the fst-node-cache
> issue are not so bad at all. Yes, the FST is not minimal, but the
> construction-space vs output size is quite all right to me.
>

Well, the tradeoffs I posted in this PR
 (now merged, to main, and
eventually to 9.x) are only if we still buffer the whole FST in RAM, and so
we use that as our random-access cache to past FST nodes.  If we succeed in
changing FST writing to fully off-heap (append bytes directly to disk),
then we need to duplicate that random-access RAM somewhere else (maybe a
direct NIO file, maybe just duplicated byte[] copies in the NodeHash).  So
net/net those curves will get worse -- more RAM required to achieve the
same minimality.  I haven't tested just how much worse yet ... I wanted to
probe this possibility (random read access on an appending write file)
first to not wastefully duplicate these bytes in RAM.

On Sat, Oct 21, 2023 at 1:09 AM Uwe Schindler  wrote:

> Hi, the biggest problem is with some IndexInputs that work on FS Cache
> (mmapdir). The file size changes while you are writing therefore it could
> cause strange issues. Especially the mapping of mmap may not see the
> changes you have already written as there is no happens-before relationship.
>
Hmm, I didn't realize Panama's MMap implementation had this limitation.  Or
maybe you are saying this is an OS level limitation?  Because when you map
a region of a file, you must give a bounded range (0 .. file-length), and
then if the file grows, you would have to re-map or make a 2nd, 3rd, ...
map?  Yeah OK this seems problematic indeed.

> So as said by the others, if you need stuff already written, keep it in
> memory (like nodes). We should really not change our IO model for this
> singleton. 1% slowdown while writing due to some caching of buffering does
> not matter and risk us corrupting indexes or run into errors while reading.
>
Yeah OK I'm convinced :)  Let's leave Lucene's IO "WORM" semantics intact,
and either use direct NIO for the suffix hash (NodeHash), or, burn the RAM
in duplicating the FST nodes (and measure the impact on RAM vs minimality).

Thanks everyone,

Mike McCandless

http://blog.mikemccandless.com


Re: Can we get rid of "Approve & Run" on GitHub PRs by new contributors (non-committers)?

2023-10-23 Thread Michael McCandless
Thanks everyone!  Responses below:

On Mon, Oct 16, 2023 at 7:37 AM Uwe Schindler  wrote:

> this seems to be a safety feature and is also enabled in general for
> Github. I found no options in asf.yaml to enable/disable it:
>
OK, thanks for checking Uwe.

> Nevertheless, I see no problem for pressing the button. When I quickly
> review a PR, I generally press the button.
>
I press it as well, but this is just a "best effort" and leaves many runs
unapproved for quite some time.  When I check a few days ago, there were 52
pending runs (I think for 26 PRs, seems to be 2 runs per unapproved PR),
ranging up to 19 days in age:
https://github.com/apache/lucene/actions?query=is%3Aaction_required  (I
have since approved all of them).  We committers are not consistent in
checking all pending runs ...

> For safety reasons this is required in most projects I was contributing,
> too (not only ASF).
>
But this is a silly way to achieve safety -- it's "assume guilty until
proven innocent" of our newest contributors, when past evidence that I've
seen shows it's 100% the opposite: all of our new contributors opening PRs
are not bad agents.  Can't we instead assume innocent until proven guilty
of our newest contributors?

Sure, those of us with the karma to push the "Approve and run" don't see a
problem: we long ago lost the fresh eyes / Shoshin that new contributors
bring and experience.

Uwe, put yourself in the shoes of a new contributor: you see a small issue,
you know how to make PRs so you make one, submit it, and then no response.
You see that other contributors' PRs quickly get this nice GitHub action
catching problems, but for some reason yours does not (maybe for 19 days).
(I think this "Approve and run" button is only seen by committers.)  You're
not sure what you did wrong, what you should do next.  You feel like this
community doesn't listen to new people's PRs.  Some random time later, say
12 days, the jobs run, and now you see you made some silly mistake and you
fix it and push to your PR.  And, again, nothing happens to confirm you did
fix the problems from the first run, for maybe another 6 days.  You wonder
why you had to wait 12 days to see the first silly mistake and another 6 to
see the next...

We should constantly strive to make the new contributor experience as
wonderful / frictionless / responsive as we can, not the opposite (this
approval step).  Such brave new people is how our community grows.  And we
old timer committers are blind to the pains they feel.

(Separately, we have another problem: gradually growing number of
still-open PRs:
https://home.apache.org/~mikemccand/lucenebench/github_pr_counts.html)

> What's the problem in pressing the button? Of course you take
> responsibility when the crypto miner starts, but if there is a huge PR
> by an external contributor, I would first ask if they could split it into
> smaller pieces. At some point we have to review it, and most external
> people creating huge PRs did bad stuff like pressing the format button in
> their IDE.
>
> I think running "./gradlew precommit" is a must for new contributors. The
> online checks on Github are more for me as reviewer/committer, to make sure
> all is fine before I press the merge button (for many PRs I don't even
> checkout the code after review). So it is fine to not trigger it by
> end-users.
>
New contributors don't necessarily know everything that we old-timers
expect/know, like running precommit (they are not committers, yet!), or
tidy.  That's what's great about our Github actions -- they run that for
the contributor an the contributor can quickly see what went wrong.
Inserting committer approval there just gums up that whole nice feedback
loop for a new contributor (up to 19 days!).

> -1 to ask INFRA to enable this.
>
OK, I won't ask INFRA to change anything!

On Mon, Oct 16, 2023 at 7:53 AM Robert Muir  wrote:

> I think running the builds with a timeout is a good thing to do
> anyway, for any CI build. I'm sure github actions has some fancy yaml
> for that, but you can just do "timeout -k 1m 1h ./gradlew..." instead
> of "./gradlew" too.


+1, that seems like a much better way to achieve safety without harming the
new contributor onboarding experience.

On Mon, Oct 16, 2023 at 11:02 AM Dawid Weiss  wrote:

> I filed a PR here -
> https://github.com/apache/lucene/pull/12687
>

Ooh, thank you Dawid!  And it's now merged, so we now have a decent timeout
protection, so if a bad actor tries to crypto mine or run some distributed
LLM or whatever, at least the wasted resources are bounded by how long a
"typical" legitimate run takes, plus generous buffer.  So given this
protection, why require the added manual approval step :)

Net/net I don't think we have to do anything more here ... for now I'll try
to make a periodic effort myself to approve & run these blocked jobs.
Maybe that's enough to create a smoother first-contributor experience.

But I still strongly disagree with intentionally harming the 

Re: Welcome Luca Cavanna to the Lucene PMC

2023-10-22 Thread Michael McCandless
Welcome Luca!

Mike

On Sun, Oct 22, 2023 at 4:34 PM Tomás Fernández Löbbe 
wrote:

> Congratulations Luca!
>
> On Sun, Oct 22, 2023 at 10:51 AM Michael Sokolov 
> wrote:
>
>> Congratulations and welcome, Luca!
>>
>> On Sun, Oct 22, 2023 at 1:42 PM Julie Tibshirani 
>> wrote:
>> >
>> > Congratulations Luca!!
>> >
>> > On Fri, Oct 20, 2023 at 1:45 AM Bruno Roustant <
>> bruno.roust...@gmail.com> wrote:
>> >>
>> >> Welcome, congratulations!
>> >>
>> >> Le ven. 20 oct. 2023 à 10:02, Dawid Weiss  a
>> écrit :
>> >>>
>> >>>
>> >>> Congratulations, Luca!
>> >>>
>> >>> On Fri, Oct 20, 2023 at 7:51 AM Adrien Grand 
>> wrote:
>> 
>>  I'm pleased to announce that Luca Cavanna has accepted an invitation
>> to join the Lucene PMC!
>> 
>>  Congratulations Luca, and welcome aboard!
>> 
>>  --
>>  Adrien
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>


Could we allow an IndexInput to read from a still writing IndexOutput?

2023-10-19 Thread Michael McCandless
Hi Team,

Today, Lucene's Directory abstraction does not allow opening an IndexInput
on a file until the file is fully written and closed via IndexOutput.  We
enforce this in tests, and some of our core Directory implementations
demand this (e.g. caching the file's length on opening an IndexInput).

Yet, most filesystems will easily allow simultaneous read/append of a
single file.  We just don't expose this IO semantics to Lucene, but could
we allow random-access reads with append-only writes on one file?  Is there
a strong reason that we don't allow this?

Quick TL/DR context: we are trying to enable FST compilation to write
off-heap (directly to disk), enabling creating arbitrarily large FSTs with
bounded heap, matching how FSTs can now be read off-heap, and it would be
much much more RAM efficient if we could read/append the same file at once.

Full gory details context: inspired by how Tantivy
 (awesome and fast Rust search
engine!) writes its FSTs , over
in this issue  and PR
,
we (thank you Dzung Bui / @dungba88!) are trying to fix Lucene's FST
building to immediately stream the FST to disk, instead of buffering the
whole thing in RAM and then writing to disk.

This would allow building arbitrarily large FSTs without using up heap, and
symmetrically matches how we can now read FSTs off-heap, plus FST building
is already (mostly) append-only. This would also allow removing some of the
crazy abstractions we have for writing FST bytes into RAM (FSTStore,
BytesStore).  It would enable interesting things like a Codec whose term
dictionary is stored entirely in an FST
 (also inspired by Tantivy).

The wrinkle is that, while the FST is building, it sometimes looks back and
reads previously written bytes, to share suffixes and create a minimal (or
near minimal) FST.  So if IndexInput could read those bytes, even as the
FST is still appending to IndexOutput, it would "just work".

Failing that, our plan B is to wastefully duplicate the byte[] slices from
the already written bytes into our own private (heap resident, boo) copy,
which would use quite a bit more RAM while building the FST, and make less
minimal FSTs for a given RAM budget.  I haven't measured the added wasted
RAM if we have to go this route but I fear it is sizable in practice, i.e.
it strongly negates the whole idea of writing an FST off-heap since its
effectively storing a possibly large portion of the FST in many duplicated
byte[] fragments (in the NodeHash).

So ... could we somehow relax Lucene's Directory semantics to allow opening
an IndexInput on a still appending IndexOutput, since most filesystems are
fine with this?

Mike McCandless

http://blog.mikemccandless.com


  1   2   3   4   5   6   7   8   9   10   >