[jira] [Updated] (PYLUCENE-70) JCC --generate missing additional \ on windows

2024-03-17 Thread Jira


 [ 
https://issues.apache.org/jira/browse/PYLUCENE-70?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Petrus Hyvönen updated PYLUCENE-70:
---
Description: 
The --generate seems to be missing double 
in package_dir parameter on windows platform

In tests at JCC conda build feedstock 
([https://github.com/conda-forge/jcc-feedstock/tree/main/recipe/test/java-example-test-parameters])

--generates gives a setup.py with row:
{color:#d4d4d4}  
{color}{color:#9cdcfe}package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
 {color}{color:#ce9178}"build\{color}{color:#d7ba7d}test2"{color}},
 
But should be:
  
{color:#9cdcfe}package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
 {color}{color:#ce9178}"build\\{color}{color:#ce9178}test2"}{color},
 

  was:
The --generate seems to be missing double \\ in package_dir parameter on 
windows platform

In tests at JCC conda build feedstock 
(https://github.com/conda-forge/jcc-feedstock/tree/main/recipe/test/java-example-test-parameters)

--generates gives a setup.py with row:
{color:#d4d4d4}  
{color}{color:#9cdcfe}package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
 
{color}{color:#ce9178}"build{color}{color:#d7ba7d}\{color}{color:#ce9178}test2"{color}{color:#d4d4d4}},{color}
 
But should be:
{color:#d4d4d4}  
{color}{color:#9cdcfe}package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
 
{color}{color:#ce9178}"build{color}{color:#d7ba7d}\\{color}{color:#ce9178}test2"{color}{color:#d4d4d4}},{color}
 


> JCC --generate missing additional \ on windows
> --
>
> Key: PYLUCENE-70
> URL: https://issues.apache.org/jira/browse/PYLUCENE-70
> Project: PyLucene
>  Issue Type: Bug
> Environment: Windows11, conda python package
>Reporter: Petrus Hyvönen
>Priority: Blocker
>
> The --generate seems to be missing double 
> in package_dir parameter on windows platform
> In tests at JCC conda build feedstock 
> ([https://github.com/conda-forge/jcc-feedstock/tree/main/recipe/test/java-example-test-parameters])
> --generates gives a setup.py with row:
> {color:#d4d4d4}  
> {color}{color:#9cdcfe}package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
>  {color}{color:#ce9178}"build\{color}{color:#d7ba7d}test2"{color}},
>  
> But should be:
>   
> {color:#9cdcfe}package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
>  {color}{color:#ce9178}"build\\{color}{color:#ce9178}test2"}{color},
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PYLUCENE-70) JCC --generate missing additional \ on windows

2024-03-17 Thread Jira


 [ 
https://issues.apache.org/jira/browse/PYLUCENE-70?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Petrus Hyvönen updated PYLUCENE-70:
---
Description: 
The --generate seems to be missing double 
in package_dir parameter on windows platform

In tests at JCC conda build feedstock 
([https://github.com/conda-forge/jcc-feedstock/tree/main/recipe/test/java-example-test-parameters])

--generates gives a setup.py with row:
{color:#d4d4d4}  
{color}{color:#9cdcfe}package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
 {color}{color:#ce9178}"build\{color}test2"\{color}},
 
But should be:
  
package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
 {color}{color:#ce9178}"build\\{color}{color:#ce9178}test2"}{color},
 

  was:
The --generate seems to be missing double 
in package_dir parameter on windows platform

In tests at JCC conda build feedstock 
([https://github.com/conda-forge/jcc-feedstock/tree/main/recipe/test/java-example-test-parameters])

--generates gives a setup.py with row:
{color:#d4d4d4}  
{color}{color:#9cdcfe}package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
 {color}{color:#ce9178}"build\{color}{color:#d7ba7d}test2"{color}},
 
But should be:
  
{color:#9cdcfe}package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
 {color}{color:#ce9178}"build\\{color}{color:#ce9178}test2"}{color},
 


> JCC --generate missing additional \ on windows
> --
>
> Key: PYLUCENE-70
> URL: https://issues.apache.org/jira/browse/PYLUCENE-70
> Project: PyLucene
>  Issue Type: Bug
> Environment: Windows11, conda python package
>Reporter: Petrus Hyvönen
>Priority: Blocker
>
> The --generate seems to be missing double 
> in package_dir parameter on windows platform
> In tests at JCC conda build feedstock 
> ([https://github.com/conda-forge/jcc-feedstock/tree/main/recipe/test/java-example-test-parameters])
> --generates gives a setup.py with row:
> {color:#d4d4d4}  
> {color}{color:#9cdcfe}package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
>  {color}{color:#ce9178}"build\{color}test2"\{color}},
>  
> But should be:
>   
> package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
>  {color}{color:#ce9178}"build\\{color}{color:#ce9178}test2"}{color},
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PYLUCENE-70) JCC --generate missing additional \ on windows

2024-03-17 Thread Jira
Petrus Hyvönen created PYLUCENE-70:
--

 Summary: JCC --generate missing additional \ on windows
 Key: PYLUCENE-70
 URL: https://issues.apache.org/jira/browse/PYLUCENE-70
 Project: PyLucene
  Issue Type: Bug
 Environment: Windows11, conda python package
Reporter: Petrus Hyvönen


The --generate seems to be missing double \\ in package_dir parameter on 
windows platform

In tests at JCC conda build feedstock 
(https://github.com/conda-forge/jcc-feedstock/tree/main/recipe/test/java-example-test-parameters)

--generates gives a setup.py with row:
{color:#d4d4d4}  
{color}{color:#9cdcfe}package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
 
{color}{color:#ce9178}"build{color}{color:#d7ba7d}\{color}{color:#ce9178}test2"{color}{color:#d4d4d4}},{color}
 
But should be:
{color:#d4d4d4}  
{color}{color:#9cdcfe}package_dir{color}{color:#d4d4d4}={{color}{color:#ce9178}"test2"{color}{color:#d4d4d4}:
 
{color}{color:#ce9178}"build{color}{color:#d7ba7d}\\{color}{color:#ce9178}test2"{color}{color:#d4d4d4}},{color}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Lucene 10

2024-03-15 Thread Patrick Zhai
Thanks Adrien +1 to the timelines.

I'm also willing to work on/ review the Decouple search concurrency from
index geometry  task,
that's a very nice one to have for those latency sensitive applications
(rather than have to tune
merge policy case by case). But I cannot guarantee anything yet so if
others are also
working on it I'm happy to share the ideas/ efforts (if any).

Patrick

On Thu, Mar 14, 2024 at 12:09 PM Michael Sokolov  wrote:

> timing makes sense to me. +1 for having a deadline to reduce
> procrastination, but Adrien I don't honestly believe anyone who is
> paying attention thinks that is what you have been doing!
>
> On Wed, Mar 13, 2024 at 10:40 AM Adrien Grand  wrote:
> >
> > Hello everyone!
> >
> > It's been ~2.5 years since we released Lucene 9.0 (December 2021) and
> I'd like us to start working towards Lucene 10.0. I'm volunteering for
> being the release manager and propose the following timeline:
> >  - ~September 15th: main gets bumped to 11.x, branch_10x gets created
> >  - ~September 22nd: Do a last 9.x minor release.
> >  - ~October 1st: Release 10.0.
> >
> > This may sound like a long notice period. My motivation is that there
> are a few changes I have on my mind that are likely worthy of a major
> release, and I plan on taking advantage of a date being set to stop
> procrastinating and finally start moving these enhancements forward. These
> are not blockers, only my wish list for Lucene 10.0, if they are not ready
> in time we can have discussions about letting them slip until the next
> major.
> >  - Greater I/O concurrency. Can Lucene better utilize modern disks that
> are plenty concurrent?
> >  - Decouple search concurrency from index geometry. Can Lucene better
> utilize modern CPUs that are plenty concurrent?
> >  - "Sparse indexing" / "zone indexing" for sorted indexes. This is one
> of the most efficient techniques that OLAP databases take advantage of to
> make search fast. Let's bring it to Lucene.
> >
> > This list isn't meant to be an exhaustive list of release highlights for
> Lucene 10, feel free to add your own. There are also a number of cleanups
> we may want to consider. I wanted to share this list for visibility though
> in case you have thoughts on these enhancements and/or would like to help.
> >
> > --
> > Adrien
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Lucene 10

2024-03-14 Thread Michael Sokolov
timing makes sense to me. +1 for having a deadline to reduce
procrastination, but Adrien I don't honestly believe anyone who is
paying attention thinks that is what you have been doing!

On Wed, Mar 13, 2024 at 10:40 AM Adrien Grand  wrote:
>
> Hello everyone!
>
> It's been ~2.5 years since we released Lucene 9.0 (December 2021) and I'd 
> like us to start working towards Lucene 10.0. I'm volunteering for being the 
> release manager and propose the following timeline:
>  - ~September 15th: main gets bumped to 11.x, branch_10x gets created
>  - ~September 22nd: Do a last 9.x minor release.
>  - ~October 1st: Release 10.0.
>
> This may sound like a long notice period. My motivation is that there are a 
> few changes I have on my mind that are likely worthy of a major release, and 
> I plan on taking advantage of a date being set to stop procrastinating and 
> finally start moving these enhancements forward. These are not blockers, only 
> my wish list for Lucene 10.0, if they are not ready in time we can have 
> discussions about letting them slip until the next major.
>  - Greater I/O concurrency. Can Lucene better utilize modern disks that are 
> plenty concurrent?
>  - Decouple search concurrency from index geometry. Can Lucene better utilize 
> modern CPUs that are plenty concurrent?
>  - "Sparse indexing" / "zone indexing" for sorted indexes. This is one of the 
> most efficient techniques that OLAP databases take advantage of to make 
> search fast. Let's bring it to Lucene.
>
> This list isn't meant to be an exhaustive list of release highlights for 
> Lucene 10, feel free to add your own. There are also a number of cleanups we 
> may want to consider. I wanted to share this list for visibility though in 
> case you have thoughts on these enhancements and/or would like to help.
>
> --
> Adrien

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Lucene 10

2024-03-13 Thread Adrien Grand
Hello everyone!

It's been ~2.5 years since we released Lucene 9.0 (December 2021) and I'd
like us to start working towards Lucene 10.0. I'm volunteering for being
the release manager and propose the following timeline:
 - ~September 15th: main gets bumped to 11.x, branch_10x gets created
 - ~September 22nd: Do a last 9.x minor release.
 - ~October 1st: Release 10.0.

This may sound like a long notice period. My motivation is that there are a
few changes I have on my mind that are likely worthy of a major release,
and I plan on taking advantage of a date being set to stop procrastinating
and finally start moving these enhancements forward. These are not
blockers, only my wish list for Lucene 10.0, if they are not ready in time
we can have discussions about letting them slip until the next major.
 - Greater I/O concurrency .
Can Lucene better utilize modern disks that are plenty concurrent?
 - Decouple search concurrency from index geometry
. Can Lucene better utilize
modern CPUs that are plenty concurrent?
 - "Sparse indexing " /
"zone indexing" for sorted indexes. This is one of the most efficient
techniques that OLAP databases take advantage of to make search fast. Let's
bring it to Lucene.

This list isn't meant to be an exhaustive list of release highlights for
Lucene 10, feel free to add your own. There are also a number of cleanups
we may want to consider. I wanted to share this list for visibility though
in case you have thoughts on these enhancements and/or would like to help.

-- 
Adrien


Re: [JENKINS] Lucene » Lucene-Coverage-main - Build # 1065 - Still Failing!

2024-03-10 Thread Dawid Weiss
I did turn off the security manager for coverage runs, a workaround but
better than none.

On Sun, Mar 10, 2024 at 6:11 PM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build:
> https://ci-builds.apache.org/job/Lucene/job/Lucene-Coverage-main/1065/
>
> All tests passed
>
> Build Log:
> [...truncated 97716 lines...]
> BUILD FAILED in 10m 51s
> 318 actionable tasks: 318 executed
>
> Publishing build scan...
> https://ge.apache.org/s/3a3dhos6frshe
>
> Build step 'Invoke Gradle script' changed build result to FAILURE
> Build step 'Invoke Gradle script' marked build as failure
> Archiving artifacts
> Recording test results
> [Checks API] No suitable checks publisher found.
> Email was triggered for: Failure - Any
> Sending email for trigger: Failure - Any
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: Inlining, virtual calls and BKDPointsTree

2024-03-06 Thread Anton Hägerstrand
> I tried this and also tried benchmarking the change on 2 other types of
indexes, with slightly varying attributes. They roughly correlate to
indexes for different categories of products.
> Performance on both throughput and latency was flat.

Thank you very much for running the benchmarks and reviewing the code!

After thinking a bit about this I think that it would be best if the PR
could be proven to improve performance in luceneutil before merging. Since
luceneutil does not currently, as far as I understand things, have good
coverage for point range queries and numerical sorting, this means adding
that functionality to luceneutil. I'll start looking into that next week
(I'm currently travelling).

best regards,
Anton

On Thu, 7 Mar 2024 at 02:30, Gautam Worah  wrote:

> > I'll try tweaking the query set to target queries with more Point hits
> during the week and see what comes out..
>
> I tried this and also tried benchmarking the change on 2 other types of
> indexes, with slightly varying attributes. They roughly correlate to
> indexes for different categories of products.
> Performance on both throughput and latency was flat.
>
> The change still LGTM.
>
> Regards,
> Gautam Worah.
>
>
> On Sat, Mar 2, 2024 at 8:45 AM Gautam Worah 
> wrote:
>
>> > I am running Amazon Product Search's benchmarks to see if the change
>> is needle moving for us.
>>
>> Results were flat to slightly positive (+0.94% redline QPS) on our
>> workload.
>> Although we do have numeric range queries that would've improved, I
>> suspect it is flat because our workload is heavily dominated by TermQueries
>> and their combinations with various clauses.
>>
>> I'll try tweaking the query set to target queries with more Point hits
>> during the week and see what comes out..
>>
>> Regards,
>> Gautam Worah.
>>
>>
>> On Sat, Mar 2, 2024 at 2:33 AM Anton Hägerstrand 
>> wrote:
>>
>>> Thank you Gautam!
>>>
>>> > Yeah, it seems like luceneutil is not stressing the code path that
>>> ElasticSearch's benchmarks are?
>>>
>>> Yes, as far as I understand it - though it might just be that I don't
>>> understand luceneutil good enough. I believe that in order to see the
>>> performance diff numerical range queries or numerical sorting would have to
>>> be involved - the more documents matched the larger the difference. This is
>>> what the relevant benchmark operations from Elastic does.
>>>
>>> > So it seems like switching over from an iterative visit(int docID)
>>> call to a bulk visit(DocIdSetIterator iterator) gave us these gains?
>>> Cool!
>>>
>>> Yes, it seems like it, based on this benchmark.
>>>
>>> > I am running Amazon Product Search's benchmarks to see if the change
>>> is needle moving for us.
>>>
>>> Thank you, much appreciated!
>>>
>>> > Small suggestion on the blog...
>>>
>>> Thank you for the feedback! The post is definitely a bit confusing, I
>>> struggled with keeping it clear. I will try to make some edits to make it
>>> clearer what conclusions can be made after each section.
>>>
>>> /Anton
>>>
>>> On Sat, 2 Mar 2024 at 00:30, Gautam Worah 
>>> wrote:
>>>
 Hi Anton,

 It took me a while to get through the blog post, and I suspect I will
 need to read through a couple more times to understand it fully.
 Thanks for writing up something so detailed. I learnt a lot!
 (especially about JVM inlining methods).

 > I have not been able to reproduce the speedup with lucenutil - I
 suspect that the default tasks in it would not trigger this code path that
 much.

 Yeah, it seems like luceneutil is not stressing the code path that
 ElasticSearch's benchmarks are?

 > I tried changing the DocIdsWriter::readInts32 (and readDelta16),
 instead calling the IntersectVisitor with a DocIdSetItorator, to reduce the
 number of virtual calls. In the benchmark setup by Elastic [2] I saw a
 decrease of execution time of 35-45% for range queries and numerical
 sorting with this patch applied.

 So it seems like switching over from an iterative visit(int docID)
 call to a bulk visit(DocIdSetIterator iterator) gave us these gains?
 Cool!

 I am running Amazon Product Search's benchmarks to see if the change is
 needle moving for us.

 Small suggestion on the blog: The JVM inlining, ElasticSearch
 short-circuiting/opto causing a difference in performance could've been a
 blog on its own, part 1 maybe.. I got confused when the blog shifted from
 the performance differences between ElasticSearch and OpenSearch, to how
 you ended up improving Lucene.

 Regards,
 Gautam Worah.


 On Fri, Mar 1, 2024 at 2:42 AM Anton Hägerstrand 
 wrote:

> Hi everyone, long time lurker here.
>
> I recently investigated Elasticsearch/OpenSearch performance in a blog
> post [1], and saw some interesting behavior of numerical range queries and
> numerical sorting with regards to inlining and virtual 

Re: Inlining, virtual calls and BKDPointsTree

2024-03-06 Thread Gautam Worah
> I'll try tweaking the query set to target queries with more Point hits
during the week and see what comes out..

I tried this and also tried benchmarking the change on 2 other types of
indexes, with slightly varying attributes. They roughly correlate to
indexes for different categories of products.
Performance on both throughput and latency was flat.

The change still LGTM.

Regards,
Gautam Worah.


On Sat, Mar 2, 2024 at 8:45 AM Gautam Worah  wrote:

> > I am running Amazon Product Search's benchmarks to see if the change is
> needle moving for us.
>
> Results were flat to slightly positive (+0.94% redline QPS) on our
> workload.
> Although we do have numeric range queries that would've improved, I
> suspect it is flat because our workload is heavily dominated by TermQueries
> and their combinations with various clauses.
>
> I'll try tweaking the query set to target queries with more Point hits
> during the week and see what comes out..
>
> Regards,
> Gautam Worah.
>
>
> On Sat, Mar 2, 2024 at 2:33 AM Anton Hägerstrand 
> wrote:
>
>> Thank you Gautam!
>>
>> > Yeah, it seems like luceneutil is not stressing the code path that
>> ElasticSearch's benchmarks are?
>>
>> Yes, as far as I understand it - though it might just be that I don't
>> understand luceneutil good enough. I believe that in order to see the
>> performance diff numerical range queries or numerical sorting would have to
>> be involved - the more documents matched the larger the difference. This is
>> what the relevant benchmark operations from Elastic does.
>>
>> > So it seems like switching over from an iterative visit(int docID)
>> call to a bulk visit(DocIdSetIterator iterator) gave us these gains?
>> Cool!
>>
>> Yes, it seems like it, based on this benchmark.
>>
>> > I am running Amazon Product Search's benchmarks to see if the change
>> is needle moving for us.
>>
>> Thank you, much appreciated!
>>
>> > Small suggestion on the blog...
>>
>> Thank you for the feedback! The post is definitely a bit confusing, I
>> struggled with keeping it clear. I will try to make some edits to make it
>> clearer what conclusions can be made after each section.
>>
>> /Anton
>>
>> On Sat, 2 Mar 2024 at 00:30, Gautam Worah  wrote:
>>
>>> Hi Anton,
>>>
>>> It took me a while to get through the blog post, and I suspect I will
>>> need to read through a couple more times to understand it fully.
>>> Thanks for writing up something so detailed. I learnt a lot! (especially
>>> about JVM inlining methods).
>>>
>>> > I have not been able to reproduce the speedup with lucenutil - I
>>> suspect that the default tasks in it would not trigger this code path that
>>> much.
>>>
>>> Yeah, it seems like luceneutil is not stressing the code path that
>>> ElasticSearch's benchmarks are?
>>>
>>> > I tried changing the DocIdsWriter::readInts32 (and readDelta16),
>>> instead calling the IntersectVisitor with a DocIdSetItorator, to reduce the
>>> number of virtual calls. In the benchmark setup by Elastic [2] I saw a
>>> decrease of execution time of 35-45% for range queries and numerical
>>> sorting with this patch applied.
>>>
>>> So it seems like switching over from an iterative visit(int docID) call
>>> to a bulk visit(DocIdSetIterator iterator) gave us these gains? Cool!
>>>
>>> I am running Amazon Product Search's benchmarks to see if the change is
>>> needle moving for us.
>>>
>>> Small suggestion on the blog: The JVM inlining, ElasticSearch
>>> short-circuiting/opto causing a difference in performance could've been a
>>> blog on its own, part 1 maybe.. I got confused when the blog shifted from
>>> the performance differences between ElasticSearch and OpenSearch, to how
>>> you ended up improving Lucene.
>>>
>>> Regards,
>>> Gautam Worah.
>>>
>>>
>>> On Fri, Mar 1, 2024 at 2:42 AM Anton Hägerstrand 
>>> wrote:
>>>
 Hi everyone, long time lurker here.

 I recently investigated Elasticsearch/OpenSearch performance in a blog
 post [1], and saw some interesting behavior of numerical range queries and
 numerical sorting with regards to inlining and virtual calls.

 In short, the DocIdsWriter::readInts method seems to get much slower if
 it is called with 3 or more implementations of IntersectVisitor during the
 JVM lifetime. I believe that this is due to IntersectVisitory.visit(docid)
 being heavily inlined with 2 or fewer IntersectVisitor implementations,
 while becoming a virtual call with 3 or more.

 This leads to two interesting points wrt Lucene

 1) For benchmarks, warm ups should not only be done to trigger speedups
 by the JIT, instead making the JVM be in a realistic production state. For
 the BKDPointTree, this means at least 3 implementations of the
 IntersectVisitor. I'm not sure if this is top of mind when writing Lucene
 benchmarks?
 2) I tried changing the DocIdsWriter::readInts32 (and readDelta16),
 instead calling the IntersectVisitor with a DocIdSetItorator, to reduce the

Re: [JENKINS] Lucene » Lucene-Check-main (s390x big endian) - Build # 460 - Still Failing!

2024-03-06 Thread Uwe Schindler

See this issue: https://github.com/apache/lucene/issues/13161

The s390x server (big endian) has no Java 21 yet. I'll keep the job 
enabled, should work soon.


Uwe

Am 06.03.2024 um 23:09 schrieb Apache Jenkins Server:

Build: 
https://ci-builds.apache.org/job/Lucene/job/Lucene-Check-main%20(s390x%20big%20endian)/460/

No tests ran.

Build Log:
[...truncated 29 lines...]
ERROR: JAVA_HOME is set to an invalid directory: 
/home/jenkins/tools/java/latest21

Please set the JAVA_HOME variable in your environment to match the
location of your Java installation.

Build step 'Invoke Gradle script' changed build result to FAILURE
Build step 'Invoke Gradle script' marked build as failure
Archiving artifacts
Recording test results
ERROR: Step ‘Publish JUnit test result report’ failed: No test report files 
were found. Configuration error?
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any

-
To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
For additional commands, e-mail: builds-h...@lucene.apache.org


--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: u...@thetaphi.de


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Query about the GitHub statistics for Lucene

2024-03-06 Thread Uwe Schindler

Hi,

Yes, we should contact INFRA so they get all the repository links 
uptodate. They should maybe send us a list of tracked repos/issue 
trackers for us to review. There were also some crazy things like the 
temporary repository, that we used to migrate our issues from JIRA to 
Github, be used for statistics, but NOT the apache/lucene one.


The statistics for JIRA are clearly wrong, too. The last change in JIRA 
was Aug 19, 2022.


Uwe

Am 05.03.2024 um 14:26 schrieb Robert Muir:

On Tue, Mar 5, 2024 at 4:50 AM Chris Hegarty
 wrote:

It appears that there is no GH activity for 2024! Clearly this is incorrect. 
I’ve yet to track down what’s going on with this. Familiar to anyone here?


Last time I looked at this, it appeared it is looking at the incorrect
github repositories, for example https://github.com/apache/lucene-solr
and not https://github.com/apache/lucene

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: u...@thetaphi.de


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Query about the GitHub statistics for Lucene

2024-03-06 Thread Chris Hegarty
Hi Mike,

> On 6 Mar 2024, at 10:47, Michael McCandless  wrote:
> 
> On Wed, Mar 6, 2024 at 4:41 AM Chris Hegarty  
> wrote:
> 
> Seems that I’ve fallen into the newbie PMC Chair rabbit hole! ;-) - the 
> reporting tool has long standing issues. Maybe they’re fixable, maybe not, 
> but it’s possible we don’t necessarily need it now.
> 
> Sorry :)  Seems to be a rite-of-passage at this point! 

Ha! Just happy that I’m not alone on this.

> It should be mentioned in the handover instructions... or, we should simply 
> merge Daniel Gruno's one-line fix to the regexp that Kibble/Whimsy/reporter 
> tool uses: 
> https://issues.apache.org/jira/browse/COMDEV-425?focusedCommentId=17823767=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17823767

That would be great, but I’m not sure why it’s not been done before at this 
point. I’ll add a note to future handover instructions if it cannot be resolved.

> @Mike is it possible to add “created since” filter?
> 
> Ahh good idea, done!  
> https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated=created%3APast+3+months=issue_or_pr%3APR
>   (this is PRs created in the Past 3 months ... it shows 36 open and 162 
> closed right now, close to the GitHub counts you found).

This looks right, thanks. I think we can use Githubsearch going forward. :-) 

> Here's the luceneserver commit that adds it: 
> https://github.com/mikemccand/luceneserver/commit/397942573bed3e2c4fd00ab0a324a19fd014bfd4

Thank you,
-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Query about the GitHub statistics for Lucene

2024-03-06 Thread Michael McCandless
On Wed, Mar 6, 2024 at 4:41 AM Chris Hegarty 
wrote:

Seems that I’ve fallen into the newbie PMC Chair rabbit hole! ;-) - the
> reporting tool has long standing issues. Maybe they’re fixable, maybe not,
> but it’s possible we don’t necessarily need it now.
>

Sorry :)  Seems to be a rite-of-passage at this point!  It should be
mentioned in the handover instructions... or, we should simply merge Daniel
Gruno's one-line fix to the regexp that Kibble/Whimsy/reporter tool uses:
https://issues.apache.org/jira/browse/COMDEV-425?focusedCommentId=17823767=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17823767

@Mike is it possible to add “created since” filter?
>

Ahh good idea, done!
https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated=created%3APast+3+months=issue_or_pr%3APR
(this is PRs created in the Past 3 months ... it shows 36 open and 162
closed right now, close to the GitHub counts you found).

Here's the luceneserver commit that adds it:
https://github.com/mikemccand/luceneserver/commit/397942573bed3e2c4fd00ab0a324a19fd014bfd4

Mike McCandless

http://blog.mikemccandless.com


Re: Query about the GitHub statistics for Lucene

2024-03-06 Thread Chris Hegarty
Hi,

Seems that I’ve fallen into the newbie PMC Chair rabbit hole! ;-) - the 
reporting tool has long standing issues. Maybe they’re fixable, maybe not, but 
it’s possible we don’t necessarily need it now.

> On 5 Mar 2024, at 18:22, Michael McCandless  wrote:
> 
> ...
> @Mike. Would it be possible to add a “Past 3 months” to 
> https://githubsearch.mikemccandless.com/search.py ? Which would be helpful 
> when reporting.
> 
> Good idea!  Done!  
> https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated=status%3AOpen=updated%3APast+3+months

Cool. Thanks.

The stats I’m trying to retrieve are for PRs created in the past 3 months. 
GitHub allows me to get that with:
   https://github.com/apache/lucene/pulls?q=is%3Apr+created%3A%3E2023-12-05

, which (when run today) shows:  PRs - 36 Open   163 Closed

Another interesting stat is PRs UPDATED in the past 3 months, e.g.
  https://github.com/apache/lucene/pulls?q=is%3Apr+updated%3A%3E2023-12-05+
   ~355 PRs updated.
   ( which we can also see from Mike’s githubsearch [1])

@Mike is it possible to add “created since” filter?

Another very rough approximation of activity / health is commits, e.g.

  $ git log --pretty='format:%cd' --since='3 months ago' | wc -l
  244
  $ git log --all --pretty='format:%cd' --since='3 months ago' | wc -l
  472

So 472 commits on all branches in the past 3 months.

-Chris

[1] 
https://githubsearch.mikemccandless.com/search.py?chg=du==status=undefined=0=29577=recentlyUpdated=list=uzz5ht9buk98=status%3AOpen=updated%3APast+3+months=issue_or_pr%3APR=


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Query about the GitHub statistics for Lucene

2024-03-05 Thread Michael McCandless
Found the prior discussion/issue:
https://lists.apache.org/thread/fhzw0y7kpnf48cxfml8t0313sdswdv6b

And a prior prior discussion:
https://lists.apache.org/thread/6rsr8v982fjqgyopprqzw057cpzfnz3z

Issue: https://issues.apache.org/jira/browse/COMDEV-425.  Jan seemed to get
close to fixing the (regexp?) bug!

Mike McCandless

http://blog.mikemccandless.com


On Tue, Mar 5, 2024 at 1:03 PM Michael McCandless 
wrote:

>
> On Tue, Mar 5, 2024 at 4:49 AM Chris Hegarty <
> christopher.hega...@elastic.co> wrote:
>
> In preparation for the project’s upcoming ASF board report, I came across
>> and reported [1] an issue with the GH statistics, available at:
>> https://reporter.apache.org/wizard/statistics?lucene
>>
>> It appears that there is no GH activity for 2024! Clearly this is
>> incorrect. I’ve yet to track down what’s going on with this. Familiar to
>> anyone here?
>
>
> There is a long-standing INFRA issue about this.  Lemme try to locate it
> ...
>
> @Mike. Would it be possible to add a “Past 3 months” to
>> https://githubsearch.mikemccandless.com/search.py ? Which would be
>> helpful when reporting.
>>
>
> Good idea!  Done!
> https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated=status%3AOpen=updated%3APast+3+months
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>


Re: Query about the GitHub statistics for Lucene

2024-03-05 Thread Michael McCandless
On Tue, Mar 5, 2024 at 4:49 AM Chris Hegarty 
wrote:

In preparation for the project’s upcoming ASF board report, I came across
> and reported [1] an issue with the GH statistics, available at:
> https://reporter.apache.org/wizard/statistics?lucene
>
> It appears that there is no GH activity for 2024! Clearly this is
> incorrect. I’ve yet to track down what’s going on with this. Familiar to
> anyone here?


There is a long-standing INFRA issue about this.  Lemme try to locate it
...

@Mike. Would it be possible to add a “Past 3 months” to
> https://githubsearch.mikemccandless.com/search.py ? Which would be
> helpful when reporting.
>

Good idea!  Done!
https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated=status%3AOpen=updated%3APast+3+months

Mike McCandless

http://blog.mikemccandless.com


Re: Query about the GitHub statistics for Lucene

2024-03-05 Thread Dawid Weiss
Perhaps this is what you meant by 'gh' but wanted to mention it -
https://github.com/apache/lucene/pulse/monthly

On Tue, Mar 5, 2024 at 4:34 PM Chris Hegarty
 wrote:

>
> > On 5 Mar 2024, at 13:26, Robert Muir  wrote:
> >
> > On Tue, Mar 5, 2024 at 4:50 AM Chris Hegarty
> >  wrote:
> >> It appears that there is no GH activity for 2024! Clearly this is
> incorrect. I’ve yet to track down what’s going on with this. Familiar to
> anyone here?
> >>
> >
> > Last time I looked at this, it appeared it is looking at the incorrect
> > github repositories, for example https://github.com/apache/lucene-solr
> > and not https://github.com/apache/lucene
>
> Ah, that could explain it!!
>
> I’ll try to take a look at what repo those report stats are generated
> from, and how we might be able to get that updated. Mostly for convenience,
> and also having a single source of truth.
>
> Anyway, thankfully git and GH are good enough to get the kind of basic
> stats we typically want - just that it’s not as clear when comparing to
> previously gathered stats. Well… commits are commits, and counting PRs
> should not result in different numbers, but you know ... ;-)
>
> Thanks,
> -Chris.
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Query about the GitHub statistics for Lucene

2024-03-05 Thread Chris Hegarty


> On 5 Mar 2024, at 13:26, Robert Muir  wrote:
> 
> On Tue, Mar 5, 2024 at 4:50 AM Chris Hegarty
>  wrote:
>> It appears that there is no GH activity for 2024! Clearly this is incorrect. 
>> I’ve yet to track down what’s going on with this. Familiar to anyone here?
>> 
> 
> Last time I looked at this, it appeared it is looking at the incorrect
> github repositories, for example https://github.com/apache/lucene-solr
> and not https://github.com/apache/lucene

Ah, that could explain it!!

I’ll try to take a look at what repo those report stats are generated from, and 
how we might be able to get that updated. Mostly for convenience, and also 
having a single source of truth.

Anyway, thankfully git and GH are good enough to get the kind of basic stats we 
typically want - just that it’s not as clear when comparing to previously 
gathered stats. Well… commits are commits, and counting PRs should not result 
in different numbers, but you know ... ;-) 

Thanks,
-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Query about the GitHub statistics for Lucene

2024-03-05 Thread Robert Muir
On Tue, Mar 5, 2024 at 4:50 AM Chris Hegarty
 wrote:
> It appears that there is no GH activity for 2024! Clearly this is incorrect. 
> I’ve yet to track down what’s going on with this. Familiar to anyone here?
>

Last time I looked at this, it appeared it is looking at the incorrect
github repositories, for example https://github.com/apache/lucene-solr
and not https://github.com/apache/lucene

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Query about the GitHub statistics for Lucene

2024-03-05 Thread Chris Hegarty
Hi,

In preparation for the project’s upcoming ASF board report, I came across and 
reported [1] an issue with the GH statistics, available at: 
https://reporter.apache.org/wizard/statistics?lucene

It appears that there is no GH activity for 2024! Clearly this is incorrect. 
I’ve yet to track down what’s going on with this. Familiar to anyone here? 

@Mike. Would it be possible to add a “Past 3 months” to 
https://githubsearch.mikemccandless.com/search.py ? Which would be helpful when 
reporting.

-Chris

[1] https://lists.apache.org/thread/78fh8hb68zybbkz63odb0hzohgrddzkq
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release PyLucene 9.10.0-rc1

2024-03-04 Thread Andi Vajda



Thank you all who voted.
Thank you Dawid and Mike for your PMC +1 votes as well.

This vote has passed !
Expect a release shortly...

Andi..

On Wed, 21 Feb 2024, Andi Vajda wrote:


The PyLucene 9.10.0 (rc1) release tracking the recent release of
Apache Lucene 9.10.0 is ready.

A release candidate is available from:
  https://dist.apache.org/repos/dist/dev/lucene/pylucene/9.10.0-rc1/

PyLucene 9.10.0 is built with JCC 3.14, included in these release artifacts.

Apart from the catch-up to Lucene 9.10.0, the other major new feature in this 
release candidate is that JCC can now generate a setup.py file instead of 
calling Setup() directly. This makes it possible to use modern Python 
packaging without falling afoul of "python setup.py install" being
deprecated. Setup.py itself is not deprecated, only some of its associated 
commands are; see [1] for more information about this.


In PyLucene's Makefile, there now is a new MODERN_PACKAGING variable, which 
can be set to true so that "python -m build" and "python -m pip install" are 
used for building and installing PyLucene.


JCC 3.14 supports Python 3.3 up to Python 3.12.
PyLucene may also be built with Python 2 but this configuration is no longer
tested.

Please vote to release these artifacts as PyLucene 9.10.0.
Anyone interested in this release can and should vote !

Thanks !

Andi..

ps: the KEYS file for PyLucene release signing is at:
https://dist.apache.org/repos/dist/release/lucene/pylucene/KEYS
https://dist.apache.org/repos/dist/dev/lucene/pylucene/KEYS

pps: here is my +1

[1] https://packaging.python.org/en/latest/discussions/setup-py-deprecated/



Re: The future of the PyLucene project

2024-03-04 Thread Andi Vajda



So it does look like there are users of PyLucene who would like the project 
to continue, after all. As long as there is interest I'm happy to continue 
with it as well.


Thank you all who responded to this thread !

Andi..

On Wed, 28 Feb 2024, Andi Vajda wrote:



Hi PyLucene users and Lucene PMC,

A week ago, on Wednesday February 21st, I started a voting thread for 
qualifying a new PyLucene release candidate to catch-up with the recent 
Lucene 9.10.0 release and fix a bug in JCC.


Usually these voting threads get a couple of +1 for PyLucene users before 
getting votes from a couple of people on the Lucene PMC, always the same ones 
;-) Three PMC +1 votes -> a release can happen.


This time, crickets, the voting thread has been completely quiet.

If there are no PyLucene users anymore, maybe it's time to shut the project 
down ? Personally, I think that the "software value" in the project is all in 
JCC. PyLucene itself is 99% machine generated by JCC around Java Lucene.


Of course, having Java Lucene available that way from Python is pretty cool 
so I don't want diminish PyLucene's "usage value", but from a software 
engineering standpoint, the itch, if you prefer, all the cool stuff is done 
in JCC.


If the Lucene PMC agrees and no PyLucene users come forward, I propose the 
following:

 - shutdown the PyLucene project
 - fork JCC to my gitlab (https://gitlab.pyicu.org/main) where it can
   get the occasional fix or improvement before being released to PyPI.
   JCC has been distributed from PyPI forever,
 https://pypi.org/project/JCC/#history
   so JCC users shouldn't even notice this...

What do you all think ?
This message is not a vote, I'm just trying to gauge interest in PyLucene and 
JCC.


Andi..

ps: for those who have never heard of PyLucene, it is a sub-project of
   Apache Lucene hosted here:
 https://lucene.apache.org/pylucene/index.html
pps: for those who have never heard of JCC, it is a sub-project of PyLucene
 hosted here: https://lucene.apache.org/pylucene/jcc/index.html



Re: [VOTE] Release PyLucene 9.10.0-rc1

2024-03-04 Thread Michael McCandless
+1 to release.

I successfully ran my standard PyLucene smoke test of indexing the first
100K enwiki documents, running a couple queries, force merging to one
segment, and running again.

This was on Python 3.11, OpenJDK 21, Arch Linux kernel 6.4.1.

I am sad that this may be the last official PyLucene release!!  Sorry for
the long delay on completing my vote.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Feb 21, 2024 at 4:50 PM Andi Vajda  wrote:

>
> The PyLucene 9.10.0 (rc1) release tracking the recent release of
> Apache Lucene 9.10.0 is ready.
>
> A release candidate is available from:
> https://dist.apache.org/repos/dist/dev/lucene/pylucene/9.10.0-rc1/
>
> PyLucene 9.10.0 is built with JCC 3.14, included in these release
> artifacts.
>
> Apart from the catch-up to Lucene 9.10.0, the other major new feature in
> this release candidate is that JCC can now generate a setup.py file
> instead
> of calling Setup() directly. This makes it possible to use modern Python
> packaging without falling afoul of "python setup.py install" being
> deprecated. Setup.py itself is not deprecated, only some of its associated
> commands are; see [1] for more information about this.
>
> In PyLucene's Makefile, there now is a new MODERN_PACKAGING variable,
> which
> can be set to true so that "python -m build" and "python -m pip install"
> are
> used for building and installing PyLucene.
>
> JCC 3.14 supports Python 3.3 up to Python 3.12.
> PyLucene may also be built with Python 2 but this configuration is no
> longer
> tested.
>
> Please vote to release these artifacts as PyLucene 9.10.0.
> Anyone interested in this release can and should vote !
>
> Thanks !
>
> Andi..
>
> ps: the KEYS file for PyLucene release signing is at:
> https://dist.apache.org/repos/dist/release/lucene/pylucene/KEYS
> https://dist.apache.org/repos/dist/dev/lucene/pylucene/KEYS
>
> pps: here is my +1
>
> [1]
> https://packaging.python.org/en/latest/discussions/setup-py-deprecated/
>


Re: Inlining, virtual calls and BKDPointsTree

2024-03-02 Thread Gautam Worah
 > I am running Amazon Product Search's benchmarks to see if the change is
needle moving for us.

Results were flat to slightly positive (+0.94% redline QPS) on our workload.
Although we do have numeric range queries that would've improved, I suspect
it is flat because our workload is heavily dominated by TermQueries and
their combinations with various clauses.

I'll try tweaking the query set to target queries with more Point hits
during the week and see what comes out..

Regards,
Gautam Worah.


On Sat, Mar 2, 2024 at 2:33 AM Anton Hägerstrand  wrote:

> Thank you Gautam!
>
> > Yeah, it seems like luceneutil is not stressing the code path that
> ElasticSearch's benchmarks are?
>
> Yes, as far as I understand it - though it might just be that I don't
> understand luceneutil good enough. I believe that in order to see the
> performance diff numerical range queries or numerical sorting would have to
> be involved - the more documents matched the larger the difference. This is
> what the relevant benchmark operations from Elastic does.
>
> > So it seems like switching over from an iterative visit(int docID) call
> to a bulk visit(DocIdSetIterator iterator) gave us these gains? Cool!
>
> Yes, it seems like it, based on this benchmark.
>
> > I am running Amazon Product Search's benchmarks to see if the change is
> needle moving for us.
>
> Thank you, much appreciated!
>
> > Small suggestion on the blog...
>
> Thank you for the feedback! The post is definitely a bit confusing, I
> struggled with keeping it clear. I will try to make some edits to make it
> clearer what conclusions can be made after each section.
>
> /Anton
>
> On Sat, 2 Mar 2024 at 00:30, Gautam Worah  wrote:
>
>> Hi Anton,
>>
>> It took me a while to get through the blog post, and I suspect I will
>> need to read through a couple more times to understand it fully.
>> Thanks for writing up something so detailed. I learnt a lot! (especially
>> about JVM inlining methods).
>>
>> > I have not been able to reproduce the speedup with lucenutil - I
>> suspect that the default tasks in it would not trigger this code path that
>> much.
>>
>> Yeah, it seems like luceneutil is not stressing the code path that
>> ElasticSearch's benchmarks are?
>>
>> > I tried changing the DocIdsWriter::readInts32 (and readDelta16),
>> instead calling the IntersectVisitor with a DocIdSetItorator, to reduce the
>> number of virtual calls. In the benchmark setup by Elastic [2] I saw a
>> decrease of execution time of 35-45% for range queries and numerical
>> sorting with this patch applied.
>>
>> So it seems like switching over from an iterative visit(int docID) call
>> to a bulk visit(DocIdSetIterator iterator) gave us these gains? Cool!
>>
>> I am running Amazon Product Search's benchmarks to see if the change is
>> needle moving for us.
>>
>> Small suggestion on the blog: The JVM inlining, ElasticSearch
>> short-circuiting/opto causing a difference in performance could've been a
>> blog on its own, part 1 maybe.. I got confused when the blog shifted from
>> the performance differences between ElasticSearch and OpenSearch, to how
>> you ended up improving Lucene.
>>
>> Regards,
>> Gautam Worah.
>>
>>
>> On Fri, Mar 1, 2024 at 2:42 AM Anton Hägerstrand 
>> wrote:
>>
>>> Hi everyone, long time lurker here.
>>>
>>> I recently investigated Elasticsearch/OpenSearch performance in a blog
>>> post [1], and saw some interesting behavior of numerical range queries and
>>> numerical sorting with regards to inlining and virtual calls.
>>>
>>> In short, the DocIdsWriter::readInts method seems to get much slower if
>>> it is called with 3 or more implementations of IntersectVisitor during the
>>> JVM lifetime. I believe that this is due to IntersectVisitory.visit(docid)
>>> being heavily inlined with 2 or fewer IntersectVisitor implementations,
>>> while becoming a virtual call with 3 or more.
>>>
>>> This leads to two interesting points wrt Lucene
>>>
>>> 1) For benchmarks, warm ups should not only be done to trigger speedups
>>> by the JIT, instead making the JVM be in a realistic production state. For
>>> the BKDPointTree, this means at least 3 implementations of the
>>> IntersectVisitor. I'm not sure if this is top of mind when writing Lucene
>>> benchmarks?
>>> 2) I tried changing the DocIdsWriter::readInts32 (and readDelta16),
>>> instead calling the IntersectVisitor with a DocIdSetItorator, to reduce the
>>> number of virtual calls. In the benchmark setup by Elastic [2] I saw a
>>> decrease of execution time of 35-45% for range queries and numerical
>>> sorting with this patch applied. PR:
>>> https://github.com/apache/lucene/pull/13149
>>>
>>> I have not been able to reproduce the speedup with lucenutil - I suspect
>>> that the default tasks in it would not trigger this code path that much.
>>>
>>> If you want understand more of my line of thinking, consider skimming
>>> through the blog post [1]
>>>
>>> [1] https://blunders.io/posts/es-benchmark-4-inlining
>>> [2] 

Re: Inlining, virtual calls and BKDPointsTree

2024-03-02 Thread Anton Hägerstrand
Thank you Gautam!

> Yeah, it seems like luceneutil is not stressing the code path that
ElasticSearch's benchmarks are?

Yes, as far as I understand it - though it might just be that I don't
understand luceneutil good enough. I believe that in order to see the
performance diff numerical range queries or numerical sorting would have to
be involved - the more documents matched the larger the difference. This is
what the relevant benchmark operations from Elastic does.

> So it seems like switching over from an iterative visit(int docID) call
to a bulk visit(DocIdSetIterator iterator) gave us these gains? Cool!

Yes, it seems like it, based on this benchmark.

> I am running Amazon Product Search's benchmarks to see if the change is
needle moving for us.

Thank you, much appreciated!

> Small suggestion on the blog...

Thank you for the feedback! The post is definitely a bit confusing, I
struggled with keeping it clear. I will try to make some edits to make it
clearer what conclusions can be made after each section.

/Anton

On Sat, 2 Mar 2024 at 00:30, Gautam Worah  wrote:

> Hi Anton,
>
> It took me a while to get through the blog post, and I suspect I will need
> to read through a couple more times to understand it fully.
> Thanks for writing up something so detailed. I learnt a lot! (especially
> about JVM inlining methods).
>
> > I have not been able to reproduce the speedup with lucenutil - I suspect
> that the default tasks in it would not trigger this code path that much.
>
> Yeah, it seems like luceneutil is not stressing the code path that
> ElasticSearch's benchmarks are?
>
> > I tried changing the DocIdsWriter::readInts32 (and readDelta16), instead
> calling the IntersectVisitor with a DocIdSetItorator, to reduce the number
> of virtual calls. In the benchmark setup by Elastic [2] I saw a decrease of
> execution time of 35-45% for range queries and numerical sorting with this
> patch applied.
>
> So it seems like switching over from an iterative visit(int docID) call
> to a bulk visit(DocIdSetIterator iterator) gave us these gains? Cool!
>
> I am running Amazon Product Search's benchmarks to see if the change is
> needle moving for us.
>
> Small suggestion on the blog: The JVM inlining, ElasticSearch
> short-circuiting/opto causing a difference in performance could've been a
> blog on its own, part 1 maybe.. I got confused when the blog shifted from
> the performance differences between ElasticSearch and OpenSearch, to how
> you ended up improving Lucene.
>
> Regards,
> Gautam Worah.
>
>
> On Fri, Mar 1, 2024 at 2:42 AM Anton Hägerstrand 
> wrote:
>
>> Hi everyone, long time lurker here.
>>
>> I recently investigated Elasticsearch/OpenSearch performance in a blog
>> post [1], and saw some interesting behavior of numerical range queries and
>> numerical sorting with regards to inlining and virtual calls.
>>
>> In short, the DocIdsWriter::readInts method seems to get much slower if
>> it is called with 3 or more implementations of IntersectVisitor during the
>> JVM lifetime. I believe that this is due to IntersectVisitory.visit(docid)
>> being heavily inlined with 2 or fewer IntersectVisitor implementations,
>> while becoming a virtual call with 3 or more.
>>
>> This leads to two interesting points wrt Lucene
>>
>> 1) For benchmarks, warm ups should not only be done to trigger speedups
>> by the JIT, instead making the JVM be in a realistic production state. For
>> the BKDPointTree, this means at least 3 implementations of the
>> IntersectVisitor. I'm not sure if this is top of mind when writing Lucene
>> benchmarks?
>> 2) I tried changing the DocIdsWriter::readInts32 (and readDelta16),
>> instead calling the IntersectVisitor with a DocIdSetItorator, to reduce the
>> number of virtual calls. In the benchmark setup by Elastic [2] I saw a
>> decrease of execution time of 35-45% for range queries and numerical
>> sorting with this patch applied. PR:
>> https://github.com/apache/lucene/pull/13149
>>
>> I have not been able to reproduce the speedup with lucenutil - I suspect
>> that the default tasks in it would not trigger this code path that much.
>>
>> If you want understand more of my line of thinking, consider skimming
>> through the blog post [1]
>>
>> [1] https://blunders.io/posts/es-benchmark-4-inlining
>> [2] https://github.com/elastic/elasticsearch-opensearch-benchmark
>>
>> best regards,
>> Anton H
>>
>


Re: Inlining, virtual calls and BKDPointsTree

2024-03-01 Thread Gautam Worah
Hi Anton,

It took me a while to get through the blog post, and I suspect I will need
to read through a couple more times to understand it fully.
Thanks for writing up something so detailed. I learnt a lot! (especially
about JVM inlining methods).

> I have not been able to reproduce the speedup with lucenutil - I suspect
that the default tasks in it would not trigger this code path that much.

Yeah, it seems like luceneutil is not stressing the code path that
ElasticSearch's benchmarks are?

> I tried changing the DocIdsWriter::readInts32 (and readDelta16), instead
calling the IntersectVisitor with a DocIdSetItorator, to reduce the number
of virtual calls. In the benchmark setup by Elastic [2] I saw a decrease of
execution time of 35-45% for range queries and numerical sorting with this
patch applied.

So it seems like switching over from an iterative visit(int docID) call to
a bulk visit(DocIdSetIterator iterator) gave us these gains? Cool!

I am running Amazon Product Search's benchmarks to see if the change is
needle moving for us.

Small suggestion on the blog: The JVM inlining, ElasticSearch
short-circuiting/opto causing a difference in performance could've been a
blog on its own, part 1 maybe.. I got confused when the blog shifted from
the performance differences between ElasticSearch and OpenSearch, to how
you ended up improving Lucene.

Regards,
Gautam Worah.


On Fri, Mar 1, 2024 at 2:42 AM Anton Hägerstrand  wrote:

> Hi everyone, long time lurker here.
>
> I recently investigated Elasticsearch/OpenSearch performance in a blog
> post [1], and saw some interesting behavior of numerical range queries and
> numerical sorting with regards to inlining and virtual calls.
>
> In short, the DocIdsWriter::readInts method seems to get much slower if it
> is called with 3 or more implementations of IntersectVisitor during the JVM
> lifetime. I believe that this is due to IntersectVisitory.visit(docid)
> being heavily inlined with 2 or fewer IntersectVisitor implementations,
> while becoming a virtual call with 3 or more.
>
> This leads to two interesting points wrt Lucene
>
> 1) For benchmarks, warm ups should not only be done to trigger speedups by
> the JIT, instead making the JVM be in a realistic production state. For the
> BKDPointTree, this means at least 3 implementations of the
> IntersectVisitor. I'm not sure if this is top of mind when writing Lucene
> benchmarks?
> 2) I tried changing the DocIdsWriter::readInts32 (and readDelta16),
> instead calling the IntersectVisitor with a DocIdSetItorator, to reduce the
> number of virtual calls. In the benchmark setup by Elastic [2] I saw a
> decrease of execution time of 35-45% for range queries and numerical
> sorting with this patch applied. PR:
> https://github.com/apache/lucene/pull/13149
>
> I have not been able to reproduce the speedup with lucenutil - I suspect
> that the default tasks in it would not trigger this code path that much.
>
> If you want understand more of my line of thinking, consider skimming
> through the blog post [1]
>
> [1] https://blunders.io/posts/es-benchmark-4-inlining
> [2] https://github.com/elastic/elasticsearch-opensearch-benchmark
>
> best regards,
> Anton H
>


Re: [VOTE] Release PyLucene 9.10.0-rc1

2024-03-01 Thread Greg Kuperberg
Hello folks,

I agree with everyone else that PyLucene is still useful, and I am glad to
see that it is still supported and that people are voting on the new
release.

That said, unfortunately, I never found time to update my own project that
would use these newer versions of PyLucene.  I tried to unsubscribe to this
mailing list, but somehow it didn't work.   (Maybe I used the wrong syntax
in the e-mail to the listserv?)

-- 
0 Greg Kuperberg
01234 Distinguished Professor of Mathematics and Computer Science
02413 University of California, Davis
03142 http://www.math.ucdavis.edu/~greg/
04321 g...@math.ucdavis.edu


Inlining, virtual calls and BKDPointsTree

2024-03-01 Thread Anton Hägerstrand
Hi everyone, long time lurker here.

I recently investigated Elasticsearch/OpenSearch performance in a blog post
[1], and saw some interesting behavior of numerical range queries and
numerical sorting with regards to inlining and virtual calls.

In short, the DocIdsWriter::readInts method seems to get much slower if it
is called with 3 or more implementations of IntersectVisitor during the JVM
lifetime. I believe that this is due to IntersectVisitory.visit(docid)
being heavily inlined with 2 or fewer IntersectVisitor implementations,
while becoming a virtual call with 3 or more.

This leads to two interesting points wrt Lucene

1) For benchmarks, warm ups should not only be done to trigger speedups by
the JIT, instead making the JVM be in a realistic production state. For the
BKDPointTree, this means at least 3 implementations of the
IntersectVisitor. I'm not sure if this is top of mind when writing Lucene
benchmarks?
2) I tried changing the DocIdsWriter::readInts32 (and readDelta16), instead
calling the IntersectVisitor with a DocIdSetItorator, to reduce the number
of virtual calls. In the benchmark setup by Elastic [2] I saw a decrease of
execution time of 35-45% for range queries and numerical sorting with this
patch applied. PR: https://github.com/apache/lucene/pull/13149

I have not been able to reproduce the speedup with lucenutil - I suspect
that the default tasks in it would not trigger this code path that much.

If you want understand more of my line of thinking, consider skimming
through the blog post [1]

[1] https://blunders.io/posts/es-benchmark-4-inlining
[2] https://github.com/elastic/elasticsearch-opensearch-benchmark

best regards,
Anton H


Re: [VOTE] Release PyLucene 9.10.0-rc1

2024-03-01 Thread Bart Moelans
+1


From: Dawid Weiss 
Date: Thursday, 29 February 2024 at 20:31
To: pylucene-dev@lucene.apache.org 
Cc: priv...@lucene.apache.org 
Subject: Re: [VOTE] Release PyLucene 9.10.0-rc1
CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you recognize the sender and know the content 
is safe.


+1.

On Wed, Feb 21, 2024 at 10:50 PM Andi Vajda  wrote:

>
> The PyLucene 9.10.0 (rc1) release tracking the recent release of
> Apache Lucene 9.10.0 is ready.
>
> A release candidate is available from:
> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Fdev%2Flucene%2Fpylucene%2F9.10.0-rc1%2F=05%7C02%7CBart.Moelans%40uantwerpen.be%7C2b46199a2c4f457c82f108dc395d0a20%7C792e08fb2d544a8eaf72202548136ef6%7C0%7C0%7C638448318978470817%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=ilzht5cC%2Fwwk4Kk3xK2zdvlP1zvRg3a2W8VuW1qOmH8%3D=0
>
> PyLucene 9.10.0 is built with JCC 3.14, included in these release
> artifacts.
>
> Apart from the catch-up to Lucene 9.10.0, the other major new feature in
> this release candidate is that JCC can now generate a setup.py file
> instead
> of calling Setup() directly. This makes it possible to use modern Python
> packaging without falling afoul of "python setup.py install" being
> deprecated. Setup.py itself is not deprecated, only some of its associated
> commands are; see [1] for more information about this.
>
> In PyLucene's Makefile, there now is a new MODERN_PACKAGING variable,
> which
> can be set to true so that "python -m build" and "python -m pip install"
> are
> used for building and installing PyLucene.
>
> JCC 3.14 supports Python 3.3 up to Python 3.12.
> PyLucene may also be built with Python 2 but this configuration is no
> longer
> tested.
>
> Please vote to release these artifacts as PyLucene 9.10.0.
> Anyone interested in this release can and should vote !
>
> Thanks !
>
> Andi..
>
> ps: the KEYS file for PyLucene release signing is at:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Frelease%2Flucene%2Fpylucene%2FKEYS=05%7C02%7CBart.Moelans%40uantwerpen.be%7C2b46199a2c4f457c82f108dc395d0a20%7C792e08fb2d544a8eaf72202548136ef6%7C0%7C0%7C638448318978479984%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=QHQjCkFL7fS%2B1t2gNPoCCLeS6YdCaedv%2BUWgsB91C0M%3D=0
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Fdev%2Flucene%2Fpylucene%2FKEYS=05%7C02%7CBart.Moelans%40uantwerpen.be%7C2b46199a2c4f457c82f108dc395d0a20%7C792e08fb2d544a8eaf72202548136ef6%7C0%7C0%7C638448318978486640%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=F9G6cC9sWc6lS6WnaRDcKME9Ci%2Bzi10Y36Z4uZIQ8ic%3D=0
>
> pps: here is my +1
>
> [1]
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpackaging.python.org%2Fen%2Flatest%2Fdiscussions%2Fsetup-py-deprecated%2F=05%7C02%7CBart.Moelans%40uantwerpen.be%7C2b46199a2c4f457c82f108dc395d0a20%7C792e08fb2d544a8eaf72202548136ef6%7C0%7C0%7C638448318978491791%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=z7rt7qk74rqivANd5cV%2BprXlvglMxG9fNOcik5JS8BU%3D=0
>


Re: [VOTE] Release PyLucene 9.10.0-rc1

2024-02-29 Thread Dawid Weiss
+1.

On Wed, Feb 21, 2024 at 10:50 PM Andi Vajda  wrote:

>
> The PyLucene 9.10.0 (rc1) release tracking the recent release of
> Apache Lucene 9.10.0 is ready.
>
> A release candidate is available from:
> https://dist.apache.org/repos/dist/dev/lucene/pylucene/9.10.0-rc1/
>
> PyLucene 9.10.0 is built with JCC 3.14, included in these release
> artifacts.
>
> Apart from the catch-up to Lucene 9.10.0, the other major new feature in
> this release candidate is that JCC can now generate a setup.py file
> instead
> of calling Setup() directly. This makes it possible to use modern Python
> packaging without falling afoul of "python setup.py install" being
> deprecated. Setup.py itself is not deprecated, only some of its associated
> commands are; see [1] for more information about this.
>
> In PyLucene's Makefile, there now is a new MODERN_PACKAGING variable,
> which
> can be set to true so that "python -m build" and "python -m pip install"
> are
> used for building and installing PyLucene.
>
> JCC 3.14 supports Python 3.3 up to Python 3.12.
> PyLucene may also be built with Python 2 but this configuration is no
> longer
> tested.
>
> Please vote to release these artifacts as PyLucene 9.10.0.
> Anyone interested in this release can and should vote !
>
> Thanks !
>
> Andi..
>
> ps: the KEYS file for PyLucene release signing is at:
> https://dist.apache.org/repos/dist/release/lucene/pylucene/KEYS
> https://dist.apache.org/repos/dist/dev/lucene/pylucene/KEYS
>
> pps: here is my +1
>
> [1]
> https://packaging.python.org/en/latest/discussions/setup-py-deprecated/
>


Re: [VOTE] Release PyLucene 9.10.0-rc1

2024-02-29 Thread Nelia Vb
+1

On Thu, 29 Feb 2024, 19:36 Laurent Jakubina, 
wrote:

> +1
>
> Le jeu. 29 févr. 2024 à 01:21, Jeff Breidenbach  a
> écrit :
>
> > +1
> >
> > On Wed, Feb 21, 2024 at 1:51 PM Andi Vajda  wrote:
> >
> > >
> > > The PyLucene 9.10.0 (rc1) release tracking the recent release of
> > > Apache Lucene 9.10.0 is ready.
> > >
> > > A release candidate is available from:
> > > https://dist.apache.org/repos/dist/dev/lucene/pylucene/9.10.0-rc1/
> > >
> > > PyLucene 9.10.0 is built with JCC 3.14, included in these release
> > > artifacts.
> > >
> > > Apart from the catch-up to Lucene 9.10.0, the other major new feature
> in
> > > this release candidate is that JCC can now generate a setup.py file
> > > instead
> > > of calling Setup() directly. This makes it possible to use modern
> Python
> > > packaging without falling afoul of "python setup.py install" being
> > > deprecated. Setup.py itself is not deprecated, only some of its
> > associated
> > > commands are; see [1] for more information about this.
> > >
> > > In PyLucene's Makefile, there now is a new MODERN_PACKAGING variable,
> > > which
> > > can be set to true so that "python -m build" and "python -m pip
> install"
> > > are
> > > used for building and installing PyLucene.
> > >
> > > JCC 3.14 supports Python 3.3 up to Python 3.12.
> > > PyLucene may also be built with Python 2 but this configuration is no
> > > longer
> > > tested.
> > >
> > > Please vote to release these artifacts as PyLucene 9.10.0.
> > > Anyone interested in this release can and should vote !
> > >
> > > Thanks !
> > >
> > > Andi..
> > >
> > > ps: the KEYS file for PyLucene release signing is at:
> > > https://dist.apache.org/repos/dist/release/lucene/pylucene/KEYS
> > > https://dist.apache.org/repos/dist/dev/lucene/pylucene/KEYS
> > >
> > > pps: here is my +1
> > >
> > > [1]
> > >
> https://packaging.python.org/en/latest/discussions/setup-py-deprecated/
> > >
> >
>


Re: [VOTE] Release PyLucene 9.10.0-rc1

2024-02-29 Thread Laurent Jakubina
+1

Le jeu. 29 févr. 2024 à 01:21, Jeff Breidenbach  a
écrit :

> +1
>
> On Wed, Feb 21, 2024 at 1:51 PM Andi Vajda  wrote:
>
> >
> > The PyLucene 9.10.0 (rc1) release tracking the recent release of
> > Apache Lucene 9.10.0 is ready.
> >
> > A release candidate is available from:
> > https://dist.apache.org/repos/dist/dev/lucene/pylucene/9.10.0-rc1/
> >
> > PyLucene 9.10.0 is built with JCC 3.14, included in these release
> > artifacts.
> >
> > Apart from the catch-up to Lucene 9.10.0, the other major new feature in
> > this release candidate is that JCC can now generate a setup.py file
> > instead
> > of calling Setup() directly. This makes it possible to use modern Python
> > packaging without falling afoul of "python setup.py install" being
> > deprecated. Setup.py itself is not deprecated, only some of its
> associated
> > commands are; see [1] for more information about this.
> >
> > In PyLucene's Makefile, there now is a new MODERN_PACKAGING variable,
> > which
> > can be set to true so that "python -m build" and "python -m pip install"
> > are
> > used for building and installing PyLucene.
> >
> > JCC 3.14 supports Python 3.3 up to Python 3.12.
> > PyLucene may also be built with Python 2 but this configuration is no
> > longer
> > tested.
> >
> > Please vote to release these artifacts as PyLucene 9.10.0.
> > Anyone interested in this release can and should vote !
> >
> > Thanks !
> >
> > Andi..
> >
> > ps: the KEYS file for PyLucene release signing is at:
> > https://dist.apache.org/repos/dist/release/lucene/pylucene/KEYS
> > https://dist.apache.org/repos/dist/dev/lucene/pylucene/KEYS
> >
> > pps: here is my +1
> >
> > [1]
> > https://packaging.python.org/en/latest/discussions/setup-py-deprecated/
> >
>


Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-29 Thread Chris Hegarty
Hi, 

> On 29 Feb 2024, at 11:38, Uwe Schindler  wrote:
> 
> Hi,
> 
> this vote has passed.

I was about to send a note about this, but you beat me to it! ;-)  The 
substantive point is that the vote passed - Awesome!

> 
> I wanted to wait for Chris to merge the PR, but due to heavy working on main 
> removing ByteBufferIndexInput and updating Java versions, I accidentally 
> pushed the wrong branch to main, so it is already merged. The PRwas closed 
> manually.
> 
> Lucene "main" (10.0) is now on Java 21.
> 
> Sorry, Chris - my fault!

Apology not needed. Thank you, the the other contributors on that PR, so much 
for getting this done. I’m super happy with the outcome.

-Chris.

> Uwe
> 
> Am 23.02.2024 um 12:24 schrieb Chris Hegarty:
>> Hi,
>> 
>> Since the discussion on bumping the Lucene main branch to Java 21 is winding 
>> down, let's hold a vote on this important change.
>> 
>> Once bumped, the next major release of Lucene (whenever that will be) will 
>> require a version of Java greater than or equal to Java 21.
>> 
>> The vote will be open for at least 72 hours (and allow some additional time 
>> for the weekend) i.e. until 2024-02-28 12:00 UTC.
>> 
>> [ ] +1  approve
>> [ ] +0  no opinion
>> [ ] -1  disapprove (and reason why)
>> 
>> Here is my +1
>> 
>> -Chris.
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
> -- 
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-29 Thread Uwe Schindler

Hi,

this vote has passed.

I wanted to wait for Chris to merge the PR, but due to heavy working on 
main removing ByteBufferIndexInput and updating Java versions, I 
accidentally pushed the wrong branch to main, so it is already merged. 
The PRwas closed manually.


Lucene "main" (10.0) is now on Java 21.

Sorry, Chris - my fault!

Uwe

Am 23.02.2024 um 12:24 schrieb Chris Hegarty:

Hi,

Since the discussion on bumping the Lucene main branch to Java 21 is winding 
down, let's hold a vote on this important change.

Once bumped, the next major release of Lucene (whenever that will be) will 
require a version of Java greater than or equal to Java 21.

The vote will be open for at least 72 hours (and allow some additional time for 
the weekend) i.e. until 2024-02-28 12:00 UTC.

[ ] +1  approve
[ ] +0  no opinion
[ ] -1  disapprove (and reason why)

Here is my +1

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: u...@thetaphi.de


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: The future of the PyLucene project

2024-02-28 Thread Bart Moelans
Dear Andi

I probably missed the mail to vote, I apoligize for that. At Antwerp University 
we still use PyLucene for several purposes on production services. So please 
continue the good work.

Best regards

Bart

dr. Bart Moelans  
(he/him/his)
Technologisch Manager

[8P7IGoY8iEEZwASUVORK5CYII=]
Bibliotheek UAntwerpen
Stadscampus - Lokaal Ve35.304
Venusstraat 35 - 2000 Antwerpen - België
bart.moel...@uantwerpen.be
T +32 486 78 01 85
[phgdtZ4L+dRbHxpMAR8wBJRU5ErkJggg==]


From: Andi Vajda 
Date: Wednesday, 28 February 2024 at 20:48
To: pylucene-dev@lucene.apache.org , 
priv...@lucene.apache.org 
Subject: The future of the PyLucene project
CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you recognize the sender and know the content 
is safe.


  Hi PyLucene users and Lucene PMC,

A week ago, on Wednesday February 21st, I started a voting thread for
qualifying a new PyLucene release candidate to catch-up with the recent
Lucene 9.10.0 release and fix a bug in JCC.

Usually these voting threads get a couple of +1 for PyLucene users before
getting votes from a couple of people on the Lucene PMC, always the same
ones ;-) Three PMC +1 votes -> a release can happen.

This time, crickets, the voting thread has been completely quiet.

If there are no PyLucene users anymore, maybe it's time to shut the project
down ? Personally, I think that the "software value" in the project is all
in JCC. PyLucene itself is 99% machine generated by JCC around Java Lucene.

Of course, having Java Lucene available that way from Python is pretty cool
so I don't want diminish PyLucene's "usage value", but from a software
engineering standpoint, the itch, if you prefer, all the cool stuff is done
in JCC.

If the Lucene PMC agrees and no PyLucene users come forward, I propose the
following:
   - shutdown the PyLucene project
   - fork JCC to my gitlab 
(https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.pyicu.org%2Fmain=05%7C02%7CBart.Moelans%40uantwerpen.be%7C5963406229204f43d03f08dc389639d9%7C792e08fb2d544a8eaf72202548136ef6%7C0%7C0%7C638447465066808772%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=YMILZApLUKkHQxaI03oq%2ByJgHY9i%2FORWGIVQxRkS9kw%3D=0)
 where it can
 get the occasional fix or improvement before being released to PyPI.
 JCC has been distributed from PyPI forever,
   
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpypi.org%2Fproject%2FJCC%2F%23history=05%7C02%7CBart.Moelans%40uantwerpen.be%7C5963406229204f43d03f08dc389639d9%7C792e08fb2d544a8eaf72202548136ef6%7C0%7C0%7C638447465066816875%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=NfACVt8r1kQYc9XVELlQFJwI0ZkHSGxJjI5pufLWDLY%3D=0
 so JCC users shouldn't even notice this...

What do you all think ?
This message is not a vote, I'm just trying to gauge interest in PyLucene
and JCC.

Andi..

ps: for those who have never heard of PyLucene, it is a sub-project of
 Apache Lucene hosted here:
   
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flucene.apache.org%2Fpylucene%2Findex.html=05%7C02%7CBart.Moelans%40uantwerpen.be%7C5963406229204f43d03f08dc389639d9%7C792e08fb2d544a8eaf72202548136ef6%7C0%7C0%7C638447465066822626%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=MvzYQSDXSZen%2FCzO5YLhPETTEDKZQX5UqGk6vcp5y1I%3D=0
pps: for those who have never heard of JCC, it is a sub-project of PyLucene
   hosted here: 
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flucene.apache.org%2Fpylucene%2Fjcc%2Findex.html=05%7C02%7CBart.Moelans%40uantwerpen.be%7C5963406229204f43d03f08dc389639d9%7C792e08fb2d544a8eaf72202548136ef6%7C0%7C0%7C638447465066827160%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=zVv97tOHJurSW2psjdBhd16xFasiZGbjaCX2D3xFY3s%3D=0


Re: The future of the PyLucene project

2024-02-28 Thread Jeff Breidenbach
My excuse is I'm increasingly bad at reading email. Still using. Still
encouraging.


On Wed, Feb 28, 2024 at 9:32 PM Aric Coady  wrote:

> On Feb 28, 2024, at 2:29 PM, Andi Vajda  wrote:
> > Of course anyone can vote !
> > Anyone interested in this project can and should vote !
> > If no one does, how do we know anyone cares ?
>
> +0.5. I’m still maintaining a docker image (coady/pylucene:rc), a homebrew
> formula, and a dependent project (lupyne). But the state of that project is
> much the same - I don’t know how much interest there still is in it.
>
> I feel like Lucene should have python bindings in principle, but I don’t
> personally have a use case anymore. Thanks for your work on this, whatever
> you decide.
>
>


Re: [VOTE] Release PyLucene 9.10.0-rc1

2024-02-28 Thread Jeff Breidenbach
+1

On Wed, Feb 21, 2024 at 1:51 PM Andi Vajda  wrote:

>
> The PyLucene 9.10.0 (rc1) release tracking the recent release of
> Apache Lucene 9.10.0 is ready.
>
> A release candidate is available from:
> https://dist.apache.org/repos/dist/dev/lucene/pylucene/9.10.0-rc1/
>
> PyLucene 9.10.0 is built with JCC 3.14, included in these release
> artifacts.
>
> Apart from the catch-up to Lucene 9.10.0, the other major new feature in
> this release candidate is that JCC can now generate a setup.py file
> instead
> of calling Setup() directly. This makes it possible to use modern Python
> packaging without falling afoul of "python setup.py install" being
> deprecated. Setup.py itself is not deprecated, only some of its associated
> commands are; see [1] for more information about this.
>
> In PyLucene's Makefile, there now is a new MODERN_PACKAGING variable,
> which
> can be set to true so that "python -m build" and "python -m pip install"
> are
> used for building and installing PyLucene.
>
> JCC 3.14 supports Python 3.3 up to Python 3.12.
> PyLucene may also be built with Python 2 but this configuration is no
> longer
> tested.
>
> Please vote to release these artifacts as PyLucene 9.10.0.
> Anyone interested in this release can and should vote !
>
> Thanks !
>
> Andi..
>
> ps: the KEYS file for PyLucene release signing is at:
> https://dist.apache.org/repos/dist/release/lucene/pylucene/KEYS
> https://dist.apache.org/repos/dist/dev/lucene/pylucene/KEYS
>
> pps: here is my +1
>
> [1]
> https://packaging.python.org/en/latest/discussions/setup-py-deprecated/
>


Re: The future of the PyLucene project

2024-02-28 Thread Aric Coady
On Feb 28, 2024, at 2:29 PM, Andi Vajda  wrote:
> Of course anyone can vote !
> Anyone interested in this project can and should vote !
> If no one does, how do we know anyone cares ?

+0.5. I’m still maintaining a docker image (coady/pylucene:rc), a homebrew 
formula, and a dependent project (lupyne). But the state of that project is 
much the same - I don’t know how much interest there still is in it.

I feel like Lucene should have python bindings in principle, but I don’t 
personally have a use case anymore. Thanks for your work on this, whatever you 
decide.



Re: The future of the PyLucene project

2024-02-28 Thread Andi Vajda



On Wed, 28 Feb 2024, Erik Groeneveld LPV wrote:

I always followed new releases and checked the change log for both 
PyLucene and Lucene. I never felt entitled to vote however.


This seems to be a common misconception.
Everyone can vote on a release, everyone is entitled to.
It's just an Apache Rule that 3 PMC +1 votes are required to make a release.

Look at it this way: if a non-PMC user votes +1, it's a sign of interest.
If such a user votes -1, it's even more: a sign of participation and a 
non-binding veto to the release.
Sure, according to the rules, one can ignore it and go ahead with the 
release but what does that say to your user community ?


Of course anyone can vote !
Anyone interested in this project can and should vote !
If no one does, how do we know anyone cares ?

Andi..



I can still vote, but I think it would be more appropriate if Thijs does 
that.


Keep up the good work!
Erik

On Wed, Feb 28, 2024 at 21:08, Dawid Weiss <[dawid.we...@gmail.com](mailto:On Wed, Feb 
28, 2024 at 21:08, Dawid Weiss < wrote:


Hi Andi,

This time, crickets, the voting thread has been completely quiet.




For me - and it's not an excuse at all - you hit winter holidays, I'm
really sorry!


If the Lucene PMC agrees and no PyLucene users come forward, I propose the
following:
- shutdown the PyLucene project
- fork JCC to my gitlab (https://gitlab.pyicu.org/main) where it can
get the occasional fix or improvement before being released to PyPI.
JCC has been distributed from PyPI forever,
https://pypi.org/project/JCC/#history
so JCC users shouldn't even notice this...



I think open source is mostly about the community and folks coding together
for fun... And not many of us seem to be
able to help you with PyLucene development - I can't, for that matter,
because my Python is really limited.

Your plan sounds good to me. And you'd get more freedom from procedural
release
requirements at Apache too, which sounds like an added benefit?... :)

I also hope that, regardless of the status of PyLucene and JCC, you remain
with the Lucene project.

Dawid

--
Seecr is een kleine groep zeer ervaren full cycle software engineers. We
specialiseren ons in Linux, search en dataverwerking met de laatste
technieken. Wilt u weten meer weten? Kijk op seecr.nl .


Re: The future of the PyLucene project

2024-02-28 Thread Erik Groeneveld LPV
Hi Andy,

Thank you very much for PyLucene!

Seecr uses PyLucene extensively in all kinds of projects, in production systems.

A few weeks ago I sold the company, but I am sure they still use PyLucene and 
will continue doing so. I cc’d the new owner, Thijs.

I always followed new releases and checked the change log for both PyLucene and 
Lucene. I never felt entitled to vote however.

I can still vote, but I think it would be more appropriate if Thijs does that.

Keep up the good work!
Erik

On Wed, Feb 28, 2024 at 21:08, Dawid Weiss <[dawid.we...@gmail.com](mailto:On 
Wed, Feb 28, 2024 at 21:08, Dawid Weiss < wrote:

> Hi Andi,
>
> This time, crickets, the voting thread has been completely quiet.
>>
>
> For me - and it's not an excuse at all - you hit winter holidays, I'm
> really sorry!
>
>> If the Lucene PMC agrees and no PyLucene users come forward, I propose the
>> following:
>> - shutdown the PyLucene project
>> - fork JCC to my gitlab (https://gitlab.pyicu.org/main) where it can
>> get the occasional fix or improvement before being released to PyPI.
>> JCC has been distributed from PyPI forever,
>> https://pypi.org/project/JCC/#history
>> so JCC users shouldn't even notice this...
>>
>
> I think open source is mostly about the community and folks coding together
> for fun... And not many of us seem to be
> able to help you with PyLucene development - I can't, for that matter,
> because my Python is really limited.
>
> Your plan sounds good to me. And you'd get more freedom from procedural
> release
> requirements at Apache too, which sounds like an added benefit?... :)
>
> I also hope that, regardless of the status of PyLucene and JCC, you remain
> with the Lucene project.
>
> Dawid
>
> --
> Seecr is een kleine groep zeer ervaren full cycle software engineers. We
> specialiseren ons in Linux, search en dataverwerking met de laatste
> technieken. Wilt u weten meer weten? Kijk op seecr.nl .

Re: The future of the PyLucene project

2024-02-28 Thread Dawid Weiss
Hi Andi,

This time, crickets, the voting thread has been completely quiet.
>

For me - and it's not an excuse at all - you hit winter holidays, I'm
really sorry!


> If the Lucene PMC agrees and no PyLucene users come forward, I propose the
> following:
>- shutdown the PyLucene project
>- fork JCC to my gitlab (https://gitlab.pyicu.org/main) where it can
>  get the occasional fix or improvement before being released to PyPI.
>  JCC has been distributed from PyPI forever,
>https://pypi.org/project/JCC/#history
>  so JCC users shouldn't even notice this...
>

I think open source is mostly about the community and folks coding together
for fun... And not many of us seem to be
able to help you with PyLucene development - I can't, for that matter,
because my Python is really limited.

Your plan sounds good to me. And you'd get more freedom from procedural
release
requirements at Apache too, which sounds  like an added benefit?... :)

I also hope that, regardless of the status of PyLucene and JCC, you remain
with the Lucene project.

Dawid


The future of the PyLucene project

2024-02-28 Thread Andi Vajda



 Hi PyLucene users and Lucene PMC,

A week ago, on Wednesday February 21st, I started a voting thread for 
qualifying a new PyLucene release candidate to catch-up with the recent 
Lucene 9.10.0 release and fix a bug in JCC.


Usually these voting threads get a couple of +1 for PyLucene users before 
getting votes from a couple of people on the Lucene PMC, always the same 
ones ;-) Three PMC +1 votes -> a release can happen.


This time, crickets, the voting thread has been completely quiet.

If there are no PyLucene users anymore, maybe it's time to shut the project 
down ? Personally, I think that the "software value" in the project is all 
in JCC. PyLucene itself is 99% machine generated by JCC around Java Lucene.


Of course, having Java Lucene available that way from Python is pretty cool 
so I don't want diminish PyLucene's "usage value", but from a software 
engineering standpoint, the itch, if you prefer, all the cool stuff is done 
in JCC.


If the Lucene PMC agrees and no PyLucene users come forward, I propose the 
following:

  - shutdown the PyLucene project
  - fork JCC to my gitlab (https://gitlab.pyicu.org/main) where it can
get the occasional fix or improvement before being released to PyPI.
JCC has been distributed from PyPI forever,
  https://pypi.org/project/JCC/#history
so JCC users shouldn't even notice this...

What do you all think ?
This message is not a vote, I'm just trying to gauge interest in PyLucene 
and JCC.


Andi..

ps: for those who have never heard of PyLucene, it is a sub-project of
Apache Lucene hosted here:
  https://lucene.apache.org/pylucene/index.html
pps: for those who have never heard of JCC, it is a sub-project of PyLucene
  hosted here: https://lucene.apache.org/pylucene/jcc/index.html


Re: Announcing githubsearch!

2024-02-27 Thread Michael Sokolov
No I think you only get one version. Maybe we can try adding the green
background out regular making it gray and keeping the transparent
background?

On Mon, Feb 26, 2024, 2:53 PM Michael McCandless 
wrote:

> Done!  Deployed!  Thank you Mike S.
>
> Though on my "dark mode" Chrome on a Macbook, it's super dark.  I can make
> it out but I gotta stare for a bit ... do they make light and dark mode
> .ico files in one!?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Sun, Feb 25, 2024 at 6:05 PM Michael Sokolov 
> wrote:
>
>> here is a favicon you might want to try: I cropped the "VL" from the
>> Apache Lucene logo (ok I guess it's an AL) -- if you save it as
>> favicon.ico in the root of your website (ie as url /favicon.ico) it
>> should show up in bookmarks, browser toolbars, etc as a handy memory
>> aid. Of course you might have other ideas for a picture - it's
>> actually pretty easy to make the favicon once you have a picture you
>> like; I followed the instructions here
>>
>> https://www.logikfabrik.se/blog/how-to-create-a-multisize-favicon-using-gimp/
>>
>> On Thu, Feb 22, 2024 at 10:48 AM Zhang Chao <80152...@qq.com.invalid>
>> wrote:
>> >
>> > Great job! Thanks Mike!
>> >
>> > 2024年2月22日 22:31,Alessandro Benedetti  写道:
>> >
>> > That's cool Mike! Well done!
>> >
>> > On Wed, 21 Feb 2024, 22:02 Anshum Gupta, 
>> wrote:
>> >>
>> >> This is great! Like always, thank you Mike!
>> >>
>> >> On Mon, Feb 19, 2024 at 8:40 AM Michael McCandless <
>> luc...@mikemccandless.com> wrote:
>> >>>
>> >>> Hi Team,
>> >>>
>> >>> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking
>> from Jira to GitHub. Thank you Tomoko for all the hard work doing such a
>> complex, multi-phased, high-fidelity migration!
>> >>>
>> >>> I finally finished also migrating jirasearch to GitHub:
>> githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs
>> are fundamentally more complex than Jira's data model, and the GitHub REST
>> API is also quite rich / heavily normalized. All of the source code for
>> githubsearch lives here. The UI remains its barebones self ;)
>> >>>
>> >>> Githubsearch is dog food for us: it showcases Lucene (currently
>> 9.8.0), and many of its fun features like infix autosuggest, block join
>> queries (each comment is a sub-document on the issue/PR), DrillSideways
>> faceting, near-real-time indexing/searching, synonyms (try “oome”),
>> expressions, non-relevance and blended-relevance sort, etc.  (This old blog
>> post goes into detail.)  Plus, it’s meta-fun to use Lucene to search its
>> own issues, to help us be more productive in improving Lucene!  Nicely
>> recursive.
>> >>>
>> >>> In addition to good ol’ searching by text, githubsearch has some
>> new/fun features:
>> >>>
>> >>> Drill down to just PRs or issues
>> >>> Filter by “review requested” for a given user: poor Adrien has 8
>> (open) now (sorry)! Or see your mentions (Robert is mentioned in 27 open
>> issues/PRs). Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs).
>> Or issues and PRs where a user has had any involvement at all (Dawid has
>> interacted on 197 issues/PRs).
>> >>> Find still-open PRs that were created by a New Contributor (an author
>> who has no changes merged into our repository) or Contributor
>> (non-committer who has had some changes merged into our repository) or
>> Member
>> >>> Here are the uber-stale (last touched more than a month ago) open PRs
>> by outside contributors. We should ideally keep this at 0, but it’s 83 now!
>> >>> “Link to this search” to get a short-er, more permanent URL (it is
>> NOT a URL shortener, though!)
>> >>> Save named searches you frequently run (they just save to local
>> cookie state on that one browser)
>> >>>
>> >>> I’m sure there are exciting bugs, feedback/patches welcome!  If you
>> see problems, please reply to this email or file an issue here.
>> >>>
>> >>> Note that jirasearch remains running, to search Solr, Tika and Infra
>> issues.
>> >>>
>> >>> Happy Searching,
>> >>>
>> >>> Mike McCandless
>> >>>
>> >>> http://blog.mikemccandless.com
>> >>
>> >>
>> >>
>> >> --
>> >> Anshum Gupta
>> >
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-27 Thread Guo Feng
+1

On 2024/02/23 11:24:10 Chris Hegarty wrote:
> Hi,
> 
> Since the discussion on bumping the Lucene main branch to Java 21 is winding 
> down, let's hold a vote on this important change.
> 
> Once bumped, the next major release of Lucene (whenever that will be) will 
> require a version of Java greater than or equal to Java 21.
> 
> The vote will be open for at least 72 hours (and allow some additional time 
> for the weekend) i.e. until 2024-02-28 12:00 UTC.
> 
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
> 
> Here is my +1
> 
> -Chris.
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Announcing githubsearch!

2024-02-26 Thread Michael McCandless
Done!  Deployed!  Thank you Mike S.

Though on my "dark mode" Chrome on a Macbook, it's super dark.  I can make
it out but I gotta stare for a bit ... do they make light and dark mode
.ico files in one!?

Mike McCandless

http://blog.mikemccandless.com


On Sun, Feb 25, 2024 at 6:05 PM Michael Sokolov  wrote:

> here is a favicon you might want to try: I cropped the "VL" from the
> Apache Lucene logo (ok I guess it's an AL) -- if you save it as
> favicon.ico in the root of your website (ie as url /favicon.ico) it
> should show up in bookmarks, browser toolbars, etc as a handy memory
> aid. Of course you might have other ideas for a picture - it's
> actually pretty easy to make the favicon once you have a picture you
> like; I followed the instructions here
>
> https://www.logikfabrik.se/blog/how-to-create-a-multisize-favicon-using-gimp/
>
> On Thu, Feb 22, 2024 at 10:48 AM Zhang Chao <80152...@qq.com.invalid>
> wrote:
> >
> > Great job! Thanks Mike!
> >
> > 2024年2月22日 22:31,Alessandro Benedetti  写道:
> >
> > That's cool Mike! Well done!
> >
> > On Wed, 21 Feb 2024, 22:02 Anshum Gupta,  wrote:
> >>
> >> This is great! Like always, thank you Mike!
> >>
> >> On Mon, Feb 19, 2024 at 8:40 AM Michael McCandless <
> luc...@mikemccandless.com> wrote:
> >>>
> >>> Hi Team,
> >>>
> >>> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking
> from Jira to GitHub. Thank you Tomoko for all the hard work doing such a
> complex, multi-phased, high-fidelity migration!
> >>>
> >>> I finally finished also migrating jirasearch to GitHub:
> githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs
> are fundamentally more complex than Jira's data model, and the GitHub REST
> API is also quite rich / heavily normalized. All of the source code for
> githubsearch lives here. The UI remains its barebones self ;)
> >>>
> >>> Githubsearch is dog food for us: it showcases Lucene (currently
> 9.8.0), and many of its fun features like infix autosuggest, block join
> queries (each comment is a sub-document on the issue/PR), DrillSideways
> faceting, near-real-time indexing/searching, synonyms (try “oome”),
> expressions, non-relevance and blended-relevance sort, etc.  (This old blog
> post goes into detail.)  Plus, it’s meta-fun to use Lucene to search its
> own issues, to help us be more productive in improving Lucene!  Nicely
> recursive.
> >>>
> >>> In addition to good ol’ searching by text, githubsearch has some
> new/fun features:
> >>>
> >>> Drill down to just PRs or issues
> >>> Filter by “review requested” for a given user: poor Adrien has 8
> (open) now (sorry)! Or see your mentions (Robert is mentioned in 27 open
> issues/PRs). Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs).
> Or issues and PRs where a user has had any involvement at all (Dawid has
> interacted on 197 issues/PRs).
> >>> Find still-open PRs that were created by a New Contributor (an author
> who has no changes merged into our repository) or Contributor
> (non-committer who has had some changes merged into our repository) or
> Member
> >>> Here are the uber-stale (last touched more than a month ago) open PRs
> by outside contributors. We should ideally keep this at 0, but it’s 83 now!
> >>> “Link to this search” to get a short-er, more permanent URL (it is NOT
> a URL shortener, though!)
> >>> Save named searches you frequently run (they just save to local cookie
> state on that one browser)
> >>>
> >>> I’m sure there are exciting bugs, feedback/patches welcome!  If you
> see problems, please reply to this email or file an issue here.
> >>>
> >>> Note that jirasearch remains running, to search Solr, Tika and Infra
> issues.
> >>>
> >>> Happy Searching,
> >>>
> >>> Mike McCandless
> >>>
> >>> http://blog.mikemccandless.com
> >>
> >>
> >>
> >> --
> >> Anshum Gupta
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org


Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-26 Thread Michael McCandless
+1, exciting!

Mike McCandless

http://blog.mikemccandless.com


On Fri, Feb 23, 2024 at 6:24 AM Chris Hegarty
 wrote:

> Hi,
>
> Since the discussion on bumping the Lucene main branch to Java 21 is
> winding down, let's hold a vote on this important change.
>
> Once bumped, the next major release of Lucene (whenever that will be) will
> require a version of Java greater than or equal to Java 21.
>
> The vote will be open for at least 72 hours (and allow some additional
> time for the weekend) i.e. until 2024-02-28 12:00 UTC.
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1
>
> -Chris.
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-26 Thread Houston Putman
+1

- Houston

On Mon, Feb 26, 2024 at 7:00 AM Jan Høydahl  wrote:

> +1
>
> Jan
>
> 23. feb. 2024 kl. 20:01 skrev Patrick Zhai :
>
> +1
>
> On Fri, Feb 23, 2024 at 9:34 AM Dawid Weiss  wrote:
>
>>
>> I'm fine with this requirement.
>>
>> +1.
>>
>> On Fri, Feb 23, 2024 at 12:24 PM Chris Hegarty
>>  wrote:
>>
>>> Hi,
>>>
>>> Since the discussion on bumping the Lucene main branch to Java 21 is
>>> winding down, let's hold a vote on this important change.
>>>
>>> Once bumped, the next major release of Lucene (whenever that will be)
>>> will require a version of Java greater than or equal to Java 21.
>>>
>>> The vote will be open for at least 72 hours (and allow some additional
>>> time for the weekend) i.e. until 2024-02-28 12:00 UTC.
>>>
>>> [ ] +1  approve
>>> [ ] +0  no opinion
>>> [ ] -1  disapprove (and reason why)
>>>
>>> Here is my +1
>>>
>>> -Chris.
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>


Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-26 Thread Jan Høydahl
+1

Jan

> 23. feb. 2024 kl. 20:01 skrev Patrick Zhai :
> 
> +1
> 
> On Fri, Feb 23, 2024 at 9:34 AM Dawid Weiss  > wrote:
>> 
>> I'm fine with this requirement. 
>> 
>> +1.
>> 
>> On Fri, Feb 23, 2024 at 12:24 PM Chris Hegarty 
>>  wrote:
>>> Hi,
>>> 
>>> Since the discussion on bumping the Lucene main branch to Java 21 is 
>>> winding down, let's hold a vote on this important change.
>>> 
>>> Once bumped, the next major release of Lucene (whenever that will be) will 
>>> require a version of Java greater than or equal to Java 21.
>>> 
>>> The vote will be open for at least 72 hours (and allow some additional time 
>>> for the weekend) i.e. until 2024-02-28 12:00 UTC.
>>> 
>>> [ ] +1  approve
>>> [ ] +0  no opinion
>>> [ ] -1  disapprove (and reason why)
>>> 
>>> Here is my +1
>>> 
>>> -Chris.
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org 
>>> 
>>> For additional commands, e-mail: dev-h...@lucene.apache.org 
>>> 
>>> 



Re: Welcome Zhang Chao as Lucene committer

2024-02-25 Thread Michael Sokolov
Welcome and congratulations, Chao!

On Sat, Feb 24, 2024 at 8:51 PM Christian Moen  wrote:
>
> Congrats, Chao!
>
> On Wed, Feb 21, 2024 at 2:28 AM Adrien Grand  wrote:
>>
>> I'm pleased to announce that Zhang Chao has accepted the PMC's
>> invitation to become a committer.
>>
>> Chao, the tradition is that new committers introduce themselves with a
>> brief bio.
>>
>> Congratulations and welcome!
>>
>> --
>> Adrien

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-25 Thread Michael Sokolov
+1

On Fri, Feb 23, 2024 at 7:08 PM Stefan Vodita  wrote:
>
> +1
>
> On Fri, 23 Feb 2024 at 11:24, Chris Hegarty 
>  wrote:
>>
>> Hi,
>>
>> Since the discussion on bumping the Lucene main branch to Java 21 is winding 
>> down, let's hold a vote on this important change.
>>
>> Once bumped, the next major release of Lucene (whenever that will be) will 
>> require a version of Java greater than or equal to Java 21.
>>
>> The vote will be open for at least 72 hours (and allow some additional time 
>> for the weekend) i.e. until 2024-02-28 12:00 UTC.
>>
>> [ ] +1  approve
>> [ ] +0  no opinion
>> [ ] -1  disapprove (and reason why)
>>
>> Here is my +1
>>
>> -Chris.
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Announcing githubsearch!

2024-02-25 Thread Michael Sokolov
here is a favicon you might want to try: I cropped the "VL" from the
Apache Lucene logo (ok I guess it's an AL) -- if you save it as
favicon.ico in the root of your website (ie as url /favicon.ico) it
should show up in bookmarks, browser toolbars, etc as a handy memory
aid. Of course you might have other ideas for a picture - it's
actually pretty easy to make the favicon once you have a picture you
like; I followed the instructions here
https://www.logikfabrik.se/blog/how-to-create-a-multisize-favicon-using-gimp/

On Thu, Feb 22, 2024 at 10:48 AM Zhang Chao <80152...@qq.com.invalid> wrote:
>
> Great job! Thanks Mike!
>
> 2024年2月22日 22:31,Alessandro Benedetti  写道:
>
> That's cool Mike! Well done!
>
> On Wed, 21 Feb 2024, 22:02 Anshum Gupta,  wrote:
>>
>> This is great! Like always, thank you Mike!
>>
>> On Mon, Feb 19, 2024 at 8:40 AM Michael McCandless 
>>  wrote:
>>>
>>> Hi Team,
>>>
>>> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking from 
>>> Jira to GitHub. Thank you Tomoko for all the hard work doing such a 
>>> complex, multi-phased, high-fidelity migration!
>>>
>>> I finally finished also migrating jirasearch to GitHub: 
>>> githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs 
>>> are fundamentally more complex than Jira's data model, and the GitHub REST 
>>> API is also quite rich / heavily normalized. All of the source code for 
>>> githubsearch lives here. The UI remains its barebones self ;)
>>>
>>> Githubsearch is dog food for us: it showcases Lucene (currently 9.8.0), and 
>>> many of its fun features like infix autosuggest, block join queries (each 
>>> comment is a sub-document on the issue/PR), DrillSideways faceting, 
>>> near-real-time indexing/searching, synonyms (try “oome”), expressions, 
>>> non-relevance and blended-relevance sort, etc.  (This old blog post goes 
>>> into detail.)  Plus, it’s meta-fun to use Lucene to search its own issues, 
>>> to help us be more productive in improving Lucene!  Nicely recursive.
>>>
>>> In addition to good ol’ searching by text, githubsearch has some new/fun 
>>> features:
>>>
>>> Drill down to just PRs or issues
>>> Filter by “review requested” for a given user: poor Adrien has 8 (open) now 
>>> (sorry)! Or see your mentions (Robert is mentioned in 27 open issues/PRs). 
>>> Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs). Or issues and 
>>> PRs where a user has had any involvement at all (Dawid has interacted on 
>>> 197 issues/PRs).
>>> Find still-open PRs that were created by a New Contributor (an author who 
>>> has no changes merged into our repository) or Contributor (non-committer 
>>> who has had some changes merged into our repository) or Member
>>> Here are the uber-stale (last touched more than a month ago) open PRs by 
>>> outside contributors. We should ideally keep this at 0, but it’s 83 now!
>>> “Link to this search” to get a short-er, more permanent URL (it is NOT a 
>>> URL shortener, though!)
>>> Save named searches you frequently run (they just save to local cookie 
>>> state on that one browser)
>>>
>>> I’m sure there are exciting bugs, feedback/patches welcome!  If you see 
>>> problems, please reply to this email or file an issue here.
>>>
>>> Note that jirasearch remains running, to search Solr, Tika and Infra issues.
>>>
>>> Happy Searching,
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>
>>
>>
>> --
>> Anshum Gupta
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Zhang Chao as Lucene committer

2024-02-24 Thread Christian Moen
Congrats, Chao!

On Wed, Feb 21, 2024 at 2:28 AM Adrien Grand  wrote:

> I'm pleased to announce that Zhang Chao has accepted the PMC's
> invitation to become a committer.
>
> Chao, the tradition is that new committers introduce themselves with a
> brief bio.
>
> Congratulations and welcome!
>
> --
> Adrien
>


Re: Welcome Zhang Chao as Lucene committer

2024-02-24 Thread Julie Tibshirani
Congratulations!!

On Fri, Feb 23, 2024 at 8:49 PM Michael Gibney 
wrote:

> Welcome, Chao!
>
> On Thu, Feb 22, 2024 at 7:23 PM Nhat Nguyen
>  wrote:
> >
> > Congrats, Chao!
> >
> > On Wed, Feb 21, 2024 at 1:03 PM Anshum Gupta 
> wrote:
> >>
> >> Congratulations and welcome, Chao!
> >>
> >> On Tue, Feb 20, 2024 at 9:28 AM Adrien Grand  wrote:
> >>>
> >>> I'm pleased to announce that Zhang Chao has accepted the PMC's
> >>> invitation to become a committer.
> >>>
> >>> Chao, the tradition is that new committers introduce themselves with a
> >>> brief bio.
> >>>
> >>> Congratulations and welcome!
> >>>
> >>> --
> >>> Adrien
> >>
> >>
> >>
> >> --
> >> Anshum Gupta
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Welcome Zhang Chao as Lucene committer

2024-02-23 Thread Michael Gibney
Welcome, Chao!

On Thu, Feb 22, 2024 at 7:23 PM Nhat Nguyen
 wrote:
>
> Congrats, Chao!
>
> On Wed, Feb 21, 2024 at 1:03 PM Anshum Gupta  wrote:
>>
>> Congratulations and welcome, Chao!
>>
>> On Tue, Feb 20, 2024 at 9:28 AM Adrien Grand  wrote:
>>>
>>> I'm pleased to announce that Zhang Chao has accepted the PMC's
>>> invitation to become a committer.
>>>
>>> Chao, the tradition is that new committers introduce themselves with a
>>> brief bio.
>>>
>>> Congratulations and welcome!
>>>
>>> --
>>> Adrien
>>
>>
>>
>> --
>> Anshum Gupta

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-23 Thread Stefan Vodita
+1

On Fri, 23 Feb 2024 at 11:24, Chris Hegarty
 wrote:

> Hi,
>
> Since the discussion on bumping the Lucene main branch to Java 21 is
> winding down, let's hold a vote on this important change.
>
> Once bumped, the next major release of Lucene (whenever that will be) will
> require a version of Java greater than or equal to Java 21.
>
> The vote will be open for at least 72 hours (and allow some additional
> time for the weekend) i.e. until 2024-02-28 12:00 UTC.
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1
>
> -Chris.
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-23 Thread Tomás Fernández Löbbe
SGTM!

+1

On Fri, Feb 23, 2024 at 11:04 AM Patrick Zhai  wrote:

> +1
>
> On Fri, Feb 23, 2024 at 9:34 AM Dawid Weiss  wrote:
>
>>
>> I'm fine with this requirement.
>>
>> +1.
>>
>> On Fri, Feb 23, 2024 at 12:24 PM Chris Hegarty
>>  wrote:
>>
>>> Hi,
>>>
>>> Since the discussion on bumping the Lucene main branch to Java 21 is
>>> winding down, let's hold a vote on this important change.
>>>
>>> Once bumped, the next major release of Lucene (whenever that will be)
>>> will require a version of Java greater than or equal to Java 21.
>>>
>>> The vote will be open for at least 72 hours (and allow some additional
>>> time for the weekend) i.e. until 2024-02-28 12:00 UTC.
>>>
>>> [ ] +1  approve
>>> [ ] +0  no opinion
>>> [ ] -1  disapprove (and reason why)
>>>
>>> Here is my +1
>>>
>>> -Chris.
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>


Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-23 Thread Patrick Zhai
+1

On Fri, Feb 23, 2024 at 9:34 AM Dawid Weiss  wrote:

>
> I'm fine with this requirement.
>
> +1.
>
> On Fri, Feb 23, 2024 at 12:24 PM Chris Hegarty
>  wrote:
>
>> Hi,
>>
>> Since the discussion on bumping the Lucene main branch to Java 21 is
>> winding down, let's hold a vote on this important change.
>>
>> Once bumped, the next major release of Lucene (whenever that will be)
>> will require a version of Java greater than or equal to Java 21.
>>
>> The vote will be open for at least 72 hours (and allow some additional
>> time for the weekend) i.e. until 2024-02-28 12:00 UTC.
>>
>> [ ] +1  approve
>> [ ] +0  no opinion
>> [ ] -1  disapprove (and reason why)
>>
>> Here is my +1
>>
>> -Chris.
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>


Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-23 Thread Dawid Weiss
I'm fine with this requirement.

+1.

On Fri, Feb 23, 2024 at 12:24 PM Chris Hegarty
 wrote:

> Hi,
>
> Since the discussion on bumping the Lucene main branch to Java 21 is
> winding down, let's hold a vote on this important change.
>
> Once bumped, the next major release of Lucene (whenever that will be) will
> require a version of Java greater than or equal to Java 21.
>
> The vote will be open for at least 72 hours (and allow some additional
> time for the weekend) i.e. until 2024-02-28 12:00 UTC.
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1
>
> -Chris.
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-23 Thread Anshum Gupta
+1

On Fri, Feb 23, 2024 at 3:24 AM Chris Hegarty
 wrote:

> Hi,
>
> Since the discussion on bumping the Lucene main branch to Java 21 is
> winding down, let's hold a vote on this important change.
>
> Once bumped, the next major release of Lucene (whenever that will be) will
> require a version of Java greater than or equal to Java 21.
>
> The vote will be open for at least 72 hours (and allow some additional
> time for the weekend) i.e. until 2024-02-28 12:00 UTC.
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1
>
> -Chris.
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-- 
Anshum Gupta


Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-23 Thread Ignacio Vera
+1

On Fri, Feb 23, 2024 at 3:34 PM Benjamin Trent 
wrote:

> +1
>
> On Fri, Feb 23, 2024 at 8:54 AM Adrien Grand  wrote:
>
>> +1
>>
>> On Fri, Feb 23, 2024 at 12:54 PM Uwe Schindler  wrote:
>> >
>> > Here is my +1
>> >
>> > Uwe
>> >
>> > Am 23.02.2024 um 12:24 schrieb Chris Hegarty:
>> > > Hi,
>> > >
>> > > Since the discussion on bumping the Lucene main branch to Java 21 is
>> winding down, let's hold a vote on this important change.
>> > >
>> > > Once bumped, the next major release of Lucene (whenever that will be)
>> will require a version of Java greater than or equal to Java 21.
>> > >
>> > > The vote will be open for at least 72 hours (and allow some
>> additional time for the weekend) i.e. until 2024-02-28 12:00 UTC.
>> > >
>> > > [ ] +1  approve
>> > > [ ] +0  no opinion
>> > > [ ] -1  disapprove (and reason why)
>> > >
>> > > Here is my +1
>> > >
>> > > -Chris.
>> > > -
>> > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > > For additional commands, e-mail: dev-h...@lucene.apache.org
>> > >
>> > --
>> > Uwe Schindler
>> > Achterdiek 19, D-28357 Bremen
>> > https://www.thetaphi.de
>> > eMail: u...@thetaphi.de
>> >
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>>
>>
>> --
>> Adrien
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>


Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-23 Thread Benjamin Trent
+1

On Fri, Feb 23, 2024 at 8:54 AM Adrien Grand  wrote:

> +1
>
> On Fri, Feb 23, 2024 at 12:54 PM Uwe Schindler  wrote:
> >
> > Here is my +1
> >
> > Uwe
> >
> > Am 23.02.2024 um 12:24 schrieb Chris Hegarty:
> > > Hi,
> > >
> > > Since the discussion on bumping the Lucene main branch to Java 21 is
> winding down, let's hold a vote on this important change.
> > >
> > > Once bumped, the next major release of Lucene (whenever that will be)
> will require a version of Java greater than or equal to Java 21.
> > >
> > > The vote will be open for at least 72 hours (and allow some additional
> time for the weekend) i.e. until 2024-02-28 12:00 UTC.
> > >
> > > [ ] +1  approve
> > > [ ] +0  no opinion
> > > [ ] -1  disapprove (and reason why)
> > >
> > > Here is my +1
> > >
> > > -Chris.
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > > For additional commands, e-mail: dev-h...@lucene.apache.org
> > >
> > --
> > Uwe Schindler
> > Achterdiek 19, D-28357 Bremen
> > https://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
>
> --
> Adrien
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-23 Thread Adrien Grand
+1

On Fri, Feb 23, 2024 at 12:54 PM Uwe Schindler  wrote:
>
> Here is my +1
>
> Uwe
>
> Am 23.02.2024 um 12:24 schrieb Chris Hegarty:
> > Hi,
> >
> > Since the discussion on bumping the Lucene main branch to Java 21 is 
> > winding down, let's hold a vote on this important change.
> >
> > Once bumped, the next major release of Lucene (whenever that will be) will 
> > require a version of Java greater than or equal to Java 21.
> >
> > The vote will be open for at least 72 hours (and allow some additional time 
> > for the weekend) i.e. until 2024-02-28 12:00 UTC.
> >
> > [ ] +1  approve
> > [ ] +0  no opinion
> > [ ] -1  disapprove (and reason why)
> >
> > Here is my +1
> >
> > -Chris.
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>


-- 
Adrien

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-23 Thread Uwe Schindler

Here is my +1

Uwe

Am 23.02.2024 um 12:24 schrieb Chris Hegarty:

Hi,

Since the discussion on bumping the Lucene main branch to Java 21 is winding 
down, let's hold a vote on this important change.

Once bumped, the next major release of Lucene (whenever that will be) will 
require a version of Java greater than or equal to Java 21.

The vote will be open for at least 72 hours (and allow some additional time for 
the weekend) i.e. until 2024-02-28 12:00 UTC.

[ ] +1  approve
[ ] +0  no opinion
[ ] -1  disapprove (and reason why)

Here is my +1

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: u...@thetaphi.de


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[Vote] Bump the Lucene main branch to Java 21

2024-02-23 Thread Chris Hegarty
Hi,

Since the discussion on bumping the Lucene main branch to Java 21 is winding 
down, let's hold a vote on this important change.

Once bumped, the next major release of Lucene (whenever that will be) will 
require a version of Java greater than or equal to Java 21.

The vote will be open for at least 72 hours (and allow some additional time for 
the weekend) i.e. until 2024-02-28 12:00 UTC.

[ ] +1  approve
[ ] +0  no opinion
[ ] -1  disapprove (and reason why)

Here is my +1

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Zhang Chao as Lucene committer

2024-02-22 Thread Nhat Nguyen
Congrats, Chao!

On Wed, Feb 21, 2024 at 1:03 PM Anshum Gupta  wrote:

> Congratulations and welcome, Chao!
>
> On Tue, Feb 20, 2024 at 9:28 AM Adrien Grand  wrote:
>
>> I'm pleased to announce that Zhang Chao has accepted the PMC's
>> invitation to become a committer.
>>
>> Chao, the tradition is that new committers introduce themselves with a
>> brief bio.
>>
>> Congratulations and welcome!
>>
>> --
>> Adrien
>>
>
>
> --
> Anshum Gupta
>


Welcome Ben Trent to the Lucene PMC

2024-02-22 Thread Luca Cavanna
I'm pleased to announce that Ben Trent has accepted an invitation to join
the Lucene PMC!

Congratulations Ben, and welcome aboard!


Cheers
Luca


Re: Announcing githubsearch!

2024-02-22 Thread Zhang Chao
Great job! Thanks Mike!

> 2024年2月22日 22:31,Alessandro Benedetti  写道:
> 
> That's cool Mike! Well done! 
> 
> On Wed, 21 Feb 2024, 22:02 Anshum Gupta,  > wrote:
>> This is great! Like always, thank you Mike! 
>> 
>> On Mon, Feb 19, 2024 at 8:40 AM Michael McCandless 
>> mailto:luc...@mikemccandless.com>> wrote:
>>> Hi Team,
>>> 
>>> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking from 
>>> Jira to GitHub. Thank you Tomoko for all the hard work doing such a 
>>> complex, multi-phased, high-fidelity migration!
>>> 
>>> I finally finished also migrating jirasearch to GitHub: 
>>> githubsearch.mikemccandless.com . 
>>> It was tricky because GitHub issues/PRs are fundamentally more complex than 
>>> Jira's data model, and the GitHub REST API is also quite rich / heavily 
>>> normalized. All of the source code for githubsearch lives here 
>>> .
>>>  The UI remains its barebones self ;)
>>> 
>>> Githubsearch 
>>> 
>>>  is dog food for us: it showcases Lucene (currently 9.8.0), and many of its 
>>> fun features like infix autosuggest, block join queries (each comment is a 
>>> sub-document on the issue/PR), DrillSideways faceting, near-real-time 
>>> indexing/searching, synonyms (try “oome 
>>> ”),
>>>  expressions, non-relevance and blended-relevance sort, etc.  (This old 
>>> blog post 
>>> 
>>>  goes into detail.)  Plus, it’s meta-fun to use Lucene to search its own 
>>> issues, to help us be more productive in improving Lucene!  Nicely 
>>> recursive.
>>> 
>>> In addition to good ol’ searching by text, githubsearch 
>>>  has some new/fun features:
>>> Drill down to just PRs or issues
>>> Filter by “review requested” for a given user: poor Adrien has 8 (open) now 
>>> 
>>>  (sorry)! Or see your mentions (Robert is mentioned in 27 open issues/PRs 
>>> ).
>>>  Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs 
>>> ).
>>>  Or issues and PRs where a user has had any involvement at all (Dawid has 
>>> interacted on 197 issues/PRs 
>>> ).
>>> Find still-open PRs that were created by a New Contributor 
>>> 
>>>  (an author who has no changes merged into our repository) or Contributor 
>>> 
>>>  (non-committer who has had some changes merged into our repository) or 
>>> Member 
>>> 
>>> Here are the uber-stale (last touched more than a month ago) open PRs by 
>>> outside contributors 
>>> .
>>>  We should ideally keep this at 0, but it’s 83 now!
>>> “Link to this search” to get a short-er, more permanent URL (it is NOT a 
>>> URL shortener, though!)
>>> Save named searches you frequently run (they just save to local cookie 
>>> state on that one browser)
>>> I’m sure there are exciting bugs, feedback/patches welcome!  If you see 
>>> problems, please reply to this email or file an issue here 
>>> .
>>> 
>>> Note that jirasearch  
>>> remains running, to search Solr, Tika and Infra issues.
>>> 
>>> Happy Searching,
>>> 
>>> Mike McCandless
>>> 
>>> http://blog.mikemccandless.com 
>> 
>> -- 
>> Anshum Gupta



Re: Announcing githubsearch!

2024-02-22 Thread Alessandro Benedetti
That's cool Mike! Well done!

On Wed, 21 Feb 2024, 22:02 Anshum Gupta,  wrote:

> This is great! Like always, thank you Mike!
>
> On Mon, Feb 19, 2024 at 8:40 AM Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> Hi Team,
>>
>> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking from
>> Jira to GitHub. Thank you Tomoko for all the hard work doing such a
>> complex, multi-phased, high-fidelity migration!
>>
>> I finally finished also migrating jirasearch to GitHub:
>> githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs
>> are fundamentally more complex than Jira's data model, and the GitHub REST
>> API is also quite rich / heavily normalized. All of the source code for
>> githubsearch lives here
>> .
>> The UI remains its barebones self ;)
>>
>> Githubsearch
>> 
>> is dog food for us: it showcases Lucene (currently 9.8.0), and many of its
>> fun features like infix autosuggest, block join queries (each comment is a
>> sub-document on the issue/PR), DrillSideways faceting, near-real-time
>> indexing/searching, synonyms (try “oome
>> ”),
>> expressions, non-relevance and blended-relevance sort, etc.  (This old
>> blog post
>> 
>>  goes
>> into detail.)  Plus, it’s meta-fun to use Lucene to search its own issues,
>> to help us be more productive in improving Lucene!  Nicely recursive.
>>
>> In addition to good ol’ searching by text, githubsearch
>>  has some new/fun features:
>>
>>- Drill down to just PRs or issues
>>- Filter by “review requested” for a given user: poor Adrien has 8
>>(open) now
>>
>> 
>>(sorry)! Or see your mentions (Robert is mentioned in 27 open
>>issues/PRs
>>
>> ).
>>Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs
>>
>> ).
>>Or issues and PRs where a user has had any involvement at all (Dawid
>>has interacted on 197 issues/PRs
>>
>> 
>>).
>>- Find still-open PRs that were created by a New Contributor
>>
>> 
>>(an author who has no changes merged into our repository) or
>>Contributor
>>
>> 
>>(non-committer who has had some changes merged into our repository) or
>>Member
>>
>> 
>>- Here are the uber-stale (last touched more than a month ago) open
>>PRs by outside contributors
>>
>> .
>>We should ideally keep this at 0, but it’s 83 now!
>>- “Link to this search” to get a short-er, more permanent URL (it is
>>NOT a URL shortener, though!)
>>- Save named searches you frequently run (they just save to local
>>cookie state on that one browser)
>>
>> I’m sure there are exciting bugs, feedback/patches welcome!  If you see
>> problems, please reply to this email or file an issue here
>> .
>>
>> Note that jirasearch 
>> remains running, to search Solr, Tika and Infra issues.
>>
>> Happy Searching,
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>
>
> --
> Anshum Gupta
>


[VOTE] Release PyLucene 9.10.0-rc1

2024-02-21 Thread Andi Vajda



The PyLucene 9.10.0 (rc1) release tracking the recent release of
Apache Lucene 9.10.0 is ready.

A release candidate is available from:
   https://dist.apache.org/repos/dist/dev/lucene/pylucene/9.10.0-rc1/

PyLucene 9.10.0 is built with JCC 3.14, included in these release artifacts.

Apart from the catch-up to Lucene 9.10.0, the other major new feature in 
this release candidate is that JCC can now generate a setup.py file instead 
of calling Setup() directly. This makes it possible to use modern Python 
packaging without falling afoul of "python setup.py install" being
deprecated. Setup.py itself is not deprecated, only some of its associated 
commands are; see [1] for more information about this.


In PyLucene's Makefile, there now is a new MODERN_PACKAGING variable, which 
can be set to true so that "python -m build" and "python -m pip install" are 
used for building and installing PyLucene.


JCC 3.14 supports Python 3.3 up to Python 3.12.
PyLucene may also be built with Python 2 but this configuration is no longer
tested.

Please vote to release these artifacts as PyLucene 9.10.0.
Anyone interested in this release can and should vote !

Thanks !

Andi..

ps: the KEYS file for PyLucene release signing is at:
https://dist.apache.org/repos/dist/release/lucene/pylucene/KEYS
https://dist.apache.org/repos/dist/dev/lucene/pylucene/KEYS

pps: here is my +1

[1] https://packaging.python.org/en/latest/discussions/setup-py-deprecated/


Re: Announcing githubsearch!

2024-02-21 Thread Anshum Gupta
This is great! Like always, thank you Mike!

On Mon, Feb 19, 2024 at 8:40 AM Michael McCandless <
luc...@mikemccandless.com> wrote:

> Hi Team,
>
> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking from
> Jira to GitHub. Thank you Tomoko for all the hard work doing such a
> complex, multi-phased, high-fidelity migration!
>
> I finally finished also migrating jirasearch to GitHub:
> githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs
> are fundamentally more complex than Jira's data model, and the GitHub REST
> API is also quite rich / heavily normalized. All of the source code for
> githubsearch lives here
> .
> The UI remains its barebones self ;)
>
> Githubsearch
> 
> is dog food for us: it showcases Lucene (currently 9.8.0), and many of its
> fun features like infix autosuggest, block join queries (each comment is a
> sub-document on the issue/PR), DrillSideways faceting, near-real-time
> indexing/searching, synonyms (try “oome
> ”),
> expressions, non-relevance and blended-relevance sort, etc.  (This old
> blog post
> 
>  goes
> into detail.)  Plus, it’s meta-fun to use Lucene to search its own issues,
> to help us be more productive in improving Lucene!  Nicely recursive.
>
> In addition to good ol’ searching by text, githubsearch
>  has some new/fun features:
>
>- Drill down to just PRs or issues
>- Filter by “review requested” for a given user: poor Adrien has 8
>(open) now
>
> 
>(sorry)! Or see your mentions (Robert is mentioned in 27 open
>issues/PRs
>
> ).
>Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs
>
> ).
>Or issues and PRs where a user has had any involvement at all (Dawid
>has interacted on 197 issues/PRs
>
> 
>).
>- Find still-open PRs that were created by a New Contributor
>
> 
>(an author who has no changes merged into our repository) or
>Contributor
>
> 
>(non-committer who has had some changes merged into our repository) or
>Member
>
> 
>- Here are the uber-stale (last touched more than a month ago) open
>PRs by outside contributors
>
> .
>We should ideally keep this at 0, but it’s 83 now!
>- “Link to this search” to get a short-er, more permanent URL (it is
>NOT a URL shortener, though!)
>- Save named searches you frequently run (they just save to local
>cookie state on that one browser)
>
> I’m sure there are exciting bugs, feedback/patches welcome!  If you see
> problems, please reply to this email or file an issue here
> .
>
> Note that jirasearch 
> remains running, to search Solr, Tika and Infra issues.
>
> Happy Searching,
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>


-- 
Anshum Gupta


Re: Welcome Zhang Chao as Lucene committer

2024-02-21 Thread Anshum Gupta
Congratulations and welcome, Chao!

On Tue, Feb 20, 2024 at 9:28 AM Adrien Grand  wrote:

> I'm pleased to announce that Zhang Chao has accepted the PMC's
> invitation to become a committer.
>
> Chao, the tradition is that new committers introduce themselves with a
> brief bio.
>
> Congratulations and welcome!
>
> --
> Adrien
>


-- 
Anshum Gupta


Re: Welcome Zhang Chao as Lucene committer

2024-02-21 Thread Gus Heck
Welcome :)

On Wed, Feb 21, 2024 at 12:03 PM Dawid Weiss  wrote:

>
> Congratulations and welcome!
>
> On Tue, Feb 20, 2024 at 6:28 PM Adrien Grand  wrote:
>
>> I'm pleased to announce that Zhang Chao has accepted the PMC's
>> invitation to become a committer.
>>
>> Chao, the tradition is that new committers introduce themselves with a
>> brief bio.
>>
>> Congratulations and welcome!
>>
>> --
>> Adrien
>>
>

-- 
http://www.needhamsoftware.com (work)
https://a.co/d/b2sZLD9 (my fantasy fiction book)


Re: Welcome Zhang Chao as Lucene committer

2024-02-21 Thread Dawid Weiss
Congratulations and welcome!

On Tue, Feb 20, 2024 at 6:28 PM Adrien Grand  wrote:

> I'm pleased to announce that Zhang Chao has accepted the PMC's
> invitation to become a committer.
>
> Chao, the tradition is that new committers introduce themselves with a
> brief bio.
>
> Congratulations and welcome!
>
> --
> Adrien
>


Re: Bump the Lucene main branch to Java 21

2024-02-21 Thread Michael McCandless
On Wed, Feb 21, 2024 at 7:41 AM Chris Hegarty
 wrote:

> So I think this means we are now free to use all the newfangled language
> features since Java 11 (min required for Lucene 9.x) -> Java 21?
>
> For the _main_ branch, yes.
>
> The _branch_9x_ remains unchanged - it stays on Java 11.
>
> So, if you’re planning to backport a change from main to 9x, then you may
> want to consider what Java language feature and/or JDK API you use - to
> make the backport more straightforward. But this is nothing new, _main_ is
> already on Java 17, while 9x is on Java 11, so the scenario already exists,
> just that the range is changing with this proposal. Hope this helps.
>

Thanks Chris, this makes sense!  So what's new with this change is on main
branch we can now use new language features from Java 17 -> Java 21.  But
on backport to 9.x we must still use only Java 11.

Thanks!

Mike McCandless

http://blog.mikemccandless.com


Re: Bump the Lucene main branch to Java 21

2024-02-21 Thread Chris Hegarty
Hi Mike,

> On 21 Feb 2024, at 12:34, Michael McCandless  
> wrote:
> 
> Thank you for the heads up Chris.
> 
> So I think this means we are now free to use all the newfangled language 
> features since Java 11 (min required for Lucene 9.x) -> Java 21?

For the _main_ branch, yes.

The _branch_9x_ remains unchanged - it stays on Java 11.

So, if you’re planning to backport a change from main to 9x, then you may want 
to consider what Java language feature and/or JDK API you use - to make the 
backport more straightforward. But this is nothing new, _main_ is already on 
Java 17, while 9x is on Java 11, so the scenario already exists, just that the 
range is changing with this proposal. Hope this helps.

-Chris.

> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> 
> On Wed, Feb 21, 2024 at 3:58 AM Chris Hegarty 
>  wrote:
> Hi,
> 
> A number of us have been iterating on a PR to bump the Lucene main branch to 
> a minimum of Java 21 [1]. The work is in a good state and is almost ready to 
> commit.
> 
> While the changes themselves are not large, the impact is arguably larger. So 
> I’m raising awareness here with the wider group.
> 
> Clearly one could conflate the bump to Java 21 with the question of when will 
> Lucene have a next major release, but those issues, while somewhat related, 
> are orthogonal. My position is that the next Lucene major should be on Java 
> 21, regardless of when that will happen.
> 
> Comments, feedback, suggestions welcome.
> 
> Thanks,
> -Chris.
> 
> [1] https://github.com/apache/lucene/pull/12753
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Bump the Lucene main branch to Java 21

2024-02-21 Thread Michael McCandless
Thank you for the heads up Chris.

So I think this means we are now free to use all the newfangled language
features since Java 11 (min required for Lucene 9.x) -> Java 21?

Mike McCandless

http://blog.mikemccandless.com


On Wed, Feb 21, 2024 at 3:58 AM Chris Hegarty
 wrote:

> Hi,
>
> A number of us have been iterating on a PR to bump the Lucene main branch
> to a minimum of Java 21 [1]. The work is in a good state and is almost
> ready to commit.
>
> While the changes themselves are not large, the impact is arguably larger.
> So I’m raising awareness here with the wider group.
>
> Clearly one could conflate the bump to Java 21 with the question of when
> will Lucene have a next major release, but those issues, while somewhat
> related, are orthogonal. My position is that the next Lucene major should
> be on Java 21, regardless of when that will happen.
>
> Comments, feedback, suggestions welcome.
>
> Thanks,
> -Chris.
>
> [1] https://github.com/apache/lucene/pull/12753
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Welcome Zhang Chao as Lucene committer

2024-02-21 Thread Michael McCandless
Welcome Chao!

Mike McCandless

http://blog.mikemccandless.com


On Wed, Feb 21, 2024 at 5:02 AM Stefan Vodita 
wrote:

> Congratulations, Chao!
>
> On Tue, 20 Feb 2024 at 17:28, Adrien Grand  wrote:
>
>> I'm pleased to announce that Zhang Chao has accepted the PMC's
>> invitation to become a committer.
>>
>> Chao, the tradition is that new committers introduce themselves with a
>> brief bio.
>>
>> Congratulations and welcome!
>>
>> --
>> Adrien
>>
>


Re: Announcing githubsearch!

2024-02-21 Thread Michael McCandless
On Tue, Feb 20, 2024 at 10:06 AM Stefan Vodita 
wrote:

Thank you Mike, I really like all the facets!
>

Me too lol.  It was one of the big motivators for me to build this out.
GitHub's search didn't have all the facet drill-downs/up/sideways I
wanted.  Some of them are super useful like "which PRs have review
requested for me
"
or "where am I mentioned
".
Also, GitHub's filter choices do not seem to be dynamically generated for
this query -- so you can pick a filter value and it brings you to 0 hits,
violating the "no dead end" promise of Lucene's facets.

I was also disappointed with GitHub search's lack of hit highlighting, to
solve the "final inch" problem (show me specifically where, in this
massive massive list of comments on a PR/issue, my search terms appear),
and also not showing me the individual comment or code review comment
(multiple ones of those on a PR) where my search terms appear, lack of
linking directly to that comment, etc.  Githubsearch uses Lucene's block
joins to achieve this.

GitHub's search doesn't offer a blended relevance+recency sort, which I
think makes a great default.  It looks like it does support phrase search
(with double-quotes), curious how that works with ngrams.

I do like that the text query language includes all of the sort/filter
criteria -- the "is:open" and "sort:comments-desc".  Githubsearch doens't
support that through the text query language, just the facets UI / REST
query URL.

Anyway, I don't want to complain (too much) about GitHub's search efforts.
Search is clearly hard, and we all (Lucene experts) have a fairly
biased/opinionated take on it all, heh.  I've never met a search engine
that I'm fully happy with ;)

One thing that bothered me about GitHub's own search was that it would
> return
> different results if I wasn't signed in. Maybe it does early stopping for
> non-authenticated users? In any case, this won't be a problem with
> githubsearch.
>

Oh, that is very interesting -- I didn't know that.

Wow, I just tested -- indeed, you cannot even search the source code (for
Lucene's repo anyways) if you are not signed in.  That's weird.

For issues/PRs searching, the three queries I tried seem to produce the
same results signed in or out.  But it is scary/dangerous if this can
differ!!


> Have you considered indexing the Lucene source code too?
>

Oh my, I have not (until now lol).  That's a great idea.  Source code
tokenization would be such a fun problem ... I wonder if GitHub
open-sources how they tokenize the many different languages' source code.
GitHub's code search is in Rust (not using Lucene nor Rucene), a custom
search engine they recently built / switched to:
https://github.blog/2023-02-06-the-technology-behind-githubs-new-code-search,
away from Elasticsearch previously I think.  It looks like they use ngrams,
maybe instead of language-specific tokenization (?), to do the initial
matching/retrieval.  I would try normal lexical tokenization to see if
highlighting could work well.

I opened this luceneserver/GitHubSearch issue
 to think about this
... it'd sure be fun to build and use :)  Thank you for the suggestion
Stefan!

Mike McCandless

http://blog.mikemccandless.com

>


Re: Welcome Zhang Chao as Lucene committer

2024-02-21 Thread Stefan Vodita
Congratulations, Chao!

On Tue, 20 Feb 2024 at 17:28, Adrien Grand  wrote:

> I'm pleased to announce that Zhang Chao has accepted the PMC's
> invitation to become a committer.
>
> Chao, the tradition is that new committers introduce themselves with a
> brief bio.
>
> Congratulations and welcome!
>
> --
> Adrien
>


Re: Welcome Zhang Chao as Lucene committer

2024-02-21 Thread Lu Xugang
Congrats and welcome, Chao
Xugang
https://www.amazingkoala.com.cn/


Alan Woodward  于2024年2月21日周三 17:18写道:

> Congratulations and welcome!
>
> - Alan
>
> > On 20 Feb 2024, at 17:28, Adrien Grand  wrote:
> >
> > I'm pleased to announce that Zhang Chao has accepted the PMC's
> > invitation to become a committer.
> >
> > Chao, the tradition is that new committers introduce themselves with a
> > brief bio.
> >
> > Congratulations and welcome!
> >
> > --
> > Adrien
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Welcome Zhang Chao as Lucene committer

2024-02-21 Thread Alan Woodward
Congratulations and welcome!

- Alan

> On 20 Feb 2024, at 17:28, Adrien Grand  wrote:
> 
> I'm pleased to announce that Zhang Chao has accepted the PMC's
> invitation to become a committer.
> 
> Chao, the tradition is that new committers introduce themselves with a
> brief bio.
> 
> Congratulations and welcome!
> 
> -- 
> Adrien


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Bump the Lucene main branch to Java 21

2024-02-21 Thread Chris Hegarty
Hi,

A number of us have been iterating on a PR to bump the Lucene main branch to a 
minimum of Java 21 [1]. The work is in a good state and is almost ready to 
commit.

While the changes themselves are not large, the impact is arguably larger. So 
I’m raising awareness here with the wider group.

Clearly one could conflate the bump to Java 21 with the question of when will 
Lucene have a next major release, but those issues, while somewhat related, are 
orthogonal. My position is that the next Lucene major should be on Java 21, 
regardless of when that will happen.

Comments, feedback, suggestions welcome.

Thanks,
-Chris.

[1] https://github.com/apache/lucene/pull/12753


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re:Welcome Zhang Chao as Lucene committer

2024-02-20 Thread 80152403
Thank you all! I am very honored to be invited as a committer, and thanks 
again
for all the great suggestions on my PR/issues.



I am from Beijing, China, currently working within ByteDance. Since around 2017,
I started working on Elasticsearch optimization in the infrastructure team. At 
first,
my main work was related to usability and cost optimization. In recent years, I 
have
focused on improving performance, I have always been very interested in this 
field.



Outside of work, I like running and playing table tennis with the goal of 
improving
health and fitness. I used to like play RC model aircraft, but flying 3D RC 
helicopter
aerobatics requires a lot of practice, and there are more important things to 
do. so
only sports events are retained.



I am very happy to work with the best people in the world to improve Lucene!



--
Cheers
Zhang Chao










   
Original Email
   
 

Sender:"Adrien Grand"< jpou...@gmail.com ;

Sent Time:2024/2/21 1:28

To:"Lucene Dev"< dev@lucene.apache.org ;

Cc recipient:"Zhang Chao"< 80152...@qq.com ;

Subject:Welcome Zhang Chao as Lucene committer


I'm pleased to announce that Zhang Chao has accepted the PMC's
invitation to become acommitter.

Chao, the tradition is that newcommittersintroduce themselves with a
brief bio.

Congratulations andwelcome!


--
Adrien

Re: Welcome Zhang Chao as Lucene committer

2024-02-20 Thread Guo Feng
Congratulations and welcome, Chao!

On 2024/02/20 17:41:07 Luca Cavanna wrote:
> Congrats and welcome
> 
> On Tue, Feb 20, 2024 at 6:28 PM Adrien Grand  wrote:
> 
> > I'm pleased to announce that Zhang Chao has accepted the PMC's
> > invitation to become a committer.
> >
> > Chao, the tradition is that new committers introduce themselves with a
> > brief bio.
> >
> > Congratulations and welcome!
> >
> > --
> > Adrien
> >
> 

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Zhang Chao as Lucene committer

2024-02-20 Thread Jianliang Qi
Congratulations Chao!!

On Wed, Feb 21, 2024 at 2:54 AM Vigya Sharma  wrote:

> Congratulations Zhang!
>
> On Tue, Feb 20, 2024 at 9:51 AM Chris Hegarty
>  wrote:
>
>> Congratulations and welcome!!
>>
>> -Chris.
>>
>> > On 20 Feb 2024, at 17:28, Adrien Grand  wrote:
>> >
>> > I'm pleased to announce that Zhang Chao has accepted the PMC's
>> > invitation to become a committer.
>> >
>> > Chao, the tradition is that new committers introduce themselves with a
>> > brief bio.
>> >
>> > Congratulations and welcome!
>> >
>> > --
>> > Adrien
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> --
> - Vigya
>


-- 
Best regards.
Jianliang Qi


Re: (lucene) branch main updated: Fix bw index generation logic.

2024-02-20 Thread Adrien Grand
I had to fix a couple things for addBackcompatIndexes.py to work
properly. I pushed directly because it would have been a bit
cumbersome to run this script without pushing these changes first, but
I'd still appreciate a review if anyone is up for it.

On Tue, Feb 20, 2024 at 10:14 PM  wrote:
>
> This is an automated email from the ASF dual-hosted git repository.
>
> jpountz pushed a commit to branch main
> in repository https://gitbox.apache.org/repos/asf/lucene.git
>
>
> The following commit(s) were added to refs/heads/main by this push:
>  new 13d561af1d6 Fix bw index generation logic.
> 13d561af1d6 is described below
>
> commit 13d561af1d624f35f8a27a05490062ac2472e786
> Author: Adrien Grand 
> AuthorDate: Tue Feb 20 22:10:01 2024 +0100
>
> Fix bw index generation logic.
> ---
>  dev-tools/scripts/addBackcompatIndexes.py  | 13 +++-
>  .../BackwardsCompatibilityTestBase.java| 23 
> +++---
>  .../backward_index/TestGenerateBwcIndices.java |  2 ++
>  3 files changed, 25 insertions(+), 13 deletions(-)
>
> diff --git a/dev-tools/scripts/addBackcompatIndexes.py 
> b/dev-tools/scripts/addBackcompatIndexes.py
> index bbaf0b40630..7faacb8b8e3 100755
> --- a/dev-tools/scripts/addBackcompatIndexes.py
> +++ b/dev-tools/scripts/addBackcompatIndexes.py
> @@ -45,16 +45,13 @@ def create_and_add_index(source, indextype, 
> index_version, current_version, temp
>'emptyIndex': 'empty'
>  }[indextype]
>if indextype in ('cfs', 'nocfs'):
> -dirname = 'index.%s' % indextype
>  filename = '%s.%s-%s.zip' % (prefix, index_version, indextype)
>else:
> -dirname = indextype
>  filename = '%s.%s.zip' % (prefix, index_version)
>
>print('  creating %s...' % filename, end='', flush=True)
>module = 'backward-codecs'
>index_dir = os.path.join('lucene', module, 
> 'src/test/org/apache/lucene/backward_index')
> -  test_file = os.path.join(index_dir, filename)
>if os.path.exists(os.path.join(index_dir, filename)):
>  print('uptodate')
>  return
> @@ -76,24 +73,20 @@ def create_and_add_index(source, indextype, 
> index_version, current_version, temp
>  '-Dtests.codec=default'
>])
>base_dir = os.getcwd()
> -  bc_index_dir = os.path.join(temp_dir, dirname)
> -  bc_index_file = os.path.join(bc_index_dir, filename)
> +  bc_index_file = os.path.join(temp_dir, filename)
>
>if os.path.exists(bc_index_file):
>  print('alreadyexists')
>else:
> -if os.path.exists(bc_index_dir):
> -  shutil.rmtree(bc_index_dir)
>  os.chdir(source)
>  scriptutil.run('./gradlew %s' % gradle_args)
> -os.chdir(bc_index_dir)
> -scriptutil.run('zip %s *' % filename)
> +if not os.path.exists(bc_index_file):
> +  raise Exception("Expected file can't be found: %s" %bc_index_file)
>  print('done')
>
>print('  adding %s...' % filename, end='', flush=True)
>scriptutil.run('cp %s %s' % (bc_index_file, os.path.join(base_dir, 
> index_dir)))
>os.chdir(base_dir)
> -  scriptutil.run('rm -rf %s' % bc_index_dir)
>print('done')
>
>  def update_backcompat_tests(index_version, current_version):
> diff --git 
> a/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/BackwardsCompatibilityTestBase.java
>  
> b/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/BackwardsCompatibilityTestBase.java
> index 8df28d40dbc..b131bb9497b 100644
> --- 
> a/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/BackwardsCompatibilityTestBase.java
> +++ 
> b/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/BackwardsCompatibilityTestBase.java
> @@ -17,6 +17,7 @@
>  package org.apache.lucene.backward_index;
>
>  import com.carrotsearch.randomizedtesting.annotations.Name;
> +import java.io.FileOutputStream;
>  import java.io.IOException;
>  import java.io.InputStream;
>  import java.io.LineNumberReader;
> @@ -38,11 +39,17 @@ import java.util.function.Predicate;
>  import java.util.regex.Matcher;
>  import java.util.regex.Pattern;
>  import java.util.stream.Collectors;
> +import java.util.zip.ZipEntry;
> +import java.util.zip.ZipOutputStream;
>  import org.apache.lucene.codecs.Codec;
>  import org.apache.lucene.index.DirectoryReader;
>  import org.apache.lucene.index.LeafReaderContext;
>  import org.apache.lucene.index.SegmentReader;
>  import org.apache.lucene.store.Directory;
> +import org.apache.lucene.store.FSDirectory;
> +import org.apache.lucene.store.IOContext;
> +import org.apache.lucene.store.IndexInput;
> +import org.apache.lucene.store.OutputStreamDataOutput;
>  import org.apache.lucene.tests.util.LuceneTestCase;
>  import org.apache.lucene.tests.util.TestUtil;
>  import org.apache.lucene.util.BytesRef;
> @@ -253,10 +260,20 @@ public abstract class BackwardsCompatibilityTestBase 
> extends LuceneTestCase {
>protected abstract void createIndex(Directory directory) throws 
> IOException;
>
>public final void createBWCIndex() throws IOException {
> -

Re: Welcome Zhang Chao as Lucene committer

2024-02-20 Thread Vigya Sharma
Congratulations Zhang!

On Tue, Feb 20, 2024 at 9:51 AM Chris Hegarty
 wrote:

> Congratulations and welcome!!
>
> -Chris.
>
> > On 20 Feb 2024, at 17:28, Adrien Grand  wrote:
> >
> > I'm pleased to announce that Zhang Chao has accepted the PMC's
> > invitation to become a committer.
> >
> > Chao, the tradition is that new committers introduce themselves with a
> > brief bio.
> >
> > Congratulations and welcome!
> >
> > --
> > Adrien
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-- 
- Vigya


Re: Welcome Zhang Chao as Lucene committer

2024-02-20 Thread Chris Hegarty
Congratulations and welcome!! 

-Chris.

> On 20 Feb 2024, at 17:28, Adrien Grand  wrote:
> 
> I'm pleased to announce that Zhang Chao has accepted the PMC's
> invitation to become a committer.
> 
> Chao, the tradition is that new committers introduce themselves with a
> brief bio.
> 
> Congratulations and welcome!
> 
> -- 
> Adrien


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Zhang Chao as Lucene committer

2024-02-20 Thread Luca Cavanna
Congrats and welcome

On Tue, Feb 20, 2024 at 6:28 PM Adrien Grand  wrote:

> I'm pleased to announce that Zhang Chao has accepted the PMC's
> invitation to become a committer.
>
> Chao, the tradition is that new committers introduce themselves with a
> brief bio.
>
> Congratulations and welcome!
>
> --
> Adrien
>


[ANNOUNCE] Apache Lucene 9.10.0 released

2024-02-20 Thread Adrien Grand
The Lucene PMC is pleased to announce the release of Apache Lucene 9.10.

Apache Lucene is a high-performance, full-featured search engine library
written entirely in Java. It is a technology suitable for nearly any
application that requires structured search, full-text search, faceting,
nearest-neighbor search on high-dimensionality vectors, spell correction or
query suggestions.

This release contains numerous features, optimizations, and improvements,
some of which are highlighted below. The release is available for immediate
download at:
  https://lucene.apache.org/core/downloads.html

Lucene 9.10 Release Highlights

New Features

 * Support for similarity-based vector searches, ie. finding all nearest
neighbors whose similarity is greater than a configured threshold from a
query vector. See [Byte|Float]VectorSimilarityQuery.

 * Index sorting is now compatible with block joins. See
IndexWriterConfig#setParentField.

 * MMapDirectory now takes advantage of the now finalized JDK foreign
memory API internally when running on Java 22 (or later). This was only
supported with Java 19 to 21 until now.

 * SIMD vectorization now takes advantage of JDK vector incubator on Java
22. This was only supported with Java 20 or 21 until now.

Optimizations

 * Tail postings are now encoded using group-varint. This yielded speedups
on queries that match lots of terms that have short postings lists in
Lucene's nightly benchmarks.

 * Range queries on points now exit earlier when evaluating a segment that
has no matches. This will improve performance when intersected with other
queries that have a high up-front cost such as multi-term queries.

 * BooleanQueries that mix SHOULD and FILTER clauses now propagate minimum
competitive scores to the SHOULD clauses, yielding significant speedups for
top-k queries sorted by descending score.

 * IndexSearcher#count has been optimized on pure disjunctions of two term
queries.

... plus a multitude of helpful bug fixes!

Further details of changes are available in the change log available at:
http://lucene.apache.org/core/9_10_0/changes/Changes.html.

Please report any feedback to the mailing lists (
http://lucene.apache.org/core/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using may
not have replicated the release yet. If that is the case, please try
another mirror. This also applies to Maven access.

-- 
Adrien


Welcome Zhang Chao as Lucene committer

2024-02-20 Thread Adrien Grand
I'm pleased to announce that Zhang Chao has accepted the PMC's
invitation to become a committer.

Chao, the tradition is that new committers introduce themselves with a
brief bio.

Congratulations and welcome!

-- 
Adrien


Re: Announcing githubsearch!

2024-02-20 Thread Walter Underwood
Oops, I followed a link which went to the main GitHub search. Nevermind.

I’m getting zero results for “wunder” now, no error. Looks like my username 
there is “wrunderwood”, that is working correctly as are quoted searches for my 
name.

I’l fool around some more, but so far it looks clean and fast. 

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Feb 20, 2024, at 3:29 AM, Michael McCandless  
> wrote:
> 
> On Mon, Feb 19, 2024 at 1:00 PM Walter Underwood  > wrote:
> 
>> It appears to always search prefixes, so there is no way to search for 
>> “wunder” without getting “wundermap” and “wunderground”. Putting the term in 
>> quotes doesn’t turn that off.
> 
> Hmm that shouldn't be the case?  It does split on camel case though (thank 
> you WordDelimiterFilter!).  E.g. try searching on infix 
> 
>  and you should see it highlighted inside terms like AnalyzingInfixSuggester.
> 
> In fact when I search for wunder 
> 
>  I get a horrible exception, I think I know why (it happens for any query 
> that gets no hits!).  I opened this issue 
> .  I'll try to fix that 
> soon.
> 
> Walter, I'm not sure how you were able to even search on "wunder" -- did you 
> get actual results?  From githubsearch 
> ?
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com 
> 



Re: Announcing githubsearch!

2024-02-20 Thread Stefan Vodita
Thank you Mike, I really like all the facets!

One thing that bothered me about GitHub's own search was that it would
return
different results if I wasn't signed in. Maybe it does early stopping for
non-authenticated users? In any case, this won't be a problem with
githubsearch.

Have you considered indexing the Lucene source code too?


Stefan

On Mon, 19 Feb 2024 at 16:40, Michael McCandless 
wrote:

> Hi Team,
>
> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking from
> Jira to GitHub. Thank you Tomoko for all the hard work doing such a
> complex, multi-phased, high-fidelity migration!
>
> I finally finished also migrating jirasearch to GitHub:
> githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs
> are fundamentally more complex than Jira's data model, and the GitHub REST
> API is also quite rich / heavily normalized. All of the source code for
> githubsearch lives here
> .
> The UI remains its barebones self ;)
>
> Githubsearch
> 
> is dog food for us: it showcases Lucene (currently 9.8.0), and many of its
> fun features like infix autosuggest, block join queries (each comment is a
> sub-document on the issue/PR), DrillSideways faceting, near-real-time
> indexing/searching, synonyms (try “oome
> ”),
> expressions, non-relevance and blended-relevance sort, etc.  (This old
> blog post
> 
>  goes
> into detail.)  Plus, it’s meta-fun to use Lucene to search its own issues,
> to help us be more productive in improving Lucene!  Nicely recursive.
>
> In addition to good ol’ searching by text, githubsearch
>  has some new/fun features:
>
>- Drill down to just PRs or issues
>- Filter by “review requested” for a given user: poor Adrien has 8
>(open) now
>
> 
>(sorry)! Or see your mentions (Robert is mentioned in 27 open
>issues/PRs
>
> ).
>Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs
>
> ).
>Or issues and PRs where a user has had any involvement at all (Dawid
>has interacted on 197 issues/PRs
>
> 
>).
>- Find still-open PRs that were created by a New Contributor
>
> 
>(an author who has no changes merged into our repository) or
>Contributor
>
> 
>(non-committer who has had some changes merged into our repository) or
>Member
>
> 
>- Here are the uber-stale (last touched more than a month ago) open
>PRs by outside contributors
>
> .
>We should ideally keep this at 0, but it’s 83 now!
>- “Link to this search” to get a short-er, more permanent URL (it is
>NOT a URL shortener, though!)
>- Save named searches you frequently run (they just save to local
>cookie state on that one browser)
>
> I’m sure there are exciting bugs, feedback/patches welcome!  If you see
> problems, please reply to this email or file an issue here
> .
>
> Note that jirasearch 
> remains running, to search Solr, Tika and Infra issues.
>
> Happy Searching,
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>


Re: Announcing githubsearch!

2024-02-20 Thread Chris Hegarty
Awesome! I love it. Very useful.

-Chris.

> On 20 Feb 2024, at 11:40, Michael McCandless  
> wrote:
> 
> Thank you for all the warm feedback everyone, and all the exciting issues 
> already uncovered / ideas for improvements.  Now I have some more fun work to 
> do!
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> 
> On Mon, Feb 19, 2024 at 12:58 PM Julie Tibshirani  wrote:
> This is so cool! Thank you Mike for developing and hosting these services!
> 
> Julie
> 
> On Mon, Feb 19, 2024 at 9:40 AM Michael Wechner  
> wrote:
> thank you very much!
> 
> Am 19.02.24 um 17:39 schrieb Michael McCandless:
>> Hi Team,
>> 
>> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking from Jira 
>> to GitHub. Thank you Tomoko for all the hard work doing such a complex, 
>> multi-phased, high-fidelity migration!
>> 
>> I finally finished also migrating jirasearch to GitHub: 
>> githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs are 
>> fundamentally more complex than Jira's data model, and the GitHub REST API 
>> is also quite rich / heavily normalized. All of the source code for 
>> githubsearch lives here. The UI remains its barebones self ;)
>> 
>> Githubsearch is dog food for us: it showcases Lucene (currently 9.8.0), and 
>> many of its fun features like infix autosuggest, block join queries (each 
>> comment is a sub-document on the issue/PR), DrillSideways faceting, 
>> near-real-time indexing/searching, synonyms (try “oome”), expressions, 
>> non-relevance and blended-relevance sort, etc.  (This old blog post goes 
>> into detail.)  Plus, it’s meta-fun to use Lucene to search its own issues, 
>> to help us be more productive in improving Lucene!  Nicely recursive.
>> 
>> In addition to good ol’ searching by text, githubsearch has some new/fun 
>> features:
>> • Drill down to just PRs or issues
>> • Filter by “review requested” for a given user: poor Adrien has 8 
>> (open) now (sorry)! Or see your mentions (Robert is mentioned in 27 open 
>> issues/PRs). Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs). 
>> Or issues and PRs where a user has had any involvement at all (Dawid has 
>> interacted on 197 issues/PRs).
>> • Find still-open PRs that were created by a New Contributor (an author 
>> who has no changes merged into our repository) or Contributor (non-committer 
>> who has had some changes merged into our repository) or Member
>> • Here are the uber-stale (last touched more than a month ago) open PRs 
>> by outside contributors. We should ideally keep this at 0, but it’s 83 now!
>> • “Link to this search” to get a short-er, more permanent URL (it is NOT 
>> a URL shortener, though!)
>> • Save named searches you frequently run (they just save to local cookie 
>> state on that one browser)
>> I’m sure there are exciting bugs, feedback/patches welcome!  If you see 
>> problems, please reply to this email or file an issue here. 
>> 
>> Note that jirasearch remains running, to search Solr, Tika and Infra issues.
>> 
>> Happy Searching,
>> 
>> Mike McCandless
>> 
>> http://blog.mikemccandless.com
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



unsubscribe

2024-02-20 Thread Gino Rodrigues



Re: Announcing githubsearch!

2024-02-20 Thread Michael McCandless
Thank you for all the warm feedback everyone, and all the exciting issues
already uncovered / ideas for improvements.  Now I have some more fun work
to do!

Mike McCandless

http://blog.mikemccandless.com


On Mon, Feb 19, 2024 at 12:58 PM Julie Tibshirani 
wrote:

> This is so cool! Thank you Mike for developing and hosting these services!
>
> Julie
>
> On Mon, Feb 19, 2024 at 9:40 AM Michael Wechner 
> wrote:
>
>> thank you very much!
>>
>> Am 19.02.24 um 17:39 schrieb Michael McCandless:
>>
>> Hi Team,
>>
>> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking from
>> Jira to GitHub. Thank you Tomoko for all the hard work doing such a
>> complex, multi-phased, high-fidelity migration!
>>
>> I finally finished also migrating jirasearch to GitHub:
>> githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs
>> are fundamentally more complex than Jira's data model, and the GitHub REST
>> API is also quite rich / heavily normalized. All of the source code for
>> githubsearch lives here
>> .
>> The UI remains its barebones self ;)
>>
>> Githubsearch
>> 
>> is dog food for us: it showcases Lucene (currently 9.8.0), and many of its
>> fun features like infix autosuggest, block join queries (each comment is a
>> sub-document on the issue/PR), DrillSideways faceting, near-real-time
>> indexing/searching, synonyms (try “oome
>> ”),
>> expressions, non-relevance and blended-relevance sort, etc.  (This old
>> blog post
>> 
>>  goes
>> into detail.)  Plus, it’s meta-fun to use Lucene to search its own issues,
>> to help us be more productive in improving Lucene!  Nicely recursive.
>>
>> In addition to good ol’ searching by text, githubsearch
>>  has some new/fun features:
>>
>>- Drill down to just PRs or issues
>>- Filter by “review requested” for a given user: poor Adrien has 8
>>(open) now
>>
>> 
>>(sorry)! Or see your mentions (Robert is mentioned in 27 open
>>issues/PRs
>>
>> ).
>>Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs
>>
>> ).
>>Or issues and PRs where a user has had any involvement at all (Dawid
>>has interacted on 197 issues/PRs
>>
>> 
>>).
>>- Find still-open PRs that were created by a New Contributor
>>
>> 
>>(an author who has no changes merged into our repository) or
>>Contributor
>>
>> 
>>(non-committer who has had some changes merged into our repository) or
>>Member
>>
>> 
>>- Here are the uber-stale (last touched more than a month ago) open
>>PRs by outside contributors
>>
>> .
>>We should ideally keep this at 0, but it’s 83 now!
>>- “Link to this search” to get a short-er, more permanent URL (it is
>>NOT a URL shortener, though!)
>>- Save named searches you frequently run (they just save to local
>>cookie state on that one browser)
>>
>> I’m sure there are exciting bugs, feedback/patches welcome!  If you see
>> problems, please reply to this email or file an issue here
>> .
>>
>> Note that jirasearch 
>> remains running, to search Solr, Tika and Infra issues.
>>
>> Happy Searching,
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>>


Re: Announcing githubsearch!

2024-02-20 Thread Michael McCandless
On Mon, Feb 19, 2024 at 1:00 PM Walter Underwood 
wrote:

It appears to always search prefixes, so there is no way to search for
> “wunder” without getting “wundermap” and “wunderground”. Putting the term
> in quotes doesn’t turn that off.
>

Hmm that shouldn't be the case?  It does split on camel case though (thank
you WordDelimiterFilter!).  E.g. try searching on infix

and
you should see it highlighted inside terms like AnalyzingInfixSuggester.

In fact when I search for wunder

I get a horrible exception, I think I know why (it happens for any query
that gets no hits!).  I opened this issue
.  I'll try to fix
that soon.

Walter, I'm not sure how you were able to even search on "wunder" -- did
you get actual results?  From githubsearch
?

Mike McCandless

http://blog.mikemccandless.com


Re: Announcing githubsearch!

2024-02-20 Thread Michael McCandless
On Tue, Feb 20, 2024 at 6:01 AM Michael Sokolov  wrote:

I love the gray all text UI. Don't change it! But I wonder if it's time for
> a favicon?
>

LOL favicon!  You do NOT want to have to confront my artistic skills!

Mike McCandless

http://blog.mikemccandless.com

>


Re: Announcing githubsearch!

2024-02-20 Thread Michael Sokolov
I love the gray all text UI. Don't change it! But I wonder if it's time for
a favicon?

On Tue, Feb 20, 2024, 4:40 AM Adrien Grand  wrote:

> Very cool, thank you Mike!
>
> On Mon, Feb 19, 2024 at 5:40 PM Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> Hi Team,
>>
>> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking from
>> Jira to GitHub. Thank you Tomoko for all the hard work doing such a
>> complex, multi-phased, high-fidelity migration!
>>
>> I finally finished also migrating jirasearch to GitHub:
>> githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs
>> are fundamentally more complex than Jira's data model, and the GitHub REST
>> API is also quite rich / heavily normalized. All of the source code for
>> githubsearch lives here
>> .
>> The UI remains its barebones self ;)
>>
>> Githubsearch
>> 
>> is dog food for us: it showcases Lucene (currently 9.8.0), and many of its
>> fun features like infix autosuggest, block join queries (each comment is a
>> sub-document on the issue/PR), DrillSideways faceting, near-real-time
>> indexing/searching, synonyms (try “oome
>> ”),
>> expressions, non-relevance and blended-relevance sort, etc.  (This old
>> blog post
>> 
>>  goes
>> into detail.)  Plus, it’s meta-fun to use Lucene to search its own issues,
>> to help us be more productive in improving Lucene!  Nicely recursive.
>>
>> In addition to good ol’ searching by text, githubsearch
>>  has some new/fun features:
>>
>>- Drill down to just PRs or issues
>>- Filter by “review requested” for a given user: poor Adrien has 8
>>(open) now
>>
>> 
>>(sorry)! Or see your mentions (Robert is mentioned in 27 open
>>issues/PRs
>>
>> ).
>>Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs
>>
>> ).
>>Or issues and PRs where a user has had any involvement at all (Dawid
>>has interacted on 197 issues/PRs
>>
>> 
>>).
>>- Find still-open PRs that were created by a New Contributor
>>
>> 
>>(an author who has no changes merged into our repository) or
>>Contributor
>>
>> 
>>(non-committer who has had some changes merged into our repository) or
>>Member
>>
>> 
>>- Here are the uber-stale (last touched more than a month ago) open
>>PRs by outside contributors
>>
>> .
>>We should ideally keep this at 0, but it’s 83 now!
>>- “Link to this search” to get a short-er, more permanent URL (it is
>>NOT a URL shortener, though!)
>>- Save named searches you frequently run (they just save to local
>>cookie state on that one browser)
>>
>> I’m sure there are exciting bugs, feedback/patches welcome!  If you see
>> problems, please reply to this email or file an issue here
>> .
>>
>> Note that jirasearch 
>> remains running, to search Solr, Tika and Infra issues.
>>
>> Happy Searching,
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>
>
> --
> Adrien
>


Re: Announcing githubsearch!

2024-02-20 Thread Adrien Grand
Very cool, thank you Mike!

On Mon, Feb 19, 2024 at 5:40 PM Michael McCandless <
luc...@mikemccandless.com> wrote:

> Hi Team,
>
> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking from
> Jira to GitHub. Thank you Tomoko for all the hard work doing such a
> complex, multi-phased, high-fidelity migration!
>
> I finally finished also migrating jirasearch to GitHub:
> githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs
> are fundamentally more complex than Jira's data model, and the GitHub REST
> API is also quite rich / heavily normalized. All of the source code for
> githubsearch lives here
> .
> The UI remains its barebones self ;)
>
> Githubsearch
> 
> is dog food for us: it showcases Lucene (currently 9.8.0), and many of its
> fun features like infix autosuggest, block join queries (each comment is a
> sub-document on the issue/PR), DrillSideways faceting, near-real-time
> indexing/searching, synonyms (try “oome
> ”),
> expressions, non-relevance and blended-relevance sort, etc.  (This old
> blog post
> 
>  goes
> into detail.)  Plus, it’s meta-fun to use Lucene to search its own issues,
> to help us be more productive in improving Lucene!  Nicely recursive.
>
> In addition to good ol’ searching by text, githubsearch
>  has some new/fun features:
>
>- Drill down to just PRs or issues
>- Filter by “review requested” for a given user: poor Adrien has 8
>(open) now
>
> 
>(sorry)! Or see your mentions (Robert is mentioned in 27 open
>issues/PRs
>
> ).
>Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs
>
> ).
>Or issues and PRs where a user has had any involvement at all (Dawid
>has interacted on 197 issues/PRs
>
> 
>).
>- Find still-open PRs that were created by a New Contributor
>
> 
>(an author who has no changes merged into our repository) or
>Contributor
>
> 
>(non-committer who has had some changes merged into our repository) or
>Member
>
> 
>- Here are the uber-stale (last touched more than a month ago) open
>PRs by outside contributors
>
> .
>We should ideally keep this at 0, but it’s 83 now!
>- “Link to this search” to get a short-er, more permanent URL (it is
>NOT a URL shortener, though!)
>- Save named searches you frequently run (they just save to local
>cookie state on that one browser)
>
> I’m sure there are exciting bugs, feedback/patches welcome!  If you see
> problems, please reply to this email or file an issue here
> .
>
> Note that jirasearch 
> remains running, to search Solr, Tika and Infra issues.
>
> Happy Searching,
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>


-- 
Adrien


<    1   2   3   4   5   6   7   8   9   10   >