Re: Lucene 9.2 release

2022-05-03 Thread Tomoko Uchida
+1 Thank you Alan! I wonder if it makes sense to include in the highlighted updates that pull requests to the github repository no longer require Jira issues? I'm trying to adjust the contribution workflow more GitHub-oriented and there is a related issue https://issues.apache.org/jira/browse/LUC

Re: Lucene 9.2 release

2022-05-03 Thread Ignacio Vera
+1 Thanks Alan! > On 3. May 2022, at 13:01, Alan Woodward wrote: > > Hi all, > > It’s been six weeks or so since we released 9.1, and we have a bunch of nice > new features and enhancements piling up in the 9.x branch. I’d like to > volunteer to be a release manager for a 9.2 release. I

Re: Changing type of the tokens generated by pattern tokenizer

2022-05-03 Thread Robert Muir
As an alternative to writing a custom tokenizer, you can use built-in PatternTypingFilter which does exactly this (sets type based on whether it matches some regex). https://lucene.apache.org/core/9_1_0/analysis/common/org/apache/lucene/analysis/pattern/PatternTypingFilter.html On Tue, May 3, 202

REMINDER - Travel Assistance available for ApacheCon NA New Orleans 2022

2022-05-03 Thread Gavin McDonald
Hi All Contributors and Committers, This is a first reminder email that travel assistance applications for ApacheCon NA 2022 are now open! We will be supporting ApacheCon North America in New Orleans, Louisiana, on October 3rd through 6th, 2022. TAC exists to help those that would like to attend

Lucene 9.2 release

2022-05-03 Thread Alan Woodward
Hi all, It’s been six weeks or so since we released 9.1, and we have a bunch of nice new features and enhancements piling up in the 9.x branch. I’d like to volunteer to be a release manager for a 9.2 release. I propose to cut a branch this time next week, 10th May. - Alan ---

Re: Changing type of the tokens generated by pattern tokenizer

2022-05-03 Thread Tomoko Uchida
Hi, you pass input.toString() to the matcher - this is the entire source character stream to be tokenized; I think this would lead to the result you saw. If you'd like to match the pattern to the specific token (a substring of the input), I think you may want to give the substring of the input stri

Changing type of the tokens generated by pattern tokenizer

2022-05-03 Thread dishant sharma
I am creating a custom Pattern Tokenizer to change the type of the generated tokens. By incrementToken() function looks like the below code: public boolean incrementToken() { if (index >= str.length()) return false; clearAttributes(); if (group >= 0) { // match a specific grou