Hello,
I'm surprised and in doubt it may happen. Would you mind to upload a short
test reproducing it?

On Wed, Sep 20, 2023 at 11:44 PM Amitesh Kumar <amiteshk...@gmail.com>
wrote:

> Thanks Mikhail!
>
> I have tried all other tokenizers from Lucene4.4. In case of
> WhitespaceTokwnizer, it loses romanizing of special chars like - etc
>
>
> On Wed, Sep 20, 2023 at 16:39 Mikhail Khludnev <m...@apache.org> wrote:
>
> > Hello,
> > Check the whitespace tokenizer.
> >
> > On Wed, Sep 20, 2023 at 7:46 PM Amitesh Kumar <amiteshk...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I am facing a requirement change to get % sign retained in searches.
> e.g.
> > >
> > > Sample search docs:
> > > 1. Number of boys 50
> > > 2. My score was 50%
> > > 3. 40-50% for pass score
> > >
> > > Search query: 50%
> > > Expected results: Doc-2, Doc-3 i.e.
> > > My score was
> > > 1. 50%
> > > 2. 40-50% for pass score
> > >
> > > Actual result: All 3 documents (because tokenizer strips off the % both
> > > during indexing as well as searching and hence matches all docs with 50
> > in
> > > it.
> > >
> > > On the implementation front, I am using a set of filters like
> > > lowerCaseFilter, EnglishPossessiveFilter etc in addition to base
> > tokenizer
> > > StandardTokenizer.
> > >
> > > Per my analysis suggests, StandardTokenizer strips off the %  I am
> > facing a
> > > requirement change to get % sign retained in searches. e.g
> > >
> > > Sample search docs:
> > > 1. Number of boys 50
> > > 2. My score was 50%
> > > 3. 40-50% for pass score
> > >
> > > Search query: 50%
> > > Expected results: Doc-2, Doc-3 i.e.
> > > My score was 50%
> > > 40-50% for pass score
> > >
> > > Actual result: All 4 documents
> > >
> > > On the implementation front, I am using a set of filters like
> > > lowerCaseFilter, EnglishPossessiveFilter etc in addition to base
> > tokenizer
> > > StandardTokenizer.
> > >
> > > Per my analysis, StandardTOkenizer strips off the %  sign and hence the
> > > behavior.Has someone faced similar requirement? Any help/guidance is
> > highly
> > > appreciated.
> > >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>


-- 
Sincerely yours
Mikhail Khludnev

Reply via email to