date:20160412

Re: Custom indexing

2016-04-12 Thread Jack Krupansky

The standard analyzer/tokenizer should do a decent job of splitting on dot,
hyphen, and underscore, in addition to whitespace and other punctuation.

Can you post some specific test cases you are concerned with? (You should
always run some test cases.)

-- Jack Krupansky

On Tue, Apr 12, 2016 at 10:35 AM, Ahmet Arslan 
wrote:

> Hi Chamarty,
>
> Well, there are a lot of options here.
>
> 1) Use LetterTokenizer
> 2) Use WordDelimeterFilter combined with WhiteSpaceTokenizer
> 3) Use MappingCharFilter to replace those characters with spaces
> .
> .
> .
>
> Ahmet
>
>
> On Tuesday, April 12, 2016 3:58 PM, PrasannaKumar Chamarty <
> tech.kumar...@gmail.com> wrote:
>
>
>
> Hi,
>
> What is the best way (in terms of maintenance required with new lucene
> releases) to allow splitting of words on "." and "_" for indexing ? Thank
> you.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Re: Custom indexing

2016-04-12 Thread Ahmet Arslan

Hi Chamarty,

Well, there are a lot of options here.

1) Use LetterTokenizer
2) Use WordDelimeterFilter combined with WhiteSpaceTokenizer
3) Use MappingCharFilter to replace those characters with spaces
.
.
.

Ahmet


On Tuesday, April 12, 2016 3:58 PM, PrasannaKumar Chamarty 
 wrote:



Hi,

What is the best way (in terms of maintenance required with new lucene
releases) to allow splitting of words on "." and "_" for indexing ? Thank
you.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Jackrabbit - Custom indexing

2016-04-12 Thread PrasannaKumar Chamarty

Hi,

What is the best way (in terms of maintenance required with new lucene
releases) to allow splitting of words (into tokens) on "." and "_" for
indexing ?

Please note that I am using lucene through Jackrabbit. Jackrabbit's Search
configuration can be found at http://wiki.apache.org/jackrabbit/Search

The default analyzer is org.apache.lucene.analysis.standard.StandardAnalyzer
If writing custom analyzer is the only option, how to do that without
maintenance overhead with new lucene releases.

Thank you.

Re: Custom indexing

Re: Custom indexing

Jackrabbit - Custom indexing

3 matches

Site Navigation

Mail list logo

Footer information