RE: Tokenizing text custom way

Pleasant, Tracy Tue, 25 Nov 2003 09:09:46 -0800

Not exactly and answer to the question but I haven't yet used the Token 
classes/functionality that came with Lucene. Can someone give me an idea of how and 
why one may use this?

-----Original Message-----
From: Dragan Jotanovic [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 25, 2003 6:42 AM
To: Lucene Users List
Subject: Tokenizing text custom way

Hi. I need to tokenize text while indexing but I don't want space to be delimiter. 
Delimiter should be my custom character (for example comma). I understand that I would 
probably need to implement my own analyzer, but could someone help me where to start. 
Is there any other way to do this without writing custom analyzer?

This is what I want to achieve.
If I have some text that will be indexed like following:

man, people, time out, sun

and if I enter 'time' as a search word, I don't want to get "time out" in results. I 
need exact keyword matching. I would achieve this if I tokenize "time out" as one 
token while idexing.

Maybe someone had similar problem? If someone knows how to handle this, please help me.

Dragan Jotanovic

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Tokenizing text custom way

Reply via email to