You most certainly want to index the whole token, and likely portions of it 
(didn't you already ask this a few weeks ago?).
You will want to write your own Analyzer + Tokenizer that's 
email-address-format-aware and does things like:
emit the whole token
emit the username portion
email the fully qualified host name
email the fully qualified domain name

Perhaps you'll want to index some of these as separate fields, or perhaps 
you'll want to also index [EMAIL PROTECTED] even if an email address looks like 
[EMAIL PROTECTED]

Otis
 

----- Original Message ----
From: Michael J. Prichard <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, July 26, 2006 4:33:10 PM
Subject: To Tokenize or Un_Tokenize?

If I want to search an email address (i.e. [EMAIL PROTECTED]) do I need to 
Tokenize that field?

doc.add(new Field("from", (String) itemContent.get("from"), 
Field.Store.YES, Field.Index.TOKENIZED));

-OR-

doc.add(new Field("from", (String) itemContent.get("from"), 
Field.Store.YES, Field.Index.UN_TOKENIZED));

Thx.
Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to