Parsing email addresses with StandardTokenizer.

Minh Kama Yie Sun, 27 Oct 2002 21:35:47 -0800

Hi all,

Please forgive me if this question has been asked elsewhere but I can't seem to find 
an answer for this in the documentation. The code for StandardTokenizer is a little 
too deep to go into right now :), so I thought I    'd post to the list first.


If I'm using the standard analyzer, which in turn uses StandardTokenizer, how would 
the following email addresses be parsed?

- [EMAIL PROTECTED]
- [EMAIL PROTECTED]

If I did a search for "abc.com", which entries should turn up? 
Right now I'm only getting [EMAIL PROTECTED], and if this is correct then what are the 
standard tokenizing rules regarding the "@" sign, and where can I read up on this 
without looking at the hexedecimal values in StandardTokenizer? 

I've basically been asked why the document for [EMAIL PROTECTED] doesn't turn up in the 
search results for "abc.com".

Thanks in advance.

Regards,

Minh Kama Yie

This message is intended only for the named recipient. 
If you are not the intended recipient you are notified that
disclosing, copying, distributing or taking any action 
in reliance on the contents of this information is strictly 
prohibited.

Parsing email addresses with StandardTokenizer.

Reply via email to