['yahoo.de']
> 1-and  if the sample is like this:
> sunBuy is shinigViagrawww.xyx.com/dfdf.html
> ?
>
> 2-how manytokens will be there?

Marshall:~/spambayes tameyer$ python -c "from spambayes import  
tokenizer;print list(tokenizer.tokenize('sunBuy is  
shinigViagrawww.xyx.com/dfdf.html'))"
['content-type:text/plain', 'from:none', 'to:none', 'cc:none',  
'sender:none', 'reply-to:none', 'x-mailer:none', 'message- 
id:invalid', 'sunbuy', 'skip:s 30']

The first eight tokens (the ones with colons) are all header tokens,  
so I presume the answer you are looking for is "two".

It is just as Tim & Tim said: basically it's split-on-whitespace, and  
reading tokenizer.py is what you should do to learn more (and ask  
questions if parts don't make sense).

=Tony.Meyer

-- 
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.


_______________________________________________
[email protected]
http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html

Reply via email to