Re: Look for strange encodings -- tokenization

2007-09-05 Thread poeta simbolista
for Natural Language Processing > http://www.cnlp.org/tech/lucene.asp > > ------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTE

Re: Look for strange encodings -- tokenization

2007-09-05 Thread Steven Rowe
poeta simbolista wrote: > I'd want to know the best way to look for strange encodings on a Lucene > index. > i have several inputs where input can have been encoded on different sets. I > not always know if my guess about the encoding has been ok. Hence, I'd > thought of querying the index for some

Look for strange encodings -- tokenization

2007-09-04 Thread poeta simbolista
wledge of lucene .) I have the feeling this can be done better somehow. Thanks a lot in advance! -- View this message in context: http://www.nabble.com/Look-for-strange-encodingstokenization-tf4378064.html#a12479370 Sent from the Lucene - Java Users mail