On 4 Aug, 11:59, Fred Mangusta <[EMAIL PROTECTED]> wrote: > Hi, > > are you aware of any nlp packages or algorithms in Python to spot > whether a '.' represents an end of sentence or rather something else (eg > Mr., [EMAIL PROTECTED], etc)?
I wouldn't mind finding out about such packages, either. I see that NLTK offers a few options, with the following tokeniser being interesting if you don't mind training the software: http://nltk.org/doc/guides/tokenize.html#punkt-tokenizer There was also discussion of this topic on Ned Batchelder's blog a while back: http://nedbatchelder.com/blog/200804/separating_sentences.html My comment on there (that I'm using a regular expression with some postprocessing) still stands. Paul -- http://mail.python.org/mailman/listinfo/python-list