Haibao Tang wrote:
> with high accuracy...
>
> My temporary plan is to first recognized consecutive two or three
> initial-capitalized words, but certainly we need to do more than that?
> Anyone has suggestions?
>
> Thanks first.

It's not easy to say without seeing the HTML. If you the structure
allows it, a couple of str.split() is probably the easiest way, but you
always have BeautifulSoup.

http://www.crummy.com/software/BeautifulSoup/

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to