Thanks. I just find that all item numbers such as 1.1.1 are gone. How can I get these numbers. Also, If all items are in a table, how can I get the contents of all items and ignore the table structure. Thanks.
--- On Tue, 6/14/11, Tim Roberts <t...@probo.com> wrote: From: Tim Roberts <t...@probo.com> Subject: Re: [python-win32] UnicodeEncodingError when print a doc file To: "python-win32@python.org" <python-win32@python.org> Date: Tuesday, June 14, 2011, 9:02 PM cool_go_blue wrote: > Thanks. It works. Actually, what I want to do is to parse the whole > document. How can I retrieve the list of words in the > document? I use the following code: > > for word in doc.Content.Text.encode("cp1252", "replace"): > print word > > It seems that word is each a character. > No, what you are getting back is a Python string. When you enumerate through a string, you get characters. This is basic Python. If your words are all separated by spaces, you can use split: for word in doc.Content.Text.encode("cp1252","replace").split(): print word Note, however, that you don't need to convert it to an 8-bit character set until you want to print it. If you are going to process these words, then you might as well leave them in Unicode. -- Tim Roberts, t...@probo.com Providenza & Boekelheide, Inc. _______________________________________________ python-win32 mailing list python-win32@python.org http://mail.python.org/mailman/listinfo/python-win32
_______________________________________________ python-win32 mailing list python-win32@python.org http://mail.python.org/mailman/listinfo/python-win32