If I understand correctly, this process it to be used to translate languages. I trust that all concerned appreciate that the differences between languages' grammars mean that the results are very unlikely to be particularly legible. You're into Machine Translation issues really and probably need to pass whole sentences into a proper language translator that will also deal with the grammar. Even then the results are likely to be somewhat stilted.
Linguistically, Richard. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Roberts Sent: Tuesday, October 11, 2005 5:14 PM To: python-win32@python.org Subject: Re: [python-win32] Translating MS-Word documents On Tue, 11 Oct 2005 11:32:53 +0200 (CEST), ?yvind <[EMAIL PROTECTED]> wrote: >I need to translate several Word-documents. I have a list with >approximately 5000 words and its translation, and would like to read >thru a Word-document, look for the words in the list and replace them. >However, I need to keep the current formating of the Word-documents. >(Using Word >2003 and XP). > >What is the best way of doing this as fast and efficient as possible? > >1) Search and replace for each word directly in Word > >2) Exctract the text, run it thru regex and thereafter do a search and >replace in Word. > >3) Some other way? > >(The only language I know is Python, so writing some C++ stuff that can >do it a lot faster is not an option). > > This is a hard problem. If you can let this run for a number of hours, the simplest answer is to use the Word object model to open each file in turn and use the Document.Find method to search and replace. It'll take a while, but the computer won't complain. Here's an MSDN article that shows how to use Find and Replace within a selection; the same syntax should work with a Document: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_wrco re/html/wrtskhowtoreplacetext.asp However, in many cases, it is easier to use the Word macro recorder to record what you want to do ONCE, and then use the generated VBA to create your script. If your document formatting will survive a change to RTF and back, you could convert to RTF (which is easily machine readable) and do the replacements in plain text. However, few documents survive that change completely intact. -- Tim Roberts, [EMAIL PROTECTED] Providenza & Boekelheide, Inc. _______________________________________________ Python-win32 mailing list Python-win32@python.org http://mail.python.org/mailman/listinfo/python-win32 http://www.bbc.co.uk/ This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. _______________________________________________ Python-win32 mailing list Python-win32@python.org http://mail.python.org/mailman/listinfo/python-win32