Hi;

I'm trying to edit MS Word tables with a python script. Here's a snippet:

import string

def msw2htmlTables():

input = "/usr/home/me/test.doc"

input = open(input,'r')

word = "whatever"

inputFlag = 0

splitString = []

for line in input:

# Check first the inputFlag, since we only want to delete the top

if inputFlag == 0:

splitString = line.split(word)

try:

keep = splitString[1]

except:

keep = "nada"

print len(splitString)

inputFlag = 1

elif inputFlag == 1:

# This means we've deleted the top junk. Let's search for the bottom junk.

splitString = line.split(word)

try:

keep = splitString[0]

inputFlag = 2

print len(splitString)

except:

keep += line

elif inputFlag == 2:

# This means everything else is junk.

pass

Now, if var "word" is "orange", it will never pring the length of splitString. 
If it's "dark", it will. The only difference is the way they appear in the 
document. "orange" appears with a space character to the left and some MS 
garbage character to the right, while "dark" appears with a space character to 
the left and a comma to the right. Furthermore, if I use MSW junk characters as 
the definition of "word" (such as " Ù ", which is what I really need to 
search), it never even compiles (complains of an unpaired quote). It appears 
that python doesn't like MSW's junk characters. What shall I do?

TIA,

Tony

________________________________________________________________________
Email and AIM finally together. You've gotta check out free AOL Mail! - 
http://mail.aol.com
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to