Re: Windows file paths, again
Dan Guido dguido at gmail.com writes: Hi Anthony, Thanks for your reply, but I don't think your tests have any control characters in them. Try again with a \v, a \n, or a \x in your input and I think you'll find it doesn't work as expected. -- Dan Guido Why don't you try it yourself? He gave you the code. I changed cfg.ini to contain the following: [foo] bar=C:\x\n\r\a\01\x32\foo.py Which produced the following output: C:\x\n\r\a\01\x32\foo.py 'C:\\x\\n\\r\\a\\01\\x32\\foo.py' Looks like config parser worked just fine to me. There is a difference between a python string literal written inside of a python script and a string read from a file. When reading from a file (or the registry) what you see is what you get. There is no need to do so much work. Matt McCredie -- http://mail.python.org/mailman/listinfo/python-list
Windows file paths, again
I'm trying to write a few methods that normalize Windows file paths. I've gotten it to work in 99% of the cases, but it seems like my code still chokes on '\x'. I've pasted my code below, can someone help me figure out a better way to write this? This seems overly complicated for such a simple problem... # returns normalized filepath with arguments removed def remove_arguments(filepath): #print removing args from: + filepath (head, tail) = os.path.split(filepath) pathext = os.environ['PATHEXT'].split(;) while(tail != ''): #print trying: + os.path.join(head,tail) # does it just work? if os.path.isfile(os.path.join(head, tail)): #print it just worked return os.path.join(head, tail) # try every extension for ext in pathext: if os.path.isfile(os.path.join(head, tail) + ext): return os.path.join(head, tail) + ext # remove the last word, try again tail = tail.split()[:-1] tail = .join(tail) return None escape_dict={'\a':r'\a', '\b':r'\b', '\c':r'\c', '\f':r'\f', '\n':r'\n', '\r':r'\r', '\t':r'\t', '\v':r'\v', '\'':r'\'', #'\':r'\', '\0':r'\0', '\1':r'\1', '\2':r'\2', '\3':r'\3', '\4':r'\4', '\5':r'\5', '\6':r'\6', '\7':r'\a', #i have no idea '\8':r'\8', '\9':r'\9'} def raw(text): Returns a raw string representation of text new_string='' for char in text: try: new_string+=escape_dict[char] #print escaped except KeyError: new_string+=char #print keyerror #print new_string return new_string # returns the normalized path to a file if it exists # returns None if it doesn't exist def normalize_path(path): #print not normal: + path # make sure it's not blank if(path == ): return None # get rid of mistakenly escaped bytes path = raw(path) #print step1: + path # remove quotes path = path.replace('', '') #print step2: + path #convert to lowercase lower = path.lower() #print step3: + lower # expand all the normally formed environ variables expanded = os.path.expandvars(lower) #print step4: + expanded # chop off \??\ if expanded[:4] == \\??\\: expanded = expanded[4:] #print step5: + expanded # strip a leading '/' if expanded[:1] == \\: expanded = expanded[1:] #print step7: + expanded systemroot = os.environ['SYSTEMROOT'] # sometimes systemroot won't have % r = re.compile('systemroot', re.IGNORECASE) expanded = r.sub(systemroot, expanded) #print step8: + expanded # prepend the %systemroot% if its missing if expanded[:8] == system32 or syswow64: expanded = os.path.join(systemroot, expanded) #print step9: + expanded stripped = remove_arguments(expanded.lower()) # just in case you're running as LUA # this is a race condition but you can suck it if(stripped): if os.access(stripped, os.R_OK): return stripped return None def test_normalize(): test1 = \??\C:\WINDOWS\system32\Drivers\CVPNDRVA.sys test2 = C:\WINDOWS\system32\msdtc.exe test3 = %SystemRoot%\system32\svchost.exe -k netsvcs test4 = \SystemRoot\System32\drivers\vga.sys test5 = system32\DRIVERS\compbatt.sys test6 = C:\Program Files\ABC\DEC Windows Services\Client Services.exe test7 = c:\Program Files\Common Files\Symantec Shared\SNDSrvc.exe test8 = C:\WINDOWS\system32\svchost -k dcomlaunch test9 = test10 = SysWow64\drivers\AsIO.sys test11 = \SystemRoot\system32\DRIVERS\amdsbs.sys test12 = C:\windows\system32\xeuwhatever.sys #this breaks everything print normalize_path(test1) print normalize_path(test2) print normalize_path(test3) print normalize_path(test4) print normalize_path(test5) print normalize_path(test6) print normalize_path(test7) print normalize_path(test8) print normalize_path(test9) print normalize_path(test10) print normalize_path(test11) print
Re: Windows file paths, again
Dan Guido wrote: I'm trying to write a few methods that normalize Windows file paths. I've gotten it to work in 99% of the cases, but it seems like my code still chokes on '\x'. I've pasted my code below, can someone help me figure out a better way to write this? This seems overly complicated for such a simple problem... # returns normalized filepath with arguments removed def remove_arguments(filepath): #print removing args from: + filepath (head, tail) = os.path.split(filepath) pathext = os.environ['PATHEXT'].split(;) while(tail != ''): #print trying: + os.path.join(head,tail) # does it just work? if os.path.isfile(os.path.join(head, tail)): #print it just worked return os.path.join(head, tail) # try every extension for ext in pathext: if os.path.isfile(os.path.join(head, tail) + ext): return os.path.join(head, tail) + ext # remove the last word, try again tail = tail.split()[:-1] tail = .join(tail) return None escape_dict={'\a':r'\a', '\b':r'\b', '\c':r'\c', '\f':r'\f', '\n':r'\n', '\r':r'\r', '\t':r'\t', '\v':r'\v', '\'':r'\'', #'\':r'\', '\0':r'\0', '\1':r'\1', '\2':r'\2', '\3':r'\3', '\4':r'\4', '\5':r'\5', '\6':r'\6', '\7':r'\a', #i have no idea '\8':r'\8', '\9':r'\9'} def raw(text): Returns a raw string representation of text new_string='' for char in text: try: new_string+=escape_dict[char] #print escaped except KeyError: new_string+=char #print keyerror #print new_string return new_string # returns the normalized path to a file if it exists # returns None if it doesn't exist def normalize_path(path): #print not normal: + path # make sure it's not blank if(path == ): return None # get rid of mistakenly escaped bytes path = raw(path) #print step1: + path # remove quotes path = path.replace('', '') #print step2: + path #convert to lowercase lower = path.lower() #print step3: + lower # expand all the normally formed environ variables expanded = os.path.expandvars(lower) #print step4: + expanded # chop off \??\ if expanded[:4] == \\??\\: expanded = expanded[4:] #print step5: + expanded # strip a leading '/' if expanded[:1] == \\: expanded = expanded[1:] #print step7: + expanded systemroot = os.environ['SYSTEMROOT'] # sometimes systemroot won't have % r = re.compile('systemroot', re.IGNORECASE) expanded = r.sub(systemroot, expanded) #print step8: + expanded # prepend the %systemroot% if its missing if expanded[:8] == system32 or syswow64: expanded = os.path.join(systemroot, expanded) #print step9: + expanded stripped = remove_arguments(expanded.lower()) # just in case you're running as LUA # this is a race condition but you can suck it if(stripped): if os.access(stripped, os.R_OK): return stripped return None def test_normalize(): test1 = \??\C:\WINDOWS\system32\Drivers\CVPNDRVA.sys test2 = C:\WINDOWS\system32\msdtc.exe test3 = %SystemRoot%\system32\svchost.exe -k netsvcs test4 = \SystemRoot\System32\drivers\vga.sys test5 = system32\DRIVERS\compbatt.sys test6 = C:\Program Files\ABC\DEC Windows Services\Client Services.exe test7 = c:\Program Files\Common Files\Symantec Shared\SNDSrvc.exe test8 = C:\WINDOWS\system32\svchost -k dcomlaunch test9 = test10 = SysWow64\drivers\AsIO.sys test11 = \SystemRoot\system32\DRIVERS\amdsbs.sys test12 = C:\windows\system32\xeuwhatever.sys #this breaks everything If I'm getting this right, what you try to do is to convert characters that come from string-literal escape-codes to their literal representation. Why? A simple test12 = rC:\windows\system32\xeuwhatever.sys is all you need - note the leading r. Then test12[2] == \\ # need escape on the right because of backslashes at end of raw-string-literals rule. holds. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: Windows file paths, again
Hi Diez, The source of the string literals is ConfigParser, so I can't just mark them with an 'r'. config = ConfigParser.RawConfigParser() config.read(filename) crazyfilepath = config.get(name, ImagePath) normalfilepath = normalize_path(crazyfilepath) The ultimate origin of the strings is the _winreg function. Here I also can't mark them with an 'r'. regkey = OpenKey(HKEY_LOCAL_MACHINE, SYSTEM\\CurrentControlSet\\Services\\ + name) crazyimagepath = QueryValueEx(regkey, ImagePath)[0] CloseKey(key) -- Dan Guido On Wed, Oct 21, 2009 at 2:34 PM, Diez B. Roggisch de...@nospam.web.de wrote: Dan Guido wrote: I'm trying to write a few methods that normalize Windows file paths. I've gotten it to work in 99% of the cases, but it seems like my code still chokes on '\x'. I've pasted my code below, can someone help me figure out a better way to write this? This seems overly complicated for such a simple problem... # returns normalized filepath with arguments removed def remove_arguments(filepath): #print removing args from: + filepath (head, tail) = os.path.split(filepath) pathext = os.environ['PATHEXT'].split(;) while(tail != ''): #print trying: + os.path.join(head,tail) # does it just work? if os.path.isfile(os.path.join(head, tail)): #print it just worked return os.path.join(head, tail) # try every extension for ext in pathext: if os.path.isfile(os.path.join(head, tail) + ext): return os.path.join(head, tail) + ext # remove the last word, try again tail = tail.split()[:-1] tail = .join(tail) return None escape_dict={'\a':r'\a', '\b':r'\b', '\c':r'\c', '\f':r'\f', '\n':r'\n', '\r':r'\r', '\t':r'\t', '\v':r'\v', '\'':r'\'', #'\':r'\', '\0':r'\0', '\1':r'\1', '\2':r'\2', '\3':r'\3', '\4':r'\4', '\5':r'\5', '\6':r'\6', '\7':r'\a', #i have no idea '\8':r'\8', '\9':r'\9'} def raw(text): Returns a raw string representation of text new_string='' for char in text: try: new_string+=escape_dict[char] #print escaped except KeyError: new_string+=char #print keyerror #print new_string return new_string # returns the normalized path to a file if it exists # returns None if it doesn't exist def normalize_path(path): #print not normal: + path # make sure it's not blank if(path == ): return None # get rid of mistakenly escaped bytes path = raw(path) #print step1: + path # remove quotes path = path.replace('', '') #print step2: + path #convert to lowercase lower = path.lower() #print step3: + lower # expand all the normally formed environ variables expanded = os.path.expandvars(lower) #print step4: + expanded # chop off \??\ if expanded[:4] == \\??\\: expanded = expanded[4:] #print step5: + expanded # strip a leading '/' if expanded[:1] == \\: expanded = expanded[1:] #print step7: + expanded systemroot = os.environ['SYSTEMROOT'] # sometimes systemroot won't have % r = re.compile('systemroot', re.IGNORECASE) expanded = r.sub(systemroot, expanded) #print step8: + expanded # prepend the %systemroot% if its missing if expanded[:8] == system32 or syswow64: expanded = os.path.join(systemroot, expanded) #print step9: + expanded stripped = remove_arguments(expanded.lower()) # just in case you're running as LUA # this is a race condition but you can suck it if(stripped): if os.access(stripped, os.R_OK): return stripped return None def test_normalize(): test1 = \??\C:\WINDOWS\system32\Drivers\CVPNDRVA.sys test2 = C:\WINDOWS\system32\msdtc.exe test3 = %SystemRoot%\system32\svchost.exe -k netsvcs test4 = \SystemRoot\System32\drivers\vga.sys test5 = system32\DRIVERS\compbatt.sys test6 = C:\Program Files\ABC\DEC Windows Services\Client Services.exe test7 = c:\Program Files\Common Files\Symantec Shared\SNDSrvc.exe test8 = C:\WINDOWS\system32\svchost -k dcomlaunch test9 = test10 = SysWow64\drivers\AsIO.sys test11 = \SystemRoot\system32\DRIVERS\amdsbs.sys test12 = C:\windows\system32\xeuwhatever.sys #this breaks everything If I'm getting this right, what you try to do is to convert characters that come from string-literal escape-codes to their literal representation. Why? A simple test12 = rC:\windows\system32\xeuwhatever.sys is all you need - note the leading r. Then test12[2] == \\ # need escape on the right because of backslashes at end of raw-string-literals rule. holds. Diez -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: Windows file paths, again
On Oct 21, 3:20 pm, Dan Guido dgu...@gmail.com wrote: Hi Diez, The source of the string literals is ConfigParser, so I can't just mark them with an 'r'. config = ConfigParser.RawConfigParser() config.read(filename) crazyfilepath = config.get(name, ImagePath) normalfilepath = normalize_path(crazyfilepath) The ultimate origin of the strings is the _winreg function. Here I also can't mark them with an 'r'. regkey = OpenKey(HKEY_LOCAL_MACHINE, SYSTEM\\CurrentControlSet\\Services\\ + name) crazyimagepath = QueryValueEx(regkey, ImagePath)[0] CloseKey(key) -- Dan Guido I just did a quick test using Python 2.5.1 with the following script on Windows: # start of test.py import ConfigParser config = ConfigParser.RawConfigParser() config.read(cfg.ini) x = config.get(foo, bar) print x print repr(x) from _winreg import * regkey = OpenKey(HKEY_LOCAL_MACHINE, rSYSTEM\CurrentControlSet\Services\IPSec) x = QueryValueEx(regkey, ImagePath)[0] CloseKey(regkey) print x print repr(x) # end of test.py Here is the contesnts of cfg.ini: [foo] bar=c:\dir\file.txt Here is the output of the script: c:\dir\file.txt 'c:\\dir\\file.txt' system32\DRIVERS\ipsec.sys u'system32\\DRIVERS\\ipsec.sys' In either case, I don't see the functions returning strings that requires special handling. The backslashes are properly escaped in the repr of both strings. Something else must be going on if the strings are getting messed up along the way. -- http://mail.python.org/mailman/listinfo/python-list
Re: Windows file paths, again
Dan Guido wrote: I'm trying to write a few methods that normalize Windows file paths. I've gotten it to work in 99% of the cases, but it seems like my code still chokes on '\x'. I've pasted my code below, can someone help me figure out a better way to write this? This seems overly complicated for such a simple problem... [snip] test12 = C:\windows\system32\xeuwhatever.sys #this breaks everything [snip] -- Dan Guido That is overly complicated. I would recommend you use either raw strings for windows paths, or double backslashes. The problem you are observing is that \x (unlike the simpler ones such as \t) takes a hex number after the \x, so the whole thing would be, for example, \xa9. Because Python is looking for two hex digits after the back-slash, and not finding them, you get the error (long before your 'fix-it' routine gets a chance to run). Hope this helps. ~Ethan~ -- http://mail.python.org/mailman/listinfo/python-list
Re: Windows file paths, again
Hi Anthony, Thanks for your reply, but I don't think your tests have any control characters in them. Try again with a \v, a \n, or a \x in your input and I think you'll find it doesn't work as expected. -- Dan Guido On Wed, Oct 21, 2009 at 3:50 PM, Anthony Tolle anthony.to...@gmail.com wrote: On Oct 21, 3:20 pm, Dan Guido dgu...@gmail.com wrote: Hi Diez, The source of the string literals is ConfigParser, so I can't just mark them with an 'r'. config = ConfigParser.RawConfigParser() config.read(filename) crazyfilepath = config.get(name, ImagePath) normalfilepath = normalize_path(crazyfilepath) The ultimate origin of the strings is the _winreg function. Here I also can't mark them with an 'r'. regkey = OpenKey(HKEY_LOCAL_MACHINE, SYSTEM\\CurrentControlSet\\Services\\ + name) crazyimagepath = QueryValueEx(regkey, ImagePath)[0] CloseKey(key) -- Dan Guido I just did a quick test using Python 2.5.1 with the following script on Windows: # start of test.py import ConfigParser config = ConfigParser.RawConfigParser() config.read(cfg.ini) x = config.get(foo, bar) print x print repr(x) from _winreg import * regkey = OpenKey(HKEY_LOCAL_MACHINE, rSYSTEM\CurrentControlSet\Services\IPSec) x = QueryValueEx(regkey, ImagePath)[0] CloseKey(regkey) print x print repr(x) # end of test.py Here is the contesnts of cfg.ini: [foo] bar=c:\dir\file.txt Here is the output of the script: c:\dir\file.txt 'c:\\dir\\file.txt' system32\DRIVERS\ipsec.sys u'system32\\DRIVERS\\ipsec.sys' In either case, I don't see the functions returning strings that requires special handling. The backslashes are properly escaped in the repr of both strings. Something else must be going on if the strings are getting messed up along the way. -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: Windows file paths, again
Dan Guido wrote: Hi Diez, The source of the string literals is ConfigParser, so I can't just mark them with an 'r'. config =onfigParser.RawConfigParser() config.read(filename) crazyfilepath =onfig.get(name, ImagePath) normalfilepath =ormalize_path(crazyfilepath) The ultimate origin of the strings is the _winreg function. Here I also can't mark them with an 'r'. regkey =penKey(HKEY_LOCAL_MACHINE, SYSTEM\\CurrentControlSet\\Services\\ + name) crazyimagepath =ueryValueEx(regkey, ImagePath)[0] CloseKey(key) -- Dan Guido On Wed, Oct 21, 2009 at 2:34 PM, Diez B. Roggisch de...@nospam.web.de wrote: Dan Guido wrote: I'm trying to write a few methods that normalize Windows file paths. I've gotten it to work in 99% of the cases, but it seems like my code still chokes on '\x'. I've pasted my code below, can someone help me figure out a better way to write this? This seems overly complicated for such a simple problem... # returns normalized filepath with arguments removed def remove_arguments(filepath): #print removing args from: + filepath (head, tail) =s.path.split(filepath) pathext =s.environ['PATHEXT'].split(;) while(tail !='): #print trying: + os.path.join(head,tail) # does it just work? if os.path.isfile(os.path.join(head, tail)): #print it just worked return os.path.join(head, tail) # try every extension for ext in pathext: if os.path.isfile(os.path.join(head, tail) + ext): return os.path.join(head, tail) + ext # remove the last word, try again tail =ail.split()[:-1] tail = .join(tail) return None escape_dict=\a':r'\a', '\b':r'\b', '\c':r'\c', '\f':r'\f', '\n':r'\n', '\r':r'\r', '\t':r'\t', '\v':r'\v', '\'':r'\'', #'\':r'\', '\0':r'\0', '\1':r'\1', '\2':r'\2', '\3':r'\3', '\4':r'\4', '\5':r'\5', '\6':r'\6', '\7':r'\a', #i have no idea '\8':r'\8', '\9':r'\9'} def raw(text): Returns a raw string representation of text new_string= for char in text: try: new_string+=cape_dict[char] #print escaped except KeyError: new_string+=ar #print keyerror #print new_string return new_string # returns the normalized path to a file if it exists # returns None if it doesn't exist def normalize_path(path): #print not normal: + path # make sure it's not blank if(path =): return None # get rid of mistakenly escaped bytes path =aw(path) #print step1: + path # remove quotes path =ath.replace('', '') #print step2: + path #convert to lowercase lower =ath.lower() #print step3: + lower # expand all the normally formed environ variables expanded =s.path.expandvars(lower) #print step4: + expanded # chop off \??\ if expanded[:4] =\\??\\: expanded =xpanded[4:] #print step5: + expanded # strip a leading '/' if expanded[:1] =\\: expanded =xpanded[1:] #print step7: + expanded systemroot =s.environ['SYSTEMROOT'] # sometimes systemroot won't have % r =e.compile('systemroot', re.IGNORECASE) expanded =.sub(systemroot, expanded) #print step8: + expanded # prepend the %systemroot% if its missing if expanded[:8] =system32 or syswow64: expanded =s.path.join(systemroot, expanded) #print step9: + expanded stripped =emove_arguments(expanded.lower()) # just in case you're running as LUA # this is a race condition but you can suck it if(stripped): if os.access(stripped, os.R_OK): return stripped return None def test_normalize(): test1 =\??\C:\WINDOWS\system32\Drivers\CVPNDRVA.sys test2 =C:\WINDOWS\system32\msdtc.exe test3 =%SystemRoot%\system32\svchost.exe -k netsvcs test4 =\SystemRoot\System32\drivers\vga.sys test5 =system32\DRIVERS\compbatt.sys test6 =C:\Program Files\ABC\DEC Windows Services\Client Services.exe test7 =c:\Program Files\Common Files\Symantec Shared\SNDSrvc.exe test8 =C:\WINDOWS\system32\svchost -k dcomlaunch test9 = test10 =SysWow64\drivers\AsIO.sys test11 =\SystemRoot\system32\DRIVERS\amdsbs.sys test12 =C:\windows\system32\xeuwhatever.sys #this breaks everything If I'm getting this right, what you try to do is to convert characters that come from string-literal escape-codes to their literal representation. Why? A simple test12 =C:\windows\system32\xeuwhatever.sys is all you need - note the leading r. Then test12[2] =\\ # need escape on the right because of backslashes at end of raw-string-literals rule. holds. Diez -- http://mail.python.org/mailman/listinfo/python-list Your first problem is that you're mixing tabs and spaces in your source code. Dangerous and confusing, not to mention an error in Python 3.x The second problem is that your test_normalize() is called with a bunch of invalid literals. Backslashes in quote literals need to be escaped, or you need to use the raw form of literal. Now this may have nothing to do with the data you get from ConfigParser or QueryValueEx(), but it sure makes testing confusing. The third problem is
Re: Windows file paths, again
Dan Guido wrote: Hi Anthony, Thanks for your reply, but I don't think your tests have any control characters in them. Try again with a \v, a \n, or a \x in your input and I think you'll find it doesn't work as expected. A path read from a file, config file, or winreg would never contain control characters unless they contains that a control character. My crystal ball thinks that you used eval or exec somewhere in your script, which may cause a perfectly escaped path to get unescaped, like here: # python 3 path = 'C:\\path\\to\\somewhere.txt' script = 'open(%s)' % path# this calls str(path) exec(script) OR you stored the path incorrectly. Try seeing what exactly is stored in the registry using regedit. Remember that escape characters doesn't really exist in the in-memory representation of the string. The escape characters exist only in string literals (i.e. source code) and when you print the string using repr(). -- http://mail.python.org/mailman/listinfo/python-list
Re: Windows file paths, again
I'm writing a test case right now, will update in a few minutes :-). I'm using Python 2.6.x I need to read these values in from a configparser file or the windows registry and get MD5 sums of the actual files on the filesystem and copy the files to a new location. The open() method completely barfs if I don't normalize the paths to the files first. I'll show the list, just give me a little bit more time to separate the code from my project that demonstrates this bug. -- Dan Guido On Wed, Oct 21, 2009 at 4:49 PM, Lie Ryan lie.1...@gmail.com wrote: Dan Guido wrote: Hi Anthony, Thanks for your reply, but I don't think your tests have any control characters in them. Try again with a \v, a \n, or a \x in your input and I think you'll find it doesn't work as expected. A path read from a file, config file, or winreg would never contain control characters unless they contains that a control character. My crystal ball thinks that you used eval or exec somewhere in your script, which may cause a perfectly escaped path to get unescaped, like here: # python 3 path = 'C:\\path\\to\\somewhere.txt' script = 'open(%s)' % path # this calls str(path) exec(script) OR you stored the path incorrectly. Try seeing what exactly is stored in the registry using regedit. Remember that escape characters doesn't really exist in the in-memory representation of the string. The escape characters exist only in string literals (i.e. source code) and when you print the string using repr(). -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: Windows file paths, again
Dan Guido wrote: Hi Diez, The source of the string literals is ConfigParser, so I can't just mark them with an 'r'. Python string literals only exist in Python source code. Functions and methods only return *strings*, not literals. If you mistakenly put the str() representation of a string (such as print gives you) into source code, rather than the repr() output, then you may have trouble. tjr -- http://mail.python.org/mailman/listinfo/python-list
Re: Windows file paths, again
On Wed, Oct 21, 2009 at 5:40 PM, Dan Guido dgu...@gmail.com wrote: This doesn't give me quite the results I expected, so I'll have to take a closer look at my project as a whole tomorrow. The test cases clearly show the need for all the fancy parsing I'm doing on the path though. To get back to what I think was your original question, there is an easy way to take a string with control characters and turn it back into a string with the control characters escaped, which could replace your escape_dict and raw() function in normalize.py: s = 'Foo\t\n\n\x12Bar' print s Foo Bar r = s.encode('string-escape') print r Foo\t\n\n\x12Bar (Python 2.6.1 on windows XP) More generally, it sounds like you have some bad data in either the registry, or your ini file. You shouldn't have control characters in there (unless you really have directories with control characters in their names). If you have control over how those values are written, you should probably fix the bad data at the source instead of fixing it as you pull it back in. -- Jerry -- http://mail.python.org/mailman/listinfo/python-list
Re: Windows file paths, again
Dan Guido wrote: This doesn't give me quite the results I expected, so I'll have to take a closer look at my project as a whole tomorrow. The test cases clearly show the need for all the fancy parsing I'm doing on the path though. Looks like I'll return to this tomorrow and post an update as appropriate. Thanks for the help so far! -- Dan Guido On Wed, Oct 21, 2009 at 5:34 PM, Terry Reedy tjre...@udel.edu wrote: Dan Guido wrote: Hi Diez, The source of the string literals is ConfigParser, so I can't just mark them with an 'r'. Python string literals only exist in Python source code. Functions and methods only return *strings*, not literals. If you mistakenly put the str() representation of a string (such as print gives you) into source code, rather than the repr() output, then you may have trouble. tjr -- http://mail.python.org/mailman/listinfo/python-list For none of your test data does raw() change anything at all. These strings do *not* need escaping. Now some of the other things you do are interesting: 1) \??\ - presumably you're looking for a long UNC. But that's signaled by \\?\It's used to indicate to some functions that filenames over about 260 bytes are permissible. 2) The line: if expanded[:8] == system32 or syswow64: doesn't do what you think it does. it'll always evaluate as true, since == has higher priority and syswow64 is a non-empty string. If you want to compare the string to both, you need to expand it out: either if expanded[:8] == system32 or expanded[:8] == syswow64 or simpler: if expanded.startswith(system32) or expanded.startswith(syswow64): 3) removing a leading backslash should imply that you replace it with the current directory, at least in most contexts. I'm not sure what's the right thing here. DaveA -- http://mail.python.org/mailman/listinfo/python-list