Re: Windows file paths, again

2009-10-22 Thread Matt McCredie
Dan Guido dguido at gmail.com writes:

 
 Hi Anthony,
 
 Thanks for your reply, but I don't think your tests have any control
 characters in them. Try again with a \v, a \n, or a \x in your input
 and I think you'll find it doesn't work as expected.
 
 --
 Dan Guido


Why don't you try it yourself? He gave you the code. I changed cfg.ini to
contain the following:

[foo]
bar=C:\x\n\r\a\01\x32\foo.py


Which produced the following output:
C:\x\n\r\a\01\x32\foo.py
'C:\\x\\n\\r\\a\\01\\x32\\foo.py'

Looks like config parser worked just fine to me. There is a difference between a
python string literal written inside of a python script and a string read from a
file. When reading from a file (or the registry) what you see is what you get.
There is no need to do so much work. 

Matt McCredie




-- 
http://mail.python.org/mailman/listinfo/python-list


Windows file paths, again

2009-10-21 Thread Dan Guido
I'm trying to write a few methods that normalize Windows file paths.
I've gotten it to work in 99% of the cases, but it seems like my code
still chokes on '\x'. I've pasted my code below, can someone help me
figure out a better way to write this? This seems overly complicated
for such a simple problem...


# returns normalized filepath with arguments removed
def remove_arguments(filepath):
#print removing args from:  + filepath
(head, tail) = os.path.split(filepath)
pathext = os.environ['PATHEXT'].split(;)

while(tail != ''):
#print trying:  + os.path.join(head,tail)

# does it just work?
if os.path.isfile(os.path.join(head, tail)):
#print it just worked
return os.path.join(head, tail)

# try every extension
for ext in pathext:
if os.path.isfile(os.path.join(head, tail) + ext):
return os.path.join(head, tail) + ext

# remove the last word, try again
tail = tail.split()[:-1]
tail =  .join(tail)

return None

escape_dict={'\a':r'\a',
   '\b':r'\b',
   '\c':r'\c',
   '\f':r'\f',
   '\n':r'\n',
   '\r':r'\r',
   '\t':r'\t',
   '\v':r'\v',
   '\'':r'\'',
   #'\':r'\',
   '\0':r'\0',
   '\1':r'\1',
   '\2':r'\2',
   '\3':r'\3',
   '\4':r'\4',
   '\5':r'\5',
   '\6':r'\6',
   '\7':r'\a', #i have no idea
   '\8':r'\8',
   '\9':r'\9'}

def raw(text):
Returns a raw string representation of text
new_string=''
for char in text:
try:
new_string+=escape_dict[char]
#print escaped
except KeyError:
new_string+=char
#print keyerror
#print new_string
return new_string

# returns the normalized path to a file if it exists
# returns None if it doesn't exist
def normalize_path(path):
#print not normal:  + path

# make sure it's not blank
if(path == ):
return None

# get rid of mistakenly escaped bytes
path = raw(path)
#print step1:  + path

# remove quotes
path = path.replace('', '')
#print step2:  + path

#convert to lowercase
lower = path.lower()
#print step3:  + lower

# expand all the normally formed environ variables
expanded = os.path.expandvars(lower)
#print step4:  + expanded

# chop off \??\
if expanded[:4] == \\??\\:
expanded = expanded[4:]
#print step5:  + expanded

# strip a leading '/'
if expanded[:1] == \\:
expanded = expanded[1:]
#print step7:  + expanded

systemroot = os.environ['SYSTEMROOT']

# sometimes systemroot won't have %
r = re.compile('systemroot', re.IGNORECASE)
expanded = r.sub(systemroot, expanded)
#print step8:  + expanded

# prepend the %systemroot% if its missing
if expanded[:8] == system32 or syswow64:
expanded = os.path.join(systemroot, expanded)
#print step9:  + expanded

stripped = remove_arguments(expanded.lower())

# just in case you're running as LUA
# this is a race condition but you can suck it
if(stripped):
if os.access(stripped, os.R_OK):
return stripped

return None

def test_normalize():
test1 = \??\C:\WINDOWS\system32\Drivers\CVPNDRVA.sys
test2 = C:\WINDOWS\system32\msdtc.exe
test3 = %SystemRoot%\system32\svchost.exe -k netsvcs
test4 = \SystemRoot\System32\drivers\vga.sys
test5 = system32\DRIVERS\compbatt.sys
test6 = C:\Program Files\ABC\DEC Windows Services\Client Services.exe
test7 = c:\Program Files\Common Files\Symantec Shared\SNDSrvc.exe
test8 = C:\WINDOWS\system32\svchost -k dcomlaunch
test9 = 
test10 = SysWow64\drivers\AsIO.sys
test11 = \SystemRoot\system32\DRIVERS\amdsbs.sys
test12 = C:\windows\system32\xeuwhatever.sys #this breaks everything

print normalize_path(test1)
print normalize_path(test2)
print normalize_path(test3)
print normalize_path(test4)
print normalize_path(test5)
print normalize_path(test6)
print normalize_path(test7)
print normalize_path(test8)
print normalize_path(test9)
print normalize_path(test10)
print normalize_path(test11)
print 

Re: Windows file paths, again

2009-10-21 Thread Diez B. Roggisch
Dan Guido wrote:

 I'm trying to write a few methods that normalize Windows file paths.
 I've gotten it to work in 99% of the cases, but it seems like my code
 still chokes on '\x'. I've pasted my code below, can someone help me
 figure out a better way to write this? This seems overly complicated
 for such a simple problem...
 
 
 # returns normalized filepath with arguments removed
 def remove_arguments(filepath):
 #print removing args from:  + filepath
 (head, tail) = os.path.split(filepath)
 pathext = os.environ['PATHEXT'].split(;)
 
 while(tail != ''):
 #print trying:  + os.path.join(head,tail)
 
 # does it just work?
 if os.path.isfile(os.path.join(head, tail)):
 #print it just worked
 return os.path.join(head, tail)
 
 # try every extension
 for ext in pathext:
 if os.path.isfile(os.path.join(head, tail) + ext):
 return os.path.join(head, tail) + ext
 
 # remove the last word, try again
 tail = tail.split()[:-1]
 tail =  .join(tail)
 
 return None
 
 escape_dict={'\a':r'\a',
'\b':r'\b',
'\c':r'\c',
'\f':r'\f',
'\n':r'\n',
'\r':r'\r',
'\t':r'\t',
'\v':r'\v',
'\'':r'\'',
#'\':r'\',
'\0':r'\0',
'\1':r'\1',
'\2':r'\2',
'\3':r'\3',
'\4':r'\4',
'\5':r'\5',
'\6':r'\6',
'\7':r'\a', #i have no idea
'\8':r'\8',
'\9':r'\9'}
 
 def raw(text):
 Returns a raw string representation of text
 new_string=''
 for char in text:
 try:
 new_string+=escape_dict[char]
 #print escaped
 except KeyError:
 new_string+=char
 #print keyerror
 #print new_string
 return new_string
 
 # returns the normalized path to a file if it exists
 # returns None if it doesn't exist
 def normalize_path(path):
 #print not normal:  + path
 
 # make sure it's not blank
 if(path == ):
 return None
 
 # get rid of mistakenly escaped bytes
 path = raw(path)
 #print step1:  + path
 
 # remove quotes
 path = path.replace('', '')
 #print step2:  + path
 
 #convert to lowercase
 lower = path.lower()
 #print step3:  + lower
 
 # expand all the normally formed environ variables
 expanded = os.path.expandvars(lower)
 #print step4:  + expanded
 
 # chop off \??\
 if expanded[:4] == \\??\\:
 expanded = expanded[4:]
 #print step5:  + expanded
 
 # strip a leading '/'
 if expanded[:1] == \\:
 expanded = expanded[1:]
 #print step7:  + expanded
 
 systemroot = os.environ['SYSTEMROOT']
 
 # sometimes systemroot won't have %
 r = re.compile('systemroot', re.IGNORECASE)
 expanded = r.sub(systemroot, expanded)
 #print step8:  + expanded
 
 # prepend the %systemroot% if its missing
 if expanded[:8] == system32 or syswow64:
 expanded = os.path.join(systemroot, expanded)
 #print step9:  + expanded
 
 stripped = remove_arguments(expanded.lower())
 
 # just in case you're running as LUA
 # this is a race condition but you can suck it
 if(stripped):
 if os.access(stripped, os.R_OK):
 return stripped
 
 return None
 
 def test_normalize():
 test1 = \??\C:\WINDOWS\system32\Drivers\CVPNDRVA.sys
 test2 = C:\WINDOWS\system32\msdtc.exe
 test3 = %SystemRoot%\system32\svchost.exe -k netsvcs
 test4 = \SystemRoot\System32\drivers\vga.sys
 test5 = system32\DRIVERS\compbatt.sys
 test6 = C:\Program Files\ABC\DEC Windows Services\Client Services.exe
 test7 = c:\Program Files\Common Files\Symantec Shared\SNDSrvc.exe
 test8 = C:\WINDOWS\system32\svchost -k dcomlaunch
 test9 = 
 test10 = SysWow64\drivers\AsIO.sys
 test11 = \SystemRoot\system32\DRIVERS\amdsbs.sys
 test12 = C:\windows\system32\xeuwhatever.sys #this breaks everything

If I'm getting this right, what you try to do is to convert characters that
come from string-literal escape-codes to their literal representation. Why?

A simple

  test12 = rC:\windows\system32\xeuwhatever.sys

is all you need - note the leading r. Then 

  test12[2] == \\ # need escape on the right because of backslashes at end
of raw-string-literals rule.

holds.

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Windows file paths, again

2009-10-21 Thread Dan Guido
Hi Diez,

The source of the string literals is ConfigParser, so I can't just
mark them with an 'r'.

config = ConfigParser.RawConfigParser()
config.read(filename)
crazyfilepath = config.get(name, ImagePath)
normalfilepath = normalize_path(crazyfilepath)

The ultimate origin of the strings is the _winreg function. Here I
also can't mark them with an 'r'.

regkey = OpenKey(HKEY_LOCAL_MACHINE,
SYSTEM\\CurrentControlSet\\Services\\ + name)
crazyimagepath = QueryValueEx(regkey, ImagePath)[0]
CloseKey(key)

--
Dan Guido



On Wed, Oct 21, 2009 at 2:34 PM, Diez B. Roggisch de...@nospam.web.de wrote:
 Dan Guido wrote:

 I'm trying to write a few methods that normalize Windows file paths.
 I've gotten it to work in 99% of the cases, but it seems like my code
 still chokes on '\x'. I've pasted my code below, can someone help me
 figure out a better way to write this? This seems overly complicated
 for such a simple problem...


 # returns normalized filepath with arguments removed
 def remove_arguments(filepath):
 #print removing args from:  + filepath
 (head, tail) = os.path.split(filepath)
 pathext = os.environ['PATHEXT'].split(;)

 while(tail != ''):
 #print trying:  + os.path.join(head,tail)

 # does it just work?
 if os.path.isfile(os.path.join(head, tail)):
 #print it just worked
 return os.path.join(head, tail)

 # try every extension
 for ext in pathext:
 if os.path.isfile(os.path.join(head, tail) + ext):
 return os.path.join(head, tail) + ext

 # remove the last word, try again
 tail = tail.split()[:-1]
 tail =  .join(tail)

 return None

 escape_dict={'\a':r'\a',
            '\b':r'\b',
            '\c':r'\c',
            '\f':r'\f',
            '\n':r'\n',
            '\r':r'\r',
            '\t':r'\t',
            '\v':r'\v',
            '\'':r'\'',
            #'\':r'\',
            '\0':r'\0',
            '\1':r'\1',
            '\2':r'\2',
            '\3':r'\3',
            '\4':r'\4',
            '\5':r'\5',
            '\6':r'\6',
            '\7':r'\a', #i have no idea
            '\8':r'\8',
            '\9':r'\9'}

 def raw(text):
 Returns a raw string representation of text
 new_string=''
 for char in text:
 try:
 new_string+=escape_dict[char]
 #print escaped
 except KeyError:
 new_string+=char
 #print keyerror
 #print new_string
 return new_string

 # returns the normalized path to a file if it exists
 # returns None if it doesn't exist
 def normalize_path(path):
 #print not normal:  + path

 # make sure it's not blank
 if(path == ):
 return None

 # get rid of mistakenly escaped bytes
 path = raw(path)
 #print step1:  + path

 # remove quotes
 path = path.replace('', '')
 #print step2:  + path

 #convert to lowercase
 lower = path.lower()
 #print step3:  + lower

 # expand all the normally formed environ variables
 expanded = os.path.expandvars(lower)
 #print step4:  + expanded

 # chop off \??\
 if expanded[:4] == \\??\\:
 expanded = expanded[4:]
 #print step5:  + expanded

 # strip a leading '/'
 if expanded[:1] == \\:
 expanded = expanded[1:]
 #print step7:  + expanded

 systemroot = os.environ['SYSTEMROOT']

 # sometimes systemroot won't have %
 r = re.compile('systemroot', re.IGNORECASE)
 expanded = r.sub(systemroot, expanded)
 #print step8:  + expanded

 # prepend the %systemroot% if its missing
 if expanded[:8] == system32 or syswow64:
 expanded = os.path.join(systemroot, expanded)
 #print step9:  + expanded

 stripped = remove_arguments(expanded.lower())

 # just in case you're running as LUA
 # this is a race condition but you can suck it
 if(stripped):
 if os.access(stripped, os.R_OK):
 return stripped

 return None

 def test_normalize():
 test1 = \??\C:\WINDOWS\system32\Drivers\CVPNDRVA.sys
 test2 = C:\WINDOWS\system32\msdtc.exe
 test3 = %SystemRoot%\system32\svchost.exe -k netsvcs
 test4 = \SystemRoot\System32\drivers\vga.sys
 test5 = system32\DRIVERS\compbatt.sys
 test6 = C:\Program Files\ABC\DEC Windows Services\Client Services.exe
 test7 = c:\Program Files\Common Files\Symantec Shared\SNDSrvc.exe
 test8 = C:\WINDOWS\system32\svchost -k dcomlaunch
 test9 = 
 test10 = SysWow64\drivers\AsIO.sys
 test11 = \SystemRoot\system32\DRIVERS\amdsbs.sys
 test12 = C:\windows\system32\xeuwhatever.sys #this breaks everything

 If I'm getting this right, what you try to do is to convert characters that
 come from string-literal escape-codes to their literal representation. Why?

 A simple

  test12 = rC:\windows\system32\xeuwhatever.sys

 is all you need - note the leading r. Then

  test12[2] == \\ # need escape on the right because of backslashes at end
 of raw-string-literals rule.

 holds.

 Diez
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Windows file paths, again

2009-10-21 Thread Anthony Tolle
On Oct 21, 3:20 pm, Dan Guido dgu...@gmail.com wrote:
 Hi Diez,

 The source of the string literals is ConfigParser, so I can't just
 mark them with an 'r'.

 config = ConfigParser.RawConfigParser()
 config.read(filename)
 crazyfilepath = config.get(name, ImagePath)
 normalfilepath = normalize_path(crazyfilepath)

 The ultimate origin of the strings is the _winreg function. Here I
 also can't mark them with an 'r'.

 regkey = OpenKey(HKEY_LOCAL_MACHINE,
 SYSTEM\\CurrentControlSet\\Services\\ + name)
 crazyimagepath = QueryValueEx(regkey, ImagePath)[0]
 CloseKey(key)

 --
 Dan Guido


I just did a quick test using Python 2.5.1 with the following script
on Windows:

# start of test.py
import ConfigParser
config = ConfigParser.RawConfigParser()
config.read(cfg.ini)
x = config.get(foo, bar)
print x
print repr(x)
from _winreg import *
regkey = OpenKey(HKEY_LOCAL_MACHINE,
rSYSTEM\CurrentControlSet\Services\IPSec)
x = QueryValueEx(regkey, ImagePath)[0]
CloseKey(regkey)
print x
print repr(x)
# end of test.py


Here is the contesnts of cfg.ini:

[foo]
bar=c:\dir\file.txt


Here is the output of the script:

c:\dir\file.txt
'c:\\dir\\file.txt'
system32\DRIVERS\ipsec.sys
u'system32\\DRIVERS\\ipsec.sys'


In either case, I don't see the functions returning strings that
requires special handling.  The backslashes are properly escaped in
the repr of both strings.

Something else must be going on if the strings are getting messed up
along the way.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Windows file paths, again

2009-10-21 Thread Ethan Furman

Dan Guido wrote:

I'm trying to write a few methods that normalize Windows file paths.
I've gotten it to work in 99% of the cases, but it seems like my code
still chokes on '\x'. I've pasted my code below, can someone help me
figure out a better way to write this? This seems overly complicated
for such a simple problem...



[snip]


test12 = C:\windows\system32\xeuwhatever.sys #this breaks everything


[snip]


--
Dan Guido


That is overly complicated.  I would recommend you use either raw 
strings for windows paths, or double backslashes.


The problem you are observing is that \x (unlike the simpler ones such 
as \t) takes a hex number after the \x, so the whole thing would be, for 
example, \xa9.  Because Python is looking for two hex digits after the 
back-slash, and not finding them, you get the error (long before your 
'fix-it' routine gets a chance to run).


Hope this helps.

~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


Re: Windows file paths, again

2009-10-21 Thread Dan Guido
Hi Anthony,

Thanks for your reply, but I don't think your tests have any control
characters in them. Try again with a \v, a \n, or a \x in your input
and I think you'll find it doesn't work as expected.

--
Dan Guido



On Wed, Oct 21, 2009 at 3:50 PM, Anthony Tolle anthony.to...@gmail.com wrote:
 On Oct 21, 3:20 pm, Dan Guido dgu...@gmail.com wrote:
 Hi Diez,

 The source of the string literals is ConfigParser, so I can't just
 mark them with an 'r'.

 config = ConfigParser.RawConfigParser()
 config.read(filename)
 crazyfilepath = config.get(name, ImagePath)
 normalfilepath = normalize_path(crazyfilepath)

 The ultimate origin of the strings is the _winreg function. Here I
 also can't mark them with an 'r'.

 regkey = OpenKey(HKEY_LOCAL_MACHINE,
 SYSTEM\\CurrentControlSet\\Services\\ + name)
 crazyimagepath = QueryValueEx(regkey, ImagePath)[0]
 CloseKey(key)

 --
 Dan Guido


 I just did a quick test using Python 2.5.1 with the following script
 on Windows:

 # start of test.py
 import ConfigParser
 config = ConfigParser.RawConfigParser()
 config.read(cfg.ini)
 x = config.get(foo, bar)
 print x
 print repr(x)
 from _winreg import *
 regkey = OpenKey(HKEY_LOCAL_MACHINE,
 rSYSTEM\CurrentControlSet\Services\IPSec)
 x = QueryValueEx(regkey, ImagePath)[0]
 CloseKey(regkey)
 print x
 print repr(x)
 # end of test.py


 Here is the contesnts of cfg.ini:

 [foo]
 bar=c:\dir\file.txt


 Here is the output of the script:

 c:\dir\file.txt
 'c:\\dir\\file.txt'
 system32\DRIVERS\ipsec.sys
 u'system32\\DRIVERS\\ipsec.sys'


 In either case, I don't see the functions returning strings that
 requires special handling.  The backslashes are properly escaped in
 the repr of both strings.

 Something else must be going on if the strings are getting messed up
 along the way.
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Windows file paths, again

2009-10-21 Thread Dave Angel

Dan Guido wrote:

Hi Diez,

The source of the string literals is ConfigParser, so I can't just
mark them with an 'r'.

config =onfigParser.RawConfigParser()
config.read(filename)
crazyfilepath =onfig.get(name, ImagePath)
normalfilepath =ormalize_path(crazyfilepath)

The ultimate origin of the strings is the _winreg function. Here I
also can't mark them with an 'r'.

regkey =penKey(HKEY_LOCAL_MACHINE,
SYSTEM\\CurrentControlSet\\Services\\ + name)
crazyimagepath =ueryValueEx(regkey, ImagePath)[0]
CloseKey(key)

--
Dan Guido



On Wed, Oct 21, 2009 at 2:34 PM, Diez B. Roggisch de...@nospam.web.de wrote:
  

Dan Guido wrote:



I'm trying to write a few methods that normalize Windows file paths.
I've gotten it to work in 99% of the cases, but it seems like my code
still chokes on '\x'. I've pasted my code below, can someone help me
figure out a better way to write this? This seems overly complicated
for such a simple problem...


# returns normalized filepath with arguments removed
def remove_arguments(filepath):
#print removing args from:  + filepath
(head, tail) =s.path.split(filepath)
pathext =s.environ['PATHEXT'].split(;)

while(tail !='):
#print trying:  + os.path.join(head,tail)

# does it just work?
if os.path.isfile(os.path.join(head, tail)):
#print it just worked
return os.path.join(head, tail)

# try every extension
for ext in pathext:
if os.path.isfile(os.path.join(head, tail) + ext):
return os.path.join(head, tail) + ext

# remove the last word, try again
tail =ail.split()[:-1]
tail = .join(tail)

return None

escape_dict=\a':r'\a',
   '\b':r'\b',
   '\c':r'\c',
   '\f':r'\f',
   '\n':r'\n',
   '\r':r'\r',
   '\t':r'\t',
   '\v':r'\v',
   '\'':r'\'',
   #'\':r'\',
   '\0':r'\0',
   '\1':r'\1',
   '\2':r'\2',
   '\3':r'\3',
   '\4':r'\4',
   '\5':r'\5',
   '\6':r'\6',
   '\7':r'\a', #i have no idea
   '\8':r'\8',
   '\9':r'\9'}

def raw(text):
Returns a raw string representation of text
new_string=
for char in text:
try:
new_string+=cape_dict[char]
#print escaped
except KeyError:
new_string+=ar
#print keyerror
#print new_string
return new_string

# returns the normalized path to a file if it exists
# returns None if it doesn't exist
def normalize_path(path):
#print not normal:  + path

# make sure it's not blank
if(path =):
return None

# get rid of mistakenly escaped bytes
path =aw(path)
#print step1:  + path

# remove quotes
path =ath.replace('', '')
#print step2:  + path

#convert to lowercase
lower =ath.lower()
#print step3:  + lower

# expand all the normally formed environ variables
expanded =s.path.expandvars(lower)
#print step4:  + expanded

# chop off \??\
if expanded[:4] =\\??\\:
expanded =xpanded[4:]
#print step5:  + expanded

# strip a leading '/'
if expanded[:1] =\\:
expanded =xpanded[1:]
#print step7:  + expanded

systemroot =s.environ['SYSTEMROOT']

# sometimes systemroot won't have %
r =e.compile('systemroot', re.IGNORECASE)
expanded =.sub(systemroot, expanded)
#print step8:  + expanded

# prepend the %systemroot% if its missing
if expanded[:8] =system32 or syswow64:
expanded =s.path.join(systemroot, expanded)
#print step9:  + expanded

stripped =emove_arguments(expanded.lower())

# just in case you're running as LUA
# this is a race condition but you can suck it
if(stripped):
if os.access(stripped, os.R_OK):
return stripped

return None

def test_normalize():
test1 =\??\C:\WINDOWS\system32\Drivers\CVPNDRVA.sys
test2 =C:\WINDOWS\system32\msdtc.exe
test3 =%SystemRoot%\system32\svchost.exe -k netsvcs
test4 =\SystemRoot\System32\drivers\vga.sys
test5 =system32\DRIVERS\compbatt.sys
test6 =C:\Program Files\ABC\DEC Windows Services\Client Services.exe
test7 =c:\Program Files\Common Files\Symantec Shared\SNDSrvc.exe
test8 =C:\WINDOWS\system32\svchost -k dcomlaunch
test9 =
test10 =SysWow64\drivers\AsIO.sys
test11 =\SystemRoot\system32\DRIVERS\amdsbs.sys
test12 =C:\windows\system32\xeuwhatever.sys #this breaks everything
  

If I'm getting this right, what you try to do is to convert characters that
come from string-literal escape-codes to their literal representation. Why?

A simple

 test12 =C:\windows\system32\xeuwhatever.sys

is all you need - note the leading r. Then

 test12[2] =\\ # need escape on the right because of backslashes at end
of raw-string-literals rule.

holds.

Diez
--
http://mail.python.org/mailman/listinfo/python-list




  
Your first problem is that you're mixing tabs and spaces in your source 
code.  Dangerous and confusing, not to mention an error in Python 3.x


The second problem is that your test_normalize() is called with a bunch 
of invalid literals.  Backslashes in quote literals need to be escaped, 
or you need to use the raw form of literal.  Now this may have nothing 
to do with the data you get from  ConfigParser or QueryValueEx(), but it 
sure makes testing confusing.


The third problem is 

Re: Windows file paths, again

2009-10-21 Thread Lie Ryan

Dan Guido wrote:

Hi Anthony,

Thanks for your reply, but I don't think your tests have any control
characters in them. Try again with a \v, a \n, or a \x in your input
and I think you'll find it doesn't work as expected.


A path read from a file, config file, or winreg would never contain 
control characters unless they contains that a control character.


My crystal ball thinks that you used eval or exec somewhere in your 
script, which may cause a perfectly escaped path to get unescaped, like 
here:


# python 3
path = 'C:\\path\\to\\somewhere.txt'
script = 'open(%s)' % path# this calls str(path)
exec(script)

OR

you stored the path incorrectly. Try seeing what exactly is stored in 
the registry using regedit.




Remember that escape characters doesn't really exist in the in-memory 
representation of the string. The escape characters exist only in string 
literals (i.e. source code) and when you print the string using repr().

--
http://mail.python.org/mailman/listinfo/python-list


Re: Windows file paths, again

2009-10-21 Thread Dan Guido
I'm writing a test case right now, will update in a few minutes :-).
I'm using Python 2.6.x

I need to read these values in from a configparser file or the windows
registry and get MD5 sums of the actual files on the filesystem and
copy the files to a new location. The open() method completely barfs
if I don't normalize the paths to the files first. I'll show the list,
just give me a little bit more time to separate the code from my
project that demonstrates this bug.
--
Dan Guido



On Wed, Oct 21, 2009 at 4:49 PM, Lie Ryan lie.1...@gmail.com wrote:
 Dan Guido wrote:

 Hi Anthony,

 Thanks for your reply, but I don't think your tests have any control
 characters in them. Try again with a \v, a \n, or a \x in your input
 and I think you'll find it doesn't work as expected.

 A path read from a file, config file, or winreg would never contain control
 characters unless they contains that a control character.

 My crystal ball thinks that you used eval or exec somewhere in your script,
 which may cause a perfectly escaped path to get unescaped, like here:

 # python 3
 path = 'C:\\path\\to\\somewhere.txt'
 script = 'open(%s)' % path    # this calls str(path)
 exec(script)

 OR

 you stored the path incorrectly. Try seeing what exactly is stored in the
 registry using regedit.



 Remember that escape characters doesn't really exist in the in-memory
 representation of the string. The escape characters exist only in string
 literals (i.e. source code) and when you print the string using repr().
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Windows file paths, again

2009-10-21 Thread Terry Reedy

Dan Guido wrote:

Hi Diez,

The source of the string literals is ConfigParser, so I can't just
mark them with an 'r'.


Python string literals only exist in Python source code. Functions and 
methods only return *strings*, not literals.  If you mistakenly put the 
str() representation of a string (such as print gives you) into source 
code, rather than the repr() output, then you may have trouble.


tjr

--
http://mail.python.org/mailman/listinfo/python-list


Re: Windows file paths, again

2009-10-21 Thread Jerry Hill
On Wed, Oct 21, 2009 at 5:40 PM, Dan Guido dgu...@gmail.com wrote:
 This doesn't give me quite the results I expected, so I'll have to
 take a closer look at my project as a whole tomorrow. The test cases
 clearly show the need for all the fancy parsing I'm doing on the path
 though.

To get back to what I think was your original question, there is an
easy way to take a string with control characters and turn it back
into a string with the control characters escaped, which could replace
your escape_dict and raw() function in normalize.py:

 s = 'Foo\t\n\n\x12Bar'
 print s
Foo 

Bar
 r = s.encode('string-escape')
 print r
Foo\t\n\n\x12Bar


(Python 2.6.1 on windows XP)

More generally, it sounds like you have some bad data in either the
registry, or your ini file.  You shouldn't have control characters in
there (unless you really have directories with control characters in
their names).  If you have control over how those values are written,
you should probably fix the bad data at the source instead of fixing
it as you pull it back in.

-- 
Jerry
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Windows file paths, again

2009-10-21 Thread Dave Angel

Dan Guido wrote:

This doesn't give me quite the results I expected, so I'll have to
take a closer look at my project as a whole tomorrow. The test cases
clearly show the need for all the fancy parsing I'm doing on the path
though.

Looks like I'll return to this tomorrow and post an update as
appropriate. Thanks for the help so far!
--
Dan Guido



On Wed, Oct 21, 2009 at 5:34 PM, Terry Reedy tjre...@udel.edu wrote:
  

Dan Guido wrote:


Hi Diez,

The source of the string literals is ConfigParser, so I can't just
mark them with an 'r'.
  

Python string literals only exist in Python source code. Functions and
methods only return *strings*, not literals.  If you mistakenly put the
str() representation of a string (such as print gives you) into source code,
rather than the repr() output, then you may have trouble.

tjr

--
http://mail.python.org/mailman/listinfo/python-list




For none of your test data does raw() change anything at all.  These 
strings do *not* need escaping.


Now some of the other things you do are interesting:

1) \??\   - presumably you're looking for a long UNC.  But that's 
signaled by  \\?\It's used to indicate to some functions that 
filenames over about 260 bytes are permissible.


2) The line:

   if expanded[:8] == system32 or syswow64:

doesn't do what you think it does.  it'll always evaluate as true, since 
== has higher priority and syswow64 is a non-empty string.  If you 
want to compare the string to both, you need to expand it out:


   either  if expanded[:8] == system32  or  expanded[:8] == syswow64
or simpler:
if expanded.startswith(system32) or  expanded.startswith(syswow64):

3) removing a leading backslash should imply that you replace it with 
the current directory, at least in most contexts.  I'm not sure what's 
the right thing here.




DaveA
--
http://mail.python.org/mailman/listinfo/python-list