On May 4, 2008, at 2:02 PM, Binger David wrote:

In any case, what would seem to me to be natural and consistent behaviour is to just quote the match and/or replacement strings, if they are themselves not yet an h8 instance.

The problem is that replace (like strip and slice)  can
break entities.

w.replace(';', '.').

Ah, that is a concern I was not even thinking about. It would be indeed worrisome to have things like that be done on your xml. On the other hand, qpy constrains itself strictly to *generic* string manipulations, and not to XML string manipulation. I.e. XMl semantics, beyond the escape characters, are by choice not addressed in any way. To be able to do proper XML-semantic-respecting string replacement, proper XML parsing would be needed...

Plus, thinking more about the concern of anyone actually doing the crazy substitution as above, I think whether you ensure that the match ";" and replacement "." strings are first cast to h8, or whether you "downcast" the ref string to unicode (and do the operation in unicode) will always give the same result... so, the concern is either way not addressed.

But, if the match or the replacement string was a special char, then it would make a difference (but in the inverse sense!):

$ python
Python 2.5.1 (r251:54863, Feb  4 2008, 21:48:13)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from qpy import h8
>>> h = h8("<xml/>")
>>> h.replace(">", "<")
u'<xml/<'
>>> h.replace(h8("")+">", h8("")+"<") # ensure match/replacements are safely escaped
u'<xml/>'
>>>

So, if i understand correctly, the concern you raise will be better addressed when the match and replacement strings are ensured to be safely quoted.

mario

_______________________________________________
QP mailing list
[email protected]
http://mail.mems-exchange.org/mailman/listinfo/qp

Reply via email to