Steve Jorgensen wrote: > Based on responses to my previous proposal, I am convinced that it was > over-ambitious > and not appropriate for inclusion in the Python standard library, so starting > over with a > more narrowly scoped suggestion. > Proposal: > Add a new function (possibly os.path.sanitizepart) to sanitize a value for > use as a single component of a path. In the default case, the value must also > not be a > reference to the current or parent directory ("." or "..") and must not > contain control > characters. > When an invalid character is encountered, then ValueError will be raised > in the default case, or the character may be replaced or escaped. > When an invalid name is encountered, then ValueError will be raised in the > default case, or the first character may be replaced, escaped, or prefixed. > Control characters (those in the Unicode general category of "C") are treated > as invalid > by default. > After applying any transformations, if the result would still be invalid, > then an > exception is raised. > Proposed function signature: sanitizepart(name, replace=None, escape=None, > prefix=None, flags=0) > When replace is supplied, it is used as a replacement for any invalid > characters or for the first character of an invalid name. When prefix is not > also supplied, this is also used as the replacement for the first character > of the name if > it is invalid, not simply due to containing invalid characters. > When escape is supplied (typically "%") it is used as the escape character > in the same way that "%" is used in URL encoding. When a non-ASCII character > is escaped, > it is represented as a sequence of encoded bytes/octets. When prefix is not > also supplied, this is also used to escape the first character of the name if > it is > invalid, not simply due to containing invalid characters. > replace and escape are mutually exclusive. > When prefix is supplied (typically "_"), it is prepended the name if it is > invalid, not simply due to containing invalid characters. > Flags: > > path.PERMIT_RELATIVE (1): Permit relative path values ("." "..") > path.PERMIT_CTRL (2): Permit characters in the Unicode general category of > "C".
Somewhere between the 1st and 2nd proposal, I lost track of the system-specificity issue. Even with this more focused proposal, there is the issue of different path separators on Windows vs *nix, so the function needs another argument for that. Presumably, it would have a default of `None` meaning to use the current platform and would have constants for `NIX`, `WIN`, and `GENERAL` where `WIN` and `GENERAL` behave the same, recognizing either "/" or "\" as a file separator character. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/SRQJ2BZHYYVIPW7CGABLNCWLZMOMCZO3/ Code of Conduct: http://python.org/psf/codeofconduct/