Steve Jorgensen wrote:
> Based on responses to my previous proposal, I am convinced that it was 
> over-ambitious
> and not appropriate for inclusion in the Python standard library, so starting 
> over with a
> more narrowly scoped suggestion.
> Proposal:
> Add a new function (possibly os.path.sanitizepart) to sanitize a value for
> use as a single component of a path. In the default case, the value must also 
> not be a
> reference to the current or parent directory ("." or "..") and must not 
> contain control
> characters.
> When an invalid character is encountered, then ValueError will be raised
> in the default case, or the character may be replaced or escaped.
> When an invalid name is encountered, then ValueError will be raised in the
> default case, or the first character may be replaced, escaped, or prefixed.
> Control characters (those in the Unicode general category of "C") are treated 
> as invalid
> by default.
> After applying any transformations, if the result would still be invalid, 
> then an
> exception is raised.
> Proposed function signature: sanitizepart(name, replace=None, escape=None,
> prefix=None, flags=0)
> When replace is supplied, it is used as a replacement for any invalid
> characters or for the first character of an invalid name. When prefix is not
> also supplied, this is also used as the replacement for the first character 
> of the name if
> it is invalid, not simply due to containing invalid characters.
> When escape is supplied (typically "%") it is used as the escape character
> in the same way that "%" is used in URL encoding. When a non-ASCII character 
> is escaped,
> it is represented as a sequence of encoded bytes/octets. When prefix is not
> also supplied, this is also used to escape the first character of the name if 
> it is
> invalid, not simply due to containing invalid characters.
> replace and escape are mutually exclusive.
> When prefix is supplied (typically "_"), it is prepended the name if it is
> invalid, not simply due to containing invalid characters.
> Flags:
> 
> path.PERMIT_RELATIVE (1): Permit relative path values ("." "..")
> path.PERMIT_CTRL (2): Permit characters in the Unicode general category of 
> "C".

Somewhere between the 1st and 2nd proposal, I lost track of the 
system-specificity issue. Even with this more focused proposal, there is the 
issue of different path separators on Windows vs *nix, so the function needs 
another argument for that. Presumably, it would have a default of `None` 
meaning to use the current platform and would have constants for `NIX`, `WIN`, 
and `GENERAL` where `WIN` and `GENERAL` behave the same, recognizing either "/" 
or "\" as a file separator character.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SRQJ2BZHYYVIPW7CGABLNCWLZMOMCZO3/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to