-Original Message-
From: Steven D'Aprano
Sent: 11 May 2020 06:02
To: [email protected]
Subject: [Python-ideas] Re: Improve handling of Unicode quotes and hyphens
On Mon, May 11, 2020 at 04:28:38AM +, Steve Barnes wrote:
> So we currently have a situation where not only does
A third-party module on PyPI for "fix-the-horrible-things-Outlook-does"
could be useful. There is no way the standard library can or should keep
up with the newest mangling techniques mail handlers employ in this week's
version.
I don't understand what you mean by the current interpreter not tell
Based on responses to my previous proposal, I am convinced that it was
over-ambitious and not appropriate for inclusion in the Python standard
library, so starting over with a more narrowly scoped suggestion.
Proposal:
Add a new function (possibly `os.path.sanitizepart`) to sanitize a value for
From: David Mertz
Sent: 11 May 2020 08:34
To: Steve Barnes
Cc: [email protected]
Subject: Re: [Python-ideas] Re: Improve handling of Unicode quotes and hyphens
A third-party module on PyPI for "fix-the-horrible-things-Outlook-does" could
be useful. There is no way the standard library
On Mon, May 11, 2020 at 6:09 PM Steve Barnes wrote:
>
> Actually, in the case of the “wrong quotes” it puts the pointer under the
> character before the space character or at the end of the line (if you have a
> fixed spacing font – worse if you don’t) – it still doesn’t tell you which
> charac
Steve Jorgensen wrote:
> When escape is supplied (typically "%") it is used as the escape character
> in the same way that "%" is used in URL encoding. When a non-ASCII character
> is escaped,
> it is represented as a sequence of encoded bytes/octets.
I neglected to say that the octet sequence
10.05.20 10:09, Steve Barnes пише:
4. Start accepting hyphens as minus & Unicode quotation marks – this
would be the ideal answer for pasted code but has a lot of possible
things to iron out such as do we require that the quotes match and
are in the typographically correct order. It
11.05.20 03:34, Steven D'Aprano пише:
There are a couple of professionally published Python books written
using Restructed Text, Sphinx and Python. So people do have a choice,
or at least a technical choice.
There was similar issue with Sphinx. It uses a third-party tools to
"improve" the HTML
Steve Jorgensen wrote:
> Based on responses to my previous proposal, I am convinced that it was
> over-ambitious
> and not appropriate for inclusion in the Python standard library, so starting
> over with a
> more narrowly scoped suggestion.
> Proposal:
> Add a new function (possibly os.path.sani
On May 11, 2020, at 00:40, Steve Jorgensen wrote:
>
> Proposal:
>
> Add a new function (possibly `os.path.sanitizepart`) to sanitize a value for
> use as a single component of a path. In the default case, the value must also
> not be a reference to the current or parent directory ("." or "..")
On 2020-05-11 09:21, Chris Angelico wrote:
On Mon, May 11, 2020 at 6:09 PM Steve Barnes wrote:
Actually, in the case of the “wrong quotes” it puts the pointer under the
character before the space character or at the end of the line (if you have a
fixed spacing font – worse if you don’t) – it
On May 10, 2020, at 21:51, Christopher Barker wrote:
>
>
> On Sun, May 10, 2020 at 9:36 PM Andrew Barnert wrote:
>
>> However, there is one potential problem with the property I hadn’t thought
>> of until just now: I think people will understand that mylist.view[2:] is
>> not mutable, but w
On Mon, May 11, 2020 at 12:50 AM Christopher Barker
wrote:
> I'm still confused what you mean by extend to all iterators? you mean that
> you could use slice syntax with anything iterable>
>
> And where does this fit in to the iterable vs iterator continuum?
>
> iterables will return an iterator
On May 10, 2020, at 22:36, Stephen J. Turnbull
wrote:
>
> Andrew Barnert via Python-ideas writes:
>
>> A lot of people get this confused. I think the problem is that we
>> don’t have a word for “iterable that’s not an iterator”,
>
> I think part of the problem is that people rarely see explic
On May 11, 2020, at 10:57, Alex Hall wrote:
>
>
>> On Mon, May 11, 2020 at 12:50 AM Christopher Barker
>> wrote:
>
>
>> Though it is heading in a different direction that where Andrew was
>> proposing, that this would be about making and using views on sequences,
>> which really wouldn't
What does sanitizepart do with newlines \n \r \r\n in filenames? Are these
control characters?
What does sanitizepart do with a leading slash?
assert os.path.join("a", "/b") == "/b"
A new safejoin() or joinsafe() or join(safe='True') could call
sanitizepart() such that:
assert joinsafe("a\n", "
> On 10 May 2020, at 01:34, Steve Jorgensen wrote:
>
> I believe the Python standard library should include a means of sanitizing a
> filesystem entry, and this should not be something requiring a 3rd party
> package.
snip
I found that I needed to have code that could tell me if a filename
FWIW, here are some of the CWE codes for related vulnerabilities/weaknesses
in implementations:
CWE-73: External Control of File Name or Path
https://cwe.mitre.org/data/definitions/73.html
CWE-707: Improper Neutralization
https://cwe.mitre.org/data/definitions/707.html
CWE-22: Improper Limitatio
(Is it almost always better to just use a hash of the provided filename
(maybe in a p/a/ir/tree234 implementation to avoid the max files in a
directory limit of whichever filesystem) instead of the user-supplied
filename string?)
On Mon, May 11, 2020 at 4:48 PM Wes Turner wrote:
> FWIW, here are
Andrew Barnert wrote:
> On May 11, 2020, at 00:40, Steve Jorgensen [email protected] wrote:
> > Proposal:
> > Add a new function (possibly os.path.sanitizepart) to sanitize a value for
> > use as a single component of a path. In the default case, the value must
> > also not be a
> > reference to
> On 11 May 2020, at 18:09, Andrew Barnert via Python-ideas
> wrote:
>
> More generally, what’s the use case for %-encoding filenames like this? Are
> people expecting it to interact transparently with URLs, so if I save a file
> “spam\0eggs” in a Python script and then try to browse to file
Andrew Barnert wrote:
> On May 11, 2020, at 00:40, Steve Jorgensen [email protected] wrote:
> > Proposal:
> > Add a new function (possibly os.path.sanitizepart) to sanitize a value for
> > use as a single component of a path. In the default case, the value must
> > also not be a
> > reference to
On May 11, 2020, at 12:59, Barry Scott wrote:
>
>
>> On 11 May 2020, at 18:09, Andrew Barnert via Python-ideas
>> wrote:
>>
>> More generally, what’s the use case for %-encoding filenames like this? Are
>> people expecting it to interact transparently with URLs, so if I save a file
>> “spam
On May 11, 2020, at 12:54, Wes Turner wrote:
>
>
> What does sanitizepart do with newlines \n \r \r\n in filenames? Are these
> control characters?
>>> unicodedata.category('\n')
Cc
___
Python-ideas mailing list -- [email protected]
T
> On May 11, 2020, at 14:18, Steve Jorgensen wrote:
>
> Andrew Barnert wrote:
>>> On May 11, 2020, at 00:40, Steve Jorgensen [email protected] wrote:
>>> Proposal:
>>> Add a new function (possibly os.path.sanitizepart) to sanitize a value for
>>> use as a single component of a path. In the defa
On May 11, 2020, at 13:31, Barry Scott wrote:
>
> macOS and Unix version (I only use Unicode input so avoid the random bytes
> problems):
But that doesn’t avoid the problem. If someone gives you a character whose
encoding on the target filesystem includes a null or pathsep byte, your
sanitize
On Mon, May 11, 2020 at 09:12:52PM -, Steve Jorgensen
wrote:
> When the platform is Windows, certainly, ":" should not be allowed,
> and perhaps colon should not be allowed at all.
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file
Forbidden characters:
chr(0) < > : "
On Mon, May 11, 2020 at 11:38 AM Andrew Barnert wrote:
> On May 11, 2020, at 10:57, Alex Hall wrote:
>
>
> On Mon, May 11, 2020 at 12:50 AM Christopher Barker
> wrote:
>
>
>> Though it is heading in a different direction that where Andrew was
>> proposing, that this would be about making and
28 matches
Mail list logo