New submission from Josh Rosenberg <shadowranger+pyt...@gmail.com>:

At least on PosixPath (not currently able to try on Windows to check 
WindowsPath, but from a quick code check I think it'll behave the same way), 
the negative indexing added in #21041 is implemented incorrectly for absolute 
paths. Passing either -1 or -2 will return a path representing the root, '/' 
for PosixPath (which should only be returned for -1), and passing an index of 
-3 or beyond returns the value expected for that index + 1, e.g. -3 gets the 
result expected for -2, -4 gets the result for -3, etc. And for the negative 
index that should be equivalent to index 0, you end up with an IndexError.

The underlying problem appears to be that absolute paths (at least, those 
created from a string) are represented in self._parts with the root '/' 
included (redundantly, since self._root has it too), so all the actual 
components of the path are offset by one.

This does not affect slicing (slicing is implemented using range and 
slice.indices to perform normalization from negative to positive indices, so it 
never indexes with a negative index).

Example:

>>> from pathlib import Path
>>> p = Path('/1/2/3')
>>> p._parts
['/', '1', '2', '3']
>>> p.parents[:]
(PosixPath('/1/2'), PosixPath('/1'), PosixPath('/'))
>>> p.parents[-1]
PosixPath('/')
>>> p.parents[-1]._parts  # Still behaves normally as self._root is still '/'
[]
>>> p.parents[-2]
PosixPath('/')
>>> p.parents[-2]._parts
['/']
>>> p.parents[-3]
PosixPath('/1')
>>> p.parents[-4]
Traceback (most recent call last):
    ...
IndexError: -4

It looks like the underlying problem is that the negative indexing code doesn't 
account for the possibility of '/' being in _parts and behaving as a component 
separate from the directory/files in the path. Frankly, it's a little odd that 
_parts includes '/' at all (Path has a ._root/.root attribute that stores it 
too, and even when '/' isn't in the ._parts/.parts, the generated complete path 
includes it because of ._root), but it looks like the docs guaranteed that 
behavior in their examples.

It looks like one of two options must be chosen:

1. Fix the negative indexing code to account for absolute paths, and ensure 
absolute paths store '/' in ._parts consistently (it should not be possible to 
get two identical Paths, one of which includes '/' in _parts, one of which does 
not, which is possible with the current negative indexing bug; not sure if 
there are any documented code paths that might produce this warped sort of 
object outside of the buggy .parents), or

2. Make no changes to the negative indexing code, but make absolute paths 
*never* store the root as the first element of _parts (.parts can prepend 
self._drive/self._root on demand to match documentation). This probably 
involves more changes (lots of places assume _parts includes the root, e.g. the 
_PathParents class's own __len__ method raises a ValueError when called on the 
warped object returned by p.parents[-1], because it adjusts for the root, and 
the lack of one means it returns a length of -1).

I think #1 is probably the way to go. I believe all that would require is to 
add:

if idx < 0:
    return self.__getitem__(len(self) + idx)

just before:

return self._pathcls._from_parsed_parts(self._drv, self._root, 
self._parts[:-idx - 1])

so it never tries to use a negative idx directly (it has to occur after the 
check for valid index in [-len(self), len(self) so very negative indices don't 
recurse until they become positive).

This takes advantage of _PathParents's already adjusting the reported length 
for the presence of drive/root, keeping the code simple; the alternative I came 
up with that doesn't recurse changes the original return line:

return self._pathcls._from_parsed_parts(self._drv, self._root, 
self._parts[:-idx - 1])

to:

adjust = idx >= 0 or not (self._drv or self._root)
return self._pathcls._from_parsed_parts(self._drv, self._root, 
self._parts[:-idx - adjust])

which is frankly terrible, even if it's a little faster.

----------
components: Library (Lib)
messages: 403488
nosy: josh.r
priority: normal
severity: normal
status: open
title: pathlib.Path.parents negative indexing is wrong for absolute paths
versions: Python 3.10, Python 3.11

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue45414>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to