Hi Todd,
my comments below. Also would offer my time for reviewing/testing if wanted.
On 22.11.20 20:53, Todd wrote:
I know enhancements to pathlib gets brought up occasionally, but it
doesn't look like anyone has been willing to take the initiative and
see things through to completion. I am willing to keep the ball
rolling here and even implement these myself. I have some suggestions
and I would like to discuss them. I don't think any of them are
significant enough to require a pep. These can be split it into
independent threads if anyone prefers.
1. copy
The big one people keep bringing up that I strongly agree on is a
"copy" method. This is really the only common file manipulation task
that currently isn't possible. You can make files, read them, move
them, delete them, create directories, even do less common operations
like change owners or create symlinks or hard links.
I really would appreciate that one. If I could through in another detail
which we needed a lot:
- atomic_copy or copy(atomic=True) whatever form you prefer
It is not as easy to achieve as it may look on the first sight.
Especially when it comes to tempfiles and permissions. The use cases of
atomic copy included scenarios for multiple parallel access of files
like caches in web development.
A common objection is that pathlib doesn't work on multiple paths.
But that isn't the case. There are a ton of methods that do that,
including:
* symlink_to
* link_to
* rename
* replace
* glob
* rglob
* iterdir
* is_relative_to
* relative_to
* samefile
I think this is really the only common file operation that someone
would need to switch to a different module to do, and it seems pretty
strange to me to be able to make symbolic or hard links to a file but
not straight up copy one.
2. recursive remove
This could be a "recursive" option to "rmdir" or a "rmtree" method (I
prefer the option). The main reason for this is symmetry. It is
possible to create a tree of folders (using "mkdir(parents=True)"),
but once you do that you cannot remove it again in a straightforward way.
Importing shutil does not seem to be a big deal but I agree that it's
somehow weird to be missing.
Correct me if I'm wrong, but os.path somehow is closer to OS-level
operations whereas shutil basically provides all the missing convenience
features that sh provided.
So, to me it boils down to the question if pathlib is a completely new
paradigm. If so, then sure let's add it. Additionally, I like the
"batteries included" theme of Python.
Last but not least, I tend more towards the "rmtree" method just to make
it crystal clear to everyone. Maybe docs could cross-refer both methods.
Tree manipulations are inherently complicated and a lot can go wrong.
Symmetry is not 100% given as you might delete more than what you've
created (which was a single node path).
3. newLine for write_text
This is the only relevant option that "Path.open" has but
"Path.write_text" doesn't, and is a serious omission when dealing with
multiple operating systems.
+1
4. uid and gid
You can get the owner and group name of a file (with the "owner" and
"group" methods), but there is no easy way to get the corresponding
number.
+1
5. Stem with no suffixes
The stem property only takes off the last suffix, but even in the
example given ('my/library.tar.gz') it isn't really useful because the
suffix has two parts ('.tar' and '.gz'). I suggest another property,
probably called "rootstem" or "basestem", that takes off all the
suffixes, using the same logic as the "suffixes" property. This is
another symmetry issue: it is possible to extract all the suffixes,
but not remove them.
+1
Does anybody rely of this behavior of ".stem"? It always seemed odd to
me but that might be because of the use-cases I work with.
So, another possibility would be to fix "stem" to do what makes sense.
Maybe also a renaming the concept "suffix" to "final_suffix" (also more
concurrent to what docs says: "The file extension of the final
component, if any:").
To me that has always been the weirdest conceptual behavior of the lib.
Not sure if that's possible to fix before people need time machines.
6. with_suffixes
Equivalent to with_suffix, but replacing all suffixes. Again, this is
a symmetry issue. It is hard to manipulate all the suffixes right
now, as the example show. You can add them or extract them, but not
change them without doing several steps.
+1
Same comment like for basestem.
7. exist_ok for is_* methods
Currently all the is_* methods (such as is_file) return False if the
file doesn't exist or if it is a broken symlink. This can be
dangerous, since it is not trivially easy to tell if you are dealing
with the wrong type of file vs. a missing file. And it isn't obvious
behavior just from the method name. I suggest adding an "exist_ok"
argument to all of these, with the default being "True" for
backwards-compatibility. This argument name is already in use
elsewhere in pathlib. If this is False and the file is not present, a
"FileNotFoundError" is raised.
+1
Maybe missing_ok could help more to make people understand what the
parameter actually does.
exist_ok is used for creation methods (mkdir and touch). So, the name
makes more sense in these context.
Best
Sven
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/OKFUWDK6UM7LZBKZV3XHBP46WZMCDCAA/
Code of Conduct: http://python.org/psf/codeofconduct/