Re: [Python-Dev] PEP 3121, 384 Refactoring Issues
On 14 Jul 2014 11:41, "Brett Cannon" wrote: > > > I agree for PEP 3121 which is the initialization/finalization work. The stable ABi is not necessary. So maybe we should re-examine the patches and accept the bits that clean up init/finalization and leave out any ABi-related changes. Martin's right about improving the subinterpreter support - every type declaration we move from a static struct to the dynamic type creation API is one that isn't shared between subinterpreters any more. That argument is potentially valid even for *builtin* modules and types, not just those in extension modules. Cheers, Nick. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remaining decisions on PEP 471 -- os.scandir()
On 14 Jul 2014 22:50, "Ben Hoyt" wrote: > > In light of that, I propose I update the PEP to basically follow > Victor's model of is_X() and stat() following symlinks by default, and > allowing you to specify follow_symlinks=False if you want something > other than that. > > Victor had one other question: > > > What happens to name and full_name with followlinks=True? > > Do they contain the name in the directory (name of the symlink) > > or name of the linked file? > > I would say they should contain the name and full path of the entry -- > the symlink, NOT the linked file. They kind of have to, right, > otherwise they'd have to be method calls that potentially call the > system. It would be worth explicitly pointing out "os.readlink(entry.full_name)" in the docs as the way to get the target of a symlink entry. Alternatively, it may be worth including a readlink() method directly on the entry objects. (That can easily be added later though, so no need for it in the initial proposal). > > In any case, here's the modified proposal: > > scandir(path='.') -> generator of DirEntry objects, which have: > > * name: name as per listdir() > * full_name: full path name (not necessarily absolute), equivalent of > os.path.join(path, entry.name) > * is_dir(follow_symlinks=True): like os.path.isdir(entry.full_name), > but free in most cases; cached per entry > * is_file(follow_symlinks=True): like os.path.isfile(entry.full_name), > but free in most cases; cached per entry > * is_symlink(): like os.path.islink(), but free in most cases; cached per entry > * stat(follow_symlinks=True): like os.stat(entry.full_name, > follow_symlinks=follow_symlinks); cached per entry > > The above may not be quite perfect, but it's good, and I think there's > been enough bike-shedding on the API. :-) +1, sounds good to me (and I like having the caching guarantees listed - helps make it clear how DirEntry differs from pathlib.Path) Cheers, Nick. > > So please speak now or forever hold your peace. :-) I intend to update > the PEP to reflect this and make a few other clarifications in the > next few days. > > -Ben > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remaining decisions on PEP 471 -- os.scandir()
> Looks doable. Just make sure the cached entries reflect the > 'follow_symlinks' setting -- so a symlink could end up with both an lstat > cached entry and a stat cached entry. Yes, good point -- basically the functions will use the _stat cache if follow_symlinks=True, otherwise the _lstat cache. If the entry is not a symlink (the usual case), they'll be the same value. -Ben ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remaining decisions on PEP 471 -- os.scandir()
> Sorry, I don't remember who but someone proposed to add the follow_symlinks > parameter in scandir() directly. If the parameter is added to methods, > there is no such issue. Yeah, I think having the DirEntry methods do different things depending on how scandir() was called is a really bad idea. It seems you're agreeing with this? > Again: remove any garantee about the cache in the definitions of methods, > instead copy the doc from os.path and os. Add a global remark saying that > most methods don't need any syscall in general, except for symlinks (with > follow_symlinks=True). I'm not sure I follow this -- surely it *has* to be documented that the values of DirEntry.is_X() and DirEntry.stat() are cached per entry, in contrast to os.path.isX()/os.stat()? I don't mind a global remark about not needing syscalls, but I do think it makes sense to make it explicit -- that is_X() almost never need syscalls, whereas stat() does only on POSIX but is free on Windows (except for symlinks). -Ben ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remaining decisions on PEP 471 -- os.scandir()
> I'd *keep DirEntry.lstat() method* regardless of existence of > .stat(*, follow_symlinks=True) method (despite the slight violation of > DRY principle) for readability. `dir_entry.lstat().st_mode` is more > consice than `dir_entry.stat(follow_symlinks=False).st_mode` and the > meaning of lstat is well-established -- get (symbolic link) status [2]. The meaning of lstat() is well-established, so I don't mind this. But I don't think it's necessary, either. My thought would be that in new code/functions we should kind of prescribe best-practices rather than leave the options open. Yes, it's a few more characters, but "follow_symlinks=True" is allow much clear than "l" to describe this behaviour, especially for non-Linux hackers. > I suggest *renaming .full_name -> .path* due to reasons outlined in [1]. > > [1]: https://mail.python.org/pipermail/python-dev/2014-July/135441.html Hmmm, perhaps. You suggest .full_name implies it's the absolute path, which isn't true. I don't mind .path, but it kind of sounds like "the Path object associated with this entry". I think "full_name" is fine -- it's not "abs_name". > follow_symlinks (if added) should be *keyword-only parameter* because > `dir_entry.is_dir(False)` is unreadable (it is not clear at a glance > what `False` means in this case). Agreed follow_symlinks should be a keyword-only parameter (as it is in os.stat() in Python 3). > Exceptions are part of the public API. pathlib is inconsitent with > os.path here e.g., os.path.isdir() ignores all OS errors raised by > the stat() call but the corresponding pathlib call ignores only broken > symlinks (non-existent entries). > > The cherry-picking of which stat errors to silence (implicitly) seems > worse than either silencing the errors (like os.path.isdir does) or > allowing them to propagate. Hmmm, you're right there's a subtle difference here. I think the os.path.isdir() behaviour could mask real errors, and the pathlib behaviour is more correct. pathlib's behaviour is not implicit though -- it's clearly documented in the docs: https://docs.python.org/3/library/pathlib.html#pathlib.Path.is_dir > Returning False instead of raising OSError in is_dir() method simplifies > the usage greatly without (much) negative consequences. It is a *rare* > case when silencing errors could be more practical. I think is_X() *should* fail if there are permissions errors or other fatal errors. Whether or not they should fail if the file doesn't exist (unlikely to happen anyway) or on a broken symlink is a different question, but there's a good prececent with the existing os/pathlib functions there. -Ben ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remaining decisions on PEP 471 -- os.scandir()
On 15 July 2014 13:19, Ben Hoyt wrote: > Hmmm, perhaps. You suggest .full_name implies it's the absolute path, > which isn't true. I don't mind .path, but it kind of sounds like "the > Path object associated with this entry". I think "full_name" is fine > -- it's not "abs_name". Interesting. I hadn't really thought about it, but I might have assumed full_name was absolute. However, now I see that it's "only as absolute as the directory argument to scandir is". Having said that, I don't think that full_name *implies* that, just that it's a possible mistake people could make. I agree that "path" could be seen as implying a Path object. My preference would be to retain the name full_name, but just make it explicit in the documentation that it is based on the directory name argument. Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remaining decisions on PEP 471 -- os.scandir()
On 07/14/2014 11:25 PM, Victor Stinner wrote: Again: remove any garantee about the cache in the definitions of methods, instead copy the doc from os.path and os. Add a global remark saying that most methods don't need any syscall in general, except for symlinks (with follow_symlinks=True). I don't understand what you're saying here. The fact that DirEnrry.is_xxx will use cached values *must* be documented, or our users will waste huge amounts of time trying to figure out why an unknowingly cached value is no longer matching the current status. ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Another case for frozendict
In article , Chris Angelico wrote: > On Mon, Jul 14, 2014 at 12:04 AM, Jason R. Coombs wrote: > > I can achieve what I need by constructing a set on the âitemsâ of the > > dict. > > > set(tuple(doc.items()) for doc in res) > > > > {(('n', 1), ('err', None), ('ok', 1.0))} > > This is flawed; the tuple-of-tuples depends on iteration order, which > may vary. It should be a frozenset of those tuples, not a tuple. Which > strengthens your case; it's that easy to get it wrong in the absence > of an actual frozendict. I would love to see frozendict in python. I find myself using dicts for translation tables, usually tables that should not be modified. Documentation usually suffices to get that idea across, but it's not ideal. frozendict would also be handy as a default values for function arguments. In that case documentation isn't enough and one has to resort to using a default value of None and then changing it in the function body. I like frozendict because I feel it is expressive and adds some safety. -- Russell ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Another case for frozendict
On 2014-07-16 00:48, Russell E. Owen wrote: In article , Chris Angelico wrote: On Mon, Jul 14, 2014 at 12:04 AM, Jason R. Coombs wrote: > I can achieve what I need by constructing a set on the ‘items’ of the dict. > set(tuple(doc.items()) for doc in res) > > {(('n', 1), ('err', None), ('ok', 1.0))} This is flawed; the tuple-of-tuples depends on iteration order, which may vary. It should be a frozenset of those tuples, not a tuple. Which strengthens your case; it's that easy to get it wrong in the absence of an actual frozendict. I would love to see frozendict in python. I find myself using dicts for translation tables, usually tables that should not be modified. Documentation usually suffices to get that idea across, but it's not ideal. frozendict would also be handy as a default values for function arguments. In that case documentation isn't enough and one has to resort to using a default value of None and then changing it in the function body. I like frozendict because I feel it is expressive and adds some safety. Here's another use-case. Using the 're' module: >>> import re >>> # Make a regex. ... p = re.compile(r'(?P\w+)\s+(?P\w+)') >>> >>> # What are the named groups? ... p.groupindex {'first': 1, 'second': 2} >>> >>> # Perform a match. ... m = p.match('FIRST SECOND') >>> m.groupdict() {'first': 'FIRST', 'second': 'SECOND'} >>> >>> # Try modifying the pattern object. ... p.groupindex['JUNK'] = 'foobar' >>> >>> # What are the named groups now? ... p.groupindex {'first': 1, 'second': 2, 'JUNK': 'foobar'} >>> >>> # And the match object? ... m.groupdict() Traceback (most recent call last): File "", line 2, in IndexError: no such group It can't find a named group called 'JUNK'. And with a bit more tinkering it's possible to crash Python. (I'll leave that as an exercise for the reader! :-)) The 'regex' module, on the other hand, rebuilds the dict each time: >>> import regex >>> # Make a regex. ... p = regex.compile(r'(?P\w+)\s+(?P\w+)') >>> >>> # What are the named groups? ... p.groupindex {'second': 2, 'first': 1} >>> >>> # Perform a match. ... m = p.match('FIRST SECOND') >>> m.groupdict() {'second': 'SECOND', 'first': 'FIRST'} >>> >>> # Try modifying the regex. ... p.groupindex['JUNK'] = 'foobar' >>> >>> # What are the named groups now? ... p.groupindex {'second': 2, 'first': 1} >>> >>> # And the match object? ... m.groupdict() {'second': 'SECOND', 'first': 'FIRST'} Using a frozendict instead would be a nicer solution. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remaining decisions on PEP 471 -- os.scandir()
I was going to stay out of this one... On 14Jul2014 10:25, Victor Stinner wrote: 2014-07-14 4:17 GMT+02:00 Nick Coghlan : Or the ever popular symlink to "." (or a directory higher in the tree). "." and ".." are explicitly ignored by os.listdir() an os.scandir(). I think os.walk() is a good source of inspiration here: call the flag "followlink" and default it to False. I also think followslinks should be spelt like os.walk, and also default to False. IMO the specific function os.walk() is not a good example. It includes symlinks to directories in the dirs list and then it does not follow symlink, I agree that is a bad mix. it is a recursive function and has a followlinks optional parameter (default: False). Which I think is desirable. Moreover, in 92% of cases, functions using os.listdir() and os.path.isdir() *follow* symlinks: https://mail.python.org/pipermail/python-dev/2014-July/135435.html Sigh. This is a historic artifact, a convenience, and a side effect of bring symlinks into UNIX in the first place. The objective was that symlinks should largely be transparent to users for naive operation. So the UNIX calls open/cd/listdir all follow symlinks so that things work transparently and a million C programs do not break. However, so do chmod/chgrp/chown, for the same reasons and with generally less desirable effects. Conversely, the find command, for example, does not follow symlinks and this is generally a good thing. "ls" is the same. Like os.walk, they are for inspecting stuff, and shouldn't indirect unless asked. I think following symlinks, especially for something like os.walk and os.scandir, should default to False. I DO NOT want to quietly wander to remote parts of the file space because someone has stuck a symlink somewhere unfortunate, lurking like a little bomb (or perhaps trapdoor, waiting to suck me down into an unexpected dark place). It is also slower to follow symlinks by default. I am also against flag parameters that default to True, on the whole; they are a failure of ergonomic design. Leaving off a flag should usually be like setting it to False. A missing flag is an "off" flag. For these reasons (and others I have not yet thought through:-) I am voting for a: followlinks=False optional parameter. If you want to follow links, it is hardly difficult. Cheers, Cameron Simpson Our job is to make the questions so painful that the only way to make the pain go away is by thinking.- Fred Friendly ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com