[Python-ideas] Re: pathlib enhancements

2021-01-09 Thread David Mertz
For my entire filesystem:

 124920 cpython-38.pyc
  50034 html.gz
  31158 cpython-39.pyc
  31032 d.ts
  30415 cpython-37.pyc
  21473 cpython-36.pyc
  19000 js.map
   9888 symbolic.png
   5086 cpython-35.pyc
   5004 1.gz
   4657 cpython-38-x86_64-linux-gnu.so
   4261 pypy36.pyc
   4152 Debian.gz
   4041 png.i
   3534 cpython-33.pyc
   3421 cpython-34.pyc
   2950 min.js
   2880 cpython-34.pyo
   2668 unix.ip
   2668 unix.gid
   2668 rpcsec.init
   2668 rpcsec.context
   2656 yarn-metadata.json
   2615 csv.gz
   2614 yarn-tarball.tgz
   2575 c.i
   2442 3.gz
   2202 tar.bz2
   2128 so.0
   2124 ts.map

On Sat, Jan 9, 2021 at 9:37 PM David Mertz  wrote:

> On my system:
>
>  % find ~ -name '*.*.*' | rev | cut -d. -f-2 | rev | sort | uniq -c | sort
> -nr | head -30
>   17278 d.ts
>   11314 js.map
>6600 symbolic.png
>4041 png.i
>3968 cpython-37.pyc
>2656 yarn-metadata.json
>2614 yarn-tarball.tgz
>2575 c.i
>2526 csv.gz
>1727 h.i
>1659 opt-1.pyc
>1590 opt-2.pyc
>1302 autogen.js
>1151 ts.map
>1148 js.flow
> 854 svg.i
> 852 min.js
> 744 test.js
> 651 travis.yml
> 560 gif.i
> 522 so.0
> 403 indexeddb.leveldb
> 384 pom.sha1
> 368 ref.css
> 367 0.0
> 357 so.1
> 311 event.jsonlz4
> 283 xpm.i
> 278 ref.ui
> 275 am.i
>
> Most of those I honestly have no idea what they are.  That's just starting
> from $HOME.  System wide, who knows.
>
> On Sat, Jan 9, 2021 at 7:27 PM <2qdxy4rzwzuui...@potatochowder.com> wrote:
>
>> On 2021-01-10 at 05:03:08 +1100,
>> Chris Angelico  wrote:
>>
>> > On Sun, Jan 10, 2021 at 4:51 AM Stephen J. Turnbull
>> >  wrote:
>> > >
>> > > Joseph Martinot-Lagarde writes:
>> > >
>> > >  > One remark about this : .tar.gz files are the exception rather than
>> > >  > the rule, and AFAIK maybe the only one ?
>> > >
>> > > Not really.  stem.ext -> stem.ext.zzz where zzz is a compression
>> > > extension is a pretty common naming convention.  For me ext == 'tar'
>> > > is by far the most common case (74%), 'tis true, but 'patch' (10%),
>> > > 'txt' (6%), 'tab', 'gml', 'xml', 'svg', 'pdf', 'ps', ' dvi', 'diff',
>> > > 'pdb', 'cpp', 'el', and 'data' also exist somewhere under $HOME.  I'll
>> > > bet others show up if I search /usr, /var, and /opt.
>> >
>> > Yep, and most of my man pages are compressed, so there's
>> > usr/share/man/man1/*.1.gz and friends.
>> >
>> > I'd say the most common case with multiple extensions is indeed
>> > precisely two, where the first one is the type of file (or in the case
>> > of man pages, the section), and the second is a compression format.
>> > But there'll be less common cases too.
>>
>> I also have a pile of whatever-x.y.z.* files, where the * is some kind
>> of compression extension and x.y.z is a major.minor.patch identifier.
>>
>> Most of the time, my brain is big enough to spot where x.y.z ends and
>> the extension(s) begin(s), but throw in a version identifier like
>> 4.3.beta, and all bets are off (unless I happen to know exactly what to
>> look for, in which case I wouldn't bother with a general purpose library
>> function that might make the wrong assumption).
>> ___
>> Python-ideas mailing list -- python-ideas@python.org
>> To unsubscribe send an email to python-ideas-le...@python.org
>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-ideas@python.org/message/WPDXKRGXDDLC4GOCFW3OIHTPHOM7KJMZ/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
> --
> The dead increasingly dominate and strangle both the living and the
> not-yet born.  Vampiric capital and undead corporate persons abuse
> the lives and control the thoughts of homo faber. Ideas, once born,
> become abortifacients against new conceptions.
>


-- 
The dead increasingly dominate and strangle both the living and the
not-yet born.  Vampiric capital and undead corporate persons abuse
the lives and control the thoughts of homo faber. Ideas, once born,
become abortifacients against new conceptions.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SVUGIHQTECSX4BTXMYD4NMW4DWOO5MLI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: pathlib enhancements

2021-01-09 Thread David Mertz
On my system:

 % find ~ -name '*.*.*' | rev | cut -d. -f-2 | rev | sort | uniq -c | sort
-nr | head -30
  17278 d.ts
  11314 js.map
   6600 symbolic.png
   4041 png.i
   3968 cpython-37.pyc
   2656 yarn-metadata.json
   2614 yarn-tarball.tgz
   2575 c.i
   2526 csv.gz
   1727 h.i
   1659 opt-1.pyc
   1590 opt-2.pyc
   1302 autogen.js
   1151 ts.map
   1148 js.flow
854 svg.i
852 min.js
744 test.js
651 travis.yml
560 gif.i
522 so.0
403 indexeddb.leveldb
384 pom.sha1
368 ref.css
367 0.0
357 so.1
311 event.jsonlz4
283 xpm.i
278 ref.ui
275 am.i

Most of those I honestly have no idea what they are.  That's just starting
from $HOME.  System wide, who knows.

On Sat, Jan 9, 2021 at 7:27 PM <2qdxy4rzwzuui...@potatochowder.com> wrote:

> On 2021-01-10 at 05:03:08 +1100,
> Chris Angelico  wrote:
>
> > On Sun, Jan 10, 2021 at 4:51 AM Stephen J. Turnbull
> >  wrote:
> > >
> > > Joseph Martinot-Lagarde writes:
> > >
> > >  > One remark about this : .tar.gz files are the exception rather than
> > >  > the rule, and AFAIK maybe the only one ?
> > >
> > > Not really.  stem.ext -> stem.ext.zzz where zzz is a compression
> > > extension is a pretty common naming convention.  For me ext == 'tar'
> > > is by far the most common case (74%), 'tis true, but 'patch' (10%),
> > > 'txt' (6%), 'tab', 'gml', 'xml', 'svg', 'pdf', 'ps', ' dvi', 'diff',
> > > 'pdb', 'cpp', 'el', and 'data' also exist somewhere under $HOME.  I'll
> > > bet others show up if I search /usr, /var, and /opt.
> >
> > Yep, and most of my man pages are compressed, so there's
> > usr/share/man/man1/*.1.gz and friends.
> >
> > I'd say the most common case with multiple extensions is indeed
> > precisely two, where the first one is the type of file (or in the case
> > of man pages, the section), and the second is a compression format.
> > But there'll be less common cases too.
>
> I also have a pile of whatever-x.y.z.* files, where the * is some kind
> of compression extension and x.y.z is a major.minor.patch identifier.
>
> Most of the time, my brain is big enough to spot where x.y.z ends and
> the extension(s) begin(s), but throw in a version identifier like
> 4.3.beta, and all bets are off (unless I happen to know exactly what to
> look for, in which case I wouldn't bother with a general purpose library
> function that might make the wrong assumption).
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/WPDXKRGXDDLC4GOCFW3OIHTPHOM7KJMZ/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
The dead increasingly dominate and strangle both the living and the
not-yet born.  Vampiric capital and undead corporate persons abuse
the lives and control the thoughts of homo faber. Ideas, once born,
become abortifacients against new conceptions.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RPT46UYKGOBABCUDG5DUPZYYFZP2Y6K5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: pathlib enhancements

2021-01-09 Thread 2QdxY4RzWzUUiLuE
On 2021-01-10 at 05:03:08 +1100,
Chris Angelico  wrote:

> On Sun, Jan 10, 2021 at 4:51 AM Stephen J. Turnbull
>  wrote:
> >
> > Joseph Martinot-Lagarde writes:
> >
> >  > One remark about this : .tar.gz files are the exception rather than
> >  > the rule, and AFAIK maybe the only one ?
> >
> > Not really.  stem.ext -> stem.ext.zzz where zzz is a compression
> > extension is a pretty common naming convention.  For me ext == 'tar'
> > is by far the most common case (74%), 'tis true, but 'patch' (10%),
> > 'txt' (6%), 'tab', 'gml', 'xml', 'svg', 'pdf', 'ps', ' dvi', 'diff',
> > 'pdb', 'cpp', 'el', and 'data' also exist somewhere under $HOME.  I'll
> > bet others show up if I search /usr, /var, and /opt.
> 
> Yep, and most of my man pages are compressed, so there's
> usr/share/man/man1/*.1.gz and friends.
> 
> I'd say the most common case with multiple extensions is indeed
> precisely two, where the first one is the type of file (or in the case
> of man pages, the section), and the second is a compression format.
> But there'll be less common cases too.

I also have a pile of whatever-x.y.z.* files, where the * is some kind
of compression extension and x.y.z is a major.minor.patch identifier.

Most of the time, my brain is big enough to spot where x.y.z ends and
the extension(s) begin(s), but throw in a version identifier like
4.3.beta, and all bets are off (unless I happen to know exactly what to
look for, in which case I wouldn't bother with a general purpose library
function that might make the wrong assumption).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WPDXKRGXDDLC4GOCFW3OIHTPHOM7KJMZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: pathlib enhancements

2021-01-09 Thread Chris Angelico
On Sun, Jan 10, 2021 at 4:51 AM Stephen J. Turnbull
 wrote:
>
> Joseph Martinot-Lagarde writes:
>
>  > One remark about this : .tar.gz files are the exception rather than
>  > the rule, and AFAIK maybe the only one ?
>
> Not really.  stem.ext -> stem.ext.zzz where zzz is a compression
> extension is a pretty common naming convention.  For me ext == 'tar'
> is by far the most common case (74%), 'tis true, but 'patch' (10%),
> 'txt' (6%), 'tab', 'gml', 'xml', 'svg', 'pdf', 'ps', ' dvi', 'diff',
> 'pdb', 'cpp', 'el', and 'data' also exist somewhere under $HOME.  I'll
> bet others show up if I search /usr, /var, and /opt.

Yep, and most of my man pages are compressed, so there's
usr/share/man/man1/*.1.gz and friends.

I'd say the most common case with multiple extensions is indeed
precisely two, where the first one is the type of file (or in the case
of man pages, the section), and the second is a compression format.
But there'll be less common cases too.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WFZN5EO5SKGDITPSAB4JYP35A6FBJNX7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: pathlib enhancements

2021-01-09 Thread Stephen J. Turnbull
Joseph Martinot-Lagarde writes:

 > One remark about this : .tar.gz files are the exception rather than
 > the rule, and AFAIK maybe the only one ?

Not really.  stem.ext -> stem.ext.zzz where zzz is a compression
extension is a pretty common naming convention.  For me ext == 'tar'
is by far the most common case (74%), 'tis true, but 'patch' (10%),
'txt' (6%), 'tab', 'gml', 'xml', 'svg', 'pdf', 'ps', ' dvi', 'diff',
'pdb', 'cpp', 'el', and 'data' also exist somewhere under $HOME.  I'll
bet others show up if I search /usr, /var, and /opt.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EHKHFHQDALHA7IGPZEJINTMSG5AMZKTE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Support reversed(itertools.chain(x, y, z))

2021-01-09 Thread Chris Angelico
On Sun, Jan 10, 2021 at 12:29 AM Oscar Benjamin
 wrote:
> I haven't ever wanted to reverse a chain but I have wanted to be able
> to reverse an enumerate many times:
>
> >>> reversed(enumerate([1, 2, 3]))
> ...
> TypeError
>
> The alternative zip(range(len(obj)-1, -1, -1), reversed(obj)) is
> fairly cryptic in comparison as well as probably being less efficient.
> There could be a __reversed__ method for enumerate with the same
> caveat as for chain: if the underlying object is not reversible then
> you get a TypeError. Otherwise reversed(enumerate(seq)) works fine for
> any sequence seq.

To clarify, you want reversed(enumerate(x)) to yield the exact same
pairs that reversed(list(enumerate(x))) would return, yes? If so, it
absolutely must have a length, AND be reversible. I don't think
spelling it reversed(enumerate(x)) will work, due to issues with
partial consumption; but it wouldn't be too hard to define a
renumerate function:

def renumerate(seq):
"""Equivalent to reversed(list(enumerate(seq))) but more efficient"""
return zip(range(len(seq))[::-1], reversed(seq))

(I prefer spelling it [::-1] than risking getting the range args
wrong, but otherwise it's the same as you had)

But if you'd rather not do this, then a cleaner solution might be for
enumerate() to grow a step parameter:

enumerate(iterable, start=0, step=1)

And then, what you want is simply:

enumerate(reversed(seq), len(seq) - 1, -1)

I'm +0 on enumerate gaining a parameter, and otherwise, -1 on actual
changes to the stdlib - this is a one-liner that you can have in your
personal library if you need it. Might be a cool recipe for itertools
docs, or one of the third-party more-itertools packages, or something,
though.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5AM7GR2IXPAOHA74RMO7YRF5FGMFKL2P/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add command-line option to unittest for enabling post-mortem debugging

2021-01-09 Thread Dominik Vilsmeier
In case someone is interested, I created a corresponding pull request here: 
https://github.com/python/cpython/pull/23900
It's a lightweight change since the relevant methods `TestCase.debug` and 
`TestSuite.debug` were already in place. The docstring of these methods makes 
it clear that this is what they were intended to be used for:

"Run the test without collecting the result. This allows exceptions raised by 
the test to be propagated to the caller, and can be used to support running 
tests under a debugger."

Dominik Vilsmeier wrote:
> Consider the following example:
> import unittest
> 
> def foo():
> for x in [1, 2, 'oops', 4]:
> print(x + 100)
> 
> class TestFoo(unittest.TestCase):
> def test_foo(self):
> self.assertIs(foo(), None)
> 
> if __name__ == '__main__':
> unittest.main()
> 
> If we were calling foo directly we could enter post-mortem debugging via
> python -m pdb test.py.
> However since foo is wrapped in a test case, unittest eats the
> exception and thus prevents post-mortem debugging. --failfast doesn't help,
> the exception is still swallowed.
> Since I am not aware of a solution that enables post-mortem debugging in such 
> a case
> (without modifying the test scripts, please correct me if one exists), I 
> propose adding a
> command-line option to unittest for running
> test cases in debug mode so that post-mortem debugging can be used.
> P.S.: There is also this SO
> question on a similar topic.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NXFAAECYVGHB6O35SJAVNVTNVKIO5KOU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Support reversed(itertools.chain(x, y, z))

2021-01-09 Thread Paul Moore
On Sat, 9 Jan 2021 at 13:29, Oscar Benjamin  wrote:
> The argument to reversed either needs to be a sequence with __len__
> and __getitem__ or an object with a __reversed__ method that returns
> an iterator. The arguments to chain have to be iterables. Every
> sequence is an iterable so there is a significant intersection between
> the possible inputs to chain and reversed. Also some non-sequences
> such as dict can work with reversed.
>
> You say it's hard to see how it could be made to work but you've shown
> precisely how it can already be done above:
>
>  reversed(chain(*args))   ==   chain(*map(reversed, reversed(args)))
>
> We can try that out and it certainly seems to work:
>
> >>> from itertools import chain
> >>> args = [[1, 2], [3, 4]]
> >>> list(chain(*args))
> [1, 2, 3, 4]
> >>> list(chain(*map(reversed, reversed(args
> [4, 3, 2, 1]
>
> This wouldn't work with chain.from_iterable without preconsuming the
> top-level iterable but in the case of chain the iterables are already
> in a *args tuple so flipping that order is always possible in a lazy
> way. That means the operation works fine if each arg in args is
> reversible. Otherwise if any arg is not reversible then it should give
> a TypeError just like reversed(set()) does except the error would
> potentially be delayed if some of the args are reversible and some are
> not.
>
> I haven't ever wanted to reverse a chain but I have wanted to be able
> to reverse an enumerate many times:
>
> >>> reversed(enumerate([1, 2, 3]))
> ...
> TypeError
>
> The alternative zip(range(len(obj)-1, -1, -1), reversed(obj)) is
> fairly cryptic in comparison as well as probably being less efficient.
> There could be a __reversed__ method for enumerate with the same
> caveat as for chain: if the underlying object is not reversible then
> you get a TypeError. Otherwise reversed(enumerate(seq)) works fine for
> any sequence seq.
>
> The thornier issue is how to handle reversed if the chain/enumerate
> iterator has already been partially consumed. If it's possible just to
> give an error in that case then reversed could still be useful in the
> common case.

I think you're about right here - both chain and enumerate could
reasonably be expected to be reversible. There are some fiddly edge
cases, and some potentially weird situations (as soon as we assume
no-one would ever expect to reverse a partially consumed iterator, I
bet someone will...) which probably warrant no more than "don't do
that then" but will end up being the subject of questions/confusion.

The question is whether the change is worth the cost. For me:

1. enumerate is probably more important than chain. I use enumerate a
*lot* and I've very rarely used chain.
2. Consistency is a benefit - as we've already seen, people assume
things work by analogy with other cases, and waste time when they
don't.
3. How easy it is to write your own matters. If chain or enumerate
objects exposed the iterables they were based on, you could write your
own reverser more easily.
4. How problematic are the workarounds? reversed(list(some_iter))
works fine - is turning the iterator into a concrete list that much of
an issue?

And of course, the key point - how often do people want to do this anyway?

If someone wants to do the work to implement this, I would say go for
it - raise a bpo issue and create a PR, and see what the response is.
Getting "community support" via this list is probably not crucial for
something like this. It's more of a quality of life change than a big
feature.

Paul
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/P2R4KOFMDL2JELV3UGQDGRVCNKAIMFYU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Support reversed(itertools.chain(x, y, z))

2021-01-09 Thread Oscar Benjamin
On Fri, 8 Jan 2021 at 22:25, Greg Ewing  wrote:
>
> On 9/01/21 10:19 am, Ram Rachum wrote:
> > In short, I want `reversed(itertools.chain(x, y, z))` that behaves like
> > `itertools.chain(map(reversed, (z, y, x)))`.
>
> I think you mean `itertools.chain(*map(reversed, (z, y, x)))`
>
> You can get this with
>
>  itertools.chain(*map(reversed, reversed(t)))
>
> Making `reversed(itertools.chain(x, y, z))` do this would be a
> backwards incompatible change.
>
> Also it's hard to see how it could be made to work, because the
> argument to reversed() necessarily has to be a sequence, not an
> iterator.

The argument to reversed either needs to be a sequence with __len__
and __getitem__ or an object with a __reversed__ method that returns
an iterator. The arguments to chain have to be iterables. Every
sequence is an iterable so there is a significant intersection between
the possible inputs to chain and reversed. Also some non-sequences
such as dict can work with reversed.

You say it's hard to see how it could be made to work but you've shown
precisely how it can already be done above:

 reversed(chain(*args))   ==   chain(*map(reversed, reversed(args)))

We can try that out and it certainly seems to work:

>>> from itertools import chain
>>> args = [[1, 2], [3, 4]]
>>> list(chain(*args))
[1, 2, 3, 4]
>>> list(chain(*map(reversed, reversed(args
[4, 3, 2, 1]

This wouldn't work with chain.from_iterable without preconsuming the
top-level iterable but in the case of chain the iterables are already
in a *args tuple so flipping that order is always possible in a lazy
way. That means the operation works fine if each arg in args is
reversible. Otherwise if any arg is not reversible then it should give
a TypeError just like reversed(set()) does except the error would
potentially be delayed if some of the args are reversible and some are
not.

I haven't ever wanted to reverse a chain but I have wanted to be able
to reverse an enumerate many times:

>>> reversed(enumerate([1, 2, 3]))
...
TypeError

The alternative zip(range(len(obj)-1, -1, -1), reversed(obj)) is
fairly cryptic in comparison as well as probably being less efficient.
There could be a __reversed__ method for enumerate with the same
caveat as for chain: if the underlying object is not reversible then
you get a TypeError. Otherwise reversed(enumerate(seq)) works fine for
any sequence seq.

The thornier issue is how to handle reversed if the chain/enumerate
iterator has already been partially consumed. If it's possible just to
give an error in that case then reversed could still be useful in the
common case.

--
Oscar
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5LH3EL3QF4WN62BE5X2RTNWT3GRHANAA/
Code of Conduct: http://python.org/psf/codeofconduct/