After messing around with `Enum` for a while, there's one small thing that I'd
like to see improved. It seems limiting to me that the only way to trigger
`_generate_next_value` is to pass `auto()`.
What if, for a particular `Enum`, I would like to be able to use `()` as a
shorthand for `auto()`
A short example would help. I read all that and I'm still not sure what you
meant.
> On 26 Oct 2019, at 13:13, Steve Jorgensen wrote:
>
> After messing around with `Enum` for a while, there's one small thing that
> I'd like to see improved. It seems limiting to me that the only way to
> tri
On 24/10/2019 18:19:27, Andrew Barnert via Python-ideas wrote:
On Oct 23, 2019, at 23:47, Inada Naoki wrote:
But if we use + for dict merging, I think we should add + to set too.
Then the set has `.union()`, `|` and `+` for the same behavior.
I don’t think we really need that. If set and dic
On Fri, Oct 25, 2019 at 08:44:17PM -0700, Ben Rudiak-Gould wrote:
> Nothing good can come of decomposing strings into Unicode code points.
Sure there is. In Python, it's the fastest way to calculate the digit
sum of an integer. It's also useful for implementing classical
encryption algorithms,
On Sun, Oct 13, 2019 at 12:41:55PM -0700, Andrew Barnert via Python-ideas wrote:
> On Oct 13, 2019, at 12:02, Steve Jorgensen wrote:
[...]
> > This proposal is a serious breakage of backward compatibility, so
> > would be something for Python 4.x, not 3.x.
>
> I’m pretty sure almost nobody wants
On Sat, Oct 26, 2019, 7:29 PM Steven D'Aprano
> (At worst, a code-point in UTF-8 takes three bytes, compared to four in
> UTF-16 or UTF-32.)
>
http://www.fileformat.info/info/unicode/char/1/index.htm
>
___
Python-ideas mailing list -- python-ideas
On Sat, Oct 26, 2019 at 07:38:19PM -0400, David Mertz wrote:
> On Sat, Oct 26, 2019, 7:29 PM Steven D'Aprano
>
>
> > (At worst, a code-point in UTF-8 takes three bytes, compared to four in
> > UTF-16 or UTF-32.)
> >
>
> http://www.fileformat.info/info/unicode/char/1/index.htm
Oops, you're r
Absolutely, utf-8 is a wonderful encoding. And indeed, worst case is the
same storage requirement as utf-16 or utf-32. For O(1) random access into
all strings, we have to eat 32-bits per character, one way or the other,
but of course there are space/speed trade-offs one could make for
intermediate
On Oct 26, 2019, at 16:28, Steven D'Aprano wrote:
>
>> On Sun, Oct 13, 2019 at 12:41:55PM -0700, Andrew Barnert via Python-ideas
>> wrote:
>> On Oct 13, 2019, at 12:02, Steve Jorgensen wrote:
> [...]
>>> This proposal is a serious breakage of backward compatibility, so
>>> would be something f
On Wed, Oct 23, 2019, at 19:00, Christopher Barker wrote:
> On Sun, Oct 13, 2019 at 12:52 PM Andrew Barnert via Python-ideas
> wrote:
> > The main problem is that a str is a sequence of single-character str, each
> > of which is a one-element sequence of itself, etc. forever. If you wanted
> >
On Sat, Oct 26, 2019, at 20:26, David Mertz wrote:
> Absolutely, utf-8 is a wonderful encoding. And indeed, worst case is
> the same storage requirement as utf-16 or utf-32. For O(1) random
> access into all strings, we have to eat 32-bits per character, one way
> or the other, but of course the
Ok, true enough that dereferencing and limited linear search is still O(1).
I could have phrased that slightly more precisely.
But the trade-off part is true. Indexing into character 1 million of a
utf-32 string is just one memory offset calculation, them following the
reference. Indexing into the
On Sun, Oct 27, 2019 at 2:37 PM David Mertz wrote:
> What does actual CPython do currently to find that s[1_000_000], assuming
> utf-8 internal representation?
>
Mu.
CPython does not have a UTF-8 internal representation.
ChrisA
___
Python-ideas maili
On Sat, Oct 26, 2019 at 11:34:34PM -0400, David Mertz wrote:
> What does actual CPython do currently to find that s[1_000_000], assuming
> utf-8 internal representation?
CPython doesn't use a UTF-8 internal representation.
MicroPython *may*, but I don't know if they do anything fancy to avoid
O
PEP 393
The Unicode string type is changed to support multiple internal
representations, depending on the character with the largest Unicode
ordinal (1, 2, or 4 bytes)
... Ah, OK. I get it. One byte representation is only ASCII, which happens
to match utf-8. Well, the latin-1 oddness. But the int
15 matches
Mail list logo