date:20191023

On Oct 23, 2019, at 22:45, Greg Ewing  wrote:
> 
> Andrew Barnert via Python-ideas wrote:
>> Someone earlier in this thread said we could optimize calling split on a
>> string literal, just as we can and do optimize iterating over a list literal
>> in a for statement.
>> The counter argument—which I thought you were adding onto—is that this would
>> be bad because it would make people write bad code for older/alternative
>> Pythons.
> 
> There's a precedent for this kind of thing -- there's an optimisation
> for repeatedly concatenating onto a string in some circumstances, even
> though building a list and joining it is recommended if you want
> guaranteed good performance. So the fact that it wouldn't apply to all
> versions and implementations of Python shouldn't really matter.
> 
> I'm not sure how much it would really help, though. Lists being
> mutable, it would have to build a new list every time,

Sure, but a small number of LOAD_CONSTs and a BUILD_LIST has to be faster than 
1 LOAD_CONST and a call to the split method.

From testing some different random examples, the split takes anywhere from 1.8x 
to 3.9x as long, and I assume with longer element strings it would be even more 
of a difference.

I still doubt this ever occurs anywhere near a bottleneck in real-life code—but 
if it did, it seems like the optimization would be worth it. (Assuming a better 
micro-benchmark verifies my quick&dirty test.)

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4NAWS4DQDI6OLJL3LZXRMQ3AG52G47ZR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Python 4000: Have stringlike objects provide sequence views rather than being sequences

2019-10-23 Thread Anders Hovmöller



> On 24 Oct 2019, at 01:02, Christopher Barker  wrote:
> 
> 
>> On Sun, Oct 13, 2019 at 12:52 PM Andrew Barnert via Python-ideas 
>>  wrote:
> 
>> The main problem is that a str is a sequence of single-character str, each 
>> of which is a one-element sequence of itself, etc. forever. If you wanted to 
>> change this, I think it would make more sense to go the opposite way: leave 
>> str a sequence, but make it a sequence of char objects. (And likewise, bytes 
>> and bytearray could be sequences of byte objects—or just go all the way to 
>> making them sequences of ints.) And then maybe add a c prefix for defining 
>> char constants, and you’ve solved all the problems without having to add new 
>> confusing methods or properties.
> 
> I've thought for a long time that this would be a "good thing". the "string 
> or sequence of strings" issues is pretty much the only hidden-bug-triggering 
> type error I've gotten since "true division".
> 
> The only way we really live with it fairly easily is that strings are pretty 
> much never duck typed -- so I can check if I got a string, and then I know I 
> didn't get a sequence of strings. But I've always wondered how disruptive it 
> would be to add a char type -- it doesn't seem like it would be very 
> disruptive, but I have not thought it through at all. And I'm not sure how 
> much string functionality a char should have -- probably next to none, as the 
> point is that it would be easy to distinguish from a string that happened to 
> have one character.
> 
> By the way, the bytes and bytearray types already does this -- index into or 
> loop through a bytes object, you get an int.

I would think it's fine if we depreciate the iter on str and supply a chars() 
method. Personally I think that can yield str and not int. The could be a 
codes() or char_codes() method for that. 

/ Anders ___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/IM4ANTZ6NDDHPOPRO3SAWFS4ECZL5MKV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

2019-10-23 Thread Greg Ewing


Andrew Barnert via Python-ideas wrote:

Someone earlier in this thread said we could optimize calling split on a
string literal, just as we can and do optimize iterating over a list literal
in a for statement.

The counter argument—which I thought you were adding onto—is that this would
be bad because it would make people write bad code for older/alternative
Pythons.


There's a precedent for this kind of thing -- there's an optimisation
for repeatedly concatenating onto a string in some circumstances, even
though building a list and joining it is recommended if you want
guaranteed good performance. So the fact that it wouldn't apply to all
versions and implementations of Python shouldn't really matter.

I'm not sure how much it would really help, though. Lists being
mutable, it would have to build a new list every time, unless it was
also being used in a context where a tuple could be substituted,
making it a doubly special case. I question whether there are many
examples of such cases in the wild.

--
Greg
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3H7XULZEJDXSJUSP3PLNGRPT67PALCNO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Oct 23, 2019, at 18:59, Christopher Barker  wrote:
> 
> Since I'm doing this, the three that aren't are:
> 
> U+180E MONGOLIAN VOWEL SEPARATOR
> U+200B ZERO WIDTH SPACE
> U+FEFF ZERO WIDTH NO-BREAK SPACE
> 
> The Mongolian vowel separator makes some sense (not knowing Mongolian in the 
> least). Though I wonder what the point of a zero-width space is if it's NOT 
> going to be a separator?

It’s a Cf (formatting character), because it’s not used for spacing, it’s used 
for controlling higher-level formatting like soft line breaks. Or, put another 
way, it’s a bit more like a soft hyphen than it is like a space. It’s a weird 
distinction, but not as weird as, say, U+2028 and U+2029, which are also used 
for controlling formatting but literally have “separator” in their name, so 
they ended up creating a special category for each one so they can be Z but not 
Zs.

Anyway, some of the answers the Unicode committee came up with are odd, but 
they’re the right answers by definition. Plus, even if I had a time machine and 
an unlimited life span, I’m pretty sure I wouldn’t want to participate in those 
arguments.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YNONA2X63SZSOVDGEELO3DJONSDXC7CY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

D'uh! stupid bug:

> Is this the same code points identified by `str.isspace`?

>
> I haven't checked -- so I will:
>
> and the answer is no:
>
> wrong, the answer is yes:

$ python weird_spaces.py
x x x x᠎x x x x x x x x x x x xx x x xx
['x', 'x', 'x', 'x\u180ex', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x',
'x', 'x\u200bx', 'x', 'x', 'x\ufeffx']
out of 20, 17 were used as split chars
out of 20, 17 were True according to .isspace

That makes far more sense.

Since I'm doing this, the three that aren't are:

U+180E MONGOLIAN VOWEL SEPARATOR
U+200B ZERO WIDTH SPACE
U+FEFF ZERO WIDTH NO-BREAK SPACE

The Mongolian vowel separator makes some sense (not knowing Mongolian in
the least). Though I wonder what the point of a zero-width space is if it's
NOT going to be a separator?

-CHB


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
#!/usr/bin/env python

weird_spaces = ("x\u0020x\u00A0x\u1680x\u180Ex\u2000x\u2001x\u2002"
"x\u2003x\u2004x\u2005x\u2006x\u2007x\u2008x\u2009"
"x\u200Ax\u200Bx\u202Fx\u205Fx\u3000x\uFEFFx")

print(weird_spaces)
splitted = weird_spaces.split()
print(splitted)

total_spacelike = (len(weird_spaces) - 1) // 2
num_split = len(splitted) - 1

print(f"out of {total_spacelike}, {num_split} were used as split chars")

isspace = [c.isspace() for c in weird_spaces if c != 'x']

# print(isspace)

print(f"out of {total_spacelike}, {sum(isspace)} were True according to .isspace")


___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/S3QCQSB2IZJ6CSR4IGXMJBL6NZN6YT6A/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 6:04 PM David Mertz  wrote:

> Is this the same code points identified by `str.isspace`?
>

I haven't checked -- so I will:

and the answer is no:

$ python weird_spaces.py
x x x x᠎x x x x x x x x x x x xx x x xx
['x', 'x', 'x', 'x\u180ex', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x',
'x', 'x\u200bx', 'x', 'x', 'x\ufeffx']
41
18
[False, True, False, True, False, True, False, False, False, True, False,
True, False, True, False, True, False, True, False, True, False, True,
False, True, False, True, False, True, False, True, False, False, False,
True, False, True, False, True, False, False, False]

There are only three that didn't split, but many more than three that
failed .isspace.

Thanks for doing that. I would have soon otherwise. Still, "most of them"
> isn't actually a precise answer for an uncertain string. :-)
>

nope.

But it could be defined somewhere, and presumably is, though maybe not
consistently.

-CHB

On Wed, Oct 23, 2019, 8:57 PM Christopher Barker 
wrote:

> On Wed, Oct 23, 2019 at 5:53 PM Andrew Barnert via Python-ideas <
> python-ideas@python.org> wrote:
>
>> > To be fair, I also don't know which of those split on str.split() with
>> no arguments to the method either.
>>
>
> I couldn't resist -- the answer is most of them:
>
> #!/usr/bin/env python
> weird_spaces = ("x\u0020x\u00A0x\u1680x\u180Ex\u2000x\u2001x\u2002"
> "x\u2003x\u2004x\u2005x\u2006x\u2007x\u2008x\u2009"
> "x\u200Ax\u200Bx\u202Fx\u205Fx\u3000x\uFEFFx")
> print(weird_spaces)
> splitted = weird_spaces.split()
> print(splitted)
>
> print(len(weird_spaces))
> print(len(splitted))
>
> $ python weird_spaces.py
> x x x x᠎x x x x x x x x x x x xx x x xx
> ['x', 'x', 'x', 'x\u180ex', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x',
> 'x', 'x\u200bx', 'x', 'x', 'x\ufeffx']
> 41
> 18
>
> -CHB
>
>
> --
> Christopher Barker, PhD
>
> Python Language Consulting
>   - Teaching
>   - Scientific Software Development
>   - Desktop GUI and Web Development
>   - wxPython, numpy, scipy, Cython
>


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
#!/usr/bin/env python

weird_spaces = ("x\u0020x\u00A0x\u1680x\u180Ex\u2000x\u2001x\u2002"
"x\u2003x\u2004x\u2005x\u2006x\u2007x\u2008x\u2009"
"x\u200Ax\u200Bx\u202Fx\u205Fx\u3000x\uFEFFx")

print(weird_spaces)
splitted = weird_spaces.split()
print(splitted)

print(len(weird_spaces))
print(len(splitted))


isspace = [c.isspace() for c in weird_spaces]

print(isspace)


___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ICM2RIS7EA3RXCRVRYTSDALFUQUEDM35/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Python 4000: Have stringlike objects provide sequence views rather than being sequences

There's a reason I've never actually proposed adding a char 

On Wed, Oct 23, 2019 at 5:34 PM Andrew Barnert  wrote:

> Well, just adding a char type (and presumably a way of defining char
literals) wouldn’t be too disruptive.

sure.

> But changing str to iterate chars instead of strs, that probably would be.

And that would be the whole point -- a char type by itself isn't very
useful. in some ssense, the only difference between a char and a str would
be that a char isn't iterable -- but the benefit would be that a string is
an iterable (and sequence) of chars, rather than an (infinitely recursable)
iterable of strings.

> Also, you’d have to go through a lot of functions and decide what types
they should take.

sure would -- a lot of thought to see how disruptive it would be ...

> For example, does str.join still accept a string instead of an iterable
of strings? Does it accept other iterables of char too?

if it accepted an iterable of either char or str, then I *think* there
would be little disruption.

> Can you pass a char to str.__contains__

yes, that's a no brainer, the whole point is that a string would be a
sequence of chars.

> or str.endswith?

I would think so -- a char would behave like a length-one string as much as
possible.

> What about a tuple of chars?

that's an odd one -- but I'm not sutre I see the point, if you have a tuple
of chars, you could "".join() them if you want a string, in any context.

> Or should we take the backward-compat breaking opportunity to eliminate
the “str or tuple of str” thing and instead use *args, or at least change
it to “str or iterable of str (which no longer includes str itself)”?

Is this for .endswith() and friends? if so, there was discussion a while
back about that -- but probably not the time to introduce even more
backward incompatible changes.

And I'm not sure how much string functionality a char should have --
probably next to none, as the point is that it would be easy to distinguish
from a string that happened to have one character.

> Surely you’d want to be able to do things like isdigit or swapcase. Even
C has functions to do most of that kind of stuff on chars.

probably -- it would be least disruptive for a char to act as much as
possible the same as a length-one string -- so maybe inexorability and
indexability would be it.

> But I think that, other than join and maybe encode and translate,

not sure why encode or translate should be an issue off the top of my head
-- it would surley be a unicode char :-)

> there’s an obvious right answer for every str method and operator, so
this isn’t too much of a problem.

well, we'd have to go through all of them, and do a lot of thinking...

I think the greater confusion is where can you use a char instead of a
string in other places? using it as a filename, for instance would make it
pointless for at least the cases I commonly deal with (list of filenames).

I can only imagine how many "things" take a string where a char would make
sense, but then it gets harder to distinguish them all.

> Speaking of operators, should char+int and char-int and char-char be
legal? (What about char%int? A thousand students doing the rot13 assignment
would rejoice, but allowing % without * and // is kind of weird, and
allowing * and // even weirder—as well as potentially confusing with
str*int being legal but meaning something very different.)

I would say no -- in C a char IS an unsigned 8bit int, but that's C -- in
pyhton a char and a number are very diferent things.

ord() and chr() would work, of course.

By the way, the bytes and bytearray types already does this -- index into
or loop through a bytes object, you get an int.

Sure, but b'abc'.find(66) is -1, and b'abc'.replace(66, 70) is a TypeError,
and so on.

I wonder if they need to be -- would we need a "byte" type, or would it be
OK to accept an int in all those sorts of places?

> Fixing those inconsistencies is what I meant by “go all the way to making
them sequences of ints”. But it might be friendlier to undo the changes and
instead add a byte type like the char type for bytes to be a sequence of.
I’m not sure which is better.

me neither.

> But anyway, I think all of these questions are questions for a new
language. If making str not iterate str was too big a change even for 3.0,
how could it be reasonable for any future version?

Well, I don't know that it was seriously considered -- with the Unicode
changes, that WOULD have been the time to do it!

Again though,, it seems like it would be pretty disruptive, so a
non-starter, but maybe not?

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.o

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

Is this the same code points identified by `str.isspace`?

Thanks for doing that. I would have soon otherwise. Still, "most of them"
isn't actually a precise answer for an uncertain string. :-)

On Wed, Oct 23, 2019, 8:57 PM Christopher Barker 
wrote:

> On Wed, Oct 23, 2019 at 5:53 PM Andrew Barnert via Python-ideas <
> python-ideas@python.org> wrote:
>
>> > To be fair, I also don't know which of those split on str.split() with
>> no arguments to the method either.
>>
>
> I couldn't resist -- the answer is most of them:
>
> #!/usr/bin/env python
> weird_spaces = ("x\u0020x\u00A0x\u1680x\u180Ex\u2000x\u2001x\u2002"
> "x\u2003x\u2004x\u2005x\u2006x\u2007x\u2008x\u2009"
> "x\u200Ax\u200Bx\u202Fx\u205Fx\u3000x\uFEFFx")
> print(weird_spaces)
> splitted = weird_spaces.split()
> print(splitted)
>
> print(len(weird_spaces))
> print(len(splitted))
>
> $ python weird_spaces.py
> x x x x᠎x x x x x x x x x x x xx x x xx
> ['x', 'x', 'x', 'x\u180ex', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x',
> 'x', 'x\u200bx', 'x', 'x', 'x\ufeffx']
> 41
> 18
>
> -CHB
>
>
> --
> Christopher Barker, PhD
>
> Python Language Consulting
>   - Teaching
>   - Scientific Software Development
>   - Desktop GUI and Web Development
>   - wxPython, numpy, scipy, Cython
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YB3PVH7IMINKYN5AQPULWSWK6QFCF5U2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 5:53 PM Andrew Barnert via Python-ideas <
python-ideas@python.org> wrote:

> > To be fair, I also don't know which of those split on str.split() with
> no arguments to the method either.
>

I couldn't resist -- the answer is most of them:

#!/usr/bin/env python
weird_spaces = ("x\u0020x\u00A0x\u1680x\u180Ex\u2000x\u2001x\u2002"
"x\u2003x\u2004x\u2005x\u2006x\u2007x\u2008x\u2009"
"x\u200Ax\u200Bx\u202Fx\u205Fx\u3000x\uFEFFx")
print(weird_spaces)
splitted = weird_spaces.split()
print(splitted)

print(len(weird_spaces))
print(len(splitted))

$ python weird_spaces.py
x x x x᠎x x x x x x x x x x x xx x x xx
['x', 'x', 'x', 'x\u180ex', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x',
'x', 'x\u200bx', 'x', 'x', 'x\ufeffx']
41
18

-CHB


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PELOTR3J4HV2EIF54ZHISWZZX45QJY7U/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Oct 23, 2019, at 16:26, David Mertz  wrote:
> 
> To be fair, I also don't know which of those split on str.split() with no 
> arguments to the method either.

I would assume the rule is the same rule used by str.isspace, and that this 
rule is either the simple one (category is Zs) or the full one (category is Zs 
or bidi class is one of the handful of bidi space classes) from the same 
version of Unicode that the unicodedata module handles.

In fact, it’s more than an assumption—if it isn’t true, I’d expect to find a 
good rationale in the docs, or it’s probably a bug in the str class. You can’t 
document something as a method of Unicode strings that splits on “whitespace” 
using anything other than a Unicode definition of whitespace is without a good 
reason.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5BJ76KZPOGQXJHH6GV5OLOV7DSWUYJ5T/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Python 4000: Have stringlike objects provide sequence views rather than being sequences

On Oct 23, 2019, at 16:00, Christopher Barker  wrote:
> 
>> On Sun, Oct 13, 2019 at 12:52 PM Andrew Barnert via Python-ideas 
>>  wrote:
> 
>> The main problem is that a str is a sequence of single-character str, each 
>> of which is a one-element sequence of itself, etc. forever. If you wanted to 
>> change this, I think it would make more sense to go the opposite way: leave 
>> str a sequence, but make it a sequence of char objects. (And likewise, bytes 
>> and bytearray could be sequences of byte objects—or just go all the way to 
>> making them sequences of ints.) And then maybe add a c prefix for defining 
>> char constants, and you’ve solved all the problems without having to add new 
>> confusing methods or properties.
> 
> I've thought for a long time that this would be a "good thing". the "string 
> or sequence of strings" issues is pretty much the only hidden-bug-triggering 
> type error I've gotten since "true division".
> 
> The only way we really live with it fairly easily is that strings are pretty 
> much never duck typed -- so I can check if I got a string, and then I know I 
> didn't get a sequence of strings. But I've always wondered how disruptive it 
> would be to add a char type -- it doesn't seem like it would be very 
> disruptive, but I have not thought it through at all.

Well, just adding a char type (and presumably a way of defining char literals) 
wouldn’t be too disruptive. 

But changing str to iterate chars instead of strs, that probably would be.

Also, you’d have to go through a lot of functions and decide what types they 
should take. For example, does str.join still accept a string instead of an 
iterable of strings? Does it accept other iterables of char too? (I have used ' 
'.join on a string in real life production code, even if I did feel guilty 
about it…) Can you pass a char to str.__contains__ or str.endswith? What about 
a tuple of chars? Or should we take the backward-compat breaking opportunity to 
eliminate the “str or tuple of str” thing and instead use *args, or at least 
change it to “str or iterable of str (which no longer includes str itself)”?

> And I'm not sure how much string functionality a char should have -- probably 
> next to none, as the point is that it would be easy to distinguish from a 
> string that happened to have one character.

Surely you’d want to be able to do things like isdigit or swapcase. Even C has 
functions to do most of that kind of stuff on chars.

But I think that, other than join and maybe encode and translate, there’s an 
obvious right answer for every str method and operator, so this isn’t too much 
of a problem.

Speaking of operators, should char+int and char-int and char-char be legal? 
(What about char%int? A thousand students doing the rot13 assignment would 
rejoice, but allowing % without * and // is kind of weird, and allowing * and 
// even weirder—as well as potentially confusing with str*int being legal but 
meaning something very different.)

> By the way, the bytes and bytearray types already does this -- index into or 
> loop through a bytes object, you get an int.

Sure, but b'abc'.find(66) is -1, and b'abc'.replace(66, 70) is a TypeError, and 
so on.

Fixing those inconsistencies is what I meant by “go all the way to making them 
sequences of ints”. But it might be friendlier to undo the changes and instead 
add a byte type like the char type for bytes to be a sequence of. I’m not sure 
which is better.

But anyway, I think all of these questions are questions for a new language. If 
making str not iterate str was too big a change even for 3.0, how could it be 
reasonable for any future version?

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WRVOKGHNK7JKR66WG7MG73FUFZODLC4R/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 7:17 PM David Mertz  wrote:

> Contains any of the following (non-escaped) characters. If they occur
> inside quotes, it seems straightforward, but in this new '%w[]' thing, who
> knows?
>
> U+00A0 NO-BREAK SPACE foo bar As a space, but often not adjusted
> U+1680 OGHAM SPACE MARK foo bar Unspecified; usually not really a space
> but a dash
> U+180E MONGOLIAN VOWEL SEPARATOR foo᠎bar 0
> U+2000 EN QUAD foo bar 1 en (= 1/2 em)
>
...

To be fair, I also don't know which of those split on str.split() with no
arguments to the method either.

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TJGSNQUNMKAFS7UES6SSC3PT4UGRELT2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019, 4:31 PM Steven D'Aprano

> David, you literally wrote the book on text processing in Python. I think
> you are being disingenious here, and below when you describe a standard
> string hex-escape \x20 that has been in Python forever and in just about
> all C-like languages as "weird".
>

I'm so flattered anyone remembers that from long ago. It was a very fun
book to write. :-)

I think, however, that I've never written '\x20' before this moment in my
life. I do know the ASCII and Unicode code point for a space. I've run the
'hexdump' utility plenty of times. But it's hard to think of an occasion
when I would have needed to enter a space by code point rather than just
quoted.

So I don't think it's so disingenuous to think needing to do that would be
"weird." I've escaped lots of other characters that don't have a giant key
about 7x the width of other keys on my keyboard.

If you can understand why this works:
> string = "Single\n quoted\n string\n containing newlines!"
> you can understand the burnt\x20umber example.
>

I can discern your intention for the new behavior, yes.  But:

In [2]: "burnt\x20umber".split()
Out[2]: ['burnt', 'umber']
In [3]: "Single\n quoted\n string\n containing newlines!".split()
Out[3]: ['Single', 'quoted', 'string', 'containing', 'newlines!']

So this new syntax would behave in a way that is counter-intuitive for
folks familiar with Python strings to date.

Also, I genuinely am not clear what should happen if an expression like

%w[cyan   forest green  burnt\x20umber]

Contains any of the following (non-escaped) characters. If they occur
inside quotes, it seems straightforward, but in this new '%w[]' thing, who
knows?

U+00A0 NO-BREAK SPACE foo bar As a space, but often not adjusted
U+1680 OGHAM SPACE MARK foo bar Unspecified; usually not really a space but
a dash
U+180E MONGOLIAN VOWEL SEPARATOR foo᠎bar 0
U+2000 EN QUAD foo bar 1 en (= 1/2 em)
U+2001 EM QUAD foo bar 1 em (nominally, the height of the font)
U+2002 EN SPACE (nut) foo bar 1 en (= 1/2 em)
U+2003 EM SPACE (mutton) foo bar 1 em
U+2004 THREE-PER-EM SPACE (thick space) foo bar 1/3 em
U+2005 FOUR-PER-EM SPACE (mid space) foo bar 1/4 em
U+2006 SIX-PER-EM SPACE foo bar 1/6 em
U+2007 FIGURE SPACE foo bar “Tabular width”, the width of digits
U+2008 PUNCTUATION SPACE foo bar The width of a period “.”
U+2009 THIN SPACE foo bar 1/5 em (or sometimes 1/6 em)
U+200A HAIR SPACE foo bar Narrower than THIN SPACE
U+200B ZERO WIDTH SPACE foobar 0
U+202F NARROW NO-BREAK SPACE foo bar Narrower than NO-BREAK SPACE (or
SPACE), “typically the width of a thin space or a mid space”
U+205F MEDIUM MATHEMATICAL SPACE foo bar 4/18 em
U+3000 IDEOGRAPHIC SPACE foo bar The width of ideographic (CJK) characters.
U+FEFF


>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AIKPD6PMBHTGKKR6D52LZZO4VQ2W6BNK/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Python 4000: Have stringlike objects provide sequence views rather than being sequences

On Sun, Oct 13, 2019 at 12:52 PM Andrew Barnert via Python-ideas <
python-ideas@python.org> wrote:

> The main problem is that a str is a sequence of single-character str, each
> of which is a one-element sequence of itself, etc. forever. If you wanted
> to change this, I think it would make more sense to go the opposite way:
> leave str a sequence, but make it a sequence of char objects. (And
> likewise, bytes and bytearray could be sequences of byte objects—or just go
> all the way to making them sequences of ints.) And then maybe add a c
> prefix for defining char constants, and you’ve solved all the problems
> without having to add new confusing methods or properties.
>

I've thought for a long time that this would be a "good thing". the "string
or sequence of strings" issues is pretty much the only
hidden-bug-triggering type error I've gotten since "true division".

The only way we really live with it fairly easily is that strings are
pretty much never duck typed -- so I can check if I got a string, and then
I know I didn't get a sequence of strings. But I've always wondered how
disruptive it would be to add a char type -- it doesn't seem like it would
be very disruptive, but I have not thought it through at all. And I'm not
sure how much string functionality a char should have -- probably next to
none, as the point is that it would be easy to distinguish from a string
that happened to have one character.

By the way, the bytes and bytearray types already does this -- index into
or loop through a bytes object, you get an int.

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QV2SLFQAR2VKOLD5Y7ACRO6LBX4ZE5UQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

2019-10-23 Thread Richard Musil

On Thu, Oct 24, 2019 at 12:34 AM Richard Musil  wrote:

> Can we agree on the reply from Serhiy and close this discussion?
>
> The proposed change does not bring any advantage apart from few saved
> keystrokes and even that is questionable, because it makes the code more
> prone to misreading/misinterpretation.
>
> I can parse separately quoted string literals in the list (especially when
> they are highlighted by syntax coloring) much faster than read the one big
> string literal, while doing the mental split, keeping in mind which
> separators the author decided to use to make the split, and filtering some
> hardcoded chars which would otherwise get cut off.
>

Please, ignore the last paragraph of my reply, I guess I need to go to
bed...

Richard
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HKJQ5TWU43JBAQBWMORXXQ5F6WVPGVXR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

2019-10-23 Thread Richard Musil

On Wed, Oct 23, 2019 at 4:33 PM Serhiy Storchaka 
wrote:

> 23.10.19 14:00, Steven D'Aprano пише:
> > So please do educate me Serhiy, which one is the One Obvious Way that we
> > should all agree is the right thing to do?
>
> If you need a constant number, the most obvious way is to write it as a
> number literal, not int('123'). If you need a constant string, the most
> obvious way is to write it as a string literal, not bytes([65,
> 66]).decode(). If you need a list of constant strings, the most obvious
> way is to write it as a list display consisting of string literals. It
> works in all Python versions.
>
> The second way works too in all actual Python versions (starting from
> 1.6), and nobody will beat you if you use it in your code. It can save
> you few keystrokes. But it is less obvious and less general.
>

Can we agree on the reply from Serhiy and close this discussion?

The proposed change does not bring any advantage apart from few saved
keystrokes and even that is questionable, because it makes the code more
prone to misreading/misinterpretation.

I can parse separately quoted string literals in the list (especially when
they are highlighted by syntax coloring) much faster than read the one big
string literal, while doing the mental split, keeping in mind which
separators the author decided to use to make the split, and filtering some
hardcoded chars which would otherwise get cut off.

Richard
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HCSOZAE4UKCOUBFYUJWYEPECC373NYAY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Thu, Oct 24, 2019 at 9:12 AM Christopher Barker  wrote:
>
> On Wed, Oct 23, 2019 at 1:41 PM Steven D'Aprano  wrote:
>> Virtually overnight, the Python community got used to the opposite
>> change, with f-strings: something that looks like a string is actually
>> code containing identifiers and even arbitrary expressions:
>>
>> f"Your score is {score}"
>
>
> well, it's technically code, yes, but it's functionally still a string -- it 
> looks like a string, and it evaluates to a string. I don't think that's 
> analogous.

An f-string is syntactic sugar for something (very approximately) like:

"".join("Your score is ", format(score))

Is that a string? It results in a string. Is a list comprehension a
list? It results in a list.

Programmer intention and concrete implementation are completely
different. Having syntactic sugar for the creation of a list of
strings is quite different from having a string which you then split,
even if the implementation is a string being split.

> > so I don't believe that this will be anywhere near the cognitive load that 
> > you state
>
> Again, this is all gut feeling, but we're talking about adding something new 
> here -- a tiny bit better, and maybe worse for some, is NOT enough to add a 
> new feature.
>
> I can't keep track of who's who, but quite amazing to me that this is getting 
> traction, and on the next thread over (some) people seem convinced that
>
> dict1 + dict2 would be incredibly confusing!
>
> oh well, language design is hard.
>

Yeah, well... welcome to the insanity that we call "python-ideas" :)

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CSZ2KDBGX25OGVDM6GOGFQRLPJBQWV53/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 1:41 PM Steven D'Aprano  wrote:

> In another comment, you asserted that we all have editors that help
> with typing quotes. Don't you have an editor that formats identifiers
> differently from string literals?
>

OK -- but which is it? do we expect people to have smart editors or not? If
we do then these are essentially equivalent in ease of reading and writing,
and if not, then the new way is easier to write, but harder to read (and
frankly, I think harder to write correctly if there is white space in the
individual strings.

An example of that: I think it's really handy that  Python allows me to use
" as a string delimiter when writing actual text, so I don't have to escape
the apostrophe.  (and vice versa for the less common " in the actual
string) escaping is a pain and error prone -- and worse when you need to
use codes:  "\x20" is at least a bit harder than "\n" -- at least "\n" is a
nice mnemonic. And "\u0020 is even worse.

Without an actual study, we are all going with our gut here, but I doubt
I'd ever use this except for simple collections of strings that don't have
spaces in them. So then there are now Three ways, rather than two obvious
ways to do it :-)

> I predict that even without colour or stylistic hinting, people will
> soon get used to the syntax. The fact that space-seperated identifiers
> are not legal in Python is a pretty huge hint that these aren't
> identifiers.
>

nor really, because while in a list, you need commas, in regular code,
space is (at least conventionally) used to separate identifies and tokens
-- that space doesn't scream out at me.

> Virtually overnight, the Python community got used to the opposite
> change, with f-strings: something that looks like a string is actually
> code containing identifiers and even arbitrary expressions:
>
> f"Your score is {score}"
>

well, it's technically code, yes, but it's functionally still a string --
it looks like a string, and it evaluates to a string. I don't think that's
analogous.

> so I don't believe that this will be anywhere near the cognitive load
that you state

Again, this is all gut feeling, but we're talking about adding something
new here -- a tiny bit better, and maybe worse for some, is NOT enough to
add a new feature.

I can't keep track of who's who, but quite amazing to me that this is
getting traction, and on the next thread over (some) people seem convinced
that

dict1 + dict2 would be incredibly confusing!

oh well, language design is hard.

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2FUJRDXBQWQHDV4ZYPPBLICUAUQ67ELQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Oct 23, 2019, at 13:10, Steven D'Aprano  wrote:
> 
> David, you literally wrote the book on text processing in Python. I 
> think you are being disingenious here, and below when you describe a 
> standard string hex-escape \x20 that has been in Python forever and in 
> just about all C-like languages as "weird".

I think what he’s saying is that it’s weird that \x20 doesn’t count as white 
space here, when it literally means a space character.

We do have to deal with this kind of weirdness in regexes, and that’s part of 
the reason we have raw strings literal, and this is no more confusing than 
passing a raw string literal to re.compile.

But arguably it’s also no _less_ confusing than passing a raw to re.compile, 
and that does actually confuse people, and now we’re talking about promoting 
that kind of confusion from a parser buried inside a module that novices don’t 
have to use to the actual Python parser that handles every line you type.

> If you can understand why this works:
> 
>string = "Single\n quoted\n string\n containing newlines!"
> 
> you can understand the burnt\x20umber example.

Not really. Your string contains new lines; it also contains spaces. Your 
burnt\x20umber example doesn’t contain a space.

Or, rather, it doesn’t contain a space that separates the elements, but one of 
the elements does anyway. As if this:

strings = "Single\n quoted\n string\n containing newlines!".splitlines()

… gave you a list of one string that contains new lines instead of a list of 
three strings that don’t.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZIH4LOSWOB5WYKFBA2T23O4GNIN64YA5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 09:06:49AM -0700, Christopher Barker wrote:

> As for:
> 
> %w[red green blue]
> 
> The [] make it pretty clear at a glance that I'm dealing with a list -- but
> the lack of quotes is really likely to confuse me -- particularly if I have
> identifiers with similar names!

In another comment, you asserted that we all have editors that help 
with typing quotes. Don't you have an editor that formats identifiers 
differently from string literals?

I predict that even without colour or stylistic hinting, people will 
soon get used to the syntax. The fact that space-seperated identifiers 
are not legal in Python is a pretty huge hint that these aren't 
identifiers.

Virtually overnight, the Python community got used to the opposite 
change, with f-strings: something that looks like a string is actually 
code containing identifiers and even arbitrary expressions:

f"Your score is {score}"

so I don't believe that this will be anywhere near the cognitive load 
that you state, especially if you are using an editor that displays 
strings in the different style to identifiers or numbers.

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OQ7ZKW2X4EPJZBK4GZQ4BLU7PTIFW27T/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 11:59:44AM -0400, Todd wrote:

> Compare that to:
> 
> colors2 = "cyan,forest green,burnt umber".split(',')

Sure, that's not going away. But consider that you're using this inside 
a tight loop:

for something in lots_of_items:
for another in more_items:
function(spam, eggs, "cyan,forest green,burnt umber".split(','))


That's easy to fix, you say. Move the list outside the loop:

L = "cyan,forest green,burnt umber".split(','))
for something in lots_of_items:
for another in more_items:
function(spam, eggs, L)

What's wrong with this picture?



-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HKWL77SF2ZDK4RWX4N6FB7CA7QZF5ZV5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 12:02:37PM -0400, David Mertz wrote:
> >
> >
> > > colors2 = "cyan   forest green  burnt umber".split()
> > > # oops, not what I wanted, quote each separately

Ha, speaking about "Oops" moments, I *totally* failed to notice that 
"forest green" is intended to be a single colour. The perils of posting 
in the wee hours of the morning, sorry.

> > It isn't shared by the proposal.
> >
> >   colors2 = %w[cyan   forest green  burnt\x20umber]
> >
> 
> I don't get it. There is weird escaping of spaces that aren't split? 

The source code has spaces between cyan and "forest-green" (let's 
pretend that's what it said all along...) and between forest-green and 
"burnt\x20umber". The parser/lexer splits on whitespace in the source 
code, giving three tokens:

cyan
forest-green
burnt\x20umber

each of which are treated as strings, complete with standard string 
escaping.

> That is confusing and a bug magnet.

David, you literally wrote the book on text processing in Python. I 
think you are being disingenious here, and below when you describe a 
standard string hex-escape \x20 that has been in Python forever and in 
just about all C-like languages as "weird".

If you can understand why this works:

string = "Single\n quoted\n string\n containing newlines!"

you can understand the burnt\x20umber example.

> What are the rules for escaping all
> whitespace, exactly? All the Unicode space-like code points, or just x20?

(1) I am assuming that we don't change any of the existing string 
escapes. That would be a backwards-incompatible change that would change 
the meaning of existing strings.

(2) The parser splits on whitespace in the source code. After that, the 
tokens are treated as normal string tokens except that you don't need to 
put start/end delimiters (quotes) on them.

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MHVEJFLJDSZXNK4OVP3B5IXLLASC7WZ3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

2019-10-23 Thread Dominik Vilsmeier

Jan Greis wrote:
> On 22/10/2019 06:43, Richard Musil wrote:
> > It is not a "concatenation" though, because you lost
> > {"key1": "val1"} 
> > in the process. The concatenation is not _just_ "writing something 
> > after something", you can do it with anything, but the actual 
> > operation, producing the result.
> > My point is that if I saw {"key1": "val1", "key2": "val2"} + {"key1": 
> "val3"}, I would expect that it would be equivalent to {"key1": "val1", 
> "key2": "val2", "key1": "val3"}.

But that reasoning only works with literals. And chances are that you're not 
going to see something like this in real code. Because why would you add two 
dict literals?

Instead you're going to see something like this: `d1 + d2`. And if one has to 
infer the details of that operation by coming up with some hypothetical example 
involving literals, that doesn't speak in favor of the syntax.

As mentioned, here it is up to the variable names to be clear about what 
happens. E.g.

default_preferences + user_preferences

For that example it's pretty clear that `user_preferences` is meant to 
supersede `default_preferences`. But variable names might not always be 
completely clear or even if they are, they might not allow the reader to infer 
any precedence. And then, "in the face of [that] ambiguity", one has to "refuse 
the temptation to guess". Maybe it's better not to introduce that ambiguity in 
the first place.

> Similarly, I would expect that
> deque([1, 2, 3], maxlen=4) + deque([4, 5]) == deque([1, 2, 3, 4, 5], 
> maxlen=4) == deque([2, 3, 4, 5], maxlen=4)
> which indeed is true.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RDBCJGWMK45RY676YEXQATIPHWMLVQ3Z/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

2019-10-23 Thread Dominik Vilsmeier

I don't see what's wrong with `["one", "two", "three"]`. It's the most explicit 
and from the compiler perspective it's probably also as optimal as it can get. 
Also it doesn't hurt readability. Actually it helps. With syntax highlighting 
the word boundaries immediately become clear.

If you're having long lists of string literals and you're annoyed by having to 
type `"` and `,` for every element, then it is the job of your IDE to properly 
support you while coding, not the job of the syntax (as long as it's clear and 
concise).

For that reason all the advanced IDEs with all their features exists. Without 
code completion for example you could also ask for new syntax that helps you 
abbreviating long variable names, because it's too much to type. So instead of 
writing `this_is_a_very_long_but_expressive_name` you could do `this_is...` in 
case there's only one name that starts with "this_is" which can be resolved 
from your scope. That would even shorten the code. Nevertheless I think that 
code completion is a good idea and that we have to use the exact same name 
every time.

The same applies to these "word literals". If you need a list of words, you can 
already create a list literal with the words inside. If that's too much typing, 
then you should ask your favorite IDE to implement corresponding refactoring 
assistance. I'm pretty sure the guys at PyCharm would consider adding something 
like this (e.g. if the caret is inside a string literal you can access the 
context menu via + and there could be something like "split words").

Steve Jorgensen wrote:
> See 
> https://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Literals#The_%_Notatio...
> for what Ruby offers.
> For me, the arrays are the most useful aspect.
> %w{one two three}
> => ["one", "two", "three"]
> 
> I did a search, and I don't see that this has been suggested before, but I 
> might have
> missed something. I'm guessing I'm not the first person to ask whether this 
> seems like a
> desirable feature to add to Python.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/J7X7BGBNZY43NANEB5OLJXCQFMZ7KHJH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

2019-10-23 Thread Sebastian Kreft

On Wed, Oct 23, 2019 at 1:19 PM Christopher Barker 
wrote:

> On Wed, Oct 23, 2019 at 5:42 AM Rhodri James  wrote:
>
>> > I'm surprised by that description. I don't think it is just newcomers
>> > who either suggest or prefer plus over pipe, and I don't think that pipe
>> > is "more accurate".
>>
>> +1 (as one of the non-newcomers who prefers plus)
>>
>
> me too.
>
> frankly, the | is obscure to most of us. And it started as "bitwise or",
> and evokes the __or__ magic method -- so why are we all convinced that
> somehow it's inextricably linked to "set union"? And set union is a bit
> obscure as well -- I don't think that many people (newbies or not) would
> jump right to this logic:
>
In my particular case I do know what `|` means in a set context. However,
when I see code using it, it takes me a while to understand what it means
and I tend to replace the operator with an explicit call to the union
method.

The only problem with that is when `dict_keys` are in use, as they do
implement the `|` operator, but not the `union` method. They only have
`isdisjoint`.


>
> I need to put two dicts together
> That's kind of like a set union operation
> The set object uses | for union
> That's probably what dict uses -- I'll give that t try.
>
> Rather than:
>
> I need to put two dicts together
> I wonder if dicts support addition?
> I'll give that a try.
>
> so, I'm
>
> +1 on +
> +0 on |
>
> if | is implemented, I'll bet you dollars to donuts that it will get used
> far less than if + were used.
>
> (though some on this thread would think that's a good thing :-) )
>
> -CHB
>
>
> --
> Christopher Barker, PhD
>
> Python Language Consulting
>   - Teaching
>   - Scientific Software Development
>   - Desktop GUI and Web Development
>   - wxPython, numpy, scipy, Cython
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/TE2IFV2PIUADS35TD6BHPEO5KG6DDTNT/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Sebastian Kreft
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MV3P7FXG6VBIDB5WEYYPQF3KCSRYPBTV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

2019-10-23 Thread Mike Miller

Other folks (and I earlier) have explained why we think | is the better choice, 
if less obvious.


On 2019-10-22 21:41, Steven D'Aprano wrote:

I think that is patronising to anyone, newbies and experienced
programmers alike, who know and expect that merging dicts with an
operator will have the same semantics as merging them with the update
method.

Should we also force newcomers to give a moment's reflection on why
item assignment ``mydict[key] = value`` is potentially "lossy"? How
about ``mystring.replace(old, new)`` or opening a file for writing?

Even if they don't know what they are doing, it is not the place of the
interpreter to treat them as an ignoramus that needs to be forced into
reflecting on the consequences of their action, as if they were a
naughty little schoolboy being told off by their headmaster.



This is an odd take, that a helpful error message is "patronising" and treats 
you like an ignoramus.  The alternative, losing data is expected, builds 
character / "puts hair on your chest."


This attitude reminds me of the bad old days on Slashdot.  I had thought it had 
gone out of fashion after the tech community grew after the turn of the century. 
 I'll take all the helpful error messages I can get personally, assuming they 
make sense.


-Mike
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZKYTWL7DTTFNILNR46UFK7QKVL5NVV4J/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Thu, Oct 24, 2019 at 4:22 AM Andrew Barnert via Python-ideas
 wrote:
>
> On Oct 23, 2019, at 10:04, Christopher Barker  wrote:
> >
> > This talk about optimization is confusing me:
>
> The main argument for why “a b c”.split() is not good enough, and therefore 
> we need a new syntax, is that it’s “too slow”.
>
> Someone earlier in this thread said we could optimize calling split on a 
> string literal, just as we can and do optimize iterating over a list literal 
> in a for statement.

I was the one to post it in this thread, but it wasn't my invention -
talk of optimizing method calls on literals has been around before.

> I agree. That’s why I think “too slow” isn’t a good argument, and to the tiny 
> extent that it is, “then let’s write an optimizer for the already-common 
> idiom” is a good answer, not “let’s come up with a whole new syntax that does 
> the same thing”.
>

Agreed. The value of creating new syntax is (must be) that it better
expresses programmer intent, not that it's easier to optimize.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HMP2G65MSY3UK6REF7INPBNWSFPEHUFX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Oct 23, 2019, at 10:04, Christopher Barker  wrote:
> 
> This talk about optimization is confusing me:

The main argument for why “a b c”.split() is not good enough, and therefore we 
need a new syntax, is that it’s “too slow”.

Someone earlier in this thread said we could optimize calling split on a string 
literal, just as we can and do optimize iterating over a list literal in a for 
statement.

The counter argument—which I thought you were adding onto—is that this would be 
bad because it would make people write bad code for older/alternative Pythons.

The reason I thought you were adding onto that argument is that you said people 
should be able to write something and know it’ll be _efficient_ on every Python 
implementation. Why does efficient matter if this code will only show up in 
places where you are, as you say below, already not concerned with performance?

That’s what I was responding to. If that wasn’t your point, I apologize for 
misreading it.

> These are literals -- they should only get processed once, generally on 
> module import.
> 
> If you are putting a long list of literal strings inside a tight loop, you 
> are already not concerned with performance.
> 
> Performance is absolutely the LAST reason to consider any proposal like this.

I agree. That’s why I think “too slow” isn’t a good argument, and to the tiny 
extent that it is, “then let’s write an optimizer for the already-common idiom” 
is a good answer, not “let’s come up with a whole new syntax that does the same 
thing”.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SMDBEQXRMVVKQ4RIGTVLQSSVMAU3OYPN/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

On Mon, Oct 21, 2019 at 4:49 PM Brandt Bucher 
wrote:

> Meitham Jamaa wrote:
> > The fact dict is a mutable object makes this PEP very complicated.
>

no, it doesn't -- a mutable inside a container of any sort has the same
issues:

Yes, that does cause it's confusion, but this proposal doesn't change that.

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BQO3HM2QEK752GSDYT4AWGAJCS2QJKUX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

This talk about optimization is confusing me:

These are literals -- they should only get processed once, generally on
module import.

If you are putting a long list of literal strings inside a tight loop, you
are already not concerned with performance.

Performance is absolutely the LAST reason to consider any proposal like
this.

I'm not saying that things like this shouldn't be optimized -- faster
import is a good thing, but I am saying it's not a reason to add a language
feature.

-CHB



On Wed, Oct 23, 2019 at 9:42 AM Andrew Barnert via Python-ideas <
python-ideas@python.org> wrote:

> On Oct 23, 2019, at 03:08, Steven D'Aprano  wrote:
> >
> > It could also be done by a source code preprocessor, or an AST
> > transformation, without changing syntax.
> >
> > But the advantage of changing the syntax is that it becomes the One
> > Obvious Way, and you know it will be efficient whatever version or
> > implementation of Python you are using.
>
> The advantage of just optimizing split on a literal is that split becomes
> the One Obvious Way, and you know it will work and be correct in whatever
> version of implementation of Python you are using, back to 0.9; it’ll just
> be faster in CPython 3.9+.
>
> In fact, given that we already use split all over the place, and even
> offer shorthand for it in places like namedtuple, and people recommend it
> on python-list and StackOverflow without any pushback, I think it already
> is TOOWTDI for many cases. So why not optimize it?
>
> And your argument is really an argument against adding any optimizations
> to CPython. The fact that nested tuple literals are now as fast as
> constants means someone could be constructing one right in the middle of a
> bottleneck, making their code appear to work on all Python versions and
> pass benchmarks in current CPython but then be unacceptably slow when they
> deploy on CPython 3.4 or uPython or whatever. But would you say that
> optimization was a mistake, and we should have instead left nested tuple
> displays slow and invented a new syntax for nested tuple constants that
> would make it an obvious SyntaxError in 3.4 or uPython, just because it’s
> possible that one person might run into that unacceptably slow case one
> day, even though nobody has ever complained about it?
>
> And this is almost certainly the same thing. If someone has a case where
> they wrote out a long list of strings as a list literal with quotes instead
> of using split because benchmarking required it, where they would have been
> misled into using split if it were faster in 3.8 even though some of their
> deployment targets are 3.7, then we should listen. But I doubt anyone does.
> The optimization will just be a small QoI thing that adds to Python 3.9
> being on average a bit faster than 3.8.
>
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/7F3FC3YR7XUG4CHXT27BXQAED7WR3F35/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RHY6NSJKHRO3GO3OIN7MUFNWL7GST3ST/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On 23/10/2019 16:16, Steven D'Aprano wrote:

On Wed, Oct 23, 2019 at 03:16:51PM +0100, Rhodri James wrote:

I'm seriously not getting the issue people have with

colours1 = ["red", "green", "blue"]

which has the advantage of saying what it means.

As opposed to the alternatives, which say something different from what
they mean?

Well, yes.

["red", "green", "blue"]

says that this is a list of strings. End of.

"red green blue".split()

says that this is a string that is now -- ta dah! -- a list of strings.
Nothing up my sleeves. No, don't clap, just throw money.

It's only a little bit of extra cognitive load in this case, but then
you start meeting corner cases like wanting spaces in your strings and
it stops being nearly so little.

The proposed:

%w[red green blue]

says that this is something, good luck figuring out what. If you know,
it's only a little more cognitive load, but again gets messier as you
get into the corner cases, as you've been demonstrating. If you don't
know, looking it up is not going to be easy.

Wherever possible, we should let the interpreter or compiler do the
repetitive stuff.

I prefer to let my editor do the work, actually. When I have had to do
long lists of strings (or anything, really) like this, I mostly type it
in as:

NOTIONAL_CONSTANT = [
red
blue
green
burnt umber
burnt cake
really long name with lots of spaces in it
and so on
and so on
]

and then write a quick editor macro to add the quotes and comma and tab
into a more beautiful (and syntactically correct) form. Not much more
trouble than typing it all in as an escaped string, and no extra runtime
loading either. The result is immediately readable source, which I
consider a major win.

--
Rhodri James *-* Kynesim Ltd
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/2UJPTFEJLLUPLC552BOH7PBXQ7FXNVMY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Oct 23, 2019, at 03:08, Steven D'Aprano  wrote:
> 
> It could also be done by a source code preprocessor, or an AST 
> transformation, without changing syntax.
> 
> But the advantage of changing the syntax is that it becomes the One 
> Obvious Way, and you know it will be efficient whatever version or 
> implementation of Python you are using.

The advantage of just optimizing split on a literal is that split becomes the 
One Obvious Way, and you know it will work and be correct in whatever version 
of implementation of Python you are using, back to 0.9; it’ll just be faster in 
CPython 3.9+.

In fact, given that we already use split all over the place, and even offer 
shorthand for it in places like namedtuple, and people recommend it on 
python-list and StackOverflow without any pushback, I think it already is 
TOOWTDI for many cases. So why not optimize it?

And your argument is really an argument against adding any optimizations to 
CPython. The fact that nested tuple literals are now as fast as constants means 
someone could be constructing one right in the middle of a bottleneck, making 
their code appear to work on all Python versions and pass benchmarks in current 
CPython but then be unacceptably slow when they deploy on CPython 3.4 or 
uPython or whatever. But would you say that optimization was a mistake, and we 
should have instead left nested tuple displays slow and invented a new syntax 
for nested tuple constants that would make it an obvious SyntaxError in 3.4 or 
uPython, just because it’s possible that one person might run into that 
unacceptably slow case one day, even though nobody has ever complained about it?

And this is almost certainly the same thing. If someone has a case where they 
wrote out a long list of strings as a list literal with quotes instead of using 
split because benchmarking required it, where they would have been misled into 
using split if it were faster in 3.8 even though some of their deployment 
targets are 3.7, then we should listen. But I doubt anyone does. The 
optimization will just be a small QoI thing that adds to Python 3.9 being on 
average a bit faster than 3.8.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7F3FC3YR7XUG4CHXT27BXQAED7WR3F35/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.

On Wed, Oct 23, 2019 at 5:42 AM Rhodri James  wrote:

> > I'm surprised by that description. I don't think it is just newcomers
> > who either suggest or prefer plus over pipe, and I don't think that pipe
> > is "more accurate".
>
> +1 (as one of the non-newcomers who prefers plus)
>

me too.

frankly, the | is obscure to most of us. And it started as "bitwise or",
and evokes the __or__ magic method -- so why are we all convinced that
somehow it's inextricably linked to "set union"? And set union is a bit
obscure as well -- I don't think that many people (newbies or not) would
jump right to this logic:

I need to put two dicts together
That's kind of like a set union operation
The set object uses | for union
That's probably what dict uses -- I'll give that t try.

Rather than:

I need to put two dicts together
I wonder if dicts support addition?
I'll give that a try.

so, I'm

+1 on +
+0 on |

if | is implemented, I'll bet you dollars to donuts that it will get used
far less than if + were used.

(though some on this thread would think that's a good thing :-) )

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TE2IFV2PIUADS35TD6BHPEO5KG6DDTNT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 8:45 AM Chris Angelico  wrote:

> > This would be a good argument if Python be a write-only language.
>
> I'm pretty sure the character counts are the same whether you're
> reading or writing. If anything, writing is based on keystrokes, but
> reading is based on characters.
>

It's not that simple -- it takes more work to type the quotes -- it may
take more work to read them, but they provide useful information -- this is
a string. If I see:

colors = ["red",  "green", "blue"]

It is VERY clear to me, at a glance, that it is a list of strings.

but when I see:

colors = "red, green, blue".split()

I need to think about it a bit.

As for:

%w[red green blue]

The [] make it pretty clear at a glance that I'm dealing with a list -- but
the lack of quotes is really likely to confuse me -- particularly if I have
identifiers with similar names!

and:

%w[1 2 3]

would really take a cognitive load to remember that that is a list of
strings.

I won't say that I (as a pretty bad typist) don't get annoyed at having to
type quotes a lot, but I really do appreciate that clear distinction
between identifiers and strings when reading code.

-CHB

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HX3FPTRGEA73CNEJTSJDR24LVEJ6DFMV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 11:44 AM Chris Angelico  wrote:

> On Thu, Oct 24, 2019 at 2:39 AM Serhiy Storchaka 
> wrote:
> >
> > 23.10.19 18:16, Steven D'Aprano пише:
> > > The average word length in English is five characters. That means that
> > > in a list of typical English words, more than a third of the expression
> > > is made up of the quotes and commas. In the example you give, there are
> > > twelve characters in the words themselves and eight characters worth of
> > > boilerplate surrounding them (quotes and commas, not including the
> > > spaces or brackets).
> >
> > This would be a good argument if Python be a write-only language.
>
> I'm pretty sure the character counts are the same whether you're
> reading or writing. If anything, writing is based on keystrokes, but
> reading is based on characters.
>
>
Reading really isn't based on characters.  People generally read words as a
single unit rather than reading each character individually.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JIHD3FWMVFHEBUCE46XAVV2GJNVVHDDJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

>
>
> > colors2 = "cyan   forest green  burnt umber".split()
> > # oops, not what I wanted, quote each separately
>
> It isn't shared by the proposal.
>
>   colors2 = %w[cyan   forest green  burnt\x20umber]
>

I don't get it. There is weird escaping of spaces that aren't split? That
is confusing and a bug magnet. What are the rules for escaping all
whitespace, exactly? All the Unicode space-like code points, or just x20?

Plus your example doesn't capture the color "forest green" correctly in any
way I can imagine.  But I suppose more weird escapes in the middle could do
that.

Overall... the proposal becomes incredibly ugly, and probably more
characters that are harder to type, than existing syntax.

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3KGLLDA43Q6X46RM7NEHHCNVIATQ6OLX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 10:59 AM Steven D'Aprano 
wrote:

> On Wed, Oct 23, 2019 at 10:09:41AM -0400, David Mertz wrote:
>
> > One big problem with the current obvious way would be shared by the
> > proposal. This hits me fairly often.
> >
> > colors1 = "red green blue".split()  # happy
> >
> > Later
> >
> > colors2 = "cyan   forest green  burnt umber".split()
> > # oops, not what I wanted, quote each separately
>
> It isn't shared by the proposal.
>
>   colors2 = %w[cyan   forest green  burnt\x20umber]
>
>
> Escaping the space ``\ `` might be nicer, but escaping an invisible
> character is problematic (see the problems with the explict line
> continuation character ``\``) and we may not be able to add any new
> escape characters to the language. However a hex escape will do the
> trick.
>

Compare that to:

colors2 = "cyan,forest green,burnt umber".split(',')

or, if you follow pep8-style commas:

colors2 = "cyan, forest green, burnt umber".split(', ')

This is one of the many cases where being able to specify the delimiter
helps.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DF5N3ER7GL224SAAS74G2ASERPDYG3MO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

I have to say that I'm really surprised that this idea is gaining this much
traction. And this is why:

Shorthand for a list of stings, whether this proposal, or the "list of
strings".split() "hack" -- is useful primarily for what I"d call
"scripting",rather than "software development".

There is not clear distinction, of course, but in (my definition of)
scripting, the write:read ratio (and the write:everything-else ratio: e.g.
runinng, testing, debugging, reviewing) is much higher, and it is a lot
more common to have a bunch of literals. I know I often put a pile of
literals a the top of a script, whereas a program would use a config file,
or command line arguments, or pull data from a database or web service, or
.

So why am I surprised? Because Python, over the years, has become more of a
"programming language", and a bit less of a scripting language.

print x => print(x) is a prime example -- but there are many others.

I'd say f-strings are the only exception I can think of of a feature that
is probably more useful to "scripting" than "programming". But less so than
this proposal.

On to this one -- despite the fact that I do a fair bit of quicky
scripting, I don't think this is worth it -- it's really only useful for a
particular subset of lists of strings -- once you add escapoing whitepace
and all that (and what do you do with quotes?, it isn't a good general
solution. Sure it's a common use case, but then, the "a bunch of
words".split() solution is fine in that case.

As for "one obvious way to do it" -- that is aspirational -- there simply
can't be one obvious way to do everything. And sometimes "it" is not one
thing. I'd say:

if you need to build a quick list of simple single words that isn't likely
to get more complex, then use .split(), if you need to build a list of
strings that are not simple words, and/or may get more complex, then use
the full set of quotation marks.

Final point: ideally, we all have editors that help with the quotes, so
it's not *quite* as much extra typing.

TL;DR -- not a really wide use case, and makes the language that much more
"PERL-like".

-CHB

On Wed, Oct 23, 2019 at 8:00 AM Steven D'Aprano  wrote:

> On Wed, Oct 23, 2019 at 10:09:41AM -0400, David Mertz wrote:
>
> > One big problem with the current obvious way would be shared by the
> > proposal. This hits me fairly often.
> >
> > colors1 = "red green blue".split()  # happy
> >
> > Later
> >
> > colors2 = "cyan   forest green  burnt umber".split()
> > # oops, not what I wanted, quote each separately
>
> It isn't shared by the proposal.
>
>   colors2 = %w[cyan   forest green  burnt\x20umber]
>
>
> Escaping the space ``\ `` might be nicer, but escaping an invisible
> character is problematic (see the problems with the explict line
> continuation character ``\``) and we may not be able to add any new
> escape characters to the language. However a hex escape will do the
> trick.
>
>
> --
> Steven
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/TCIWZCAPCIQJA2LMAKK6H4TWQNBJPUU7/
> Code of Conduct: http://python.org/psf/codeofconduct/
>

-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MJX3YCKX76XR5AYPGTVSGBVV2E34SYWG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Thu, Oct 24, 2019 at 2:39 AM Serhiy Storchaka  wrote:
>
> 23.10.19 18:16, Steven D'Aprano пише:
> > The average word length in English is five characters. That means that
> > in a list of typical English words, more than a third of the expression
> > is made up of the quotes and commas. In the example you give, there are
> > twelve characters in the words themselves and eight characters worth of
> > boilerplate surrounding them (quotes and commas, not including the
> > spaces or brackets).
>
> This would be a good argument if Python be a write-only language.

I'm pretty sure the character counts are the same whether you're
reading or writing. If anything, writing is based on keystrokes, but
reading is based on characters.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UUPAPTV6UBODZPSTZGHHKW4JZ7ASU3OI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

2019-10-23 Thread Serhiy Storchaka


23.10.19 18:16, Steven D'Aprano пише:

The average word length in English is five characters. That means that
in a list of typical English words, more than a third of the expression
is made up of the quotes and commas. In the example you give, there are
twelve characters in the words themselves and eight characters worth of
boilerplate surrounding them (quotes and commas, not including the
spaces or brackets).


This would be a good argument if Python be a write-only language.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WUBZYS5O6Y6F3I4FWBRFALNG55ZYQ4MB/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Thu, Oct 24, 2019 at 2:20 AM Steven D'Aprano  wrote:
> Hand-writing repetitive, dumb, mechanical code is an anti-pattern. I'm
> sure that, somewhere out there, there's a coder who prefers to write:
>
> [mylist[1], mylist[2], mylist[3], mylist[4], mylist[5]]
>
> instead of the obvious slice, but most of us didn't become programmers
> because we love the tedious, repetitive boilerplate.
>

Sgh... that one actually strikes home with me. Some of my
non-Python coding is in a language called SourcePawn, which doesn't
have any sort of "bulk operations" like slicing or *args or anything.
So I might have code like this:

SmokeLog("[%d-A] Smoke (%.2f, %.2f, %.2f) - (%.2f, %.2f)",
client, pos[0], pos[1], pos[2], angle[0], angle[1]);

where "pos" and "angle" are vectors - arrays of three floating-point
values. In Python, a Vector would be directly stringifiable, of
course, but even if not, you could at least say *pos,*angle.

So if someone is coming from a background in languages that can't do
this sort of thing, then yes, Python's way doesn't "look like what it
does". Quite frankly, that's a feature, not a flaw. It looks like what
the programmer intends, instead of looking like what mechanically
happens on the fly. We don't write code that looks like "push this
value onto the stack, push that value onto the stack, add the top two
values and leave the result on the stack", even though that's how
CPython byte code works. We write code that says "a + b", because
that's the programmer's intention.

If your intention is to iterate over a series of words, you do not
need all the mechanical boilerplate of constructing a list and
properly delimiting all the pieces. In Python, we don't iterate over
numbers by saying "start at 5, continue so long as we're below 20, and
add 1 every time". We say "iterate over range(5, 20)". And Python is
better for having that. (Trust me, I've messed up C-style for loops
enough times to be 100% certain of that.) You might argue that a
blank-separated words notation is unnecessary, but it should be
obvious that it's a valid way of expressing *programmer intention*.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HRYW4F7S7NCEDE37RUZMIW6WU23PQWLP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 03:16:51PM +0100, Rhodri James wrote:

> I'm seriously not getting the issue people have with
> 
> colours1 = ["red", "green", "blue"]
> 
> which has the advantage of saying what it means.

As opposed to the alternatives, which say something different from what 
they mean?

The existing alternative:

"red green blue".split()

equally "has the advantage of saying what it means", and so will the 
proposed alternative, just as it already does in Ruby.

I know that code is read more than it's written, but it still has to be 
written, and maintained, and writing out long lists of words is annoying 
to write and tedious to read.

An example like "red", "green", "blue" isn't too bad, but try it with 30 
or more single-word strings. I have. 1 out of 5, would not recommend.

Hand-writing repetitive, dumb, mechanical code is an anti-pattern. I'm 
sure that, somewhere out there, there's a coder who prefers to write:

[mylist[1], mylist[2], mylist[3], mylist[4], mylist[5]]

instead of the obvious slice, but most of us didn't become programmers 
because we love the tedious, repetitive boilerplate. 

[ QUOTE red QUOTE COMMA 
  QUOTE green QUOTE COMMA 
  QUOTE blue QUOTE COMMA 
  QUOTE yellow QUOTE COMMA 
  QUOTE magenta QUOTE COMMA 
  ... 
  ]

Wherever possible, we should let the interpreter or compiler do the 
repetitive stuff.

The average word length in English is five characters. That means that 
in a list of typical English words, more than a third of the expression 
is made up of the quotes and commas. In the example you give, there are 
twelve characters in the words themselves and eight characters worth of 
boilerplate surrounding them (quotes and commas, not including the 
spaces or brackets).

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LXHDLMUBG72F4G56SEZKCV4SPATRVHQP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

2019-10-23 Thread Guido van Rossum

On Wed, Oct 23, 2019 at 7:03 AM Todd  wrote:

> On Sat, Apr 7, 2018 at 4:48 AM Paul Moore  wrote:
>
>> On 7 April 2018 at 08:44, Raymond Hettinger 
>> wrote:
>> > Agreed that the "chain([x], it)" step is obscure.  That's a bit of a
>> bummer -- one of the goals for the itertools module was to be a generic
>> toolkit for chopping-up, modifying, and splicing iterator streams (sort of
>> a CRISPR for iterators).  The docs probably need another recipe to show
>> this pattern:
>> >
>> > def prepend(value, iterator):
>> > "prepend(1, [2, 3, 4]) -> 1 2 3 4"
>> > return chain([value], iterator)
>> >
>> > Thanks for taking a look at the proposal.  I was -0 when it came up
>> once before. Once I saw a use case pop-up on this list, I thought it might
>> be worth discussing again.
>>
>> I don't have much to add here - I typically agree that an explicit
>> loop is simpler, but my code tends not to be the sort that does this
>> type of operation, so my experience is either where it's not
>> appropriate, or where I'm unfamiliar with the algorithms, so terseness
>> is more of a problem to me than it would be to a domain expert.
>>
>> Having said that, I find that the arguments that it's easy to add and
>> it broadens the applicability of the function to be significant.
>> Certainly, writing a helper is simple, but as Tim pointed out, the
>> trick to writing that helper is obscure. Also, in the light of the
>> itertools design goal to be a toolkit for iterators, I often find that
>> the tools are just slightly *too* low level for my use case - they are
>> designed to be combined, certainly, but in practice I find that
>> building my own loop is often quicker than working out how to combine
>> them. (I don't have concrete examples, unfortunately - this feeling
>> comes from working back from the question of why I don't use itertools
>> more than I do). So I tend to favour such slight extensions to the use
>> cases of itertools functions.
>>
>> A recipe would help, but I don't know how much use the recipes see in
>> practice. I see a lot of questions where "there's a recipe for that"
>> is the answer - indicating that people don't always spot the recipes.
>>
>
> Part of the problem with the recipes is, as far as I am aware, the
> license.  The recipes appear to be under the Python-2.0 license, which
> complicates the licensing of any project you use them in that isn't already
> under that license.
>

That can be solved. We could explicitly license the recipes in the docs
under a simpler license. Please start a new thread, as this one has
attracted too much spam.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RMJKNJHMZZ7P4WDO6SMHQDMFWEV3XBL5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 10:09:41AM -0400, David Mertz wrote:

> One big problem with the current obvious way would be shared by the
> proposal. This hits me fairly often.
> 
> colors1 = "red green blue".split()  # happy
> 
> Later
> 
> colors2 = "cyan   forest green  burnt umber".split()
> # oops, not what I wanted, quote each separately

It isn't shared by the proposal.

  colors2 = %w[cyan   forest green  burnt\x20umber]


Escaping the space ``\ `` might be nicer, but escaping an invisible 
character is problematic (see the problems with the explict line 
continuation character ``\``) and we may not be able to add any new 
escape characters to the language. However a hex escape will do the 
trick.


-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TCIWZCAPCIQJA2LMAKK6H4TWQNBJPUU7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

2019-10-23 Thread Serhiy Storchaka


23.10.19 14:00, Steven D'Aprano пише:

On Wed, Oct 23, 2019 at 01:42:11PM +0300, Serhiy Storchaka wrote:

23.10.19 13:08, Steven D'Aprano пише:

But the advantage of changing the syntax is that it becomes the One
Obvious Way, and you know it will be efficient whatever version or
implementation of Python you are using.


There is already the One Obvious Way, and you know it will work whatever
version or implementation of Python you are using.


Your "One Obvious Way" is not obvious to me. Should I write this:

 # This is from actual code I have used.

 ["zero", "one", "two", "three", "four", "five", "six", "seven",
 "eight", "nine", "ten", "eleven", "twelve", "thirteen", "fourteen",
 "fifteen", "sixteen", "seventeen", "eighteen", "nineteen",
 "twenty", "twenty-one", "twenty-two", "twenty-three", "twenty-four"
 "twenty-five", "twenty-six", "twenty-seven", "twenty-eight",
 "twenty-nine", "thirty"]

Or this?

 """zero one two three four five six seven eight nine ten eleven
 twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen
 twenty twenty-one twenty-two twenty-three twenty-four twenty-five
 twenty-six twenty-seven twenty-eight twenty-nine thirty""".split()


I've been told by people that if I use the first style I'm obviously
ignorant and don't know Python very well, and by other people that the
second one is a hack and that I would fail a code review for using it.

So please do educate me Serhiy, which one is the One Obvious Way that we
should all agree is the right thing to do?


If you need a constant number, the most obvious way is to write it as a 
number literal, not int('123'). If you need a constant string, the most 
obvious way is to write it as a string literal, not bytes([65, 
66]).decode(). If you need a list of constant strings, the most obvious 
way is to write it as a list display consisting of string literals. It 
works in all Python versions.


The second way works too in all actual Python versions (starting from 
1.6), and nobody will beat you if you use it in your code. It can save 
you few keystrokes. But it is less obvious and less general.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DBGOE4HGYKIQ5WZITZDF65U3KAYFHWOT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby


On 23/10/2019 15:09, David Mertz wrote:

One big problem with the current obvious way would be shared by the
proposal. This hits me fairly often.

colors1 = "red green blue".split()  # happy

Later

colors2 = "cyan   forest green  burnt umber".split()
# oops, not what I wanted, quote each separately


I'm seriously not getting the issue people have with

colours1 = ["red", "green", "blue"]

which has the advantage of saying what it means.

--
Rhodri James *-* Kynesim Ltd
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SHGLAOIVC32AR5F2TWKNHNI7WGLTNME7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

One big problem with the current obvious way would be shared by the
proposal. This hits me fairly often.

colors1 = "red green blue".split()  # happy

Later

colors2 = "cyan   forest green  burnt umber".split()
# oops, not what I wanted, quote each separately


On Wed, Oct 23, 2019, 7:03 AM Steven D'Aprano  wrote:

> On Wed, Oct 23, 2019 at 01:42:11PM +0300, Serhiy Storchaka wrote:
> > 23.10.19 13:08, Steven D'Aprano пише:
> > >But the advantage of changing the syntax is that it becomes the One
> > >Obvious Way, and you know it will be efficient whatever version or
> > >implementation of Python you are using.
> >
> > There is already the One Obvious Way, and you know it will work whatever
> > version or implementation of Python you are using.
>
> Your "One Obvious Way" is not obvious to me. Should I write this:
>
> # This is from actual code I have used.
>
> ["zero", "one", "two", "three", "four", "five", "six", "seven",
> "eight", "nine", "ten", "eleven", "twelve", "thirteen", "fourteen",
> "fifteen", "sixteen", "seventeen", "eighteen", "nineteen",
> "twenty", "twenty-one", "twenty-two", "twenty-three", "twenty-four"
> "twenty-five", "twenty-six", "twenty-seven", "twenty-eight",
> "twenty-nine", "thirty"]
>
> Or this?
>
> """zero one two three four five six seven eight nine ten eleven
> twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen
> twenty twenty-one twenty-two twenty-three twenty-four twenty-five
> twenty-six twenty-seven twenty-eight twenty-nine thirty""".split()
>
>
> I've been told by people that if I use the first style I'm obviously
> ignorant and don't know Python very well, and by other people that the
> second one is a hack and that I would fail a code review for using it.
>
> So please do educate me Serhiy, which one is the One Obvious Way that we
> should all agree is the right thing to do?
>
>
>
> --
> Steven
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/ZM3O3BF7C6A5Y6NF3LKJNAN2WDXQMLTY/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NHRHYL5JUO7BNR3ZAHEBSTIZA5FZ5ORG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

On Sat, Apr 7, 2018 at 4:48 AM Paul Moore  wrote:

> On 7 April 2018 at 08:44, Raymond Hettinger 
> wrote:
> > Agreed that the "chain([x], it)" step is obscure.  That's a bit of a
> bummer -- one of the goals for the itertools module was to be a generic
> toolkit for chopping-up, modifying, and splicing iterator streams (sort of
> a CRISPR for iterators).  The docs probably need another recipe to show
> this pattern:
> >
> > def prepend(value, iterator):
> > "prepend(1, [2, 3, 4]) -> 1 2 3 4"
> > return chain([value], iterator)
> >
> > Thanks for taking a look at the proposal.  I was -0 when it came up once
> before. Once I saw a use case pop-up on this list, I thought it might be
> worth discussing again.
>
> I don't have much to add here - I typically agree that an explicit
> loop is simpler, but my code tends not to be the sort that does this
> type of operation, so my experience is either where it's not
> appropriate, or where I'm unfamiliar with the algorithms, so terseness
> is more of a problem to me than it would be to a domain expert.
>
> Having said that, I find that the arguments that it's easy to add and
> it broadens the applicability of the function to be significant.
> Certainly, writing a helper is simple, but as Tim pointed out, the
> trick to writing that helper is obscure. Also, in the light of the
> itertools design goal to be a toolkit for iterators, I often find that
> the tools are just slightly *too* low level for my use case - they are
> designed to be combined, certainly, but in practice I find that
> building my own loop is often quicker than working out how to combine
> them. (I don't have concrete examples, unfortunately - this feeling
> comes from working back from the question of why I don't use itertools
> more than I do). So I tend to favour such slight extensions to the use
> cases of itertools functions.
>
> A recipe would help, but I don't know how much use the recipes see in
> practice. I see a lot of questions where "there's a recipe for that"
> is the answer - indicating that people don't always spot the recipes.
>

Part of the problem with the recipes is, as far as I am aware, the
license.  The recipes appear to be under the Python-2.0 license, which
complicates the licensing of any project you use them in that isn't already
under that license.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3JRKWJ7H7QIQP2RNVGJBK3XHB3J5OIXL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 5:30 AM Steven D'Aprano  wrote:

> On Tue, Oct 22, 2019 at 08:53:53PM -0400, Todd wrote:
>
> [I wrote this]
> > > I would expect %w{ ... } to return a set, not a list:
> > >
> > > %w[ ... ]  # list
> > > %w{ ... ]  # set
> > > %w( ... )  # tuple
> > >
>
> [Todd replied]
> > This is growing into an entire new group of constructors for a very, very
> > limited number of operations that have been privileged for some reason.
>
> Sure. That's what syntactic sugar is: privileging one particular thing
> over another. That's why, for example, we privilage the idiom:
>
> import spam
> eggs = spam.eggs
>
> by giving it special syntax, but not
>
> class Spam: ...
> spam = Spam(args)
> del Spam
>
> Some things are privileged. We privilage for-loops as comprehensions,
> but not while-loop; we privilage getting a bunch of indexes in a
> sequence as a slice ``sequence[start:end]`` but not getting a bunch of
> items from a dict. Not everything can be syntactic sugar; but that
> doesn't mean nothing should be syntactic sugar.
>

This is getting bogged down in details.  Let me explain as simply as I can
why I don't think this is a good idea.

Everyone has a different set of things they want privileged with a new
syntax.  Everyone has different things they consider to be "annoyances"
that they wish took less characters to do.  And everyone who wants a new
syntax thinks that new syntax should the "one way" of doing that
operation.  If we accepted every syntax everyone wants the language would
be unusable.  We have to draw the line somewhere.  For any new syntax I can
think of, it significantly simplified real use-cases, was more expressive
in some way, or made things more consistent.

This, on the other hand, does none of these.  Getting a performance benefit
doesn't require a new syntax.  So the only benefit this has is saving a few
characters once per operation, at the expense of being less flexible.  And
again, if we made a new syntax every time someone wanted to save a few
characters the language would be unusable.  So I just don't think this
reaches what is my understanding of the bar new syntax has to reach.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3SI4MAL5IOHXOAK2NROVQP2U4JKWEPAM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: PEP 584: Add + and += operators to the built-in dict class.


On 23/10/2019 05:41, Steven D'Aprano wrote:

On Tue, Oct 22, 2019 at 11:39:59AM -0700, Mike Miller wrote:

On 2019-10-18 10:23, Ricky Teachey wrote:

but i'm -0 because i am very concerned it will not be obvious to new
learners, without constantly looking it up, whether adding two mappings
together would either:

The big trade off I'm gathering from this mega-thread is that the |, |=
operators are more accurate, but less obvious to newcomers, who will first
try +, += instead.

I'm surprised by that description. I don't think it is just newcomers
who either suggest or prefer plus over pipe, and I don't think that pipe
is "more accurate".


+1 (as one of the non-newcomers who prefers plus)

--
Rhodri James *-* Kynesim Ltd
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KTZTMR3YM2YPFHXZ64N66WVZ6ZF4ARCI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby


On 22/10/2019 20:53, Steve Jorgensen wrote:

See 
https://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Literals#The_%_Notation 
for what Ruby offers.

For me, the arrays are the most useful aspect.

 %w{one two three}
 => ["one", "two", "three"]


This smells like Perl's quoting operators.  I wasn't a big fan of them 
even when I was a Perlmonger.  Given the choice of "glyph doing 
something" and "glyph doing something I understand", I'll take the 
latter every time.


--
Rhodri James *-* Kynesim Ltd
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LI6YD57LQIHVLHWOVUH6VR4HLB6LJAUM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

2019-10-23 Thread Dan Sommers


On 10/23/19 6:07 AM, Ricky Teachey wrote:


I would expect %w{ ... } to return a set, not a list:

     %w[ ... ]  # list
     %w{ ... ]  # set
     %w( ... )  # tuple

and I would describe them as list/set/tuple "word literals". Unlike
list etc displays [spam, eggs, cheese] these would actually be true
literals that can be determined entirely at compile-time.

A more convenient way to populate lists/tuples/sets full of strings at 
compile time seems like a win.


If I might be allowed to bikeshed: the w seems unnecessary. Why not drop 
it in favor of a single character like %, and use an optional r for raw 
strings?


     %[words]  # "words".split()
     %{words}  # set("words".split())
     %(words)  # tuple("words".split())
     %r[wo\rds]  # "wo\\rds".split()
     %r{wo\rds}  # set("wo\\rds".split())
     %r(wo\rds)  # tuple("wo\\rds".split())


At that point, the "obvious" choice is an "s" (short for "split")
string rather than a whole new construct:

>>> s"one two three"
["one", "two", "three"]

which could be combined with "r" like f and b strings.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PN4CSBEU6GFNT37HBRU325F6BDRPWR7N/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

2019-10-23 Thread Ricky Teachey

> I would expect %w{ ... } to return a set, not a list:
>
> %w[ ... ]  # list
> %w{ ... ]  # set
> %w( ... )  # tuple
>
> and I would describe them as list/set/tuple "word literals". Unlike
> list etc displays [spam, eggs, cheese] these would actually be true
> literals that can be determined entirely at compile-time.


A more convenient way to populate lists/tuples/sets full of strings at
compile time seems like a win.

If I might be allowed to bikeshed: the w seems unnecessary. Why not drop it
in favor of a single character like %, and use an optional r for raw
strings?

%[words]  # "words".split()
%{words}  # set("words".split())
%(words)  # tuple("words".split())
%r[wo\rds]  # "wo\\rds".split()
%r{wo\rds}  # set("wo\\rds".split())
%r(wo\rds)  # tuple("wo\\rds".split())
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LZK45BJ5WGSOQ5PVLMV6YTWBW376RIJ4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 01:42:11PM +0300, Serhiy Storchaka wrote:
> 23.10.19 13:08, Steven D'Aprano пише:
> >But the advantage of changing the syntax is that it becomes the One
> >Obvious Way, and you know it will be efficient whatever version or
> >implementation of Python you are using.
> 
> There is already the One Obvious Way, and you know it will work whatever 
> version or implementation of Python you are using.

Your "One Obvious Way" is not obvious to me. Should I write this:

# This is from actual code I have used.

["zero", "one", "two", "three", "four", "five", "six", "seven",
"eight", "nine", "ten", "eleven", "twelve", "thirteen", "fourteen",
"fifteen", "sixteen", "seventeen", "eighteen", "nineteen",
"twenty", "twenty-one", "twenty-two", "twenty-three", "twenty-four"
"twenty-five", "twenty-six", "twenty-seven", "twenty-eight",
"twenty-nine", "thirty"]

Or this?

"""zero one two three four five six seven eight nine ten eleven
twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen
twenty twenty-one twenty-two twenty-three twenty-four twenty-five
twenty-six twenty-seven twenty-eight twenty-nine thirty""".split()


I've been told by people that if I use the first style I'm obviously 
ignorant and don't know Python very well, and by other people that the 
second one is a hack and that I would fail a code review for using it.

So please do educate me Serhiy, which one is the One Obvious Way that we 
should all agree is the right thing to do?



-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZM3O3BF7C6A5Y6NF3LKJNAN2WDXQMLTY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 9:23 PM Steven D'Aprano  wrote:
>
> On Wed, Oct 23, 2019 at 08:50:06PM +1100, Chris Angelico wrote:
> > On Wed, Oct 23, 2019 at 8:33 PM Steven D'Aprano  wrote:
> > > The most complicated feature I can think of is whether we should allow
> > > escaping spaces or not:
> > >
> > > names = %w[Aaron Susan Helen Fred Mary\ Beth]
> > > names = %w[Aaron Susan Helen Fred Mary%x20Beth]
> > >
> >
> > The second one? No. If you want that, use a post-processor or something.
>
> Ouch! Sorry, that was a brain-fart, I meant \x20 like in a string.

Oh! Then I withdraw the objection, heh.

> We surely will want to support the standard range of string escapes, not
> just ASCII identifiers, so once you support string escapes, you get \x20
> for free. The words should be arbitrary sequences of Unicode characters,
> not just limited to identifiers.
>

If you have string escapes, is "\]" a literal close bracket? It isn't
in a string literal, and yet people will expect to be able to escape
the delimiter. I think the proposal would work fine with a restricted
alphabet for the tokens, with room to potentially expand it in the
future.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZVLA32RIYNIPEFDC4GKNSCUO2BT3LNTJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

2019-10-23 Thread Serhiy Storchaka


23.10.19 13:08, Steven D'Aprano пише:

But the advantage of changing the syntax is that it becomes the One
Obvious Way, and you know it will be efficient whatever version or
implementation of Python you are using.


There is already the One Obvious Way, and you know it will work whatever 
version or implementation of Python you are using.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GUMSBFZK2SVYSKVEDRXZM6BWGKSOHOJ3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 08:50:06PM +1100, Chris Angelico wrote:
> On Wed, Oct 23, 2019 at 8:33 PM Steven D'Aprano  wrote:
> > The most complicated feature I can think of is whether we should allow
> > escaping spaces or not:
> >
> > names = %w[Aaron Susan Helen Fred Mary\ Beth]
> > names = %w[Aaron Susan Helen Fred Mary%x20Beth]
> >
> 
> The second one? No. If you want that, use a post-processor or something.

Ouch! Sorry, that was a brain-fart, I meant \x20 like in a string.

We surely will want to support the standard range of string escapes, not 
just ASCII identifiers, so once you support string escapes, you get \x20 
for free. The words should be arbitrary sequences of Unicode characters, 
not just limited to identifiers.

> But I'm not a fan of the %w syntax.

I'm not wedded to it :-)


-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YSB6TM5YAIGKYFPO7RX7SWT2YMAJ74RX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 11:47:04AM +1100, Chris Angelico wrote:

> This could be done as an optimization without changing syntax or
> semantics.. As long as the initial string is provided as a literal, it
> should be possible to call the method at compile time, since (as far
> as I know) every string method is a pure function.

Sure, it could be done as an optimization, similar to one of the 
proposals here:

https://bugs.python.org/issue36906

It could also be done by a source code preprocessor, or an AST 
transformation, without changing syntax.

But the advantage of changing the syntax is that it becomes the One 
Obvious Way, and you know it will be efficient whatever version or 
implementation of Python you are using.

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZYJWMGFYDTTCSRND43BUG2LPEQZ4YZTL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Wed, Oct 23, 2019 at 8:33 PM Steven D'Aprano  wrote:
> The most complicated feature I can think of is whether we should allow
> escaping spaces or not:
>
> names = %w[Aaron Susan Helen Fred Mary\ Beth]
> names = %w[Aaron Susan Helen Fred Mary%x20Beth]
>

The second one? No. If you want that, use a post-processor or something.

Using a backslash to escape a space would be a decent option, but I'd
also be fine with disallowing it, if it makes it easier to define the
grammar. If this syntax is restricted to a blank-separated sequence of
atoms ("NAME" in the grammar), it will still be of significant value,
and there's always the option to make it more flexible in the future.

But I'm not a fan of the %w syntax. If it comes to selection of colour
for the bikeshed, I'd rather that the list be created using another
variant of the same syntax we currently have for list creation:

numbers = [1, 2, 3, 4, 5]
from_loop = [x * 2 for x in numbers]
names = [from Aaron Susan Helen Fred Mary]

"Build a list from this set of words." Every list creation starts with
an open bracket and ends with a close bracket.

But that's just bikeshedding.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/E3MOFJMRL3FWXGKYYVV7OZGOOW66DHVA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Percent notation for array and string literals, similar to Perl, Ruby

On Tue, Oct 22, 2019 at 08:53:53PM -0400, Todd wrote:

[I wrote this]
> > I would expect %w{ ... } to return a set, not a list:
> >
> > %w[ ... ]  # list
> > %w{ ... ]  # set
> > %w( ... )  # tuple
> >

[Todd replied] 
> This is growing into an entire new group of constructors for a very, very
> limited number of operations that have been privileged for some reason.

Sure. That's what syntactic sugar is: privileging one particular thing 
over another. That's why, for example, we privilage the idiom:

import spam
eggs = spam.eggs

by giving it special syntax, but not

class Spam: ...
spam = Spam(args)
del Spam

Some things are privileged. We privilage for-loops as comprehensions, 
but not while-loop; we privilage getting a bunch of indexes in a 
sequence as a slice ``sequence[start:end]`` but not getting a bunch of 
items from a dict. Not everything can be syntactic sugar; but that 
doesn't mean nothing should be syntactic sugar.

> Should %{a=b c=d} create dicts, too?  Why not?  

Probably not, because we can already say ``dict(spam=x)`` to get the key 
"spam". That's specifically one of the motivating examples why dict 
takes keyword arguments. In the early days of Python, it didn't.

> Why should strings be privileged over, say, numbers?

Because we don't write ints or floats or complex numbers with 
delimiters. We say 5, not "5".

> Why should %w[1 2 3] make ['1', '2', '3'] instead of [1, 2, 3]?

Because the annoyance factor of having to quote each word is far greater 
than the annoyance factor of having to put commas between values.

> And why whitespace instead of a comma?

Because seperating words with whitespace is convenient when you have a 
lot of data. The spacebar, Tab and Enter keys are nice, big targets 
which are easy to hit, the comma isn't.

Splitting on whitespace means that spaces and newlines Just Work:

data = %w[alpha beta gamma ...
  psi chi omega]
# gives ['alpha', 'beta', 'gamma', ... 'psi', 'chi', 'omega']

whereas splitting on commas alone gives you a nasty surprise:

data = %w[alpha, beta, gamma, ...,
  psi, chi, omega]
# ['alpha', ' beta', ' gamma', ..., '\n  psi', ' chi', ' omega']

To avoid that, you need to complicate the rule something like:to "commas 
or whitespace", or "commas optionally followed by whitespace", or 
something even more complicated. The more complicated the rule, the more 
surprising it will be when you get caught out by some odd corner case of 
the rule you weren't expecting.

Splitting on whitespace is a nice, simple rule that cannot go wrong. Why 
make it more complicated than it needs to be?

> We have
> general ways to handle all of this stuff that doesn't lock us into a single
> special case.

Who is talking about locking us into a special case? "string 
literal".split() will still work, so will ["string", "literal"].

> > and I would describe them as list/set/tuple "word literals". Unlike
> > list etc displays [spam, eggs, cheese] these would actually be true
> > literals that can be determined entirely at compile-time.
> >
> 
> I don't know enough about the internals to say whether this would be
> possible or not.

It would be a pretty awful compiler that couldn't take a space-seperated 
sequence of characters and compile them as strings.

I'm not wedded to the leading and trailing delimiters %w[ and ] if they 
turn out to be ambiguous with something else (the % operator?), but I 
don't think they will be.

[...]
> Yes, I understand that Python has syntactic sugar.  But any new syntactic
> sugar necessarily has an uphill battle due people having to learn it, books
> and classes having to be updated, linters updated, new pep8 guidelines
> written, etc.  We already have a way to split strings.  So the question is
> why we need this in addition to what we already have,

Because it smooths out a minor annoyance and makes for a more pleasant 
programming experience for the coder, without having to worry (rightly 
or wrongly) about performance.

The status quo is that every time I need to write a list or set of 
words, I have to stop and think:

"Should I quote them all by hand, like a caveman, or get the 
interpreter to split it? If I get the interpreter to split 
it, will it hurt performance?"

but with this proposed syntax, there is One Obvious Way to write a list 
of words. I won't have to think about it, or worry that I should be 
worrying about performance.

> especially
> considering it is so radically different than anything else in Python.

Your idea of "radically different" is a lot less radical than mine.

To me, radically different would mean something like Hypertalk syntax:

put the value of the third line of text into word seven of result

or Reverse Polish Notation syntax. Not adding a prefix to list 
delimiters. We already have string prefixes, we already have list 
delimiters, putting the two concepts together is not a huge 
conceptu

[Python-ideas] Fwd: Re: PEP 584: Add + and += operators to the built-in dict class.