[Python-ideas] List assignment - extended slicing inconsistency

2018-02-22 Thread Alexander Heger
​What little documentation I could find, providing a stride on the
assignment target for a list is supposed to trigger 'advanced slicing'
causing element-wise replacement - and hence requiring that the source
iterable has the appropriate number of elements.

>>> a = [0,1,2,3]
>>> a[::2] = [4,5]
>>> a
[4, 1, 5, 3]
>>> a[::2] = [4,5,6]
Traceback (most recent call last):
  File "", line 1, in 
ValueError: attempt to assign sequence of size 3 to extended slice of size 2

This is in contrast to regular slicing (*without* a stride), allowing to
replace a *range* by another sequence of arbitrary length.

>>> a = [0,1,2,3]
>>> a[:3] = [4]
>>> a
[4, 3]

Issue
=
When, however, a stride of `1` is specified, advanced slicing is not
triggered.

>>> a = [0,1,2,3]
>>> a[:3:1] = [4]
>>> a
[4, 3]

If advanced slicing had been triggered, there should have been a ValueError
instead.

Expected behaviour:

>>> a = [0,1,2,3]
>>> a[:3:1] = [4]
Traceback (most recent call last):
  File "", line 1, in 
ValueError: attempt to assign sequence of size 1 to extended slice of size 3

I think that is an inconsistency in the language that should be fixed.

Why do we need this?

One may want this as extra check as well so that list does not change
size.  Depending on implementation, it may come with performance benefits
as well.

One could, though, argue that you still get the same result if you do all
correctly

>>> a = [0,1,2,3]
>>> a[:3:1] = [4,5,6]
>>> a
[4, 5, 6, 3]

But I disagree that there should be no error when it is wrong.
*Strides that are not None should always trigger advanced slicing.*

Other Data Types

This change should also be applied to bytearray, etc., though see below.

Concerns

It may break some code that uses advanced slicing and expects regular
slicing to occur?  These cases should be rare, and the error message should
be clear enough to allow fixes? I assume these cases should be
exceptionally rare.

If the implementation relies on `slice.indices(len(seq))[2] == 1` to
determine about advance slicing or not, that would require some
refactoring.  If it is only `slice.stride in (1, None)` then this could
easily replaced by checking against None.

Will there be issues with syntax consistency with other data types, in
particular outside the core library?
- I always found that the dynamic behaviour of lists w/r non-advanced
slicing to be somewhat peculiar in the first place, though, undeniably, it
can be immensely useful.
- Most external data types with fixed memory such as numpy do not have this
dynamic flexibility, and the behavior of regular slicing on assignment is
the same as regular slicing.  The proposed change would increase
consistency with these other data types.

More surprises
==
>>> import array
>>> a[1::2] = a[3:3]
Traceback (most recent call last):
  File "", line 1, in 
ValueError: attempt to assign sequence of size 0 to extended slice of size 2

whereas

>>> a = [1,2,3,4,5]
>>> a[1::2] = a[3:3]
Traceback (most recent call last):
  File "", line 1, in 
ValueError: attempt to assign sequence of size 0 to extended slice of size 2

>>> a = bytearray(b'12345')
>>> a[1::2] = a[3:3]
>>> a
bytearray(b'135')

but numpy

>>> import numpy as np
>>> a = np.array([1,2,3,4,5])
>>> a[1::2] = a[3:3]
Traceback (most recent call last):
  File "", line 1, in 
ValueError: could not broadcast input array from shape (0) into shape (2)

and

>>> import numpy as np
>>> a[1:2] = a[3:3]
Traceback (most recent call last):
  File "", line 1, in 
ValueError: could not broadcast input array from shape (0) into shape (1)

The latter two as expected.  memoryview behaves the same.

Issue 2
===
Whereas NumPy is know to behave differently as a data type with fixed
memory layout, and is not part of the standard library anyway, the
difference in behaviour between lists and arrays I find disconcerting.
This should be resolved to a consistent behaviour.

Proposal 2
==
Arrays and bytearrays should should adopt the same advanced slicing
behaviour I suggest for lists.

Concerns 2
==
This has the potential for a lot more side effects in existing code, but as
before in most cases error message should be triggered.

Summary
===
I find it it not acceptable as a good language design that there is a large
range of behaviour  on slicing in assignment target for the different
native (and standard library) data type of seemingly similar kind, and that
users have to figure out for each data type by testing - or at the very
least remember if documented - how it behaves on slicing in assignment
targets.  There should be a consistent behaviour at the very least, ideally
even one with a clear user interface as suggested for lists.

-Alexander
___
Python-ideas mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] List assignment - extended slicing inconsistency

2018-02-22 Thread Guido van Rossum
On Thu, Feb 22, 2018 at 2:18 PM, Alexander Heger  wrote:

> ​What little documentation I could find, providing a stride on the
> assignment target for a list is supposed to trigger 'advanced slicing'
> causing element-wise replacement - and hence requiring that the source
> iterable has the appropriate number of elements.
>
> >>> a = [0,1,2,3]
> >>> a[::2] = [4,5]
> >>> a
> [4, 1, 5, 3]
> >>> a[::2] = [4,5,6]
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: attempt to assign sequence of size 3 to extended slice of size
> 2
>
> This is in contrast to regular slicing (*without* a stride), allowing to
> replace a *range* by another sequence of arbitrary length.
>
> >>> a = [0,1,2,3]
> >>> a[:3] = [4]
> >>> a
> [4, 3]
>
> Issue
> =
> When, however, a stride of `1` is specified, advanced slicing is not
> triggered.
>
> >>> a = [0,1,2,3]
> >>> a[:3:1] = [4]
> >>> a
> [4, 3]
>
> If advanced slicing had been triggered, there should have been a
> ValueError instead.
>
> Expected behaviour:
>
> >>> a = [0,1,2,3]
> >>> a[:3:1] = [4]
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: attempt to assign sequence of size 1 to extended slice of size
> 3
>
> I think that is an inconsistency in the language that should be fixed.
>
> Why do we need this?
> 
> One may want this as extra check as well so that list does not change
> size.  Depending on implementation, it may come with performance benefits
> as well.
>
> One could, though, argue that you still get the same result if you do all
> correctly
>
> >>> a = [0,1,2,3]
> >>> a[:3:1] = [4,5,6]
> >>> a
> [4, 5, 6, 3]
>
> But I disagree that there should be no error when it is wrong.
> *Strides that are not None should always trigger advanced slicing.*
>

This makes sense.

(I wonder if the discrepancy is due to some internal interface that loses
the distinction between None and 1 before the decision is made whether to
use advanced slicing or not. But that's a possible explanation, not an
excuse.)


> Other Data Types
> 
> This change should also be applied to bytearray, etc., though see below.
>

Sure.


> Concerns
> 
> It may break some code that uses advanced slicing and expects regular
> slicing to occur?  These cases should be rare, and the error message should
> be clear enough to allow fixes? I assume these cases should be
> exceptionally rare.
>

Yeah, backwards compatibility sometimes prevents fixing a design bug. I
don't know if that's the case here, we'll need reports from real-world code.


> If the implementation relies on `slice.indices(len(seq))[2] == 1` to
> determine about advance slicing or not, that would require some
> refactoring.  If it is only `slice.stride in (1, None)` then this could
> easily replaced by checking against None.
>
> Will there be issues with syntax consistency with other data types, in
> particular outside the core library?
>

Things outside the stdlib are responsible for their own behavior. Usually
they can move faster and with less worry about breaking backward
compatibility.

>
> - I always found that the dynamic behaviour of lists w/r non-advanced
> slicing to be somewhat peculiar in the first place, though, undeniably, it
> can be immensely useful.
>

If you're talking about the ability to resize a list by assigning to a
slice, that's as intended. It predates advanced slicing by a decade or more.


> - Most external data types with fixed memory such as numpy do not have
> this dynamic flexibility, and the behavior of regular slicing on assignment
> is the same as regular slicing.  The proposed change would increase
> consistency with these other data types.
>

How? Resizing through slice assignment will stay for builtin types -- if
numpy doesn't support that, so be it.


> More surprises
> ==
> >>> import array
> >>> a[1::2] = a[3:3]
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: attempt to assign sequence of size 0 to extended slice of size
> 2
>
> whereas
>
> >>> a = [1,2,3,4,5]
> >>> a[1::2] = a[3:3]
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: attempt to assign sequence of size 0 to extended slice of size
> 2
>

OK, so array doesn't use the same rules. That should be fixed too probably
(assuming whatever is valid today remains valid).


> >>> a = bytearray(b'12345')
> >>> a[1::2] = a[3:3]
> >>> a
> bytearray(b'135')
>

Bytearray should also follow the same rules.


> but numpy
>
> >>> import numpy as np
> >>> a = np.array([1,2,3,4,5])
> >>> a[1::2] = a[3:3]
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: could not broadcast input array from shape (0) into shape (2)
>
> and
>
> >>> import numpy as np
> >>> a[1:2] = a[3:3]
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: could not broadcast input array from shape (0) into shape (1)
>
> The latter two as expected.  memoryview behaves the same.
>

Let's le

Re: [Python-ideas] List assignment - extended slicing inconsistency

2018-02-22 Thread Nick Coghlan
On 23 February 2018 at 11:51, Guido van Rossum  wrote:
> On Thu, Feb 22, 2018 at 2:18 PM, Alexander Heger  wrote:
>> But I disagree that there should be no error when it is wrong.
>> *Strides that are not None should always trigger advanced slicing.*
>
> This makes sense.
>
> (I wonder if the discrepancy is due to some internal interface that loses
> the distinction between None and 1 before the decision is made whether to
> use advanced slicing or not. But that's a possible explanation, not an
> excuse.)

That explanation seems pretty likely to me, as for the data types
implemented in C, we tend to switch to the Py_ssize_t form of slices
pretty early, and that can't represent the None/1 distinction.

Even for Python level collections, you lose the distinction as soon as
you call slice.indices (as that promises to return 3-tuple of
integers).

>> Concerns
>> 
>> It may break some code that uses advanced slicing and expects regular
>> slicing to occur?  These cases should be rare, and the error message should
>> be clear enough to allow fixes? I assume these cases should be exceptionally
>> rare.
>
> Yeah, backwards compatibility sometimes prevents fixing a design bug. I
> don't know if that's the case here, we'll need reports from real-world code.

In this case, we should be able to start with a DeprecationWarning in
3.8, since we already have the checks in place to raise ValueError
when the step is 2 or more - any patch would just need to make sure
those checks either have access to the original slice object (so they
can check the raw step value), or else an internal flag indicating
whether or not an explicit step was provided.

So the next step would be to file an issue pointing back to this
thread for acknowledgement that this is a design bug to be handled
with a DeprecationWarning in 3.8, and a ValueError in 3.9+.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-ideas mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/